What is the most optimal method of querying a reliable dictionary collection - azure-service-fabric

I would like to use service fabric reliable dictionaries to store data that we intend to query via a rest interface. I was wondering what was the most optimal approach to querying data held in those collections.
The documentation doesn't seem to provide any methods to support querying a IReliableDictionary interface apart from simply enumerating over the collection.
Is the only option to fetch every object out of the reliable dictionary and pop into a List<> collection for querying?
Thanks

As Vaclav mentions in his answer, you can always enumerate the entire dictionary and filter as you wish. Additionally, take a look at the docs for IReliableDictionary, specifically CreateEnumerableAsync(ITransaction, Func<TKey, Boolean>, EnumerationMode). This allows you to filter the keys you wish to include in the enumeration up-front, which may improve perf (e.g. by avoiding paging unnecessary values back in from disk).
Simple Example (exception handling/retry/etc. omitted for brevity):
var myDictionary = await this.StateManager.GetOrAddAsync<IReliableDictionary<int, string>>(new Uri("fabric:/myDictionary"));
using (var tx = this.StateManager.CreateTransaction())
{
// Fetch all key-value pairs where the key an integer less than 10
var enumerable = await myDictionary.CreateEnumerableAsync(tx, key => key < 10, EnumerationMode.Ordered);
var asyncEnumerator = enumerable.GetAsyncEnumerator();
while (await asyncEnumerator.MoveNextAsync(cancellationToken))
{
// Process asyncEnumerator.Current.Key and asyncEnumerator.Current.Value as you wish
}
}
You could create canned or dynamic key filters corresponding to your REST queries which search based on key - for value-based queries you will need to fall back to full enumeration (e.g. by using the same pattern as above, but omitting the key filter) or use notifications in order to build your own secondary index.

IReliableDictionary implements IEnumerable, which means you can query it with LINQ expressions, same way you would with an IDictionary.
Note that an enumeration over IReliableDictionary uses snapshot isolation, so it is lock free. Basically this means when you start the enumeration you are enumerating over the dictionary as it is at the time you started. If items are added or removed while you're enumerating, those changes won't appear in that enumeration. More info about this and Reliable Collections in general here: https://azure.microsoft.com/en-us/documentation/articles/service-fabric-reliable-services-reliable-collections/

Look here What would be the best way to search an IReliableDictionary?
There is two way to do it
Eli Arbel has async extension methods on Gist that provide a
bridge to Ix-Async and also async implementations of Select,
SelectMany, and Where.
You can wrap IAsyncEnumerable in a regular
IEnumerable that simply wraps the asynchronous

This answer: Convert IReliableDictionary to IList shows how to convert SerivceFabric's IAsyncEnumerable to a generic IAsyncEnumerable. Then you can use Async Linq provided by dotnet.

Related

Firestore database structure in flutter best practice

I'm creating a fitness app, and so far I came with the following structure:
Workout
difficulty (String)
duration (String)
exerciseSets (Firestore ref)
ExerciseSet
repNumber (int)
exercise (Firestore Ref)
and the Exercise object has a few fields describing the exercise.
So right now if i want to retrieve a whole workout, i need to do at least 3 calls to firestore, one for the Workout, then i get the ExerciseSets by ref (and there are usually a few in each workout) , and then the Exercise by ref as well..
ExerciseSet and Exercise objects are shared between workouts, thats why i have them in a different doc.
Also after retrieving all 3 or more snapshots from Firestore, i need to iterate through them to map it to my model.. i do something like this currently:
for (var exerciseSet in fsWorkout.exerciseSets) {
var fsExerciseSet = await _getFsExerciseSet(exerciseSet.ref);
var set = ExerciseSet.fromFirstoreObject(fsExerciseSet);
var fsExercise = await _getFsExercise(fsExerciseSet.exerciseRef.ref);
set.exercise = Exercise.fromFirestoreObject(fsExercise);
exerciseSets.add(set);
}
return Workout(fsWorkout.difficulty, fsWorkout.duration, exerciseSets);
Does this make sense? or is there a more efficient/easy way to achieve this? It feels like I over complicated stuff..
And is there any advantage to using firestore reference instead of just a String field with the ID?
Thanks!
EDIT: I would like to mention that in my case all the data is added once by me, and the client reads the data and needs to retrieve a Workout object that contains all the ExerciseSet and Exercise objects.
You are actually applying an SQL normalization data-modelling strategy to a NonSQL database. This is not the most efficient approach...
In the NoSQL world, you should not be afraid to duplicate data and denormalize your data model. I would suggest your read this "famous" post about NoSQL data-modelling approaches.
So, instead of designing your data-model according to SQL normalization you should, in the NoSQL world, think about it from a query perspective, trying to minimize the number of queries for a given screen/use case.
In your case a common approach would be to use a set of Cloud Functions (which are executed in the back-end) to duplicate your data and have all the ExerciceSets and corresponding Exercises in your Workout Firestore document. And to keep all these data in sync, you would also use also use Cloud Functions.
You could also go for an intermediate approach where you only add the ExerciceSets data to a Workout and when the user wants to see an ExerciceSet details (e.g. by clicking on the ExerciceSet link) you query the corresponding Exercises.

Get int value from database

How i can get int value from database?
Table has 4 columns
Id, Author, Like, Dislike.
I want to get Dislike amount and add 1.
i try
var db = new memyContext();
var amountLike = db.Memy.Where(s => s.IdMema == id).select(like);
memy.like=amountLike+1;
I know that this is bad way.
Please help
I'm not entirely sure what your question is here, but there's a few things that might help.
First, if you're retrieving via something that reasonably only has one match, or in a scenario where you want just one thing, then you should be use SingleOrDefault or FirstOrDefault, respectively - not Where. Where is reserved for scenarios where you expect multiple things to match, i.e. the result will be a list of objects, not an object. Since you're querying by an id, then it's fairly obvious that you expect just one match. Therefore:
var memy = db.Memy.SingleOrDefault(s => s.IdMema == id);
Second, if you just need to read the value of Like, then you can use Select, but here there's two problems with that. First, Select can only be used on enumerables, as already discussed here, you need a single object, not a list of objects. In truth, you can sidestep this in a somewhat convoluted way:
var amountLike = db.Memy.Select(x => x.Like).SingleOrDefault(x => x.IdMema == id);
However, this is still flawed, because you not only need to read this value, but also write back to it, which then needs the context of the object it belongs to. As such, your code should actually look like:
var memy = db.Memy.SingleOrDefault(s => s.IdMema == id);
memy.Like++;
In other words, you pull out the instance you want to modify, and then modify the value in place on that instance. I also took the liberty of using the increment operator here, since it makes far more sense that way.
That then only solves part of your problem, as you need to persist this value back to the database as well, of course. That also brings up the side issue of how you're getting your context. Since this is an EF context, it implements IDisposable and should therefore be disposed when you're done with it. That can be achieved simply by calling db.Dispose(), but it's far better to use using instead:
using (var db = new memyContext())
{
// do stuff with db
}
And while we're here, based on the tags of your question, you're using ASP.NET Core, which means that even this is sub-optimal. ASP.NET Core uses DI (dependency injection) heavily, and encourages you to do likewise. An EF context is generally registered as a scoped service, and should therefore be injected where it's needed. I don't have the context of where this code exists, but for illustration purposes, we'll assume it's in a controller:
public class MemyController : Controller
{
private readonly memyContext _db;
public MemyController(memyContext db)
{
_db = db;
}
...
}
With that, ASP.NET Core will automatically pass in an instance of your context to the constructor, and you do not need to worry about creating the context or disposing of it. It's all handled for you.
Finally, you need to do the actual persistence, but that's where things start to get trickier, as you now most likely need to deal with the concept of concurrency. This code could be being run simultaneously on multiple different threads, each one querying the database at its current state, incrementing this value, and then attempting to save it back. If you do nothing, one thread will inevitably overwrite the changes of the other. For example, let's say we receive three simultaneous "likes" on this object. They all query the object from the database, and let's say that the current like count is 0. They then each increment that value, making it 1, and then they each save the result back to the database. The end result is the value will be 1, but that's not correct: there were three likes just added.
As such, you'll need to implement a semaphore to essentially gate this logic, allowing only one like operation through at a time for this particular object. That's a bit beyond the scope here, but there's plenty of stuff online about how to achieve that.

How to update data with GraphQL

I am studying graphql.
I can retrieve data from my mongo database with queries, I can create data with mutations.
But how I can modify existing data?
I am a bit lost here...
I have to create a new mutation?
Yes, every mutation describes a specific action that can be done to a bit of data. GraphQL is not like REST - it doesn't specify any standard CRUD-type actions.
When you are writing a mutation to update some data, you have two options. Let's explain them in the context of a todo item that has a completed status, and a text field:
Write mutations that represent semantic actions - markTodoCompleted, updateTodoText, etc.
Write a generic mutation that just sets any properties passed it, you could call it updateTodo.
I prefer the first approach, because it makes it more clear what the client is doing when it calls a certain mutation. In the second approach, you need to be careful to validate the values to be set to make sure someone can't set some invalid combination.
In short, you need to define your own mutations to update data.

MongoDB - is this efficient?

I have a collection users in mongodb. I am using the node/mongodb API.
In my node.js if I go:
var users = db.collection("users");
and then
users.findOne({ ... })
Is this idiomatic and efficient?
I want to avoid loading all users into application memory.
In Mongo, collection.find() returns a cursor, which is an abstract representation of the results that match your query. It doesn't return your results over the wire until you iterate over the cursor by calling cursor.next(). Before iterating over it, you can, for instance, call cursor.limit(x) to limit the number of results that it is allowed to return.
collection.findOne() is effectively just a shortcut version of collection.find().limit(1).next(). So the cursor is never allowed to return more than the one result.
As already explained, the collection object itself is a facade allowing access to the collection, but which doesn't hold any actual documents in memory.
findOne() is very efficient, and it is indeed idiomatic, though IME it's more used in dev/debugging than in real application code. It's very useful in the CLI, but how often does a real application need to just grab any one document from a collection? The only case I can think of is when a given query can only ever return one document (i.e. an effective primary key).
Further reading:
Cursors,
Read Operations
Yes, that should only load one user into memory.
The collection object is exactly that, it doesn't return all users when you create a new collection object, only when you either use a findOne or you iterate a cursor from the return of a find.

Breeze: complex graph returns only 1 collection

I have a physician graph that looks something like this:
The query I use to get data from a WebApi backend looks like this:
var query = new breeze.EntityQuery().from("Physicians")
.expand("ContactInfo")
.expand("ContactInfo.Phones")
.expand("ContactInfo.Addresses")
.expand("PhysicianNotes")
.expand("PhysicianSpecialties")
.where("ContactInfo.LastName", "startsWith", lastInitial).take(5);
(note the ContactInfo is a pseudonym of the People object)
What I find is that If I request Contact.Phones to be expanded, I'll get just phones and no Notes or Specialties. If I comment out the phones I'll get Contact.Addresses and no other collections. If I comment out ContactInfo along with Phones and Addresses I'll get Notes only etc. Essentially, it seems like I can only get one collection at a time.
So, Is this a built in 'don't let the programmer shoot himself in the foot'?? safeguard or do I have to enable something?
OR is this graph too complicated?? should I consider a NoSql object store??
Thanks
You need to put all your expand clauses in a single one like this:
var query = new breeze.EntityQuery().from("Physicians")
.expand("ContactInfo, ContactInfo.Phones, ContactInfo.Addresses, PhysicianNotes, PhysicianSpecialties")
.where("ContactInfo.LastName", "startsWith", lastInitial).take(5);
You can see the documentation here: http://www.breezejs.com/sites/all/apidocs/classes/EntityQuery.html#method_expand
JY told you HOW. But BEWARE of performance consequences ... both on the data tier and over the wire. You can die a miserable death by grabbing too widely and deeply at once.
I saw the take(5) in his sample. That is crucial for restraining a runaway request (something you really must do also on the server). In general, I would reserve extended graph fetches of this kind for queries that pulled a single root entity. If I'm presenting a list for selection and I need data from different parts of the entity graph, I'd use a projection to get exactly what I need to display (assuming, of course, that there is no SQL View readily available for this purpose).
If any of the related items are reference lists (color, status, states, ...), consider bringing them into cache separately in a preparation step. Don't include them in the expand; Breeze will connect them on the client to your queried entities automatically.
Finally, as a matter of syntax, you don't have to repeat the name of a segment. When you write "ContactInfo.Phones", you get both ContactInfos and Phones so you don't need to specify "ContactInfo" by itself.