Service fabric reliable dictionary performance with 1 million keys - azure-service-fabric

I am evaluating the performance of Service Fabric with a Reliable Dictionary of ~1 million keys. I'm getting fairly disappointing results, so I wanted to check if either my code or my expectations are wrong.
I have a dictionary initialized with
dict = await _stateManager.GetOrAddAsync<IReliableDictionary2<string, string>>("test_"+id);
id is unique for each test run.
I populate it with a list of strings, like
"1-1-1-1-1-1-1-1-1",
"1-1-1-1-1-1-1-1-2",
"1-1-1-1-1-1-1-1-3".... up to 576,000 items. The value in the dictionary is not used, I'm currently just using "1".
It takes about 3 minutes to add all the items to the dictionary. I have to split the transaction to 100,000 at a time, otherwise it seems to hang forever (is there a limit to the number of operations in a transaction before you need to CommitAsync()?)
//take100_000 is the next 100_000 in the original list of 576,000
using (var tx = _stateManager.CreateTransaction())
{
foreach (var tick in take100_000) {
await dict.AddAsync(tx, tick, "1");
}
await tx.CommitAsync();
}
After that, I need to iterate through the dictionary to visit each item:
using (var tx = _stateManager.CreateTransaction())
{
var enumerator = (await dict.CreateEnumerableAsync(tx)).GetAsyncEnumerator();
try
{
while (await enumerator.MoveNextAsync(ct))
{
var tick = enumerator.Current.Key;
//do something with tick
}
}
catch (Exception ex)
{
throw ex;
}
}
This takes 16 seconds.
I'm not so concerned about the write time, I know it has to be replicated and persisted. But why does it take so long to read? 576,000 17-character string keys should be no more than 11.5mb in memory, and the values are only a single character and are ignored. Aren't Reliable Collections cached in ram? To iterate through a regular Dictionary of the same values takes 13ms.
I then called ContainsKeyAsync 576,000 times on an empty dictionary (in 1 transaction). This took 112 seconds. Trying this on probably any other data structure would take ~0 ms.
This is on a local 1 node cluster. I got similar results when deployed to Azure.
Are these results plausible? Any configuration I should check? Am I doing something wrong, or are my expectations wildly inaccurate? If so, is there something better suited to these requirements? (~1 million tiny keys, no values, persistent transactional updates)

Ok, for what it's worth:
Not everything is stored in memory. To support large Reliable Collections, some values are cached and some of them reside on disk, which potentially could lead to extra I/O while retrieving the data you request. I've heard a rumor that at some point we may get a chance to adjust the caching policy, but I don't think it has been implemented already.
You iterate through the data reading records one by one. IMHO, if you try to issue half a million separate sequential queries against any data source, the outcome won't be much optimistic. I'm not saying that every single MoveNext() results in a separate I/O operation, but I'd say that overall it doesn't look like a single fetch.
It depends on the resources you have. For instance, trying to reproduce your case on my local machine with a single partition and three replicas, I get the records in 5 seconds average.
Thinking about a workaround, here is what comes in mind:
Chunking I've tried to do the same stuff splitting records into string arrays capped with 10 elements(IReliableDictionary< string, string[] >). So essentially it was the same amount of data, but the time range was reduced from 5sec down to 7ms. I guess if you keep your items below 80KB thus reducing the amount of round-trips and keeping LOH small, you should see your performance improved.
Filtering CreateEnumerableAsync has an overload that allows you to specify a delegate to avoid retrieving values from the disk for keys that do not match the filter.
State Serializer In case you go beyond simple strings, you could develop your own Serializer and try to reduce the incurred I/O against your type.
Hopefully it makes sense.

Related

What is a more efficient way to compare and filter Sequences (multiple calls to a single call)

I have two sequences of Data objects and I want to establish what has been added, removed and is common between DataSeq1 and DataSeq2 based upon the id in the Data objects within each sequence.
I can achieve this using the following:
val dataRemoved = DataSeq1.filterNot(c => DataSeq2.exists(_.id == c.id))
val dataAdded = DataSeq2.filterNot(c => DataSeq1.exists(_.id == c.id))
val dataCommon = DataSeq1.filter(c => DataSeq2.exists(_.id == c.id))
//Based upon what is common I want to filter DataSeq2
var incomingDataToCompare = List[Data]()
dataCommon.foreach(data => {incomingDataToCompare = DataSeq2.find(_.id == data.id).get :: incomingDataToCompare})
However as the Data object gets larger calling filters three different times may have a performance impact. Is there a more efficient way to achieve the same output (i.e. what has been removed, added and in common) in a single call?
The short answer is, quite possibly not, unless you are going to add some additional features into the system. I would guess that you need to keep a log of operations in order to improve the time complexity. Even better if that log will be indexed both by the order in which the operation has occurred and by the id of the item that was added/removed. I will leave it to you to discover how such a log can be used.
Also you might be able to improve time complexity if you are going to keep the original sequences sorted by id (or a separate seq of sorted ids, you should probably be able to do that incurring a logN penalty per a single operation). This seq should be of type vector or something, to allow fast random access. Then you can probably iterate with two pointers. But this algorithm's efficiency will depend greatly on whether the unique ids are bounded and also whether this "establish added/removed/same" operation will be called much more frequently compared to the operations that mutate the sequences.

Database technology choice based on an use case

I am creating an API Limiter, and I am having issues deciding on what system to use for data storage.
It is really clear that I am going to need a volatile storage, plus a persistent storage.
On the volatile I want to store a key-value like this:
read:14522145 100
read:99885669 16
read:78951585 100
This is a key composed of: {action}:{client} and an integer value (available credits).
On the persistent, I want to keep a record of all resource outages.
The algorithm (pseudo-code) is pretty simple:
MAX_AMOUNT = 100
call(action, client, cost) {
key = action + ":" + client
if (volatileStorage.hasKey(key)) {
value = volatileStorage.getValue(key)
if (value >= cost) {
volatileStorage.setValue(key, value - cost)
return true
} else {
persistentStorage.logOutage(method, client, cost)
return false
}
} else {
volatileStorage.setValue(key, MAX_AMOUNT)
return call(action, client, cost)
}
}
There is a parallel process that runs every N seconds for each method, that increases all keys {action}:* by M, up to O.
Additionally, I want to remove from the volatile store all items older (not modified since) than P seconds.
So basically every action is action<N, M, O, P>. For instance, reading users is increased every 1 second, by 5 points, up to 100, and removed after 60 seconds of inactivity: read_users<1, 5, 100, 60>.
So I need a volatile storage that:
Reads really quick, without consuming too many resources (what's the point of rejecting a call, if the process is more expensive than the very own call).
Allows TTL on items.
Can, with good performance, increase all keys matching a pattern (read_users:*) without getting out of a defined limit.
and a persistent storage that:
Is also quick.
Can handle loads of registers.
Any advice is welcome.
This isn't an answer but an opinion: there are existing rate limiters that you would be better off using instead of making your own. Getting it right is tricky, so adopting a production-proven implementation is not only easier but also safer.
For example, the Generic cell rate algorithm is nothing short of plain magic and has several Redis implementations, including:
As a Ruby gem (that uses server-side Lua): https://github.com/rwz/redis-gcra
As a (v4) module: https://github.com/brandur/redis-cell/
Of course, there are many more Redis-based rate limiters - I use Google to find them ;)

Detecting concurrent data modification of document between read and write

I'm interested in a scenario where a document is fetched from the database, some computations are run based on some external conditions, one of the fields of the document gets updated and then the document gets saved, all in a system that might have concurrent threads accessing the DB.
To make it easier to understand, here's a very simplistic example. Suppose I have the following document:
{
...
items_average: 1234,
last_10_items: [10,2187,2133, ...]
...
}
Suppose a new item (X) comes in, five things will need to be done:
read the document from the DB
remove the first (oldest) item in the last_10_items
add X to the end of the array
re-compute the average* and save it in items_average.
write the document to the DB
* NOTE: the average computation was chosen as a very simple example, but the question should take into account more complex operations based on data existing in the document and on new data (i.e. not something solvable with the $inc operator)
This certainly is something easy to implement in a single-threaded system, but in a concurrent system, if 2 threads would like to follow the above steps, inconsistencies might occur since both will update the last_10_items and items_average values without considering and/or overwriting the concurrent changes.
So, my question is how can such a scenario be handled? Is there a way to check or react-upon the fact that the underlying document was changed between steps 1 and 5? Is there such a thing as WATCH from redis or 'Concurrent Modification Error' from relational DBs?
Thanks
In database system,it uses a memory inspection and roll back scheme which is similar to transactional memory.
Briefly speaking, it simply monitors the share memory parts you specified and do something like compare and swap or load and link or test and set.
Therefore,if any memory content is changed during transaction,it will abort and try again until there is no conflict operation for that shared memory.
For example,GCC implements the following:
https://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html
type __sync_lock_test_and_set (type *ptr, type value, ...)
type __sync_val_compare_and_swap (type *ptr, type oldval type newval, ...)
For more info about transactional memory,
http://en.wikipedia.org/wiki/Software_transactional_memory

Best way to store lots of unstructured data in a multithreaded iOS app with memory constraints?

I've gone around in circles with this problem in my app over several months. I've tried many home-brewed solutions and will explain what I have working here, but I hope someone can suggest a better solution I missed.
The basic problem is this: I have (potentially) thousands of Items that need to be accessed by my app at any time. A NSMutableDictionary would normally be my first approach to represent each item, since each Item might have anywhere from a few to hundreds of properties. But the rest of the requirements make things hairy:
Each Item might be read from or written to by any thread
Each Item needs to be stored to disk so that it can be retrieved between sessions
There are (potentially) so many items (and so much data) that to have them ALL in memory at once could cause memory issues
I wanted to use CoreData because Apple likes it so much, but I ran into lots of problems. Each Item does not have a definitive structure, so there is no good way to structure the data model. Furthermore, querying for data caused the single .sqlite file to act as a bottleneck, meaning that the wait times (lag) got absurd very quickly when many threads were trying to retrieve items at once.
I have a working solution, but it has problems. Here's a chunk of the code, and I'll explain what it does below
- (NSObject*) getValue:(NSString*)key {
#synchronized(self) {
if(!_cached_obj) { // private variable in this object
_cached_obj = [self loadFromDisk]; // simply loads the NSDictionary from a file
}
_last_access = time(nil);//don't release for a while
return [_cached_obj valueForKey:key];
}
}
- (void) setValue:(NSObject*)value forKey:(NSString*)key {
#synchronized(self) {
[self getValue:key];//ensures the cache is active
[_cached_obj setValue:value forKey:key];
_needs_save = true;
}
}
- (void) clean {
if(!_cached_obj)
return;
#synchronized(self) {
if(_needs_save)
{
[self writeToFile];//writes the cache obj to a file
_needs_save = NO;
}
NSTimeInterval elapsed = time(nil) - _last_access;
if(elapsed > 20)
{
[_cached_obj release];
_cached_obj = nil;
}
}
}
When I need the data from an Item, the getValue function is called. It tries to use a cached object (NSMutableDictionary). If the cached object is NULL, it loads the object from disk before returning
The setValue function works as expected, but also sets a save flag
The "clean" function is run on a 10s timer by a background thread. This takes care of saving the Item to disk and uncaching data in order to conserve memory.
What I don't like about my approach is that there's a LOT of waiting on semaphores based upon my use of #synchronized. Occasionally, this also means that the main thread is blocking while it waits for disk read/writes, which is painful.
Is there a better data structure or storage mechanism I'm missing?
Thanks!
EDIT: More information:
The speed an which the "getValue" function returns is also of very high importance, even if it is not blocking the main thread. For example, consider the scenario where I am searching through 10k items on the background thread. I will need to get a single value from each of the 10k objects once. With my current mechanism, it works, but loading each non-cached object up from the disk is time consuming and it ends up taking ~20 sec on my iPhone 4. I understand that this might be just "a price I have to pay." But, perhaps storing the data in smaller chunks could help? Eg, don't store an entire item as a dictionary, but as a collection of distinct objects.
As I understand, you profiled your app and the profiles show that the #synchronize blocks are the biggest performance bottleneck. Right?
Well, I'm not overly surprised: you read and write your files while holding the mutex as you pointed out. Moreover, you allow only one thread at the same time, while you could easily allow many readers or one writer to access your cache.
Identified locking operations:
get value -> get value in cache, get value on disk if not in cache, put value in cache
set value -> get value in cache, get value on disk if not in cache, put value in cache, put new value in cache
clean -> save cache, empty cache
So then, the basic operations are:
get value in cache
get value on disk
put a value in cache
save cache
empty cache
It's pretty easy to determine the concurrency of these simple operations, and then to rework your locks to ensure everything works nicely with each other.
You can allow many reader or one writer to access the cache. One thread can read (or write) on disk, without having to lock the cache. The value read from disk will be set in cache as a writer later on. So one read-write lock for the cache, and a mutex for the file. The set value sequence is also a bit puzzling. I do not see the point of reading the old value from the file to replace it immediately. If you need the cache data structures to be ready, just ensure they are without triggering a file operation.
All of this can also be implemented using GCD, avoiding most of the locks if not all.
There is plenty of room to reduce the collisions without introducing a lot of complexity nor changing the app threading model. GCD offer even more opportunities I think, but you'll have to think in terms of queues and operations instead of threads, which is not always easy at first sight.
I will not say reworking the locks will be enough, you may also have to improve how the data are read and saved to disk, but start with the locks. You may be surprised.

Should I store an array or individual items in Memcache?

Right now we are storing some query results on Memcache. After investigating a bit more, I've seen that many people save each individual item in Memcache. The benefit of doing this is that they can get these items from Memcache on any other request.
Store an array
$key = 'page.items.20';
if( !( $results = $memcache->get($key) ) )
{
$results = $con->execute('SELECT * FROM table LEFT JOIN .... LIMIT 0,20')->fetchAll();
$memcache->save($results, $key, 3600);
}
...
PROS:
Easier
CONS:
If I change an individual item, I have to delete all caches (it can be a pain)
I can have duplicated results (the same item on different queries)
vs
Store each item
$key = 'page.items.20';
if( !( $results_ids = $memcache->get($key) ) )
{
$results = $con->execute('SELECT * FROM table LEFT JOIN .... LIMIT 0,20')->fetchAll();
$results_ids = array();
foreach ( $results as $result )
{
$results_ids[] = $result['id'];
// if doesn't exist, save individual item
$memcache->add($result, 'item'.$result['id'], 3600);
}
// save results_ids
$memcache->save($results_ids, $key, 3600);
}
else
{
$results = $memcache->multi_get($results_ids);
// get elements which are not cached
...
}
...
PROS:
I don't have the same item stored twice on Memcache
Easier to invalidate results on several queries (just the item we change)
CONS:
More complicated business logic.
What do you think? Any other PROS or CONS on each way?
Some links
Post explaining the second method in Memcached list
Thread in Memcached Group
Grab stats and try to calculate a hit ratio or possible improvement if you cache the full query vs doing individual item grabs in MC. Profiling this kind of code is also very helpful to actually see how your theory applies.
It depends on what the query does. If you have a set of users and then want to grab the "top 10 music affinity" with some of those friends, it is worth to have both cachings:
- each friend (in fact, each user of the site)
- the top 10 query for each user (space is cheaper than CPU time)
But in general it is worth to store in MC all individual entities that are going to be used frequently (either in the same code execution, or in subsequent requests or by other users). Then things like CPU or resource heavy queries and data processings either MC-them or delegate them to async. jobs instead of making them realtime (e.g. Top 10 site users doesn't needs to be realtime, can be updated hourly or daily).
And of course taking into account that if you store and MC individual entities, you have to remove all referential integrity from the DB to be able to reuse them either individually or in groups.
The question is subjective and argumentative...
This depends on your usage pattern. If you're constantly pulling individual nodes by ID, store each one separately.
Also, note that in either case, storing the list isn't all that useful except for the top 20. If you insert/update/delete a node in such a way that the top-20 is no longer valid, you may end up needing to flush the next 20, and so on.
Lastly, keep in mind that it's a cache. If you're using a cache, you're making the underlying statement that it's no big deal if the data you're outputting is slightly stale.
The memcached stores data in chunks of specific sizes as explained better in the link below.
http://code.google.com/p/memcached/wiki/NewUserInternals
If your data distributions in the memcached is large, then the number of the larger size chunks will be less and therefore the least recently used algorithm will push the data out even if their is space available in the other chunk sizes. The least recently used algorithm works on respective chunks.
You can decide which implementation to choose based on the data size distribution in memcached.