Checking if a value exist in Spymemcached without knowing the key - key-value

SpyMemcached : How to check if a value exist without knowing the key?
Thank you!

Memcached is a cache and only supports set/delete/get type operations. You cannot create indexes on data in memcached and do searches on those indexes. If you want this functionality you should create an index in your database that maps values to keys.
Another option would be to look at Couchbase. It provides the same interface as memcached and has indexing capabilities. Note that Couchbase is more than just a cache however and it will also persist all of the data you set into it.

Related

Sorting encrypted fields in MongoDB for a spring boot project

In the project I'm working on, I need to encrypt only specific data fields of the collection. But I need to run search and sort queries on these fields as well, which is not possible currently given that they are encrypted. While running the sort query, sorting is happening on the encrypted values and not the plain text, hence giving the wrong result.
Also, part of the use case is that the encryption key is configurable for transaction. Since I am not storing the key in the application and fetching it from a custom KMS for every save/find operation.
Conclusion, I'm looking for a way to perform sort operations on encrypted fields in mongoDB using a configurable key. I found CSFLE as a solution online, but I am not sure if it covers all the points above.
Can you suggest any technology/library which can fulfill the needs? Not looking for any code, just documentation or library name is more than enough.
Thank you

MongoDB supports references from one document to another. Does DynamoDB supports the same?

MongoDB supports references from one document to another.
Source: https://docs.mongodb.com/manual/core/data-modeling-introduction/
Does DynamoDB also supports this feature?
MongoDB references are not hard bound. Unlike foreign key constraint they would allow you to delete parent document even if the references are there in other document. It's a manual reference which you can maintain and use from application. However DBRef is provided by Monogo which is kind of hard bound but are strongly advised against because it limits the benefit of Mongo.
As far as manual reference goes, there is no reason why we can't use it in dynamoDB. the id of one table can be used as a field in another table. Again it won't be binding. But I don't think there is any hard bound reference system is in dynamoDB.
Mongo provides ObjectId as an inherent way to generate Ids, however you can use other types in _id field also. In dynamoDB it is more likely that you will choose an id type by yourself though you can use it's UUID feature also. But in dynamoDB you would mostly query on the primary key, so UUID won't be that helpful.

Can I replace storage engine of MongoDB as a Key-Value store

MongoDB is easy to start, but it is not easy to ensure availability (buy EC2 to build a master-slave? or more replica set ?). And there is many key-value public service(Dynamo, AzureTable) with high availability and good performance. So if I can replace MongoDB storage engine with, such as Dynamo, then I got friendly MongoDB API and high available storage. Is that possible ?
Actually, it is pretty easy to use MongoDB as an in memory key value store:
Approach 1: Prevent disk usage
Disable journaling
Set syncPeriodSecs to 0
Store the key in the _id field and the value in a value field.
Now you have an all in-memory key/value store
The problem with this approach is that you are limited to your machines RAM size and your data does not persist.
A by far more elegant approach is to use
Approach 2: Using covered queries
A covered query is a query which is answered by using only an index, which is kept in memory as long as enough is available. If the index exceeds the RAM, the least recently used parts are "swapped" to disk. Furthermore, your key/value pairs are persisted. And all that happens transparently.
Create an index over both the keys and the values:
db.keyValueCollection.createIndex({_id:1,value:1})
Make sure MongoDB is informed that only _id and value are needed and it doesn't need to bother reading anything else from disk (flexible schema!) using projection to limit the fields returned in your query to those two fields:
db.keyValueCollection.find({_id:"foo"},{_id:1,value:1})
Done! Your queries will now be answered from RAM, overflow will be automatically handled and your values will persist.
Side note
Ensuring availability with MongoDB is close to foolproof. Choose a replica set name, add it to all confit files, fire up one of the members, initiate the replica set and add the members. Problem solved.
Edit: removed an artificial key field, and made sure the query in approach 2 is covered by limiting the fields returned.

mongodb shard key hash algorithm

I'm unable to find documentation about the algorithm that mongodb is using for collection or shard keys.
Can anyone help with this or post a reference?
If you are more interested in how indexing in general works check this presentation about the internals : http://www.mongodb.com/presentations/storage-engine-internals or this one http://www.mongodb.com/presentations/mongodbs-storage-engine-bit-bit
As an individual shard knows not much about the whole structure of the cluster, it utilizes the same indexing algorithm internally just there is a metadata layer which knows which datapart related to the specific shard.
There are some special cases, which are described in this docs : http://docs.mongodb.org/manual/core/indexes/
So which is not covered this way in the presentations above are the geospatial indexes and the special one which is the hashed index (DOCS). This one is also could be used as shard key and called hashed index and in this case the sharding is hash based sharding.check THIS and THIS
About the hashing algorithm which is used for this is: md5 used in this file:
https://github.com/mongodb/mongo/blob/master/src/mongo/db/hasher.cpp
implemented here :
https://github.com/mongodb/mongo/blob/master/src/mongo/util/md5.cpp
Currently works only for an individual field as shard key at least this could be read out from the comments in the https://github.com/mongodb/mongo/blob/master/src/mongo/db/index/hash_access_method.cpp source file.
The official doc about shard keys is
http://docs.mongodb.org/manual/core/sharded-clusters/
If your 'algorithm' means cluster, you can get help through:
http://docs.mongodb.org/manual/core/sharded-cluster-operations/
You can now use convertShardKeyToHashed to convert a key to its hash value from version 4.0
From this ref, browse the source code and read its implementation

Question about doing in-place updates in Basho Riak

Currently I use Mongodb for recording statistics and adserving. I log raw impressions to a log collection, and processes' do findandmodify to pull off the log and aggregate into a precomputed collection using upsert (similar to how rainbird works with twitter).
http://techcrunch.com/2011/02/04/twitter-rainbird/
I aggregate on the parent, child, childs child etc, which makes querying for statistics fast and painless.
I use (in mongo) a key consisting of the {Item_id, Hour} and upsert to that (alot)
I was wondering if Riak had a strong way to solve the same problem, and how I would implement it.
Short answer: I don't think Riak supports upsert-like operations.
Long answer: Riak is a Key-Value store which treats stored values as opaque data. But in the future Riak could consider adding support for HTTP PATCH which might allow one to support operations similar to upsert. There is another category of operations (compare-and-set) which would also be interesting, but supporting these is definitely much more complicated.
The way this works with Riak depends on the storage backend that you're using for Riak.
Bitcask, the current default storage backend, uses a log-structured hash tree for the internal storage mechanism. When you write a new record to Riak, an entirely new copy of your data is stored on disk. Eventually, compaction of the bitcask will occur and the old copies of your data will be removed from the bitcask file.
Any put into Riak is effectively an upsert - if the data doesn't exist, a new record will be inserted. Otherwise, the existing value will be updated by expiring the old value and making the newest value the current value.