How to model path-based data

How to model path-based data - hash

I need to model data in redis where the keys are file paths, and each path has N key/values with it.
I am currently modeling the data using Hashes.
HSET /products/clothes/pants/501-jeans "title" "Levi 501 Jeans"
This works fine, however I want to be able to get a list of all "pants" hashes "children". I can do this via
KEYS /products/clothes/pants/*
However redis documentation states KEYS pattern should not be used in production.
I was thinking of creating a SET of all "paths" associated with hashes, but im still not sure how/if I can search over those.
Thoughts on how to best model this type of data?

Here is how i've modeled the data and it seems to work fairly well.
resources::/path/foo => Redis Hash of Resource data
resources::/path/foo/bar => Redis Hash of Resource data
resources::/path/foo/cat => Redis Hash of Resource data
resources::/path/foo/cat/dog => Redis Hash of Resource data
children::/path/foo => Redis Map [ /path/foo/bar, /path/foo/cat/, /path/foo/dog ]
children::/path/foo/bar => Redis Map [ /path/foo/bar, /path/foo/cat ]
children::/path/foo/bar/cat => Redis Map [ /path/foo/dog ]
I have to manage the children:: maps any time i add/remove a resources:: key
I chose modeling children:: using Redis maps so its impossible to have duplicate path keys, however using a list may work as well (easy to

Related

What are the best practices to make a good REST API request cache key?

I am building a simple API service using Ruby on Rails. In production, I would like to integrate Redis/Memcached in order to cache some frequently-used endpoints with key-based caching. For example, I have a Car table with name and color fields.
My question is, what is the best way to define a cache key for a particular endpoint (eg. /cars) when the resource has variety of params that could come in different order? eg. /cars?name=honda&color=white, /cars?color=white&name=honda.
If I use request url as cache key I will have 2 different cache records but technically speaking, if both name and color have the same values, there should only be one cache record in Redis database.

arrange the parameters in alphabetical order and use that as the basis for a cache key.
/cars?name=honda&color=white
/cars?color=white&name=honda
in both cases the cache key would be based on the concatenated alphabetically listed parameters
colorname
So both the above reordered urls would result in the same cache key.

Which data-store to use, to store meta data corresponding to the keys in memcache?

I have a memcache backend and i want to add redis for adding the meta data of the keys of the memcache.
Meta data is as follows:
Miss_count: The number of times the data was not present in the memcache.
Hash_value: The hash value of the data corresponding to the key in the memcache.
Data in memcache : key1 ::: Data
Meta data (miss count) : key1_miss ::: 10
Meta data (hash value) : key1_hash ::: hash(Data)
Please provide help as in which data store is preferable as when i store the meta data in the memcache itself, the meta data is removed well before its expiry time as the size of the meta data is small and the slab allocation is allocating a small memory chuck to it.

As the meta data will increase with time, the hash concept of the redis will fail. Therefore apply a client logic to see that the max_zipped is satisfied.

If I understand your use case correctly I suspect Redis might be a good choice. Assuming you'll be periodically updating the meta data miss counts associated with the various hashes over time, you'd probably want to use Redis sorted sets. For example, if you wanted the miss counts stored in a sorted set called "misscounts", the Redis command to add/update those counts would be one and the same:
zadd misscounts misscount key1
... because zadd adds the entry if one doesn't already exist or overwrites an existing entry if it does. If you have a hook into the process that fires each time a miss occurs, you could instead use:
zincrby misscounts 1 key1
Similar to the the zadd command behavior, zincrement will create a new entry (using the increment value as the count) if one doesn't exist, or increment the existing count by the increment value you pass if an entry does exist.
Complete documentation of Redis commands can be found here. Descriptions of the different types of storage options in Redis is detailed here.
Oh, and a final note. In my experience, Redis is THE SHIT. Sorry to curse (in caps), but there's simply no other way to do Redis justice. We call our Redis server "honey badger", because when load starts increasing and our other servers start auto-scaling, honey badger just don't give a shit.

Riak Search giving me "Not found" error for available data

I've installed Riak 1.0.2 on Ubuntu Natty.
I have also added some sample data to the database. I'm using a LevelDB backend because I want to test the Secondary Indexing functionality.
I added a test_1 bucket. And to that bucket I added the following information.
array("name" => "Ray_6", "age" => rand(25, 90), "email" => "addr_6#orbican.com") with key "id_1"
array("name" => "Ray_7", "age" => rand(25, 90), "email" => "addr_7#orbican.com") with key "id_2"
array("name" => "Ray_8", "age" => rand(25, 90), "email" => "addr_8#orbican.com") with key "id_3"
I'm trying to use the Search feature to query this data. Below is the CURL request that I enter into the command line:
curl http://localhost:8098/solr/test_1/select?q=name:Ray_6
But when I do this, I get a no found error.
Is there something I'm missing? Am I supposed to do something to the bucket to make it searchable?
I'd appreciate some assistance.
Thanks in advance.

Well, firstly, the above URL is using Riak Search and not the secondary indexes. The URL to query a secondary index is in the form of:
/buckets/<bucket>/index/<fieldname_bin>/query
You form a secondary index by adding metadata headers when creating a record through the cURL interface. Client libraries for different languages will generate this for you.
Back to your specific question, though. Did you use the search-cmd tool to install an index for the test_1 bucket? If you did, did you have data in the bucket before doing so? Riak Search will not retroactively index your data. There are a few ways available to do so, but both are time-consuming if this is just an experimental app.
If you don't have much data, I suggest you re-enter it after setting up the index. Otherwise, you need to add secondary index or process it through the search API as you read/write a piece of data. It'll take time, but it's what is available through Riak now.
Hope this helps.

Implementing Key-value server

I found a question and that is : to implement a key-value server
User should be able to connect to server and be able to run command SET a = b.
On running command GET a, it should print b.
First of all, I didn't really understand what the question is all about.

In its simplest form, a Key-Value server is nothing more but a server that holds keys in a dictionary structure and associates a value with said key.
If it helps, you can think of a key as a variable name in a programming language or as an environment variable in the bash shell.
A client to the Key-Value server would either tell the server what value the key has, or request the current value of the key from the server.
As Ramon mentioned in his comment, memcached.org is such example of a Key-Value server.
Of course, the server can be much more complex that what I described above. Keys could be more than just values (for instance, objects) and the server/client could have a lot more functionality than the basic set/get.
Note that the term Key-Value server is very broad and doesn't mean anything concrete by itself. NoSQL systems make use of key-value stores, for example, so you could technically call any NoSQL database system a Key-Value server.

key value stores for extendable objects

http://www.infoq.com/presentations/newport-evolving-key-value-programming-model is a video about KV stores, and the whole premise is that redis promotes a column-based style for storing the attributes of an object under separate keys rather than serialising an object and storing it under a single key.
(This question is not redis-specific, but more a general style and best practice for KV stores in general.)
Instead of a blob for, say, a 'person', redis encourages a column based style where the attributes in an object are stored as separate key, e.g.
R.set("U:123:firstname","Billy")
R.set("U:123:surname","Newport")
...
I am curious if this is best practice, and if people take different approaches.
E.g. you could 'pickle' an object under a single key. This has the advantage of being fetched or set in a single request
Or a person could be a list with the first item being a field name index or such?
This got me thinking - I'd like a hierarchical key store, e.g.
R.set(["U:123","firstname"],"Billy")
R.set(["U:123","surname"],"Newport")
R.get(["U:123"]) returns [("firstname","Billy"),("surname","Newport")]
And then to add in transactions:
with(R.get(["U:132"]) as user):
user.set("firstname","Paul")
user.set("lastname","Simon")
From a scaling perspective, the batching of gets and sets is going to be important?
Are there key stores that do have support for this or have other applicable approaches?

You can get similar behavior in Redis by using an extra Set to keep track of the individual members of your object.
SET U:123:firstname Billy
SADD U:123:members firstname
SET U:123:surname Cobin
SADD U:123:members surname
GET U:123:firstname => Billy
GET U:123:firstname => Cobin
SORT U:123:members GET U:123:* -> [Billy, Cobin]
or
SMEMBERS U:123:members -> [firstname, surname]
MGET U:123:firstname U:123:firstname
Not a perfect match but good enough in many situations. There's an interesting article about how hurl uses this pattern with Redis

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to model path-based data - hash

Related

What are the best practices to make a good REST API request cache key?

Which data-store to use, to store meta data corresponding to the keys in memcache?

Riak Search giving me "Not found" error for available data

Implementing Key-value server

key value stores for extendable objects

Categories

Resources