I found a question and that is : to implement a key-value server
User should be able to connect to server and be able to run command SET a = b.
On running command GET a, it should print b.
First of all, I didn't really understand what the question is all about.
In its simplest form, a Key-Value server is nothing more but a server that holds keys in a dictionary structure and associates a value with said key.
If it helps, you can think of a key as a variable name in a programming language or as an environment variable in the bash shell.
A client to the Key-Value server would either tell the server what value the key has, or request the current value of the key from the server.
As Ramon mentioned in his comment, memcached.org is such example of a Key-Value server.
Of course, the server can be much more complex that what I described above. Keys could be more than just values (for instance, objects) and the server/client could have a lot more functionality than the basic set/get.
Note that the term Key-Value server is very broad and doesn't mean anything concrete by itself. NoSQL systems make use of key-value stores, for example, so you could technically call any NoSQL database system a Key-Value server.
Related
The normal MO for creating items in a database is to let the database control the generation of the primary key (id). That's usually true whether you're using auto-incremented integer ids or UUIDs.
I'm building a clientside app (Angular but the tech is irrelevant) that I want to be able to build offline behaviour into. In order to allow allow offline object creation (and association) I need the the client appplication to generate primary keys for new objects. This is both to allow for associations with other objects created offline and also to allow for indempotence (making sure I don't accidentally save the same object to the server twice due to a network issue).
The challenge though is what happens when that object gets sent to the server. Do you use a temporary clientside ID which you then replace with the ID that the server subsequently generates or you use some sort of ID translation layer between the client and the server - this is what Trello did when building their offline functionality.
However, it occurred to me that there may be a third way. I'm using UUIDs for all tables on the back end. And so this made me realise that I could in theory insert a UUID into the back end that was generated on the front end. The whole point of UUIDs is that they're universally unique so the front end doesn't need to know the server state to generate one. In the unlikely event that they do collide then the uniqueness criteria on the server would prevent a duplicate.
Is this a legitimate approach? The risk seems to be 1. Collisions and 2. any form of security that I haven't anticipated. Collisons seem to be taken care of by the way that UUIDs are generated but I can't tell if there are risks in allowing a client to choose the ID of an inserted object.
However, it occurred to me that there may be a third way. I'm using UUIDs for all tables on the back end. And so this made me realise that I could in theory insert a UUID into the back end that was generated on the front end. The whole point of UUIDs is that they're universally unique so the front end doesn't need to know the server state to generate one. In the unlikely event that they do collide then the uniqueness criteria on the server would prevent a duplicate.
Yes, this is fine. Postgres even has a UUID type.
Set the default ID to be a server-generated UUID if the client does not send one.
Collisions.
UUIDs are designed to not collide.
Any form of security that I haven't anticipated.
Avoid UUIDv1 because...
This involves the MAC address of the computer and a time stamp. Note that UUIDs of this kind reveal the identity of the computer that created the identifier and the time at which it did so, which might make it unsuitable for certain security-sensitive applications.
You can instead use uuid_generate_v1mc which obscures the MAC address.
Avoid UUIDv3 because it uses MD5. Use UUIDv5 instead.
UUIDv4 is simplest, it's a 122 bit random number, and built into Postgres (the others are in the commonly available uuid-osp extension). However, it depends on the strength of the random number generator of each client. But even a bad UUIDv4 generator is better than incrementing an integer.
Suppose I have a recipe page where the recipe can have a number of ingredients associated with it. Users can edit the ingredients list and update/save the recipe. In the database there are these tables: recipes table, ingredients table, ingredients_recipes_table. Suppose a recipe has ingredients a, b, c, d but then the user changes it to a, d, e, f. With the request to the server, do I just send only the new ingredients list and have the back end determine what values need to be deleted/inserted into the database? Or do I explicitly state in the payload what values need to be deleted and what values need to be inserted? I'm guessing it's probably the former, but then is this handled before or during the db query? Do I read from the table first then write after calculating the differences? Or does the query just handle this?
I searched and I'm seeing solutions involving INSERT IGNORE... + DELETE ... NOT IN ... or using the MERGE statement. The project isn't using an ORM -- would I be right to assume that this could be done easily with an ORM?
Can you share what the user interface looks like? It would be pretty standard practice that you can either post a single new ingredient as an action or delete one as an action. You can simply have a button next to the ingredients to initiate a DELETE request, and have a form beneath for a POST.
Having the users input a list creates unnecessary complexity.
A common pattern to use would be to treat this like a remote authoring problem.
The basic idea of remote authoring is that we ask the server for its current representation of a resource. We then make local (to the client) edits to the representation, and then request that the server accept our representation as a replacement.
So we might GET a representation that includes a JSON Array of ingredients. In our local copy, we remove the ingredients we no longer want, add the new ones in. The we would PUT our local copy back to the server.
When the documents are very large, with changes that are easily described, we might instead of sending the entire document to the server instead send a PATCH request, with a "patch document" that describes the changes we have made locally.
When the server is just a document store, the implementation on the server is easy -- you can review the changes to decide if they are valid, compute the new representation (if necessary), and then save it into a file, or whatever.
When you are using a relational database? Then the server implementation needs to figure out how to update itself. An ORM library might save you a bunch of work, but there are no guarantees -- people tend to get tangled up in the "object" end of the "object relational mapper". You may need to fall back to hand rolling your own SQL.
An alternative to remote authoring is to treat the problem like a web site. In that case, you would get some representation of a form that allows the client to describe the change that should be made, and then submit the form, producing a POST request that describes the intended changes.
But you run into the same mapping problem on the server end -- how much work do you have to do to translate the POST request into the correct database transaction?
REST, alas, doesn't tell you anything about how to transform the representation provided in the request into your relational database. After all, that's part of the point -- REST is intended to allow you to replace the server with an alternative implementation without breaking existing clients, and vice versa.
That said, yes - your basic ideas are right; you might just replace the entire existing representation in your database, or you might instead optimize to only issue the necessary changes. An ORM may be able to effectively perform the transformations for you -- optimizations like lazy loading have been known to complicate things significantly.
Hello Internet Denizens,
I was reading through a nice database design article and the final determination on how to properly generate DB primary keys was ...
So, in reality, the right solution is probably: use UUIDs for keys,
and don’t ever expose them. The external/internal thing is probably
best left to things like friendly-url treatments, and then (as Medium
does) with a hashed value tacked on the end.
That is, use UUIDs for internal purposes like db joins, but use a friendly-url for external purposes (like a REST API).
My question is ... how do you make uniquely identifiable (and friendly) keys for external purposes?
I've used several APIs: Stripe, QuickBooks, Amazon, etc. and it seems like they use straight up sequential IDs for things like customers, report IDs, etc for retrieving information. It makes me wonder if exposing UUIDs as a security risk is a little overblown b/c in theory you should be able to append a where clause to your queries.
SELECT * FROM products where UUID = <supplied uuid> AND owner/role/group/etc = <logged in user>
The follow-up question is: If you expose a primary key, how do people efficiently restrict access to that resource in a database environment? Assign an owner to a db row?
Interested in the design responses.
Potential Relevant Posts for Further Reading:
Should I use UUIDs for resources in my public API?
It is not a good idea to expose your internal ids to the outside. You should either encode them (with some algorithm) or have a look up table.
Also, do not append parameters provided by user (or URL) to your SQL query (UUIDS or not), this is prone to SQL injection. Use parameterized SQL queries for that.
I have a restful service for the documents, where the documents are stored in mongodb, the restful api for the document is /document/:id, initially the :id in the api is using the mongodb 's object id, but I wonder deos this approach reveal the database id, and expose the potential threat, should I want to replace it with a pseudonymity id.
if it is needed to replace it the pseudonymity id, I wonder if there is a algorithmic methods for me to transform the object id and pseudonymity id back and forth without much computation
First, there is no "database id" contained in the ObjectID.
I'm assuming your concern comes from the fact that the spec lists a 3 byte machine identifier as part of the ObjectID. A couple of things to note on that:
Most of the time, the ObjectID is actually generated on the client side, not the server (though it can be). Hence this is usually the machine identifier for the application server, not your database
The 3 byte Machine ID is the first three bytes of the (md5) hash of the machine host name, or of the mac/network address, or the virtual machine id (depending on the particular implementation), so it can't be reversed back into anything particularly meaningful
With the above in mind, you can see that worrying about exposing information is not really a concern.
However, with even a small sample, it is relatively easy to guess valid ObjectIDs, so if you want to avoid that type of traffic hitting your application, then you may want to use something else (a hash of the ObjectID might be a good idea for example), but that will be dependent on your requirements.
Is it possible to re-create the Net::Telnet connection if I have its memory location ?
how can i turn Net::Telnet=GLOB(0x1b50ff0) string to a Net::Telnet object again ?
Thanks.
UPDATE
You can not re-use your object in 2 separate processes as it seems from your comments you are trying to do - one will NOT see the other's memory/address space. You can only do one of 3 things:
Re-create the object from scratch to be the duplicate of the other object in a different program, but only if the object's class supports serialization/de-serialization (usually done via saving object state using Data::Dumper, Storable or other methods). I don't know if Net::Telnet can be handled that way.
Just to be clear, your second program will obtain a COPY of the object once deserialized, which has nothing to do with the original object.
Allow the client to talk to the server and send Telnet commands, which the server passes to Net::telnet object and tells the client the result. Basically, the server acting as a telnet proxy for the client. The client should refer to server's Net::Telnet objects via their IDs, as mentioned in the registry explanation in my original answer.
Use shared memory to store Net::Telnet object if the client and server reside on the same physical server.
ORIGINAL ANSWER
You can try looking at Acme::Ref which un-stringifies a reference... I never used it so can't guarantee it works well or works with specifically Net::telnet.
I agree with the comment posted above that if you need to do this, you most likely are not applying a correct solution to your underlying problem - it would help if you provided more details of what you're trying to achieve high-level.
You should almost never have to deal with stringified reference as opposed to an object reference. If you're within the bounds of your own process, then you can pass the object reference around (or make it global if you really must). If you are using some sort of inter-process communication and an external process needs to refer to one of Net::Telnet objects in your program, you need to create a registry of Net::Telnet objects (could be just an array) and reference them by an index in the registry.