Redis hash search by field and value - hash

here is my use case:
I have a simple client/server app, where the communication goes thru socket.io. Since I need to keep track between room name and its corresponding socket owner, I decided to create simple Redis hash, where, each pair is . This hash allows me to quickly find specific room owner socketId by its room name. So far so good.
The above hash is updated on subscribe backend event, using very simple haset call via node_redis redis.client.hset(keyRoomToSocketId, room, socketId, cb);
This makes sure, each time a new socket arrives and creates its own room with a unique name, to set its socketId to the hash along with it's corresponding field - room.
Now, I would like on socket disconnect event to find this pair and set socketId to empty string. Apparently (tell me if I am wrong), I cannot search the hash by socketId. What I have in my mind is to make one more hash in parallel, in which, the pair to be reversed, i.e. . This will allow me to search second hash by socketId, retrieve room, delete the pair from there and then search first hash and set socketId to "" into the corresponding pair.
Is there anything I am missing and can I make this in a more efficient manner, using Redis?

This should work - your thinking is correct. What you'll be doing is basically a two-way mapping, and a Hash or two are simple and efficient for that, with the main "price" being the duplication of data. Denormalization is a common practice with NoSQL and specifically Redis.

Actually, in the light of the fact, I am using redis along with socket.io, I ended up with just one hash, where pair is .
As a second hash, I am using socket object on the backend - when subscribe event fires, I assign room to socket.ownRoom field. The on disconnect event, I am using this field from socket object and search into the only hash.

Related

Is it legitimate to insert UUIDs into Postgres that have been generated by a client application?

The normal MO for creating items in a database is to let the database control the generation of the primary key (id). That's usually true whether you're using auto-incremented integer ids or UUIDs.
I'm building a clientside app (Angular but the tech is irrelevant) that I want to be able to build offline behaviour into. In order to allow allow offline object creation (and association) I need the the client appplication to generate primary keys for new objects. This is both to allow for associations with other objects created offline and also to allow for indempotence (making sure I don't accidentally save the same object to the server twice due to a network issue).
The challenge though is what happens when that object gets sent to the server. Do you use a temporary clientside ID which you then replace with the ID that the server subsequently generates or you use some sort of ID translation layer between the client and the server - this is what Trello did when building their offline functionality.
However, it occurred to me that there may be a third way. I'm using UUIDs for all tables on the back end. And so this made me realise that I could in theory insert a UUID into the back end that was generated on the front end. The whole point of UUIDs is that they're universally unique so the front end doesn't need to know the server state to generate one. In the unlikely event that they do collide then the uniqueness criteria on the server would prevent a duplicate.
Is this a legitimate approach? The risk seems to be 1. Collisions and 2. any form of security that I haven't anticipated. Collisons seem to be taken care of by the way that UUIDs are generated but I can't tell if there are risks in allowing a client to choose the ID of an inserted object.
However, it occurred to me that there may be a third way. I'm using UUIDs for all tables on the back end. And so this made me realise that I could in theory insert a UUID into the back end that was generated on the front end. The whole point of UUIDs is that they're universally unique so the front end doesn't need to know the server state to generate one. In the unlikely event that they do collide then the uniqueness criteria on the server would prevent a duplicate.
Yes, this is fine. Postgres even has a UUID type.
Set the default ID to be a server-generated UUID if the client does not send one.
Collisions.
UUIDs are designed to not collide.
Any form of security that I haven't anticipated.
Avoid UUIDv1 because...
This involves the MAC address of the computer and a time stamp. Note that UUIDs of this kind reveal the identity of the computer that created the identifier and the time at which it did so, which might make it unsuitable for certain security-sensitive applications.
You can instead use uuid_generate_v1mc which obscures the MAC address.
Avoid UUIDv3 because it uses MD5. Use UUIDv5 instead.
UUIDv4 is simplest, it's a 122 bit random number, and built into Postgres (the others are in the commonly available uuid-osp extension). However, it depends on the strength of the random number generator of each client. But even a bad UUIDv4 generator is better than incrementing an integer.

I want to use bcrypt.compare together with mongoose/mongo enginee search

Consider this code:
const hashPassword = function(plainText) {
return crypto
.createHmac(process.env.Secret_hash_Password, "secret key")
.update(plainText)
.digest("hex");
};
As you may have noticed, this is a simple hashing function using crypto.
Now consider this code excerpt:
bcrypt.compare(password, user.password, (err, isMatch) => {....}
As you may have noticed, this is a simple comparing hashing function using bcryptjs.
As I believe everyone will agree, the second is most secure.
Now consider the problem:
I have a key to store on mongo, and this key is a sensitive information, as so, I have decided to hash it as so no one can decrypt it. This key is used to make mongo searches, this an information that just the user has, a sort of password.
Solution: use the first code, as so nonetheless you cannot decrypt, you can get the same result of hashing if the input is the same.
Problem: my solution is using a tecnique that is well-known to be easily hacked, someone that somehow had access to the server just need to enter several inputs and once they get the same output, they got it! this is a well-known flaw of my solution.
Desired solution: use the second code with mongo.
Discussion: I could simply get all the database information with find({}), and apply say ForEach and bcrypt.compare, nonetheless, I know from my studies that mongo is optimized for search, e.g. they use indexes. It would be nice to be able to pass the bcrypt.compare as a customized function to mongo search enginee.
It was suggested "Increase the bcrypt salt rounds.": I cannot use salt since that would change the key and whenever I will need to compare, it will change. bcrypt.compareexists to overcome that, but mongo/mongoose queries does not have such internal enginee.
What I have in my head, in pseudocode:
Model.findOne({bcrypt.compare (internalID, internalID')}) //return when true
Where: bcrypt.compare (internalID, internalID') would be a sort of callback function, on each search, mongo would use this function with internalID', the current internalID under comparison, and return the document that produces true.
Any suggestion, comment, or anything?
PS. I am using mongoose.
From what i understand, you don't ever want anyone to know the patient ids (non -discover-able from real life patient-ids), even the database admin (and of course hackers).
I think you design is a bit messed up.
Firstly - indexes use B tree data structure for faster lookup so you have to provide exact string for lookup and by your condition of un-hash-able ids, indexes won't work. So you'll have to iterate over every patient id by that doctor and compare to get true result, which is pretty compute- extensive and frankly bad design.
There are multiple ways to approach to approaching this problem- depending upon your level of trust and paranoia.
I think using cryptojs is the correct solution. Now you have to add some randomness to the key/solution. Basically you hash the id with cryptojs, but instead of supplying the key yourself, you could take the secret key from doctor itself then hash every id with that key. Now you will have to unhash and hash every patient id everytime doctor changes secret key (using some sort of message queue).
You could also hash the secret key entered by doctor before saving and will have to unhash everytime (twice!) doctor wants to lookup by patientId.
Depending upon the number of users you expect your application to serve, if number is low enough- my solution would work. But too many users, you'd have to increase compute resources and probably invest in some security measures instead of my overkill solution. Why'd you be losing secret key to hackers anyway?
Good luck.

How to optimize collection subscription in Meteor?

I'm working on a filtered live search module with Meteor.js.
Usecase & problem:
A user wants to do a search through all the users to find friends. But I cannot afford for each user to ask the complete users collection. The user filter the search using checkboxes. I'd like to subscribe to the matched users. What is the best way to do it ?
I guess it would be better to create the query client-side, then send it the the method to get back the desired set of users. But, I wonder : when the filtering criteria changes, does the new subscription erase all of the old one ? Because, if I do a first search which return me [usr1, usr3, usr5], and after that a search that return me [usr2, usr4], the best would be to keep the first set and simply add the new one to it on the client-side suscribed collection.
And, in addition, if then I do a third research wich should return me [usr1, usr3, usr2, usr4], the autorunned subscription would not send me anything as I already have the whole result set in my collection.
The goal is to spare processing and data transfer from the server.
I have some ideas, but I haven't coded enough of it yet to share it in a easily comprehensive way.
How would you advice me to do to be the more relevant possible in term of time and performance saving ?
Thanks you all.
David
It depends on your application, but you'll probably send a non-empty string to a publisher which uses that string to search the users collection for matching names. For example:
Meteor.publish('usersByName', function(search) {
check(search, String);
// make sure the user is logged in and that search is sufficiently long
if (!(this.userId && search.length > 2))
return [];
// search by case insensitive regular expression
var selector = {username: new RegExp(search, 'i')};
// only publish the necessary fields
var options = {fields: {username: 1}};
return Meteor.users.find(selector, options);
});
Also see common mistakes for why we limit the fields.
performance
Meteor is clever enough to keep track of the current document set that each client has for each publisher. When the publisher reruns, it knows to only send the difference between the sets. So the situation you described above is already taken care of for you.
If you were subscribed for users: 1,2,3
Then you restarted the subscription for users 2,3,4
The server would send a removed message for 1 and an added message for 4.
Note this will not happen if you stopped the subscription prior to rerunning it.
To my knowledge, there isn't a way to avoid removed messages when modifying the parameters for a single subscription. I can think of two possible (but tricky) alternatives:
Accumulate the intersection of all prior search queries and use that when subscribing. For example, if a user searched for {height: 5} and then searched for {eyes: 'blue'} you could subscribe with {height: 5, eyes: 'blue'}. This may be hard to implement on the client, but it should accomplish what you want with the minimum network traffic.
Accumulate active subscriptions. Rather than modifying the existing subscription each time the user modifies the search, start a new subscription for the new set of documents, and push the subscription handle to an array. When the template is destroyed, you'll need to iterate through all of the handles and call stop() on them. This should work, but it will consume more resources (both network and server memory + CPU).
Before attempting either of these solutions, I'd recommend benchmarking the worst case scenario without using them. My main concern is that without fairly tight controls, you could end up publishing the entire users collection after successive searches.
If you want to go easy on your server, you'll want to send as little data to the client as possible. That means every document you send to the client that is NOT a friend is waste. So let's eliminate all that waste.
Collect your filters (eg filters = {sex: 'Male', state: 'Oregon'}). Then call a method to search based on your filter (eg Users.find(filters). Additionally, you can run your own proprietary ranking algorithm to determine the % chance that a person is a friend. Maybe base it off of distance from ip address (or from phone GPS history), mutual friends, etc. This will pay dividends in efficiency in a bit. Index things like GPS coords or other highly unique attributes, maybe try out composite indexes. But remember more indexes means slower writes.
Now you've got a cursor with all possible friends, ranked from most likely to least likely.
Next, change your subscription to match those friends, but put a limit:20 on there. Also, only send over the fields you need. That way, if a user wants to skip this step, you only wasted sending 20 partial docs over the wire. Then, have an infinite scroll or 'load more' button the user can click. When they load more, it's an additive subscription, so it's not resending duplicate info. Discover Meteor describes this pattern in great detail, so I won't.
After a few clicks/scrolls, the user won't find any more friends (because you were smart & sorted them) so they will stop trying & move on to the next step. If you returned 200 possible friends & they stop trying after 60, you just saved 140 docs from going through the pipeline. There's your efficiency.

Mitigation techniques for Insecure direct object reference

what are the mitigation techniques for preventing horizontal privilege escalation through insecure direct object reference other than securing the session ? In other words, how do we achieve access controls on horizontal level, I mean the functionality, data, etc is accessible to everyone on the same level, if we are breaching privilege I feel the only possible way other than hijacking session is through Insecure direct object reference or is there any other way that I'm not aware of ?
may be use below link to prevent the Insecure Direct Object Reference: http://owasp-esapi-java.googlecode.com/svn/trunk_doc/latest/org/owasp/esapi/AccessReferenceMap.html
Apart from horizontally or vertically, IDOR occurs when the authorization check has forgotten to reach an object in the system. It is critical if the reached object is sensitive like displaying an invoice belongs to users in the system.
So, I advise using randomly generated IDs or UUIDs to avoid IDOR in total. The attacker has to find valid random ID values that belong to another user.
Or if this sounds hard to apply cus it's possible. Even if you use auto-incremented object IDs you can apply a hash function with salt and put in a hash map like key-value pair. Then you’ll store the key-value map in the Session.
Instead of exposing auto-increment IDs to the user, you can use hash values of corresponding IDs. When you get the value back from the user, you can find an actual ID value by looking up the key-value map in the Session. So that means, even if the attacker spoof the generated value it’s not going to exist on the map. Basically that means IDOR is not going to exploitable anymore.
To read all about IDOR and mitigation here is a post I wrote about it considering every possible aspect: https://medium.com/#aysebilgegunduz/everything-you-need-to-know-about-idor-insecure-direct-object-references-375f83e03a87

Which way should I create a list of objects when sending POST request

When I creating same type objects and save them into database, should I send a list of that objects in one request or should I send individually for each one?
For example, I would like to create a todo list, I can create multiple todos, then click save to send a list of todos, or when I finish editing one todo, I save it directly.
The first way can save request numbers, only one request needed to create many objects. But is the first way RESTful? All infomation about create in REST is creating a single object, but will there be poblems if increasing requests numbers?
----Edit
Thank you guys answering me.
For a more spicific usecase, I am using Django Rest Framework. I created a Todo model and a corresponding serializer. I am wondering, how could I create a list of Todos? I tried to send a list of Todos to serializer, and expecting serializer can automatically loop through it as same as getting a list of instance. But that doesn't work. I know I may be able to create a loop to call create method everytime. But is there a better way to do it?
There is nothing in REST that tells you what kind of payload you are allowed to use. You can POST/PUT whatever you want - one entity representation or many representations, in lists, dictionaries, XML, URL-encoded key/values or JSON, what ever suits your use case best.
In your case you might even want to send a delta/diff list of changes on the client: Lets for instance say your client loads some existing 3 todo items. Then the user modifies one of them, deletes another one and adds a new one. You can either do that in three requests or one single request with add/modify/delete operations encoded in it. Both ways are valid and the best solution depends on your use case and constraints like bandwidth, processing power and network round-trip time.