Parse DB/Mongo compound index with an OR and an Object

Parse DB/Mongo compound index with an OR and an Object - mongodb

I am working on a project where the data is stored in a very weird shape that's causing the query for finding an user on first login attempt to TAKE A LONG TIME. I'm having a hard time wrapping my mind around how I can create an index for this. I want to find a single user. Any of these attributes can undefined, and I need to find a single user that match any of these (OR)
query.equalTo('firebaseUid', uid);
query.equalTo('emails.apple', email);
query.equalTo('emails.google', email);
query.equalTo('phoneNumber', phoneNumber);
query.equalTo('emails.facebook', email);
query.equalTo('emails.password', email);
Is this index correct or makes sense? I'm specifically lost on the fact that we are using find within an emails Object (Don't ask me why we are keeping them separate...). It's most likely an user has an UID than not (After first login). And from there I ordered from most likely login method based on user status (Mostly apple users, then google...)
db.userCollection.createIndex({firebaseUid: 1, emails.apple: 1, emails.google: 1, phoneNumber: 1, emails.facebook: 1, emails.password: 1})

Related

Check for existing value inside of Firebase Realtime Database

Hello, I have a problem I created a Registration form and im trying to check if there is any user which have a certain username inside the Firebase Db. I tried to get the reference of all the users.
var users = Database.database().reference("users")
But I don't know how I could check if there is any user with a specified username.

You'll want to use a query for that. Something like:
let query = users.queryOrdered(byChild: "username").equalTo("two")
Then execute the query and check whether the result snapshot exists.
Note though that you won't be able to guarantee uniqueness in this way. If multiple users perform the check at the same time, they may both end up claiming the same user name.
To guarantee a unique user name, you will need to store the user names as the key - as keys are by definition unique within their parent node. For more on this, see some of these top search results and possibly also from here.

Which is the better way of relating these database tables?

I have a table Users and Matters. Ordinarily only admins can create a matter so I did a one to many relationship between users and matters(User.hasMany(Matter)) and (Matter.belongsTo(User))
Now on the frontend, Matter is supposed to have a multi-select field called assignees where users gotten from the User table can be selected.
My current approach is to make assignees a column on Matter which will be an array of user emails selected on the frontend but the frontend developer thinks I should make it an array of user ids instead but I think that won't be efficient because when getting all matters or updating them, one will need to run a query each time to get the associated assignees using the array of ids stored in the assignees column(and I am not entirely sure on how to go about that).
Another option is having a UserMatters join table but I don't think it will be performant-friendly to populate two tables(Matter and UserMatters) on creation of a matter while updating and getting all matters will involve writing lots of code.
My question is, is there a better way to go about this or should I just stick with populating the assignees field with user emails since it looks like a better approach as far as I can see?
N.B: I am using sequelize(postgres)

So what I did was instead of creating a through/join table, the frontend sent an array of integers containing the IDs of the assignees. Since the assignees are also users on the app, when I get a particular matter, I just loop through the IDs in the assignees column which is an array and get the user details from the user table then I reassign the result to that column and return the resource.
try {
// get the resource
const resource = await getResource(id);
/*looping through the column containing an array of IDs and converting it to numbers(if they are strings)**/
let newArr = resource.assignees.map(id => Number(id));
let newAssignees;
/**fetch all users with the corresponding IDs and wait for that process to be complete**/
newAssignees = await Promise.all(newArr.map(id => getUserById(id)));
/**Returns only an array of objects(as per sequelize)**/
newAssignees.map(el => el.get({ raw: true }));
/**reassign the result**/
resource.assignees = newAssignees;
if(resource)
return res.status(200).json({
status: 200,
resource
})
} else {
return res.status(404).json({
status: 404,
message: 'Resource not found'
});
}
} catch(error){
return res.status(500).json({
status: 500,
err: error.message
})
}

Relational or full object in MongoDB documents

I have a general MongoDB question as I have recently found an issue with how I store things.
Currently, there is a collection called spaces like this:
{
_id: 5e1c4689429a8a0decf16f69,
challengers: [
5dfa24dce9cbc0180fb60226,
5dfa26f46719311869ac1756,
5dfa270c6719311869ac1757
],
tasks: [],
owner: 5dfa24dce9cbc0180fb60226,
name: 'testSpace',
description: 'testSpace'
}
As you can see, this has a challengers array, in which we store the ID of the User.
Would it be okey, if instead of storing the ID, I would store the entire User object, minus fields such as password etc?
Or should I continue with this reference path of referring to the ID of other documents?
The problem I have with this, is that when I want to go through all the spaces that a user has, I want to see what members are a part of that space (challengers array). However, I receive the IDS instead of name and email obviously. I am therefore struggling with sending the correct data to the frontend (I have tried doing some manual manipulation without luck).
So, if I have to continue the path of reference, then I will need to solve my problem somehow.
If it is okey to store the entire object in the array, It would be a lot easier.
HOWEVER, I want to do what is the best practice.
Thank you everyone!

How to save one value of Parse object without overwriting entire object?

I have two users accessing the same object. If userA saves without first fetching the object to refresh their version, data that userB has already successfully saved would be overwritten. Is there any way(perhaps cloud code?) to access and update one, and only one, data value of a PFObject?
I was thinking about pushing the save out to the cloud, refreshing the object once it gets there, updating the value in the cloud, and then saving it back. However that's a pain and still not without it's faults.

This seems easy enough, but to me was more difficult than it should have been. Intuitively, you should be able to filter out the fields you don't want in beforeSave. Indeed, this was the advice given in several posts on Parse.com. In my experience though, it would actually treat the filtering as deletions.
My goal was a bit different - I was trying to filter out a few fields and not only save a few fields, but translating to your context, you could try querying the existing matching record, and override the new object. You can't abort via response.failure(), and I don't know what would happen if you immediately save the existing record with the field of interest and null out the request.object property - you could experiment on your own with that:
Parse.Cloud.beforeSave("Foo", function(request, response) {
// check for master key if client is not end user etc (and option you may not need)
if (!request.master) {
var query = new Parse.Query("Foo");
query.get(request.object.id).then(function(existing) {
exiting.set("some_field", request.object.get("some_field"));
request.object = exiting; // haven't tried this, otherwise, set all fields from existing to new
response.success();
}, function(error) {
response.success();
});
}
});

Searches (and general querying) with HBase and/or Cassandra (best practices?)

I have User model object with quite few fields (properties, if you wish) in it. Say "firstname", "lastname", "city" and "year-of-birth". Each user also gets "unique id".
I want to be able to search by them. How do I do that properly? How to do that at all?
My understanding (will work for pretty much any key-value storage -- first goes key, then value)
u:123456789 = serialized_json_object
("u" as a simple prefix for user's keys, 123456789 is "unique id").
Now, thinking that I want to be able to search by firstname and lastname, I can save in:
f:Steve = u:384734807,u:2398248764,u:23276263
f:Alex = u:12324355,u:121324334
so key is "f" - which is prefix for firstnames, and "Steve" is actual firstname.
For "u:Steve" we save as value all user id's who are "Steve's".
That makes every search very-very easy. Querying by few fields (properties) -- say by firstname (i.e. "Steve") and lastname (i.e. "l:Anything") is still easy - first get list of user ids from "f:Steve", then list from "l:Anything", find crossing user ids, an here you go.
Problems (and there are quite a few):
Saving, updating, deleting user is a pain. It has to be atomic and consistent operation. Also, if we have size of value limited to some value - then we are in (potential) trouble. And really not of an answer here. Only zipping the list of user ids? Not too cool, though.
What id we want to add new field to search by. Eventually. Say by "city". We certainly can do the same way "c:Los Angeles" = ..., "c:Chicago" = ..., but if we didn't foresee all those "search choices" from the very beginning, then we will have to be able to create some night job or something to go by all existing User records and update those "c:CITY" for them... Quite a big job!
Problems with locking. User "u:123" updates his name "Alex", and user "u:456" updates his name "Alex". They both have to update "f:Alex" with their id's. That means either we get into overwriting problem, or one update will wait for another (and imaging if there are many of them?!).
What's the best way of doing that? Keeping in mind that I want to search by many fields?
P.S. Please, the question is about HBase/Cassandra/NoSQL/Key-Value storages. Please please - no advices to use MySQL and "read about" SELECTs; and worry about scaling problems "later". There is a reason why I asked MY question exactly the way I did. :-)

Being able to query properties directly is one of the features you lose when moving away from SQL, so you need a way to maintain your own index to let you find records.
If your datastore does not have built in indexing or atomic list operations, you will need to deal with the locking issues you mention. However, indexing doesn't necessarily need to be synchronous - maintain a queue of updated records to be reindexed and you have a solution for 3 that can be reused to solve 2 also.
If the index list for a particular value becomes too large for the system to handle in a single list, you can replace the list of users with a list of lists. However, if you have that many records with the same value it probably isn't a particularly useful search criteria anyway.
Another option that is useful in some cases is to use a seperate system for the indexing - for example you could set up lucene to index the records in your main datastore.

I guess i would have implemented this as a MapReduce job, which would run on schedule.
Each search word, would be a row-key with lookup to UID.
Rowkey:uid1
profile:firstName: Joe
profile:lastName: Doe
profile:nick: DoeMaster
Rowkey: uid2
profile:firstName: Jane
profile:lastName: Doe
profile:nick: SuperBabe
MapReduse indexes all searchable properties and add them with search word as row key
Rowkey: Jane
lookup:uid: uid2
Rowkey: Doe
lookup:uid: uid2, uid1
Rowkey: DoeMaster
lookup:uid: uid1
..etc
Now, if you need to update the index list on the fly as a user change, you would write the change directly to the index base, by remove uid value from index and add to another row key. In case of this happens at the same time, temporary locking could be implemented.
For users being removed, an additional attribute telling the state of the user could be use to filter them out from search.
Adding additional search word isn't very hard, since its just about which name:value you want to index. you could filter search more also by adding type attribute to your row key/keyword. i.e boston - lookup:type: city.
The idea is to maintain your own row key based search index inside hbase.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse