Mongodb performance design

Mongodb performance design - mongodb

I have the below use case.
Let's say I have user, order details in the same collection.
Id is user_id|order_id
I have the below informations in single document.
user_detail: {
name, age, email, address
}
Assume single email, single address, single phone number field.
Then order detail which contains order id, order items, etc.
Order details contains indices also.
Note: My use case is something different I am trying to map it.
Approach 1:
Replace the order details. Replace that order wherever it is present.
Approach2:
Have a order collection, compare the changes on the order, update the changed fields where and all this order is present for that customer.
Here, order detail could be present on multiple users. Kindly assume to map my use case.
I implemented approach 2. It seems time consuming, non-scalable. I wonder which one is better or any new approaches? My concern is more about the data in the index.
Any suggestions?

Related

Prioritise which identifier to use

My crystal report pulls data about books, including an identifier (isbn, issn order number etc.), author, and publisher.
The ID field stores multiple ways to identify the book. The report displays any of the identifiers for that record. If one book has two identifiers; issn and order number, the report currently displays one apparently at random.
How can I make it prioritise which type to use based on a preset order? I figured some sort of filter on the field could work, but I haven't figured out how. I can't edit the table, but I can use SQL within the report.

If all the different types of ID are stored in a single field, your best bet is to use a SQL Command inside your report to separate them into multiple virtual fields.
Go to Database Fields / Database Expert, expand the connection you want to use, and pick Add Command. From here you can write a custom SQL statement to grab the information you're currently using, and at the same time separate the ID field into multiple different fields (as far as the report will be concerned, anyway. The table will stay unchanged.)
The trick is to figure out how to write your command to do the separation. We don't know what your data looks like, so you're on your own from here.

Based on the very little information that you have provided and if i was to make a guess.I suggest you make use of the formula field in your report and then use something like this to accomplish your goal.
IF ISNULL{first_priority_field_name} OR {first_priority_field_name} = '' THEN
{second_priority_field_name}
ELSE
{first_priority_field_name}
Use nested IF statement in case there are more than 2 identifier fields.

Geofire TableView - CircleQuery Users for leaderboard [duplicate]

I'm trying to figure out how to query with filter with Geofire.
Suppose I have restaurants with different category. and I want to add that category to my query. How do I go about this?
One way I have now is querying the key with Geofire, run the for loop through each key and get the restaurant, and insert the appropriate restaurant to the array.
These seems so inefficient. Is there any other way to go about this?
Ideally I will have the filtered results, and only load each item when they're about to be shown.
Cheers!

Firebase queries can only filter by one condition. Geofire already does quite some "magic" to allow it to filter on both longitude and latitude. Adding another property to that equation might be possible, but is well beyond what Geofire handles by default. See GeoFire: How to add extra conditions within the query?
If you only ever want to access one category at a time, you can put the restaurants in a top-level node per category and point Geofire to one category.
/category1
item1
g: "pns0h0mf2u"
l: [-53.435719, 140.808716]
item2
g: "u417k3dwub"
l: [56.83069, 1.94822]
/category2
item3
g: "8m3rz3s480"
l: [30.902225, -166.66809]
/items
item1: ...
item2: ...
item3: ...
In the above example, we have two categories: category1 with 2 items and category2 with just 1 item. For each item, we see the data that Geofire uses: a geohash and the longitude and latitude. We also keep a single list with the other properties of these 3 items.
But more commonly, you simply do the extra filtering in client-side code. If you're worried about the performance of that: measure it, share the code, JSON data and measurements.

This is an old question, but I've seen it in a few places on the web, so I thought I might share one trick I've used.
The Problem
If you have a large collection in your database, maybe containing hundreds of thousands of keys, for example, it might not be feasible to grab them all. If you're trying to filter results based on location in addition to other criteria, you're stuck with something like:
Execute the location query
Loop through each returned geofire key and grab the corresponding data in the database
Check each returned piece of data to see if it matches the other criteria
Unfortunately, that's a lot of network requests, which is quite slow.
More concretely, let's say we want to get all users within e.g. 100 miles of a particular location that are male and between ages 20 and 25. If there are 10,000 users within 100 miles, that means 10,000 network requests to grab the user data and compare their gender and age.
The Workaround:
You can store the data you need for your comparisons in the geofire key itself, separated by a delimiter. Then, you can just split the keys returned by the geofire query to get access to the data. You still have to filter through them, but it's much faster than sending hundreds or thousands of requests.
For instance, you could use the format:
UserID*gender*age, which might look something like facebook:1234567*male*24. The important points are
Separate data points by a delimiter
Use a valid character for the delimiter -- "It can include any unicode characters except for . $ # [ ] / and ASCII control characters 0-31 and 127.)"
Use a character that is not going to be found elsewhere in your database - I used *, but that might not work for you. Do not use any characters from -0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz, since those are fair-game for keys generated by firebase's push()
Choose a consistent order for the data - in this case, UserID first, then gender, then age.
You can store up to 768 bytes of data in firebase keys, which goes a long way.
Hope this helps!

TSQL - Deleting with Inner Joins and multiple conditions

My question is a variation on one already asked and answered (TSQL Delete Using Inner Joins) but I have a different level of complexity and I couldn't see a solution to it.
My requirement is to delete Special Prices which haven't been accessed in 90 days. Special Prices are keyed on Customer ID and Product ID and the products have to matched to a Customer Order Detail table which also contains a Customer ID and a Product ID. I want to write one function that will look at the Special Price table for each Customer, compare each Product for that Customer with the Customer Order Detail table and if the Maximum Order Date is more than 90 days earlier than today, delete it from the Special Price table.
I know I can use a CURSOR (slow but effective) but would prefer to have a single query like the one in the TSQL Delete Using Inner Joins example. Any ideas and/or is more information required?

I cannot dig more on the situation of your system but i think and if it is ok for you, check MERGE STATEMENT, it might be a help instead of using cursors. check this Link MERGE STATEMENT

Search Logic removing records with no association from results when ordering by that association

I'm using search logic to filter and order my results but it removes records from my results when I order by a association and when that association is not always present for all records.
For example say I have a user model which can have one vehicle model but does not have to, if I have a results table where you can order by the users vehicles make I would hope all users without a vehicle record would be considered empty strings and therefore ordered all at the beginning followed by the other user records which have vehicles ordered by the make name.
Unfortunately all the user records which do not have a vehicle are removed from the results.
Is there anyway round this and still use search logic as I find it extremely useful

I think you'll have to explicitly assign a default vehicle that has an empty name

Searches (and general querying) with HBase and/or Cassandra (best practices?)

I have User model object with quite few fields (properties, if you wish) in it. Say "firstname", "lastname", "city" and "year-of-birth". Each user also gets "unique id".
I want to be able to search by them. How do I do that properly? How to do that at all?
My understanding (will work for pretty much any key-value storage -- first goes key, then value)
u:123456789 = serialized_json_object
("u" as a simple prefix for user's keys, 123456789 is "unique id").
Now, thinking that I want to be able to search by firstname and lastname, I can save in:
f:Steve = u:384734807,u:2398248764,u:23276263
f:Alex = u:12324355,u:121324334
so key is "f" - which is prefix for firstnames, and "Steve" is actual firstname.
For "u:Steve" we save as value all user id's who are "Steve's".
That makes every search very-very easy. Querying by few fields (properties) -- say by firstname (i.e. "Steve") and lastname (i.e. "l:Anything") is still easy - first get list of user ids from "f:Steve", then list from "l:Anything", find crossing user ids, an here you go.
Problems (and there are quite a few):
Saving, updating, deleting user is a pain. It has to be atomic and consistent operation. Also, if we have size of value limited to some value - then we are in (potential) trouble. And really not of an answer here. Only zipping the list of user ids? Not too cool, though.
What id we want to add new field to search by. Eventually. Say by "city". We certainly can do the same way "c:Los Angeles" = ..., "c:Chicago" = ..., but if we didn't foresee all those "search choices" from the very beginning, then we will have to be able to create some night job or something to go by all existing User records and update those "c:CITY" for them... Quite a big job!
Problems with locking. User "u:123" updates his name "Alex", and user "u:456" updates his name "Alex". They both have to update "f:Alex" with their id's. That means either we get into overwriting problem, or one update will wait for another (and imaging if there are many of them?!).
What's the best way of doing that? Keeping in mind that I want to search by many fields?
P.S. Please, the question is about HBase/Cassandra/NoSQL/Key-Value storages. Please please - no advices to use MySQL and "read about" SELECTs; and worry about scaling problems "later". There is a reason why I asked MY question exactly the way I did. :-)

Being able to query properties directly is one of the features you lose when moving away from SQL, so you need a way to maintain your own index to let you find records.
If your datastore does not have built in indexing or atomic list operations, you will need to deal with the locking issues you mention. However, indexing doesn't necessarily need to be synchronous - maintain a queue of updated records to be reindexed and you have a solution for 3 that can be reused to solve 2 also.
If the index list for a particular value becomes too large for the system to handle in a single list, you can replace the list of users with a list of lists. However, if you have that many records with the same value it probably isn't a particularly useful search criteria anyway.
Another option that is useful in some cases is to use a seperate system for the indexing - for example you could set up lucene to index the records in your main datastore.

I guess i would have implemented this as a MapReduce job, which would run on schedule.
Each search word, would be a row-key with lookup to UID.
Rowkey:uid1
profile:firstName: Joe
profile:lastName: Doe
profile:nick: DoeMaster
Rowkey: uid2
profile:firstName: Jane
profile:lastName: Doe
profile:nick: SuperBabe
MapReduse indexes all searchable properties and add them with search word as row key
Rowkey: Jane
lookup:uid: uid2
Rowkey: Doe
lookup:uid: uid2, uid1
Rowkey: DoeMaster
lookup:uid: uid1
..etc
Now, if you need to update the index list on the fly as a user change, you would write the change directly to the index base, by remove uid value from index and add to another row key. In case of this happens at the same time, temporary locking could be implemented.
For users being removed, an additional attribute telling the state of the user could be use to filter them out from search.
Adding additional search word isn't very hard, since its just about which name:value you want to index. you could filter search more also by adding type attribute to your row key/keyword. i.e boston - lookup:type: city.
The idea is to maintain your own row key based search index inside hbase.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse