is there any impact in performance when we need to access last sort key for any particular partition key - nosql

I am creating a dynamoDB table. I am using an custromerId as partition key and versionNumber as sort key. suppose there are 1000 versions for any particular customerId. for my use-case I always want to find out last version of any customerId. will there be any difference in performance when i want first versionNumber and when i want last versionNumber or both will take same time.

No, actually we'll have a parameter ScanIndexForward (True/False). So based on this it starts reading the dynamoDB in ascending or the descending order.

Related

Firestore order by time but sort by ID

I have been trying to figure out a way to query a list of documents where I have a range filter on one field and order by another field which of course isn't possible, see my other question: Order by timestamp with range filter on different field Swift Firestore
But is it possible to save documents with the timestamp as id and then it would sort by default? Or maybe hardcode an ID, then retrieve the last created document id and increase id by one for the next post to be uploaded?
This shows how the documents is ordered in the collection
Any ideas how to store documents so they are ordered by created at in the collection?
It will order by document ID (ascending) by default in Swift.
You can use .order(by: '__id__') but the better/documented way is with FieldPath documentID() I don't really know Swift but I assume that it's something like...
.order(by: FirebaseFirestore.FieldPath.documentID())
JavaScript too has an internal variable which simply returns __id__.
.orderBy(firebase.firestore.FieldPath.documentId())
Interestingly enough __name__ also works, but that sorts the whole path, including the collection name (and also the id of course).
If I correctly understood your need, by doing the following you should get the correct order:
For each document, add a specific field of type number, called for example sortNbr and assign as value a timestamp you calculate (e.g. the epoch time, see Get Unix Epoch Time in Swift)
Then build a query sorted on this field value, like:
let docRef = db.collection("xxxx")
docRef.order(by: "sortNbr")
See the doc here: https://firebase.google.com/docs/firestore/query-data/order-limit-data
Yes, you can do this.
By default, a query retrieves all documents that satisfy the query in
ascending order by document ID.
See the docs here: https://firebase.google.com/docs/firestore/query-data/order-limit-data
So if you find a way to use a timestamp or other primary key value where the ascending lexicographical ordering is what you want, you can filter by any fields and still have the results sorted by the primary key, ascending.
Be careful to zero-pad your numbers to the maximum precision if using a numeric key like seconds since epoch or an integer sequence. 10 is lexicographical less than 2, but 10 is greater than 02.
Using ISO formatted YYYY-mm-ddTHH:MM:SS date-time strings would work, because they sort naturally in ascending order.
The order of the documents shown in the Firebase console is mostly irrelevant to the functioning of your code that uses Firestore. The console is just for browsing data, and that sorting scheme makes it relatively intuitive to find a document you might be looking for, if you know its ID. You can't change this sort order in the console.
Your code is obviously going to have other requirements, and those requirements should be coded into your queries, without regarding any sort order you see in the dashboard. If you want time-based ordering of your documents, you'll have to store some sort of timestamp field in the document, and use that for ordering. I don't recommend using the timestamp as the ID of a document, as that could cause problems for you in the future.

create unique id in mongodb from last inserted id using pymongo

Is there a way I can find the last inserted document and the field, i.e. _id or id such that I can increment and use when inserting a new document?
The issue is that I create my own id count, but I do not store this, now I've deleted records, I cannot seem to add new records because I am attempting to use the same id.
There is no way to check insertion order in MongoDB, because the database does not keep any metadata in the collections regading the documents.
If your _id field is generated server-side then you need to have a very good algorithm for this value in order to provide collision avoidance and uniqueness while at the same time following any sequential constraints that you might have.

Mongodb store and select order

Basic question. Does mongodb find command will always return documents in the order they where added to collection? If no how is it possible to implement selection docs in the right order?
Sort? But what if docs where added simultaneously and say created date is the same, but there was an order still.
Well, yes and ... not exactly.
Documents are default sorted by natural order. Which is initially the order the documents are stored on disk, which is indeed the order in which the documents had been added to a collection.
This order however, is not deterministic, as document may be moved on disk once these documents grow after update operations, and can't be fit into current space anymore. This way the initial (insert) order may change.
The way to guarantee insert order sort is sort by {_id : 1} as long as the _id is of type ObjectId. This will return your documents sorted in ascending order.
Write operations do not take place simultaneously. Write locks are imposed in database level (V 2.4 and on). The first four bytes of _id is insert timestamp, and 3 last digits is a random counter used to distinguish (and sort) between ObjectId instances with same timestamp.
_id field is indexed by default

Does updating a MongoDB record rewrites the whole record or only the updated fields?

I have a MongoDB collection as follows:
comment_id (number)
comment_title (text)
score (number)
time_score (number)
final_score (number)
created_time (timestamp)
Score is and integer that's usually updated using $inc 1 or -1 whenever someone votes up or down for that record.
but time_score is updated using a function relative to timestamp and current time and other factors like how many (whole days passed) and how many (whole weeks passed) .....etc
So I do $inc and $dec on db directly but for the time_score, I retrieve data from db calculate the new score and write it back. What I'm worried about is that in case many users incremented the "score" field during my calculation of time_score then when I wrote time_score to db it'll corrupt the last value of score.
To be more clear does updating specific fields in a record in Mongo rewrites the whole record or only the updated fields ? (Assume that all these fields are indexed).
By default, whole documents are rewritten. To specify the fields that are changed without modifying anything else, use the $set operator.
Edit: The comments on this answer are correct - any of the update modifiers will cause only relevant fields to be rewritten rather than the whole document. By "default", I meant a case where no special modifiers are used (a vanilla document is provided).
The algorithm you are describing is definitely not thread-safe.
When you read the entire document, change one field and then write back the entire document, you are creating a race condition - any field in the document that is modified after your read but before your write will be overwritten by your update.
That's one of many reasons to use $set or $inc operators to atomically set individual fields rather than updating the entire document based on possibly stale values in it.
Another reason is that setting/updating a single field "in-place" is much more efficient than writing the entire document. In addition you have less load on your network when you are passing smaller update document ({$set:{field:value}}, rather than entire new version of the document).

Does newly inserted document in MongoDB surely has "bigger" _id than older document?

What's the algorithm for MongoDB to calculate the "_id" field. It looks it is incremental.
I'm wondering if it is safe to sort by "_id" field as sort by time the document inserted.
The way ids are generated is described here. Turns out leading bytes are given to the timestamp, so probably the order of ids corresponds to the order of insertion (if we don't consider deviations in time between different machines).
If you need to sort by order of insertion then you need to add your own field for timestamp or incremental counter. In a sharded set-up sorting by _id might not work.