In Couchbase, expired document included to query view list with NULL contents - touch

Recently we started to use couchbase, We are using java spring-data-couchbase with Jersey to access couchbase. Accessing low level java-sdk-api we set expire time (TTL) to a particular document with the KEY(id). It's working fine. The code is as follows.
// define couchbaseTemplate for lower-level access to Java SDK
#Autowired
CouchbaseTemplate couchbaseTemplate;
// setExpiry method update expiry given a doc ID
#Override
public void setExpiry(String key, int expN) throws RepositoryException {
couchbaseTemplate.getCouchbaseClient().touch(key, expN);
}
The problem we face is when we try to get list of documents using query, the list contains the expired documents. And when we try to access the documents from the list we found it to be null.
But if we execute query after a while the expired document no longer include to the list.
Example: When the expN = 10 seconds, and we execute query around 10 seconds after setting the TTL, the expired documents included
If we execute query around 20 seconds after setting the TTL, the expired documents no longer included
in stale options we set
Query.setStale(Stale.false)
We have tried to manipulate
Query.setIncludeDocs
But no luck, any help....

Couchbase Server does expiries lazily. There are three ways an item can be expired:
When a document is accessed (get operation) the expiration value is checked
When the expiry pager runs
When disk compaction process runs (Only in Couchbase Server 3 and onwards)
As a result of this views will not be updated until one of these three processes has happened.
For this use case you could simple do a range query against the view using the current time so it only returns documents that have not expired. Assuming the time is the same on the cluster are well as the client and the view being used is this one:
function (doc, meta) {
emit(meta.expiration, null);
}
The meta.expiration is an epoch timestamp, so the following query could be used:
String currentEpoch = String.valueOf((System.currentTimeMillis()/1000));
bucket.query(ViewQuery.from("designdoc", "myview").startkey(currentEpoch));
Please note that this will return all alive documents that have an expiration set.
If you want to do something more interesting with date formats have a look at the Date and time selection half way down in the View and query examples chapter in the Couchbase Server manual.

As the Couchbase official documents said,
Detecting Expired Documents in Result Sets : If you are using views for indexing items from Couchbase Server, items that have not yet been removed as part of the expiry pager maintenance process will be part of a result set returned by querying the view. To exclude these items from a result set you should use query parameter include_doc set to true.
For expired documents, if you set include_doc=true , Couchbase Server returns a result set indicating the document does not exist anymore. Specifically, the key that had expired but had not yet been removed by the cleanup process will appear in the result set as a row where "doc":null :
So, this is how Couchbase works with expired documents.
For your case, just filter out the items where the doc is null, the rest will be your expected result.

Related

query modified documents and load other from cache in firestore

To reduce the number of reads it is a general technique to maintain timestamp of last edits in documents and comparing timestamp to load only modified documents.
Here is an example from firebase docs:
db.collection('groups')
.where('participants', 'array-contains', 'user123')
.where('lastUpdated', '>', lastFetchTimestamp)
.orderBy('lastUpdated', 'desc')
.limit(25)
They claim this would reduce the reads.
I tried implementing the use-case, I have a document as shown below:
I have sections in my app where I use scorecards to list top scorers, My query is as follows
private void loadFriendScores(UserScorecard scorecard) {
Query friendScoreQuery=scorecardRef.whereIn("uid", scorecard.getFriendsList())
.whereGreaterThan("lastActive", scorecard.getLastActive()).limit(5);
FirestoreRecyclerOptions<UserScorecard> friends = new FirestoreRecyclerOptions
.Builder<UserScorecard>()
.setQuery(friendScoreQuery, UserScorecard.class)
.setLifecycleOwner(getViewLifecycleOwner())
.build();
TopScoresAdapter friendsAdapter = new TopScoresAdapter(friends, getContext(), this);
binding.topScorersFriendsRcv.setAdapter(friendsAdapter);
binding.topScorersFriendsRcv.setLayoutManager(new LinearLayoutManager(getContext()));
}
I assumed the query to load all modified changes along with others (from cache):
The screen on android is as follows:
While I expected it to load all of my friendlist (as I understood from docs).
I suppose they did not mention that we need to fetch the cached list, there is a way to do a cached request in firestore.
but I'm not sure if this is reliable perhaps the cache will be cleaned and the last request would be empty ,
then, you should save the last response using the localstorage library
#react-native-async-storage/async-storage
I'm struggling myself with the costs issue. The reads are way higher then 50 reads and I'm not sure how to count them properly. so I upvoted the issue

Atomically query for all collection documents + watching for further changes

Our Java app saves its configurations in a MongoDB collections. When the app starts it reads all the configurations from MongoDB and caches them in Maps. We would like to use the change stream API to be able also to watch for updates of the configurations collections.
So, upon app startup, first we would like to get all configurations, and from now on - watch for any further change.
Is there an easy way to execute the following atomically:
A find() that retrieves all configurations (documents)
Start a watch() that will send all further updates
By atomically I mean - without potentially missing any update (between 1 and 2 someone could update the collection with new configuration).
To make sure I lose no update notifications, I found that I can use watch().startAtOperationTime(serverTime) (for MongoDB of 4.0 or later), as follows.
Query the MongoDB server for its current time, using command such as Document hostInfoDoc = mongoTemplate.executeCommand(new Document("hostInfo", 1))
Query for all interesting documents: List<C> configList = mongoTemplate.findAll(clazz);
Extract the server time from hostInfoDoc: BsonTimestamp serverTime = (BsonTimestamp) hostInfoDoc.get("operationTime");
Start the change stream configured with the saved server time ChangeStreamIterable<Document> changes = eventCollection.watch().startAtOperationTime(serverTime);
Since 1 ends before 2 starts, we know that the documents that were returned by 2 were at least same or fresher than the ones on that server time. And any updates that happened on or after this server time will be sent to us by the change stream (I don't care to run again redundant updates, because I use map as cache, so extra add/remove won't make a difference, as long as the last action arrives).
I think I could also use watch().resumeAfter(_idOfLastAddedDoc) (didn't try). I did not use this approach because of the following scenario: the collection is empty, and the first document is added after getting all (none) documents, and before starting the watch(). In that scenario I don't have previous document _id to use as resume token.
Update
Instead of using "hostInfo" for getting the server time, which couldn't be used in our production, I ended using "dbStats" like that:
Document dbStats= mongoOperations.executeCommand(new Document("dbStats", 1));
BsonTimestamp serverTime = (BsonTimestamp) dbStats.get("operationTime");

Setting the _id of cloudant in node-red to ensure order of documents

I have a node-red code that works in the following way:
It receives a message (json form) and saves it to cloudant DB
Then I can make an http call where I can see all the contents of the DB
This is all good, but the problem is that when it saves it to cloudant, it gives it a random _id, so the order of the documents in the DB isn't the same as the order they came in, but random.
Is there a way to maybe set the _id while saving in node red? Or is there another solution?
I just want that when I call the http it shows it in the order that it came in (last to first, or first to last, doesn't matter).
You can set the _id with a function node or a change node before passing it to the Cloudant out node.
But if you just want them in the order they arrived then add the timestamp field and make the query node use a view that sorts the documents by the timestamp

paginated data with the help of mongo inbound adapter in spring integration

I am using mongo inbound adapter for retrieving data from mongo. Currently I am using below configuration.
<int-mongo:inbound-channel-adapter
id="mongoInboundAdapter" collection-name="updates_IPMS_PRICING"
mongo-template="mongoTemplatePublisher" channel="ipmsPricingUpdateChannelSplitter"
query="{'flagged' : false}" entity-class="com.snapdeal.coms.publisher.bean.PublisherVendorProductUpdate">
<poller max-messages-per-poll="2" fixed-rate="10000"></poller>
</int-mongo:inbound-channel-adapter>
I have around 20 records in my data base which qualifies the mentioned query but as I am giving max-messages-per-poll value 2 I was expecting that i will get maximum 2 records per poll.
but I am getting all the records which qualifies the mentioned query. Not sure what I am doing wrong.
Actually I'd suggest to raise a New Feature JIRA ticket for that query-expression to allow to specify org.springframework.data.mongodb.core.query.Query builder, which has skip() and limit() options and from there your issue can be fixed like:
<int-mongo:inbound-channel-adapter
query-expression="new BasicQuery('{\'flagged\' : false}').limit(2)"/>
The mongo adapter is designed to return a single message containing a collection of query results per poll. So max-messages-per-poll makes no difference here.
max-messages-per-poll is used to short-circuit the poller and, in your case, the second poll is done immediately rather than waiting 10 seconds again. After 2 polls, we wait again.
In order to implement paging, you will need to use a query-expression instead of query and maintain some state somewhere that can be included in the query on each poll.
For example, if the documents have some value that increments you can store off that value in a bean and use the value in the next poll to get the next one.

Updating an element in all documents in a MongoDB collection

I am running the following query with the purpose of updating a single element in all the existing documents in the collection. I am basically trying to clear their value to "0".
Here is the code:
MongoCollection collection = db.GetCollection(DataAccessConfiguration.Settings.CollectionName);
var query = Query.Exists("ElementName", true);
var update = Update.Set("ElementName", "0");
collection.Update(query, update);
It only updates a single document.
How can I update all elements at once?
Updates in MongoDB affect 0 or 1 documents by default (0 only if the query specifier doesn't match anything). To update all documents, you need to pass UpdateFlags.Multi as the third argument Update. There is also a 4-argument version of Update which accepts the "safe mode" flag as the fourth argument.
(Safe mode bundles a getLastError command with the update, and causes the driver to wait until the server acknowledges that the write has succeeded. There are various options to safe mode that will wait for acknowledgement from multiple servers if you are using a replica set, that will wait only for a certain period of time and then return with an error, etc).
Also be sure to see the C# driver documentation for details on the API.