Does Spring Data Couchbase reactive support pagination? - reactive-programming

I have use-case to provide an api which returns list of users. As there are more than 10000 users so there is a need for pagination as well.
As I am using Spring data couchbase reactive I am not able to find a way to implement pagination in a reactive way for my api.
Is it possible to achieve pagination by which api callers can control how many number of records they want and they can process?

You can use the standard N1QL LIMIT and OFFSET:
#Query("#{#n1ql.selectEntity} where #{#n1ql.filter} and companyId = $1 LIMIT $2 OFFSET $3")
Flux<Users> listUsers(String companyId, Integer limit, Integer offset);

Related

Is there any interface PagingAndSortingRepository in firestore to handle the pagination similar to Spring Data

I want to do the pagination in Firestore using the rest API's (using springboot). Is there any way similar to spring data?
Currently I am using the Paginate data with query cursors. But here we will require the last element (last token/last count) or last snapshot to get the next batch.
Is there any other way to paginate using just page and page size in Firestore?
Unfortunately this is not possible, Firestore's pagination was design with a query cursor approach in mind.
Even if there were to exist a library that does this under the hood and gives you an illusion of a page/pagesize approach, which I don't think there is, it won't matter since you are using the REST API to connect to Firestore, so you should use the parameters that Firestore was designed to accept which are, as you mentioned, the last element(last token/last count) or last snapshot to get the next batch.

how does ScalarDb support pagination

I am using ScalarDB which provides ACID functionality on Cassandra. Does ScalarDB support pagination?
If a partition has say 100 records, can I query 10 records at a time with each query starting from where the previous one left?
No, Scalar DB doesn't support pagination.
It has to be done in an application side.

Firestore | Why do all where filters have to be on the same field?

Apparently Firestore does not not support queries with range filters on different fields, as described here:
My question is, WHY?
HOW can I query by time and location for example:
Firestore.instance.collection('events')
.where("eventTime", isGreaterThanOrEqualTo: DateTime.now().millisecondsSinceEpoch)
.where("eventLocation", isGreaterThan: lesserGeopoint)
.where("eventLocation", isLessThan: greaterGeopoint)
.where("eventStatus", isEqualTo: "created")
.orderBy('eventLocation', descending: false)
.orderBy('eventTime')
.snapshots(),
(Example from Flutter App written in Dart)
I receive the following Error:
All where filters other than whereEqualTo() must be on the same field. But you have filters on 'eventTime' and 'eventLocation', null)
I don't understand how this is not supported and how to solve queries like this?
Any help is appreciated :-)
Thanks,
Michael
Cloud Firestore gives a strong performance guarantee for any read operations it allows: the time a read operation takes depends on the number of items you're reading, not on the number of items in the collection. This is a quite unique performance guarantee, since it means that your queries will take the same amount of time when you have 1 million users as when you have 1 billion users.
Firestore only offers query operations for which it can maintain this performance guarantee. This is the main reason for the limitations you may find in the Firestore query API.
To work around the limit, you'll typically perform the filtering, and ordering on the primary field in the query, and then sort on the secondary field client-side
If you'd like to learn more about Firestore's query capabilities and limitations, I highly recommend watching How do queries work in Cloud Firestore? and the rest of the Get to Know Cloud Firestore video series.
Also see:
Firestore order by two fields
Firestore order by two fields in one query
Have you had a read through the page on indexes, which describes how some compound queries can be supported, and how composite indexes can be created and used?

Is it possible to use SQL functions in Sort.Order in Spring Data JPA

We have a query where we need to run a MySQL Function to get correct Order By results. We are using Spring Data JPA and the Where clause is made up of a Specification. The pagination and sorting is done by a Pageable instance. So I need to be able to pass the MySQL function convert() to the Sort.Order object in order to correctly sort our results. Since we use pagination, we need to be able to put that into the Sort.Order object. Is it possible?
I tried countless Google Searches, and all I would get is the high level documentation of Spring Data, or the result set would all just be about MySQL.

How to do basic aggregation with DynamoDB?

How is aggregation achieved with dynamodb? Mongodb and couchbase have map reduce support.
Lets say we are building a tech blog where users can post articles. And say articles can be tagged.
user
{
id : 1235,
name : "John",
...
}
article
{
id : 789,
title: "dynamodb use cases",
author : 12345 //userid
tags : ["dynamodb","aws","nosql","document database"]
}
In the user interface we want to show for the current user tags and the respective count.
How to achieve the following aggregation?
{
userid : 12,
tag_stats:{
"dynamodb" : 3,
"nosql" : 8
}
}
We will provide this data through a rest api and it will be frequently called. Like this information is shown in the app main page.
I can think of extracting all documents and doing aggregation at the application level. But I feel my read capacity units will be exhausted
Can use tools like EMR, redshift, bigquery, aws lambda. But I think these are for datawarehousing purpose.
I would like to know other and better ways of achieving the same.
How are people achieving dynamic simple queries like these having chosen dynamodb as primary data store considering cost and response time.
Long story short: Dynamo does not support this. It's not build for this use-case. It's intended for quick data access with low-latency. It simply does not support any aggregating functionality.
You have three main options:
Export DynamoDB data to Redshift or EMR Hive. Then you can execute SQL queries on a stale data. The benefit of this approach is that it consumes RCUs just once, but you will stick with outdated data.
Use DynamoDB connector for Hive and directly query DynamoDB. Again you can write arbitrary SQL queries, but in this case it will access data in DynamoDB directly. The downside is that it will consume read capacity on every query you do.
Maintain aggregated data in a separate table using DynamoDB streams. For example you can have a table UserId as a partition key and a nested map with tags and counts as an attribute. On every update in your original data DynamoDB streams will execute a Lambda function or some code on your hosts to update aggregate table. This is the most cost efficient method, but you will need to implement additional code for each new query.
Of course you can extract data at the application level and aggregate it there, but I would not recommend to do it. Unless you have a small table you will need to think about throttling, using just part of provisioned capacity (you want to consume, say, 20% of your RCUs for aggregation and not 100%), and how to distribute your work among multiple workers.
Both Redshift and Hive already know how to do this. Redshift relies on multiple worker nodes when it executes a query, while Hive is based on top of Map-Reduce. Also, both Redshift and Hive can use predefined percentage of your RCUs throughput.
Dynamodb is pure key/value storage and does not support aggregation out of the box.
If you really want to do aggregation using DynamoDB here some hints.
For you particular case lets have table named articles.
To do aggregation we need an extra table user-stats holding userId and tag_starts.
Enabled DynamoDB streams on table articles
Create a new lambda function user-stats-aggregate which is subscribed to articles DynamoDB stream and received OLD_NEW_IMAGES on every create/update/delete operation over articles table.
Lambda will perform following logic
If there is no old image, get current tags and increase by 1 every occurrence in the db for this user. (Keep in mind there could be the case there is no initial record in user-stats this user)
If there is old image see if tag was added or removed and apply change +1 or -1 depending on the case for each affected tag for received user.
Stand an API service retrieving these user stats.
Usually aggregation in DynamoDB could be done using DynamoDB streams , lambdas for doing aggregation and extra tables keeping aggregated results with different granularity.(minutes, hours, days, years ...)
This brings near realtime aggregation without need to do it on the fly per every request, you query on aggregated data.
Basic aggregation can be done using scan() and query() in lambda.