Getting Cassandra query results asynchronously using Scala + Monix - scala

I'm building a REST API using AKKA Http, Monix and Datastax Java Driver for Apache Cassandra and I'm having some troubles while trying to fetch some Items from cassandra, wait for the query to be fulfilled and returning the results.
I'm able to print all the results easily, but unable to wait for the
query to be done and return all the items. My rest point simply
returns an empty array of items since it does not wait for the query
to be done.
I have an executeQuery method that takes:
queryString: String representing a cassandra query
page: Int useful for pagination
parameters: Any* representing parameters, if necessary for the query
And returns an Observable[Row].
Then, in order to perform such query, retrieve its result, parse them and send them back, I use Monix Observable and Subscription.
Let's suppose I want to retrieve some items by a common field known as pid:
import monix.execution.Ack
import monix.execution.Scheduler.Implicits.global
import com.datastax.driver.core.Row
import monix.reactive.Observable
import cassandra.src.CassandraHelper
import item.src.entity.{Item, Items}
. . .
val keyspace = "my_keyspace"
val table = "items"
. . .
def getItems() : Items = {
var itemList: Items = List()
val observable: Observable[Row] = CassandraHelper.executeQuery(
"SELECT * FROM " + keyspace + "." + table,
1
)
observable.subscribe { row =>
itemList ::= ItemMapper.rowToItem()(row)
Ack.Continue
}
Items(itemList)
}
Where rowToItem simply parses a row into an Item and Items: List[Item].
I was taking a look at Task but I'm not quite sure its what I'm looking for.
EDIT
With #Alexandru Nedelcu solution I'm able to print all the items in itemList as soon as they get inserted into it, but still getting an empty response for that call: { "items" : [] }.
Here's the edited code:
def getItems() : Items = {
var itemList: List[Item] = List()
val observable: Observable[Row] = CassandraHelper.executeQuery(
"SELECT * FROM " + keyspace + "." + table,
1
)
observable.subscribe { row =>
println(itemList)
itemList ::= ItemMapper.rowToItem()(row)
Ack.Continue
}
Items(itemList)
}
How can I wait for the results to be all parsed and inserted into items and then send them back?

From what I understand you have an Observable[Row] and you want to build an Items out of it, which aggregates every Row element from the source stream, is that correct?
If so, the foldLeftL is what you want, which will aggregate every element into a state and return the final result once the source stream completes:
// We need to suspend the Task, because your Items is probably a
// mutable object and it's best to suspend side effects ;-)
val items: Task[Items] = Task.suspend {
val initial: Items = _
val observable: Observable[Row] = ???
// This returns a Task[Items] when the source completes
observable.foldLeftL(initial) { (items, elem) =>
items ::= ItemMapper.rowToItem()(row)
// I don't understand if your `Items` is mutable or not
// but returning the same reference is fine
items
}
}
A Task is a lazy Future. And you can convert it into a Future with runAsync. More details here: https://monix.io/docs/2x/eval/task.html

Related

Updating a many-to-many join table in slick 3.0

I have a database structure with a many-to-many relationship between Dreams and Tags.
Dreams and Tags are kept in separate tables, and there is a join table between them as usual in this kind of situation, with the class DreamTag representing the connection:
protected class DreamTagTable(tag: Tag) extends Table[DreamTag](tag, "dreamtags") {
def dreamId = column[Long]("dream_id")
def dream = foreignKey("dreams", dreamId, dreams)(_.id)
def tagId = column[Long]("tag_id")
def tag = foreignKey("tags", tagId, tags)(_.id)
// default projection
def * = (dreamId, tagId) <> ((DreamTag.apply _).tupled, DreamTag.unapply)
}
I have managed to perform the appropriate double JOIN to retrieve a Dream with its Tags, but I struggled to do it in a fully non-blocking manner.
Here is my code for performing the retrieval, as this may shed some light on things:
def createWithTags(form: DreamForm): Future[Seq[Int]] = db.run {
logger.info(s"Creating dream [$form]")
// action to put the dream
val dreamAction: DBIO[Dream] =
dreams.map(d => (d.title, d.content, d.date, d.userId, d.emotion))
.returning(dreams.map(_.id))
.into((fields, id) => Dream(id, fields._1, fields._2, fields._3, fields._4, fields._5))
.+=((form.title, form.content, form.date, form.userId, form.emotion))
// action to put any tags that don't already exist (create a single action)
val tagActions: DBIO[Seq[MyTag]] =
DBIO.sequence(form.tags.map(text => createTagIfNotExistsAction(text)))
// zip allows us to get the results of both actions in a tuple
val zipAction: DBIO[(Dream, Seq[MyTag])] = dreamAction.zip(tagActions)
// put the entries into the join table, if the zipAction succeeds
val dreamTagsAction = exec(zipAction.asTry) match {
case Success(value) => value match {
case (dream, tags) =>
DBIO.sequence(tags.map(tag => createDreamTagAction(dream, tag)))
}
case Failure(exception) => throw exception
}
dreamTagsAction
}
private def createTagIfNotExistsAction(text: String): DBIO[MyTag] = {
tags.filter(_.text === text).result.headOption.flatMap {
case Some(t: MyTag) => DBIO.successful(t)
case None =>
tags.map(t => (t.text))
.returning(tags.map(_.id))
.into((text, id) => MyTag(id, text)) += text
}
}
private def createDreamTagAction(dream: Dream, tag: MyTag): DBIO[Int] = {
dreamTags += DreamTag(dream.id, tag.id)
}
/**
* Helper method for executing an async action in a blocking way
*/
private def exec[T](action: DBIO[T]): T = Await.result(db.run(action), 2.seconds)
Scenario
Now I'm at the stage where I want to be able to update a Dream and the list of Tags, and I'm struggling.
Given a situation where the existing list of tags is ["one", "two", "three"] and is being updated to ["two", "three", "four"] I want to:
Delete the Tag for "one", if no other Dreams reference it.
Not touch the entries for "two" and "three", as the Tag and DreamTag entries already exist.
Create Tag "four" if it doesn't exist, and add an entry to the join table for it.
I think I need to do something like list1.diff(list2) and list2.diff(list1) but that would require getting performing a get first, which seems wrong.
Perhaps my thinking is wrong - Is it best to just clear all entries in the join table for this Dream and then create every item in the new list, or is there a nice way to diff the two lists (previous and existing) and perform the deletes/adds as appropriate?
Thanks for the help.
N.B. Yes, Tag is a super-annoying class name to have, as it clashes with slick.lifted.Tag!
Update - My Solution:
I went for option 2 as mentioned by Richard in his answer...
// action to put any tags that don't already exist (create a single action)
val tagActions: DBIO[Seq[MyTag]] =
DBIO.sequence(form.tags.map(text => createTagIfNotExistsAction(text)))
// zip allows us to get the results of both actions in a tuple
val zipAction: DBIO[(Int, Seq[MyTag])] = dreamAction.zip(tagActions)
// first clear away the existing dreamtags
val deleteExistingDreamTags = dreamTags
.filter(_.dreamId === dreamId)
.delete
// put the entries into the join table, if the zipAction succeeds
val dreamTagsAction = zipAction.flatMap {
case (_, tags) =>
DBIO.sequence(tags.map(tag => createDreamTagAction(dreamId, tag)))
}
deleteExistingDreamTags.andThen(dreamTagsAction)
I struggled to do it in a fully non-blocking manner.
I see you have an eval call which is blocking. I looks like this can be replaced with a flatMap:
case class Dream()
case class MyTag()
val zipAction: DBIO[(Dream, Seq[MyTag])] =
DBIO.successful( (Dream(), MyTag() :: MyTag() :: Nil) )
def createDreamTagAction(dream: Dream)(tag: MyTag): DBIO[Int] =
DBIO.successful(1)
val action: DBIO[Seq[Int]] =
zipAction.flatMap {
case (dream, tags) => DBIO.sequence(tags.map(createDreamTagAction(dream)))
}
Is it best to just clear all entries in the join table for this Dream and then create every item in the new list, or is there a nice way to diff the two lists (previous and existing) and perform the deletes/adds as appropriate?
Broadly, you have three options:
Look in the database to see what tags exist, compare them to what you want the state to be, and compute a set of insert and delete actions.
Delete all the tags and insert the state you want to reach.
Move the problem to SQL so you insert tags where they don't already exist in the table, and delete tags that don't exist in your desired state. You'd need to look at the capabilities of your database and likely need to use Plain SQL in Slick to get the effect. I'm not sure what the insert would be for adding tags (perhaps a MERGE or upsert of some kind), but deleting would be of the form: delete from tags where tag not in (1,2) if you wanted a final state of just tags 1 and 2.
The trades off:
For 1, you need to run 1 query to fetch existing tags, and then 1 query for the deletes, and at least 1 for the inserts. This will change the smallest number of rows, but will be the largest number of queries.
For 2, you'll be executing at least 2 queries: a delete and 1 (potentially) for a bulk insert. This will change the largest number of rows.
For 3, you'll be executing a constant 2 queries (if your database can carry out the logic for you). If this is even possible, the queries will be more complicated.

Disable caching in Angular Firestore queries

I am running a firestore query to get data but the query is returning data from cached data queries earlier and then returns additional data (which was not queried earlier) in the second pass from server. Is there a way I can disable caching for firestore queries so that request goes to DB every time I query something.
this.parts$ = this.db.collection<OrderBom>('OrderBom', ref => {
let query : firebase.firestore.Query = ref;
query = query.where('orderPartLC', '==', this.searchValue.toLowerCase());
return query;
}).valueChanges();
Change that .valueChanges() to a .snapshotChanges() then you can apply a filter. See the example below.
I dont like changing default behavior (default configurations). I saw it's a desired behavior and the good practice is to show the data as soon as possible to the user, even if you refresh twice the screen.
I dont think is a bad practice to filter on fromCache === false when we dont have a choise. (In my case I do more requests after i receive this first one so due to promises and other async 'tasks' cache/server order is completly lost )
See this closed issue
getChats(user : User) {
return this.afs.collection<Chat>("chats",
ref => ref.where('participantsId', 'array-contains', user.id)
.snapshotChanges()
.pipe(filter(c=> c.payload.doc.metadata.fromCache === false)).
.pipe(map(//probaly want to parse your object here))
}
if using AngularFire2 you can try:
I read on the Internet that you can disable offline persistence - which caches your results -by not calling enablePersistence() on AngularFireStoreModule.
I have done the first and still had no success, but try it first. What I managed to do to get rid of caching results was to use the get() method from class DocumentReference. This method receives as parameter a GetOptions, which you can force the data to come from server. Usage example:
// fireStore is a instance of AngularFireStore injected by AngularFire2
let collection = fireStore.collection<any>("my-collection-name");
let options:GetOptions = {source:"server"}
collection.ref.get(options).then(results=>{
// results contains an array property called docs with collection's documents.
});
Persistence and caching should be disabled for angular/fire by default but it is not and there is no way to turn it off. As such, #BorisD's answer is correct but he hasn't explained it too well. Here's a full example for converting valueChanges to snapshotChanges.
constructor(private afs: AngularFirestore) {}
private getSequences(collection: string): Observable<IPaddedSequence[]> {
return this.afs.collection<IFirestoreVideo>('videos', ref => {
return ref
.where('flowPlayerProcessed', '==', true)
.orderBy('sequence', 'asc')
}).valueChanges().pipe(
map((results: IFirestoreVideo[]) => results.map((result: IFirestoreVideo) => ({ videoId: result.id, sequence: result.sequence })))
)
}
Converting the above to use snapshotChanges to filter out stuff from cache:
constructor(private afs: AngularFirestore) {}
private getSequences(collection: string): Observable<IPaddedSequence[]> {
return this.afs.collection<IFirestoreVideo>('videos', ref => {
return ref
.where('flowPlayerProcessed', '==', true)
.orderBy('sequence', 'asc')
}).snapshotChanges().pipe(
filter((actions: DocumentChangeAction<any>[], idx: number) => idx > 0 || actions.every(a => a.payload.doc.metadata.fromCache === false)),
map((actions: DocumentChangeAction<any>[]) => actions.map(a => ({ id: a.payload.doc.id, ...a.payload.doc.data() }))),
map((results: IFirestoreVideo[]) => results.map((result: IFirestoreVideo) => ({ videoId: result.id, sequence: result.sequence })))
)
}
The only differences are that valueChanges changes to snapshotChanges and then add the filter DocumentChangeAction and map DocumentChangeAction lines at the top of the snapshotChanges pipe, everything else remains unchanged.
This approach is discussed here

How to use results of one slick query in another with computation in between

I have a database table InventoryLotUsage which has columns id, inventoryLotId, and date. I want to delete an InventoryLot, but before I can do that I need to update the InventoryLotUsage rows that have a foreign key inventoryLotId, based on date and some other conditions.
My question is how do I query for data using monadic joins, do some computations on it and use the result of those computations to run an update all in one transaction in Slick?
I was attempting to get a sequence of rows like this
for {
il <- InventoryLot.filter(_.id === id)
lotUsage <- InventoryLotUsage.filter(_.inventoryLotId === id).result
groupedUsage = lotUsage.groupBy(_.date)
...
}
my IDE suggests that lotUsage will be a Seq[InventoryLotUsageRows], but when compiling I get a type error because of the .result.
type mismatch;
found : slick.dbio.DBIOAction[FacilityProductRepository.this.InventoryLot,slick.dbio.NoStream,slick.dbio.Effect.Read with slick.dbio.Effect.Read with slick.dbio.Effect.Read]
required: slick.lifted.Query[?,?,?]
lotUsage <- InventoryLotUsage.filter(_.inventoryLotId === id).result
Without using .result its type is the InventoryLotUsage table. How can I wrangle the query into something usable for computation?
You need to compose DBIOActions to archive desired result. For example:
Load all data that you need
val loadStaffAction = (for {
il <- InventoryLot.filter(_.id === id)
lotUsage <- InventoryLotUsage.filter(_.inventoryLotId === id)
} yield (il, lotUsage)).result
Then you could use map/flatMap on loadStaffAction to create update statements based on computations. You can also use for-comprehensions here.
val updateAction = loadStaffAction.map { result =>
// creating update statements based on some computations and conditions
DBIO.seq(
InventoryLotUsage.filter(_.id === inventory1.id).map(_.name).update("new value"),
InventoryLotUsage.filter(_.id === inventory2.id).map(_.name).update("another value"),
);
}
After this you can run all queries in one transaction
db.run(updateAction.transactionally)

Inner actions are not getting executed inside transaction in slick

I have requirement where i have to insert into book table and based on auto Id generated , I have to insert into bookModules table.For every bookModule autoincrement Id , I have to insert into bookAssoc table. But bookModule is populated using seq[bookMods] and bookAssociation data is populated using Seq[userModRoles].
I have written below code to achieve this but it is only executing action1. My inner actions are not getting executed. Please help me .
val action1 =bookDao.insert(book)
val action2 = action1.map { id => DBIO.sequence(
bookMods.map { bookMod =>
bookModDao.insert(new bookModule(None, id, bookMod.moduleId, bookMod.isActive))
.map { bookModId =>
userModRoles.map { userModRole =>
bookAssocDao.insert(new bookAssociation(None, bookModId, userModRole.moduleId, userModRole.roleId))
}
}
})
}
db.run(action2.transactionally)
EDIT 1: Adding code in for comphrension
val action1 = for{
bookId<-bookDao.insert(book) // db transaction
bookMod<-bookModules// this is scala collection // Iterate each element and Insert into tables
bookModId<-bookModDao.insert(new bookModule(None, bookId, bookMod.moduleId, bookMod.isActive))
userModRole<-userModRoles //// this is scala collection // Iterate each element and Insert into tables
bookAssocDao.insert(new bookAssociation(None, bookModId, userModRole.moduleId, userModRole.roleId))
}yield()
db.run(action2.transactionally)
You need to split your logic into two parts.
1) DBIO
2) iterates over collections.
In that case, a solution should be easy. But without using for comprehension.
bookModules.map{ bookMod =>
userModRoles.map{ userModRole =>
db.run(bookDao.insert(book).flatMap{ bookId =>
bookModDao.inser(new bookModule(None, bookId, bookMod.moduleId, bookMod.isActive)).map{ bookModId =>
bookAssocDao.insert(new bookAssociation(None, bookModId, userModRole.moduleId, userModRole.roleId))
}
}).transactionally
}
}
Try something like this. It should work. You can think about move db.run to Dao classes. And here, probably it's service you should work on Futures.
And sorry if I make some mistake with parenthesis, but here it's difficult to have everything clear :)

Using Elastic4s for Percolator Queries

I'm currently trying to create a percolator query with Elastic4s. I've got about this far but I can't seem to find any examples so I'm not sure how this quite works. So I've got:
val percQuery = percolate in esIndex / esType query myQuery
esClient.execute(percQuery)
Every time it runs it doesn't match anything. I figured out I need to be able to percolate on an Id but I can't seem to find any examples on how to do it, not even in the docs. I know with Elastic4s creating queries other than a percolator query lets you specify an id field like:
val query = index into esIndex / esType source myDoc id 12345
I've tried this way for percolate but it doesn't like the id field, does anyone know how this can be done?
I was using Dispatch Http to do this previously but I'm trying to move away from it. Before, I was doing this to submit the percolator query:
url(s"$esUrl/.percolator/$queryId)
.setContentType("application/json", "utf-8")
.setBody(someJson)
.POST
notice the queryId just need something similar to that but in elastic4s.
So you want to add a document and return the queries that are waiting for that id to be added? That seems an odd use for percolate as it will be a one time use only, as only one document can be added per id. You can't do a percolate currently on id in elastic4s, and I'm not sure if you can even do it in elasticsearch itself.
This is the best attempt I can come up with, where you have your own "id" field, which could mirror the 'proper' _id field.
object Test extends App {
import ElasticDsl._
val client = ElasticClient.local
client.execute {
create index "perc" mappings {
"idtest" as(
"id" typed StringType
)
}
}.await
client.execute {
register id "a" into "perc" query {
termQuery("id", "a")
}
}.await
client.execute {
register id "b" into "perc" query {
termQuery("id", "b")
}
}.await
val resp1 = client.execute {
percolate in "perc/idtest" doc("id" -> "a")
}.await
// prints a
println(resp1.getMatches.head.getId)
val resp2 = client.execute {
percolate in "perc/idtest" doc("id" -> "b")
}.await
// prints b
println(resp2.getMatches.head.getId)
}
Written using elastic4s 1.7.4
So after much more researching I figured out how this works with elastic4s. To do this in Elastic4s you actually have to use register instead of percolate like so:
val percQuery = register id queryId into esIndex query myQuery
This will register a percolator query at the id.