Dynamic Auto Sharding support for Scala Slick

Dynamic Auto Sharding support for Scala Slick - scala

This post is related to the issue already raised at Dynamically changing the database shard that I am connecting too.
Pointers to code that should be changed to implement this feature is given at https://github.com/slick/slick/issues/703
Am a newbie to Scala and Slick , Can I get some help ? , as to how to proceed implementing this feature. Is there any slick/scala pattern to do this at application level.
My problem is "I have pool of connections of different shards of MySQL, when I write a query/queries involving ID's (sharding keys), slick should dynamically run that particular query on the respective database shard"
For Example: If I write a query like this
val q = for ( user <- users.filter(_.name === "cat")
post <- posts.filter(_.postedBy === user.id)
comment <- comments.filter(_.postId === post.id)
} yield comment.content
q.run
a trivial case should be like one below.
users += User(id = 1, name = "cat", email = "cat#mat.com") => hits shard no 1
Even if the User ID, Post ID, Comment ID are dynamically produced, slick should hit the correct database shard using some sharding criteria ( key (ID) % 3 ) and everything should happen at the background just like single database query.
To implement the feature at application level
Is there any way to read Query object state dynamically so that I can write a function like
def func(q: Query[Something], shards: Map[Int, Database], num: Int): Unit = {
shards(q.getId % num).withSession{ implicit session => {
q.run
}
}
Usage:
val q = users.insert(User(id = 1, name = "cat", email = "cat#cat.com"))
func(q, shards, 10) => q executes on one of the 10 shards.
Thanks.

Related

Alternatives for withFilterExpression for supporting composite key

I'm trying to query dynamoDB through withFilterExpression. I get an error as the argument is a composite key
Filter Expression can only contain non-primary key attributes: Primary key attribute: question_id
and also as it uses OR operator in the query and it cannot be passed to withKeyConditionExpression.
The query that was passed to withFilterExpression is similar to this question_id = 1 OR question_id = 2. The entire code is like follows
def getQuestionItems(conceptCode : String) = {
val qIds = List("1","2","3")
val hash_map : java.util.Map[String, Object] = new java.util.HashMap[String, Object]()
var queries = ArrayBuffer[String]()
hash_map.put(":c_id", conceptCode)
for ((qId, index) <- qIds.zipWithIndex) {
val placeholder = ":qId" + index
hash_map.put(placeholder, qId)
queries += "question_id = " + placeholder
}
val query = queries.mkString(" or ")
val querySpec = new QuerySpec()
.withKeyConditionExpression("concept_id = :c_id")
.withFilterExpression(query)
.withValueMap(hash_map)
questionsTable.query(querySpec)
}
Apart from withFilterExpression and withConditionExpression is there any other methods that I can use which is a part of QuerySpec ?

Let's raise things up a level. With a Query (as opposed to a GetItem or Scan) you provide a single PK value and optionally an SK condition. That's what a Query requires. You can't provide multiple PK values. If you want multiple PK values, you can do multiple Query calls. Or possibly you may consider a Scan across all PK values.
You can also consider having a GSI that presents the data in a format more suitable to efficient lookup.
Side note: With PartiQL you can actually specify multiple PK values, up to a limit. So if you really truly want this, that's a possibility. The downside is it raises things up to a new level of abstraction and can make inefficiencies hard to spot.

Updating a many-to-many join table in slick 3.0

I have a database structure with a many-to-many relationship between Dreams and Tags.
Dreams and Tags are kept in separate tables, and there is a join table between them as usual in this kind of situation, with the class DreamTag representing the connection:
protected class DreamTagTable(tag: Tag) extends Table[DreamTag](tag, "dreamtags") {
def dreamId = column[Long]("dream_id")
def dream = foreignKey("dreams", dreamId, dreams)(_.id)
def tagId = column[Long]("tag_id")
def tag = foreignKey("tags", tagId, tags)(_.id)
// default projection
def * = (dreamId, tagId) <> ((DreamTag.apply _).tupled, DreamTag.unapply)
}
I have managed to perform the appropriate double JOIN to retrieve a Dream with its Tags, but I struggled to do it in a fully non-blocking manner.
Here is my code for performing the retrieval, as this may shed some light on things:
def createWithTags(form: DreamForm): Future[Seq[Int]] = db.run {
logger.info(s"Creating dream [$form]")
// action to put the dream
val dreamAction: DBIO[Dream] =
dreams.map(d => (d.title, d.content, d.date, d.userId, d.emotion))
.returning(dreams.map(_.id))
.into((fields, id) => Dream(id, fields._1, fields._2, fields._3, fields._4, fields._5))
.+=((form.title, form.content, form.date, form.userId, form.emotion))
// action to put any tags that don't already exist (create a single action)
val tagActions: DBIO[Seq[MyTag]] =
DBIO.sequence(form.tags.map(text => createTagIfNotExistsAction(text)))
// zip allows us to get the results of both actions in a tuple
val zipAction: DBIO[(Dream, Seq[MyTag])] = dreamAction.zip(tagActions)
// put the entries into the join table, if the zipAction succeeds
val dreamTagsAction = exec(zipAction.asTry) match {
case Success(value) => value match {
case (dream, tags) =>
DBIO.sequence(tags.map(tag => createDreamTagAction(dream, tag)))
}
case Failure(exception) => throw exception
}
dreamTagsAction
}
private def createTagIfNotExistsAction(text: String): DBIO[MyTag] = {
tags.filter(_.text === text).result.headOption.flatMap {
case Some(t: MyTag) => DBIO.successful(t)
case None =>
tags.map(t => (t.text))
.returning(tags.map(_.id))
.into((text, id) => MyTag(id, text)) += text
}
}
private def createDreamTagAction(dream: Dream, tag: MyTag): DBIO[Int] = {
dreamTags += DreamTag(dream.id, tag.id)
}
/**
* Helper method for executing an async action in a blocking way
*/
private def exec[T](action: DBIO[T]): T = Await.result(db.run(action), 2.seconds)
Scenario
Now I'm at the stage where I want to be able to update a Dream and the list of Tags, and I'm struggling.
Given a situation where the existing list of tags is ["one", "two", "three"] and is being updated to ["two", "three", "four"] I want to:
Delete the Tag for "one", if no other Dreams reference it.
Not touch the entries for "two" and "three", as the Tag and DreamTag entries already exist.
Create Tag "four" if it doesn't exist, and add an entry to the join table for it.
I think I need to do something like list1.diff(list2) and list2.diff(list1) but that would require getting performing a get first, which seems wrong.
Perhaps my thinking is wrong - Is it best to just clear all entries in the join table for this Dream and then create every item in the new list, or is there a nice way to diff the two lists (previous and existing) and perform the deletes/adds as appropriate?
Thanks for the help.
N.B. Yes, Tag is a super-annoying class name to have, as it clashes with slick.lifted.Tag!
Update - My Solution:
I went for option 2 as mentioned by Richard in his answer...
// action to put any tags that don't already exist (create a single action)
val tagActions: DBIO[Seq[MyTag]] =
DBIO.sequence(form.tags.map(text => createTagIfNotExistsAction(text)))
// zip allows us to get the results of both actions in a tuple
val zipAction: DBIO[(Int, Seq[MyTag])] = dreamAction.zip(tagActions)
// first clear away the existing dreamtags
val deleteExistingDreamTags = dreamTags
.filter(_.dreamId === dreamId)
.delete
// put the entries into the join table, if the zipAction succeeds
val dreamTagsAction = zipAction.flatMap {
case (_, tags) =>
DBIO.sequence(tags.map(tag => createDreamTagAction(dreamId, tag)))
}
deleteExistingDreamTags.andThen(dreamTagsAction)

I struggled to do it in a fully non-blocking manner.
I see you have an eval call which is blocking. I looks like this can be replaced with a flatMap:
case class Dream()
case class MyTag()
val zipAction: DBIO[(Dream, Seq[MyTag])] =
DBIO.successful( (Dream(), MyTag() :: MyTag() :: Nil) )
def createDreamTagAction(dream: Dream)(tag: MyTag): DBIO[Int] =
DBIO.successful(1)
val action: DBIO[Seq[Int]] =
zipAction.flatMap {
case (dream, tags) => DBIO.sequence(tags.map(createDreamTagAction(dream)))
}
Is it best to just clear all entries in the join table for this Dream and then create every item in the new list, or is there a nice way to diff the two lists (previous and existing) and perform the deletes/adds as appropriate?
Broadly, you have three options:
Look in the database to see what tags exist, compare them to what you want the state to be, and compute a set of insert and delete actions.
Delete all the tags and insert the state you want to reach.
Move the problem to SQL so you insert tags where they don't already exist in the table, and delete tags that don't exist in your desired state. You'd need to look at the capabilities of your database and likely need to use Plain SQL in Slick to get the effect. I'm not sure what the insert would be for adding tags (perhaps a MERGE or upsert of some kind), but deleting would be of the form: delete from tags where tag not in (1,2) if you wanted a final state of just tags 1 and 2.
The trades off:
For 1, you need to run 1 query to fetch existing tags, and then 1 query for the deletes, and at least 1 for the inserts. This will change the smallest number of rows, but will be the largest number of queries.
For 2, you'll be executing at least 2 queries: a delete and 1 (potentially) for a bulk insert. This will change the largest number of rows.
For 3, you'll be executing a constant 2 queries (if your database can carry out the logic for you). If this is even possible, the queries will be more complicated.

Slick - Update query with multiple tables

I'm currently facing an issue with my update query in my scala-slick3 project. I have a Report-Class, which contains multiple Products and each Product contains multiple Parts. I want to implement a function that marks every Part of every Product within this Report as assessed.
I thought about doing something like this:
def markProductPartsForReportAsAssessed(reportId: Int) = {
val query = for {
(products, parts) <- (report_product_query filter(_.reportId === reportId)
join (part_query filter(_.isAssessed === false))
on (_.productId === _.productId))
} yield parts.isAssessed
db.run(query.update(true))
}
Now, when I run this code slick throws this exception:
SlickException: A query for an UPDATE statement must resolve to a comprehension with a single table.
I already looked at similiar problems of which their solutions (like this or this) weren't really satisfying to me.
Why does slick throw this excpetion or why is it a problem to begin with? I was under the impression that my yield already takes care of not "updating multiple tables".
Thanks in advance!

I guess it's because the UPDATE query requires just one table. If you write SQL for the above query, it can be
UPDATE parts a SET isAccessed = 'true'
WHERE a.isAccessed = 'false' and
exists(select 'x' from products b
where a.productId = b.producId and b.reportId = reportId)
Therefore, you can put conditions related with 'Product' table in the filter as follows.
val reportId = "123" // some variable
val subQuery = (reportId:Rep[String], productId:Rep[String]) =>
report_product_query.filter(r => r.report_id === reportId && r.product_id === productId)
val query = part_query.filter(p => p.isAccesssed === false:Rep[Boolean] &&
subQuery(reportId, p.productId).exists).map(_.isAccessed)
db.run(query.update(true))

How to use results of one slick query in another with computation in between

I have a database table InventoryLotUsage which has columns id, inventoryLotId, and date. I want to delete an InventoryLot, but before I can do that I need to update the InventoryLotUsage rows that have a foreign key inventoryLotId, based on date and some other conditions.
My question is how do I query for data using monadic joins, do some computations on it and use the result of those computations to run an update all in one transaction in Slick?
I was attempting to get a sequence of rows like this
for {
il <- InventoryLot.filter(_.id === id)
lotUsage <- InventoryLotUsage.filter(_.inventoryLotId === id).result
groupedUsage = lotUsage.groupBy(_.date)
...
}
my IDE suggests that lotUsage will be a Seq[InventoryLotUsageRows], but when compiling I get a type error because of the .result.
type mismatch;
found : slick.dbio.DBIOAction[FacilityProductRepository.this.InventoryLot,slick.dbio.NoStream,slick.dbio.Effect.Read with slick.dbio.Effect.Read with slick.dbio.Effect.Read]
required: slick.lifted.Query[?,?,?]
lotUsage <- InventoryLotUsage.filter(_.inventoryLotId === id).result
Without using .result its type is the InventoryLotUsage table. How can I wrangle the query into something usable for computation?

You need to compose DBIOActions to archive desired result. For example:
Load all data that you need
val loadStaffAction = (for {
il <- InventoryLot.filter(_.id === id)
lotUsage <- InventoryLotUsage.filter(_.inventoryLotId === id)
} yield (il, lotUsage)).result
Then you could use map/flatMap on loadStaffAction to create update statements based on computations. You can also use for-comprehensions here.
val updateAction = loadStaffAction.map { result =>
// creating update statements based on some computations and conditions
DBIO.seq(
InventoryLotUsage.filter(_.id === inventory1.id).map(_.name).update("new value"),
InventoryLotUsage.filter(_.id === inventory2.id).map(_.name).update("another value"),
);
}
After this you can run all queries in one transaction
db.run(updateAction.transactionally)

Using Elastic4s for Percolator Queries

I'm currently trying to create a percolator query with Elastic4s. I've got about this far but I can't seem to find any examples so I'm not sure how this quite works. So I've got:
val percQuery = percolate in esIndex / esType query myQuery
esClient.execute(percQuery)
Every time it runs it doesn't match anything. I figured out I need to be able to percolate on an Id but I can't seem to find any examples on how to do it, not even in the docs. I know with Elastic4s creating queries other than a percolator query lets you specify an id field like:
val query = index into esIndex / esType source myDoc id 12345
I've tried this way for percolate but it doesn't like the id field, does anyone know how this can be done?
I was using Dispatch Http to do this previously but I'm trying to move away from it. Before, I was doing this to submit the percolator query:
url(s"$esUrl/.percolator/$queryId)
.setContentType("application/json", "utf-8")
.setBody(someJson)
.POST
notice the queryId just need something similar to that but in elastic4s.

So you want to add a document and return the queries that are waiting for that id to be added? That seems an odd use for percolate as it will be a one time use only, as only one document can be added per id. You can't do a percolate currently on id in elastic4s, and I'm not sure if you can even do it in elasticsearch itself.
This is the best attempt I can come up with, where you have your own "id" field, which could mirror the 'proper' _id field.
object Test extends App {
import ElasticDsl._
val client = ElasticClient.local
client.execute {
create index "perc" mappings {
"idtest" as(
"id" typed StringType
)
}
}.await
client.execute {
register id "a" into "perc" query {
termQuery("id", "a")
}
}.await
client.execute {
register id "b" into "perc" query {
termQuery("id", "b")
}
}.await
val resp1 = client.execute {
percolate in "perc/idtest" doc("id" -> "a")
}.await
// prints a
println(resp1.getMatches.head.getId)
val resp2 = client.execute {
percolate in "perc/idtest" doc("id" -> "b")
}.await
// prints b
println(resp2.getMatches.head.getId)
}
Written using elastic4s 1.7.4

So after much more researching I figured out how this works with elastic4s. To do this in Elastic4s you actually have to use register instead of percolate like so:
val percQuery = register id queryId into esIndex query myQuery
This will register a percolator query at the id.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Dynamic Auto Sharding support for Scala Slick - scala

Related

Alternatives for withFilterExpression for supporting composite key

Updating a many-to-many join table in slick 3.0

Slick - Update query with multiple tables

How to use results of one slick query in another with computation in between

Using Elastic4s for Percolator Queries

Categories

Resources