I need to delete certain entries from an Elasticsearch table. I cannot find any hints in the documentation. I'm also an Elasticsearch noob. The to be deleted rows will be identified by its type and an owner_id. Is it possible to call deleteByQuery with multiple parameters? Or any alternatives to reach the same?
I'm using this library: https://github.com/sksamuel/elastic4s
How the table looks like:
| id | type | owner_id | cost |
|------------------------------|
| 1 | house | 1 | 10 |
| 2 | hut | 1 | 3 |
| 3 | house | 2 | 16 |
| 4 | house | 1 | 11 |
In the code it looks like this currently:
deleteByQuery(someIndex, matchQuery("type", "house"))
and I would need something like this:
deleteByQuery(someIndex, matchQuery("type", "house"), matchQuery("owner_id", 1))
But this won't work since deleteByQuery only accepts a single Query.
In this example it should delete the entries with id 1 and 4.
Explaining it in JSON and rest API format, to make it more clear.
Index Sample documents
put myindex/_doc/1
{
"type" : "house",
"owner_id" :1
}
put myindex/_doc/2
{
"type" : "hut",
"owner_id" :1
}
put myindex/_doc/3
{
"type" : "house",
"owner_id" :2
}
put myindex/_doc/4
{
"type" : "house",
"owner_id" :1
}
Search using the boolean query
GET myindex/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"type": "house"
}
}
],
"filter": [
{
"term": {
"owner_id": 1
}
}
]
}
}
}
And query result
"hits" : [
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.35667494,
"_source" : {
"type" : "house",
"owner_id" : 1
}
},
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "4",
"_score" : 0.35667494,
"_source" : {
"type" : "house",
"owner_id" : 1
}
}
]
Related
I tried with the following json but the wiremock doesn't recognize my change. I read the documentation of wiremock and I saw that they said: JSON equality matching is based on JsonUnit and therefore supports placeholders. I also tried with both JDK 8 and JDK 13 but both are not working
Below is the detail
"method" : "POST",
"bodyPatterns" : [{
"equalToJson" : {
"recipient": {
"address": {
"city": "Bellevue",
"postalCode": "52031",
"countryCode": "US"
}
},
"sender": {
"address": {
"city": "",
"postalCode": "",
"countryCode": "HK"
}
},
"shipDate": "${json-unit.any-string}",
"accountNumber": {
"key": ""
}
},
Result when running selenium test with mock (I executed mock via java -jar tmp/wiremock.jar --global-response-templating --root-dir ./mock --port 1337 ):
|
{ | { <<<<< Body does not match
"recipient" : { | "recipient" : {
"address" : { | "address" : {
"city" : "Bellevue", | "city" : "Bellevue",
"postalCode" : "52031", | "postalCode" : "52031",
"countryCode" : "US" | "countryCode" : "US"
} | }
}, | },
"sender" : { | "sender" : {
"address" : { | "address" : {
"city" : "", | "city" : "",
"postalCode" : "", | "postalCode" : "",
"countryCode" : "HK" | "countryCode" : "HK"
} | }
}, | },
"shipDate" : "${json-unit.any-string}", | "shipDate" : "May-26-2020",
"accountNumber" : { | "accountNumber" : {
"key" : "" | "key" : ""
} | }
} | }
|
Can anybody make some suggestions here. Thank you for reading my question
The usage of "${json-unit.any-string}" is right. But placeholder works when the right dependency is used.
Using dependency com.github.tomakehurst:wiremock-jre8 worked for me.
Refer https://wiremock.org/docs/request-matching/ for more info. This would mention the following note
Placeholders are only available in the jre8 WireMock JARs, as the JsonUnit library requires at least Java 8.
you have to enable the placeholder as below and you should make sure you are using the jre-standalone jar. you seem to be using the normal standalone jar
"enablePlaceholders" : true
I have some input data :
Brand | Model | Number
Peugeot | 208 | 1
Peugeot | 4008 | 2
Renault | Clio | 3
Renault | Megane | 4
I would like to get both :
the sum for each brand
the global sum
Here is my expected output :
Brand | Number
Peugeot | 3
Renault | 7
Total | 10
I think I have to create two $group operations and set Total with $literal.
What is the right way to do so ?
As you said this can be done by 2 group bys, so let's start by putting some data in to mongo similar to your example input:
> db.cars.insertMany([
{ "Brand" : "Peugeot", "Model" : "208", "Number": 1 },
{ "Brand" : "Peugeot", "Model" : "4008", "Number": 2 },
{ "Brand" : "Renault", "Model" : "Clio", "Number": 3 },
{ "Brand" : "Renault", "Model" : "Megane", "Number": 4 }
]);
Now we've got all our cars inserted we can then aggregate these using the 2 group aggregation operators:
db.cars.aggregate([
{ $group : { "_id" : "$Brand", "Number" : { $sum : "$Number" }}},
{ $group : { "_id" : null, "Rows" : { $push : { "Brand" : "$$ROOT._id", "Number" : "$Number" } }, "Total" : {$sum : "$Number" } }}
])
This will give us the following output
{
"_id" : null,
"Rows" : [
{
"Brand" : "Renault",
"Number" : 7
},
{
"Brand" : "Peugeot",
"Number" : 3
}
],
"Total" : 10
}
We can then clean it up with a projection
db.cars.aggregate([
{ "$group" : { "_id" : "$Brand", "Number" : { $sum : "$Number" }}},
{ "$group" : { "_id" : null, "Rows" : { $push : { "Brand" : "$$ROOT._id", "Number" : "$Number" } }, "Total" : {$sum : "$Number" } } },
{ "$project" : { "_id" : 0, "Data" : { "$concatArrays" : [ "$Rows", [ { "Brand": { $literal : "Total" }, "Number" : "$Total" } ] ] } } }
])
Giving us the following result
{
"Data" : [
{
"Brand" : "Renault",
"Number" : 7
},
{
"Brand" : "Peugeot",
"Number" : 3
},
{
"Brand" : "Total",
"Number" : 10
}
]
}
I have a collection named genre_collection of following structure :
user | genres
----------------
1 | comedy
1 | action
1 | thriller
1 | comedy
1 | action
2 | war
2 | adventure
2 | war
2 | thriller
I'm trying to find the count for each genre for each user i.e. my ideal final result would be something like this :
1 | comedy |2
1 | action |1
1 | thriller |1
2 | war |2
2 | adventure |1
2 | thriller |1
Any helps would be really useful.
you can do this with aggregation using $group
try this :
db.genre_collection.aggregate([
{
$group:{
_id:{
genre:"$genres",
user:"$user"
},
count:{
$sum:1
}
}
}
])
output:
{ "_id" : { "genre" : "adventure", "user" : 2 }, "count" : 1 }
{ "_id" : { "genre" : "action", "user" : 1 }, "count" : 2 }
{ "_id" : { "genre" : "thriller", "user" : 2 }, "count" : 1 }
{ "_id" : { "genre" : "war", "user" : 2 }, "count" : 2 }
{ "_id" : { "genre" : "comedy", "user" : 1 }, "count" : 2 }
{ "_id" : { "genre" : "thriller", "user" : 1 }, "count" : 1 }
Try this :
db.genre_collection.aggregate([
{"$group" : {_id:{genres:"$genres"}, count:{$sum:1}}} ])
])
Hope it helps !!!
Say in mongo I have a collection that looks like this:
+----+-----+-----+----------+
| id | x | y | quantity |
+----+-----+-----+----------+
| 1 | abc | jkl | 5 |
+----+-----+-----+----------+
| 2 | jkl | xyz | 10 |
+----+-----+-----+----------+
| 3 | xyz | abc | 20 |
+----+-----+-----+----------+
I want to do a $group where x equals y and sum up the quantity. So the output would look like:
+-----+-------+
| x | total |
+-----+-------+
| abc | 25 |
+-----+-------+
| jkl | 15 |
+-----+-------+
| xyz | 30 |
+-----+-------+
Is this even possible to do in mongo?
You won't be performing a $group to retrieve the results. You're performing a $lookup. This feature is new in MongoDB 3.2.
Using the sample data you provided, the aggregation would be the following:
db.join.aggregate( [
{
"$lookup" : {
"from" : "join",
"localField" : "x",
"foreignField" : "y",
"as" : "matching_field"
}
},
{
"$unwind" : "$matching_field"
},
{
"$project" : {
"_id" : 0,
"x" : 1,
"total" : { "$sum" : [ "$quantity", "$matching_field.quantity"]}
}
}
])
The sample data set is pretty simple, so you'll need to test behavior when there are more than a simple result returned for a value, etc.
Edit:
It gets more complicated if there can be more than a single match between X and Y.
// Add document to return more than a single match for abc
db.join.insert( { "x" : "123", "y" : "abc", "quantity" : 100 })
// Had to add $group stage to consolidate matched results
db.join.aggregate( [
{
"$lookup" : {
"from" : "join",
"localField" : "x",
"foreignField" : "y",
"as" : "matching_field"
}
},
{
"$unwind" : "$matching_field"
},
{ "$group" : {
"_id" : { "x" : "$x", "quantity" : "$quantity" },
"matched_quantities" : { "$sum" : "$matching_field.quantity" }
}},
{
"$project" : {
"x" : "$_id.x",
"total" : { "$sum" : [ "$_id.quantity", "$matched_quantities" ]}
}
}
])
I have two PostgreSQL tables with the following data:
houses:
-# select * from houses;
id | address
----+----------------
1 | 123 Main Ave.
2 | 456 Elm St.
3 | 789 County Rd.
(3 rows)
and people:
-# select * from people;
id | name | house_id
----+-------+----------
1 | Fred | 1
2 | Jane | 1
3 | Bob | 1
4 | Mary | 2
5 | John | 2
6 | Susan | 2
7 | Bill | 3
8 | Nancy | 3
9 | Adam | 3
(9 rows)
In Spoon I have two table inputs the first named House Input with the SQL:
SELECT
id
, address
FROM houses
ORDER BY id;
The second table input is named People Input with the SQL:
SELECT
"name"
, house_id
FROM people
ORDER BY house_id;
I have both table input's going into a Merge Join that uses House Input as the first step with a key of id and People Input as the second step with a key of house_id.
I then have this going into a MongoDb Output with the database demo, collection houses, and Mongo document fields address and name. (As I am expecting MongoDB to assign the _id).
When I run the transformation and type db.houses.find(); from a Mongo shell, I get:
{ "_id" : ObjectId("52083706b251cc4be9813153"), "address" : "123 Main Ave.", "name" : "Fred" }
{ "_id" : ObjectId("52083706b251cc4be9813154"), "address" : "123 Main Ave.", "name" : "Jane" }
{ "_id" : ObjectId("52083706b251cc4be9813155"), "address" : "123 Main Ave.", "name" : "Bob" }
{ "_id" : ObjectId("52083706b251cc4be9813156"), "address" : "456 Elm St.", "name" : "Mary" }
{ "_id" : ObjectId("52083706b251cc4be9813157"), "address" : "456 Elm St.", "name" : "John" }
{ "_id" : ObjectId("52083706b251cc4be9813158"), "address" : "456 Elm St.", "name" : "Susan" }
{ "_id" : ObjectId("52083706b251cc4be9813159"), "address" : "789 County Rd.", "name" : "Bill" }
{ "_id" : ObjectId("52083706b251cc4be981315a"), "address" : "789 County Rd.", "name" : "Nancy" }
{ "_id" : ObjectId("52083706b251cc4be981315b"), "address" : "789 County Rd.", "name" : "Adam" }
What I want to get is something like:
{ "_id" : ObjectId("52083706b251cc4be9813153"), "address" : "123 Main Ave.", "people" : [
{ "_id" : ObjectId("52083706b251cc4be9813154"), "name" : "Fred"} ,
{ "_id" : ObjectId("52083706b251cc4be9813155"), "name" : "Jane" } ,
{ "_id" : ObjectId("52083706b251cc4be9813155"), "name" : "Bob" }
]
},
{ "_id" : ObjectId("52083706b251cc4be9813156"), "address" : "345 Elm St.", "people" : [
{ "_id" : ObjectId("52083706b251cc4be9813157"), "name" : "Mary"} ,
{ "_id" : ObjectId("52083706b251cc4be9813158"), "name" : "John" } ,
{ "_id" : ObjectId("52083706b251cc4be9813159"), "name" : "Susan" }
]
},
{ "_id" : ObjectId("52083706b251cc4be981315a"), "address" : "789 County Rd.", "people" : [
{ "_id" : ObjectId("52083706b251cc4be981315b"), "name" : "Mary"} ,
{ "_id" : ObjectId("52083706b251cc4be981315c"), "name" : "John" } ,
{ "_id" : ObjectId("52083706b251cc4be981315d"), "name" : "Susan" }
]
}
}
I know why I am getting what I am getting, but can't seem to find anything online or in the examples to get me where I want to be.
I was hoping someone could nudge me in the right direction, point to an example that is closer to what I am trying to accomplish, or tell me that this is out of scope for what Kettle is supposed to do (Hopefully not the latter).
Turns out creating subtables is all in the MongoDB Output step.
First make sure that you have the Upsert and Modifier update checked on the Configure connection tab.
Then on the Mongo Documents field tab enter the following (The first line is column names):
Name | Mongo document Path | Use field name | Match field for upsert | Modifier operation | Modifier policy
--------+---------------------+----------------+------------------------|--------------------+----------------
address | | Y | N | N/A | Insert
address | | Y | Y | N/A | Insert
name | people[0] | Y | N | $set | Insert
name | people[1] | Y | N | $push | Update
Now when I run db.houses.find(); I get:
{ "_id" : ObjectId("520ccb8978d96b204daa029d"), "address" : "123 Main Ave.", "people" : [ { "name" : "Fred" }, { "name" : "Jane" }, { "name" : "Bob" } ] }
{ "_id" : ObjectId("520ccb8978d96b204daa029e"), "address" : "456 Elm St.", "people" : [ { "name" : "Mary" }, { "name" : "John" }, { "name" : "Susan" } ] }
{ "_id" : ObjectId("520ccb8a78d96b204daa029f"), "address" : "789 County Rd.", "people" : [ { "name" : "Bill" }, { "name" : "Nancy" }, { "name" : "Adam" } ] }
Two things I would like to note:
This assumes that my address are unique and that my name's are unique within a house. If this is not the case I would need to make my id's from my OLTP tables to id (not _id) fields in MongoDB and Match for field upsert on my house id.
As #G Gordon Worley III pointed out above, if these two tables are in the same database, I could do the join in the Table Output step, and this would be a two step transformation (and faster).