Cloudant database search index - ibm-cloud

I have a Json document in cloudant as:
{
"createdAt": "2022-10-26T09:16:29.472Z",
"user_id": "4499c1c2-7507-4707-b0e4-ec83e2d2f34d",
"_id": "606a4d591031c14a8c48fcb4a9541ff0"
}
{
"createdAt": "2022-10-24T11:15:24.269Z",
"user_id": "c4bdcb54-3d0a-4b6a-a8a9-aa12e45345f3",
"_id": "fb24a15d8fb7cdf12feadac08e7c05dc"
}
{
"createdAt": "2022-10-24T11:08:24.269Z",
"user_id": "06d67681-e2c4-4ed4-b40a-5a2c5e7e6ed9",
"_id": "2d277ec3dd8c33da7642b72722aa93ed"
}
I have created a index json as:
{
"type": "json",
"partitioned": false,
"def": {
"fields": [
{
"createdAt": "asc"
},
{
"user_id": "asc"
}
]
}
}
I have created a index text as:
{
"type": "text",
"partitioned": false,
"def": {
"default_analyzer": "keyword",
"default_field": {},
"selector": {},
"fields": [
{
"_id": "string"
},
{
"createdAt": "string"
},
{
"user_id": "string"
}
],
"index_array_lengths": true
}
}
I have created a selctor cloudant query :
{
"selector": {
"$and": [
{
"createdAt": {
"$exists": true
}
},
{
"user_id": {
"$exists": true
}
}
]
},
"fields": [
"createdAt",
"user_id",
"_id"
],
"sort": [
{
"createdAt": "desc"
}
],
"limit": 10,
"skip": 0
}
This code work fine inside the cloudant ambient.
My problem is in the Search Index.
I created this function code that works,
function (doc) {
index("specialsearch", doc._id);
if(doc.createdAt){
index("createdAt", doc.createdAt, {"store":true})
}
if(doc.user_id){
index("user_id", doc.user_id, {"store":true})
}
}
result by this url:
// https://[user]-bluemix.cloudant.com/[database]/_design/attributes/_search/by_all?q=*:*&counts=["createdAt"]&limit=2
{
"total_rows": 10,
"bookmark": "xxx",
"rows": [
{
"id": "fb24a15d8fb7cdf12feadac08e7c05dc",
"order": [
1.0,
0
],
"fields": {
"createdAt": "2022-10-24T11:15:24.269Z",
"user_id": "c4bdcb54-3d0a-4b6a-a8a9-aa12e45345f3"
}
},
{
"id": "dad431735986bbf41b1fa3b1cd30cd0f",
"order": [
1.0,
0
],
"fields": {
"createdAt": "2022-10-24T11:07:02.138Z",
"user_id": "76f03307-4497-4a19-a647-8097fa288e77"
}
},
{
"id": "2d277ec3dd8c33da7642b72722aa93ed",
"order": [
1.0,
0
],
"fields": {
"createdAt": "2022-10-24T11:08:24.269Z",
"user_id": "06d67681-e2c4-4ed4-b40a-5a2c5e7e6ed9"
}
}
]
}
but it doesn't return the id sorted by date based on the createdAt and user_id keys.
What I would like is to get a list of an organized search with the index of the createdAt and user_id keys without having to indicate the value; a wildcard type search
Where am I wrong?
I have read several posts and guides but I did not understand how to do it.
Thanks for your help.

You say you want to return a list of id, createdAt and user_id, sorted by createdAt and user_id. And that you want all the documents returned.
If that is the case, what you need to do is simply create a MapReduce view of your data that emits the createdAt and user_id fields in that order, i.e. :
function (doc) {
emit([doc.createdAt, doc.user_id], 1);
}
You don't need to include the document id because that comes for free.
You can then query the view by visiting the URL:
https://<URL>/<database>/_design/<ddoc_name>/_view/<view_name>
You will get all the docs like this:
{"total_rows":3,"offset":0,"rows":[
{"id":"2d277ec3dd8c33da7642b72722aa93ed","key":["2022-10-24T11:08:24.269Z","06d67681-e2c4-4ed4-b40a-5a2c5e7e6ed9"],"value":1},
{"id":"fb24a15d8fb7cdf12feadac08e7c05dc","key":["2022-10-24T11:15:24.269Z","c4bdcb54-3d0a-4b6a-a8a9-aa12e45345f3"],"value":1},
{"id":"606a4d591031c14a8c48fcb4a9541ff0","key":["2022-10-26T09:16:29.472Z","4499c1c2-7507-4707-b0e4-ec83e2d2f34d"],"value":1}
]}

Related

How to filter array in nested documents in MongoDB?

I'm starting MongoDB and I have difficulties to understand how to filter some nested documents in an array. The objective if to keep only relevant data from a nested array.
Here is the data:
{
"_id": {
"$oid": "47bb"
},
"email": "myemail#gmail.com",
"orders": [
{
"orderNumber": "",
"products": [
{
"brand": "Brand 1",
"processing": {
"status": "pending"
}
}
],
"updated": {
"$date": {
"$numberLong": "1673031718883"
}
}
},
{
"orderNumber": "",
"products": [
{
"brand": "Brand 2",
"processing": {
"status": "pending"
}
}
],
"updated": {
"$date": {
"$numberLong": "1673031718883"
}
}
},
{
"orderNumber": "",
"products": [
{
"brand": "Brand 3",
"processing": {
"status": "processing"
}
}
],
"updated": {
"$date": {
"$numberLong": "1673031718883"
}
}
}
],
"privilege": {
"admin": false
},
"isVerified": {
"email": "true"
}
}
I want exactly the same data structure with 'orders.products.processing.status': 'pending'
The response from the database should be:
{
"_id": {
"$oid": "62b333644f70f94aa47bb4da"
},
"email": "myemail#gmail.com",
"orders": [
{
"orderNumber": "",
"products": [
{
"brand": "Brand 1",
"processing": {
"status": "pending"
}
}
],
"updated": {
"$date": {
"$numberLong": "1673031718883"
}
}
},
{
"orderNumber": "",
"products": [
{
"brand": "Brand 2",
"processing": {
"status": "pending"
}
}
],
"updated": {
"$date": {
"$numberLong": "1673031718883"
}
}
}
],
"privilege": {
"admin": false
},
"isVerified": {
"email": "true"
}
}
My closest attempt to a correct query is:
db.collection.aggregate([{
$unwind: '$orders'
},
{
$unwind: '$orders.products'
},
{
$match: {
"orders.products.processing.status": 'pending'
}
}, {
$group: {
_id: {
"_id": "$_id",
"email": "$email",
"orders": {
"orderNumber": "$orders.orderNumber",
"products": {
"processing": "$orders.products.processing.updated",
"brand": "$orders.products.brand",
}
},
},
products: {
$push: "$orders.products"
},
}
}, {
$project: {
products: 0,
}
}
])
The problem is that the result lose the grouping by _id and loosing the initial json structure. Thanks.
You can try this query:
First $match to get only documents which have orders.products.processing.status as pending (later will be filtered and maybe is redundant but using $map and $filter I prefer to avoid to do over all collection).
Then $project to get only desired values. Here the trick is to return in orders only the orders you want.
To accomplish that you can use $map to iterate over the array and return a new one with values that matches the filter (like a JS map).
And then the $filter. Here are filtered values whose status is not pending and returned to the map that output in the field orders.
And this without $unwind and $group :)
db.collection.aggregate([
{
"$match": {
"orders.products.processing.status": "pending"
}
},
{
"$project": {
"email": 1,
"isVerified": 1,
"privilege": 1,
"orders": {
"$map": {
"input": "$orders",
"as": "order",
"in": {
"orderNumber": "$$order.orderNumber",
"products": {
"$filter": {
"input": "$$order.products",
"cond": {
"$eq": [ "$$this.processing.status", "pending" ]
}
}
}
}
}
}
}
}
])
Example here
And also a bonus... check this example here I've added a one more $filter. It's so messy but if you can understand is quite easy. The $map from the first example return an array, so now I'm using a $filter ussing that array and filtering (not show) the objects where products is empty (i.e. where products.processing.status is not pending).

Configure monitor query with limitation on aggeration

I am trying to configure a monitor that looks at data logged by cron jobs.
I want to trigger an alert if a job does stop to log data.
The query using SQL looks something like this:
POST _plugins/_sql/
{
"query" : "SELECT instance, job-id, count(*), max(#timestamp) as newest FROM job-statistics-* where #timestamp > '2022-09-28 00:00:00.000' group BY job-id, instance HAVING newest < '2022-09-28 08:45:00.000'"
}
Using exlplain I converted this to a JSON Query and made the timestamp dynamic:
{
"from": 0,
"size": 0,
"timeout": "1m",
"query": {
"range": {
"#timestamp": {
"from": "now-1h",
"to": null,
"include_lower": false,
"include_upper": true,
"boost": 1
}
}
},
"sort": [
{
"_doc": {
"order": "asc"
}
}
],
"aggregations": {
"composite_buckets": {
"composite": {
"size": 1000,
"sources": [
{
"job-id": {
"terms": {
"field": "job-id.keyword",
"missing_bucket": true,
"missing_order": "first",
"order": "asc"
}
}
},
{
"instance": {
"terms": {
"field": "instance.keyword",
"missing_bucket": true,
"missing_order": "first",
"order": "asc"
}
}
}
]
},
"aggregations": {
"count(*)": {
"value_count": {
"field": "_index"
}
},
"max(#timestamp)": {
"max": {
"field": "#timestamp"
}
}
}
}
}
}
From this query, the limitation on the aggeration max(#timestmap) is missing.
In the explain response it is here:
"name": "FilterOperator",
"description": {
"conditions": """<(max(#timestamp), cast_to_timestamp("2022-09-28 08:45:00.000"))"""
},
Ideally, this should be max(#timestmap) < now-30m
My question:
How can I integrate this into the query or the monitor?
Is there another way to do this?
Thanks a lot
Marius

springboot monogdb update nested document

Document Structure
{
"_id": {
"$oid": "61e6e300f78f707b9c3ec32f"
},
"userId": "61d51daa0e09c2a97f11c81d",
"services": [
{
"sId": "62036c3cde7ac3b60203bdb5",
"status": 1,
"permissionGroups": [
{
"pgId": "61e52858b6d31433bb7faf6c",
"status": 1,
"permissions": [
{
"pId": "61e3f5891e0b130b3a228ff0",
"status": 1
},
{
"pId": "61e4ec54974ad2600b58f2ad",
"status": 1
},
{
"pId": "61e4ec54974ad2600b58f2ae",
"status": 1
}
]
},
{
"pgId": "61e8456e5b1359cb2b89888c",
"status": 1,
"permissions": [
{
"pId": "61e5086fd3af37389f1ba2a0",
"status": 1
},
{
"pId": "61e50870d3af37389f1ba2a1",
"status": 1
},
{
"pId": "61e52313f1a85a169f4f9afd",
"status": 1
},
{
"pId": "61e525c4f1a85a169f4f9b00",
"status": 1
}
]
}
]
}
]
}
Im trying to update a field in the nested document (above mentioned) using mongo query in springboot. my purpose is to update the field called status inside permissions array. i know the query to update it using mongo shell, can someone help me convert the query to java or suggest me a way to do the same.
db.customer_service_details.update(
{
"userId": "61d51daa0e09c2a97f11c81d",
"services": {
"$elemMatch": {
"sId": "62036c3cde7ac3b60203bdb5","permissionGroups.pgId": "61e52858b6d31433bb7faf6c","permissionGroups.permissions.pId":"61e3f5891e0b130b3a228ff0"
}
}
},
{
"$set": { "services.$[outer].permissionGroups.$[inner].permissions.$[inner3].status": "7899" }
},
{
"arrayFilters": [{ "outer.sId": "62036c3cde7ac3b60203bdb5" },{ "inner.pgId": "61e52858b6d31433bb7faf6c" },{"inner3.pId":"61e3f5891e0b130b3a228ff0"}]
}
)

How can I merge two documents, get rid of duplicates and keep certain data?

I have the following data, which describes who is going to do what work.
Basically I want to replace the "workId" and "userId" with objects that contain all the data from their respective documents and retain the "when" data.
I am starting with this data:
{
"schedule": {
"WorkId": "4e51dc1069c27c015ede4e3e",
"daily": [
{
"when": 1,
"U_W": [
{
"workId": "3a60dc1069c27c015ede1111",
"userId": "5f60c3b7f93d8e00a1cdf414"
},
{
"workId": "3a60dc1069c27c015ede1122",
"userId": "5f60c3b7f93d8e00a1cdf415"
}
]
}
]
}
}
Here is the user table
"userSchema": [
{
_id: "5f60c3b7f93d8e00a1cdf414",
Name: "Bob"
},
{
_id: "5f60c3b7f93d8e00a1cdf415",
Name: "Joe"
}
],
Here is the work table
"workSchema": [
{
_id: "3a60dc1069c27c015ede1111",
Name: "shovel"
},
{
_id: "3a60dc1069c27c015ede1122",
Name: "hammer"
}
]
what I want to end up with is this
{
"schedule": {
"WorkId": "4e51dc1069c27c015ede4e3e",
"daily": [
{
"when": 1,
"U_W": [
{
"work": {
"id": "3a60dc1069c27c015ede1111",
"name": "shovel"
},
"user": {
"id": "5f60c3b7f93d8e00a1cdf414",
"name": "bob"
}
},
{
"work": {
"id": "3a60dc1069c27c015ede1122",
"name": "hammer"
},
"user": {
"id": "5f60c3b7f93d8e00a1cdf415",
"name": "joe"
}
}
]
}
]
}
}
Here is my first attempt:
I have it joining the the two documents
How can I get rid of the duplicates ( bob:hammer and joe:shovel ) ?
and how do I include the "when" ?
Here is the playground that provides the following :
[
{
"_id": ObjectId("5a934e000102030405000000"),
"user_info": {
"Name": "Bob",
"_id": "5f60c3b7f93d8e00a1cdf414"
},
"work_role": {
"Name": "shovel",
"_id": "3a60dc1069c27c015ede1111"
}
},
{
"_id": ObjectId("5a934e000102030405000000"),
"user_info": {
"Name": "Bob",
"_id": "5f60c3b7f93d8e00a1cdf414"
},
"work_role": {
"Name": "hammer",
"_id": "3a60dc1069c27c015ede1122"
}
},
{
"_id": ObjectId("5a934e000102030405000000"),
"user_info": {
"Name": "Joe",
"_id": "5f60c3b7f93d8e00a1cdf415"
},
"work_role": {
"Name": "shovel",
"_id": "3a60dc1069c27c015ede1111"
}
},
{
"_id": ObjectId("5a934e000102030405000000"),
"user_info": {
"Name": "Joe",
"_id": "5f60c3b7f93d8e00a1cdf415"
},
"work_role": {
"Name": "hammer",
"_id": "3a60dc1069c27c015ede1122"
}
}
]
After beating my head against the wall for some time...
I found a pretty cool feature of mongo "references"
eg:
REF_work: { type: Schema.Types.ObjectId, required: true, ref: 'work' },
REF_person: { type: Schema.Types.ObjectId, required: true, ref: 'users' },
then when I call it from my get function I add a populate to the find
assignments.find(query).populate('daily.cp.REF_person').populate('daily.cp.REF_work');
I get exactly what I want:
[
{
"_id": ObjectId("5a934e000102030405000000"),
"REF_person": {
"Name": "Bob",
"_id": "5f60c3b7f93d8e00a1cdf414"
},
"REF_work": {
"Name": "shovel",
"_id": "3a60dc1069c27c015ede1111"
}
},
{
"_id": ObjectId("5a934e000102030405000000"),
"REF_person": {
"Name": "Joe",
"_id": "5f60c3b7f93d8e00a1cdf415"
},
"REF_work": {
"Name": "hammer",
"_id": "3a60dc1069c27c015ede1122"
}
}
]

mongodb distinct query values

I have the following mongodb documents:
{
"_id": "",
"name": "example1",
"colors": [
{
"id": 1000000,
"properties": [
{
"id": "1000",
"name": "",
"value": "green"
},
{
"id": "2000",
"name": "",
"value": "circle"
}
]
} ]
}
{
"_id": "",
"name": "example2",
"colors": [
{
"id": 1000000,
"properties": [
{
"id": "1000",
"name": "",
"value": "red"
},
{
"id": "4000",
"name": "",
"value": "box"
}
]
} ]
}
I would like to get distinct queries on the value field in the array where id=1000
db.getCollection('product').distinct('colors.properties.value', {'colors.properties.id':{'$eq': 1000}})
but it returns all values in the array.
The expected Result would be:
["green", "red"]
There are a lot of way to do.
$match eliminates unwanted data
$unwind de-structure the array
$addToSet in $group gives the distinct data
The mongo script :
db.collection.aggregate([
{
$match: {
"colors.properties.id": "1000"
}
},
{
"$unwind": "$colors"
},
{
"$unwind": "$colors.properties"
},
{
$match: {
"colors.properties.id": "1000"
}
},
{
$group: {
_id: null,
distinctData: {
$addToSet: "$colors.properties.value"
}
}
}
])
Working Mongo playground