NOSQL MONGODB How to select data (countries) that only has code with 3 characters? - mongodb

SQL How to select data (countries) that only has code with 3 characters?
My issue is the same as above link, but just that i have to do it in nosql.
I have to extract all the list of countries. however in the datatable under country column (2nd column), it includes continent as well, which is not what i want.
I realised those with the first column (iso_code) with only 3 characters are data with countries, while those with more than 3 characters are continents/non country. How do i go about extracting this?
my code in extracting everything:
db.owid_energy_data.distinct("country")

Use $strLenCP to compute the length of iso_code. $match with 3 to get the documents needed. $group to find distinct countries.
db.collection.aggregate([
{
"$match": {
$expr: {
$eq: [
3,
{
"$strLenCP": "$iso_code"
}
]
}
}
},
{
$group: {
_id: "$country"
}
}
])
Mongo Playground

Related

looking for some queries in mongodb. I have to write one query for every question

Questions are
How many products have a discount on them?
How many "Water" products don't have any discount on them?
How many products have "mega" in their name?
How many products have discount % greater than 30%?
Which brand in "Oil" section is selling the most number of products?
data format
{"_id":{"$oid":"62363ce631312ffd2dc724f5"},
{"Title":"Fortune mega"},
{"Brand":"Gewti Laurent}"
{"Name":"lorem ipsum"},
{"Original_Price":590},
{"Sale_price":590},
{"Product_category":"Oil"}
{"_id":{"$oid":"62363ce631342ffd2dc724f5"},
{"Title":"katy mega"},
{"Brand":"Gewti Laurent},
{"Name":"targi"},
{"Original_Price":1890},
{"Sale_price":1890},
{"Product_category":"Oil"}
{"_id":{"$oid":"62363ce641312ffd2dc724f5"},
{"Title":"ydnsu"},
{"Brand":"Gewti Laurent},
{"Name":"otgu"},
{"Original_Price":1390},
{"Sale_price":1290},
{"Product_category":"Water"}
{"_id":{"$oid":"62363ce431312ffd2dc724f5"},
{"Title":"ykjssu"},
{"Brand":"Gewti Laurent},
{"Name":"itru"},
{"Original_Price":190},
{"Sale_price":170},
{"Product_category":"Water"}
Here is using aggregation pipeline:
Products having discount:
db.collection.aggregate([
{
"$match": {
"$expr": {
"$lte": [
"$Sale_Price",
"$Original_Price"
]
}
}
}
])
"Water" products not having any discount:
db.collection.aggregate([
{
"$match": {
"Product_category": "Water",
"$expr": {
"$gte": [
"$Sale_Price",
"$Original_Price"
]
}
}
}
])
Note: If you want the opposite, i.e., find Water Products having discount simply change $gte to $lte
Products having "mega" in their name:
db.collection.find({
"Title": {
"$regex": "mega"
}
})
Note: This isn't aggregation. This query can be pulled off directly.
Products having discount % greater than 30:
db.collection.aggregate([
{
"$addFields": {
"discount_in_percent": {
"$multiply": [
{
"$divide": [
{
"$subtract": [
"$Original_Price",
"$Sale_price"
]
},
"$Original_Price"
]
},
100
],
}
}
},
{
"$match": {
"Product_category": "Water",
"$expr": {
"$gte": [
"$discount_in_percent",
30
]
}
}
}
])
Note: This is bit an overkill but it will let you know the percentage of discount on a product.
Your last question to find most selling product under a particular category is vague as pointed by #Rohit Roa. But from the given data it can the a product having the highest discount. For that you can play around a bit with second query by tweaking the $expr selector and using $project operator to get a particular product.
Since your question is more about how many rather than which you can look up on $size to get the count or loop the result.
How many products have a discount on them ?
db.products.find().toArray().filter(product => product.Sale_price < product.Original_Price).length
find() returns all products in the database, toArray() converts into an array on which you can use other functions. filter() then says look at every product in the list and return a new list of products where the Sale_price is lower than the Original_Price and .length just measures the length of that list.
How many "Water" products don't have any discount on them?
db.products.find({"Product_category": "Water"}).toArray().filter(product => product.Sale_price >= product.Original_Price).length
Similar to first one, except now that find() accepts a filter to filter by product category and filter() accepts a different filter.
How many products have "mega" in their name ?
db.products.find({"Name": {$regex: /mega/}}).toArray().length
Here use a regular expression to search for words containing mega
How many products have discount % greater than 30% ?
db.products.find().toArray().filter(product => (product.Original_Price-product.Sale_price) > 0.3*product.Original_Price).length
Similar to earlier, just change the filter.
Need more input to figure this out.
These commands will work in Mongo Shell. Refer to the documentation if you need to query the database via node or python using similar functions

MongoDB querying aggregation in one single document

I have a short but important question. I am new to MongoDB and querying.
My database looks like the following: I only have one document stored in my database (sorry for blurring).
The document consists of different fields:
two are blurred and not important
datum -> date
instance -> Array with an Embedded Document Object; Our instance has an id, two not important fields and a code.
Now I want to query how many times an object in my instance array has the group "a" and a text "sample"?
Is this even possible?
I only found methods to count how many documents have something...
I am using Mongo Compass, but i can also use Pymongo, Mongoengine or every other different tool for querying the mongodb.
Thank you in advance and if you have more questions please leave a comment!
You can try this
db.collection.aggregate([
{
$unwind: "$instance"
},
{
$unwind: "$instance.label"
},
{
$match: {
"instance.label.group": "a",
"instance.label.text": "sample",
}
},
{
$group: {
_id: {
group: "$instance.label.group",
text: "$instance.label.text"
},
count: {
$sum: 1
}
}
}
])

Mongo queries to search all the collections of a database (Mongo/PyMongo)

I have been stuck on how to query db which the common data structure of every document looks as:
{
"_id": {
"$oid": "5e0983863bcf0dab51f2872b"
},
"word": "never", // get the `word` value for each of below queries
"wordset_id": "a42b50e85e",
"meanings": [{
"id": "1f1bca9d9f",
"def": "not ever",
"speech_part": "adverb",
"synonyms": ["ne'er"]
}, {
"id": "d35f973ed0",
"def": "not at all",
"speech_part": "adverb"
}]
}
1) query to get all the wordfor speech_part: "adverb" (eg: never,....) //
2)query to get all the word for: word length of 6 and speech_part: "adverb"
I have learnt from SO that ,to search whole collections first i have to retrieve all collections in the database , but how to write a query is where i stuck
db.collection.find({"meanings.speech_part":"adverb"},{"_id":0, "word":1})
To get array of all word of a specific speech_part above is the query.
First part of the query is filter predicate like in your scenario matching speach_part.if your matching column were not inside another object or a object inside a array, you could just write {column_name: "something"}.
as speech_part is inside an object which is inside an array, you have to write {"parentClumn.key":"something"}, in your case {"meanings.speech_part":"adverb"}.
where second part of the query is projection where you define which columns you want in your result. so to get only word column values you do {word:1}, to have more column you do {word:1, etc:1}. While mongodb project _id by default, so to remove _id from result you have to explicitly set {_id:0}
db.collection.find({
"meanings.speech_part":"adverb",
"$expr": { "$gt": [ { "$strLenCP": "$word" }, 6 ] }
},{"_id":0, "word":1})
To get array of all word of a specific speech_part with length greater than 6. This one is a bit complex query. You can look up $expr documentation. In $expr you can run function on your column and match the result. In your case strLenCP is calculating the length of your word column value and then checking, is it greater then 6 by $gt comparison operator
You may try below query to get the matching rows. You will have to try the same with pymongo.
db.getCollection('test-collection').find(
{
'meanings.speech_part': 'adverb'
},
{
_id: 0,
word: 1
}
);
Read about the projections in mongodb here:
https://docs.mongodb.com/manual/tutorial/project-fields-from-query-results

How to compare one field with combined fields in mongodb

I have a collection like this
sample document
{
"field1":"one",
"field2":"two",
"field3":"three",
...some more key values
}
Now i have field value which is combination of field1, field2 and field3 values. i.e. for above document i have field = "onetwothree"
One option is to seggregate that field to 3 different fields but it is not possible because its pattern is not fixed
so I want to compare like this
db.collection.find({"filed1+field2+field3":field})
but it is not possible so one solution is to find all documents from collection and compare one by one but it is not optimized way to do this. So is it any better solution to do this?
Starting with MongoDB v3.6 you can run this query:
db.collection.find({
$expr: {
$eq: [
"onetwothree",
{ $concat: [ "$field1", "$field2", "$field3" ] }
]
}
})
Here's the documentation on $concat and $expr.

MongoDB: aggregate and group by splitting the id

My schema implementation is influenced from this tutorial on official mongo site
{
_id: String,
data:[
{
point_1: Number,
ts: Date
}
]
}
This is basically schema designed for time series data and I store data for each hour per device in an array in a single document. I create _id field combining device id which is sending the data and time. For example if a device having id xyz1234 sends a data at 2018-09-11 12:30:00 then my _id field becomes xyz1234:2018091112.
I create new doc if the document for that hour for that device doesn't exist otherwise I just push my data to the data array.
client.db('iot')
.collection('iotdata')
.update({_id:id},{$push:{data:{point_1,ts:date}}},{upsert:true});
Now I am facing problem while doing aggregation. I am trying to get these types of values
Min point_1 value for many devices in last 24 hours by grouping on device id
Max point_1 value for many devices in last 24 hours by grouping on device id
Average point_1 for many devices in last 24 hours by grouping on device id
I thought this is very simple aggregation then I realized device id is not direct but mixed with time data so it's not so direct to group data by device id. How can I split the _id and group based on device id? I tried my level best to write the question as clearly as possible so please ask questions in comments if any part of the question is not clear.
You can start with $unwind on data to get single document per entry. Then you can get deviceId using $substr and $indexOfBytes operators. Then you can apply your filtering condition (last 24 hours) and use $group to get min, max and avg
db.col.aggregate([
{
$unwind: "$data"
},
{
$project: {
point_1: "$data.point_1",
deviceId: { $substr: [ "$_id", 0, { $indexOfBytes: [ "$_id", ":" ] } ] },
dateTime: "$data.ts"
}
},
{
$match: {
dateTime: { $gte: ISODate("2018-09-10T12:00:00Z") }
}
},
{
$group: {
_id: "$deviceId",
min: { $min: "$point_1" },
max: { $max: "$point_1" },
avg: { $avg: "$point_1" }
}
}
])
You can use below query in 3.6.
db.colname.aggregate([
{"$project":{
"deviceandtime":{"$split":["$_id", ":"]},
"minpoint":{"$min":"$data.point_1"},
"maxpoint":{"$min":"$data.point_1"},
"sumpoint":{"$sum":"$data.point_1"},
"count":{"$size":"$data.point_1"}
}},
{"$match":{"$expr":{"$gte":[{"$arrayElemAt":["$deviceandtime",1]},"2018-09-10 00:00:00"]}}},
{"$group":{
"_id":{"$arrayElemAt":["$deviceandtime",0]},
"minpoint":{"$min":"$minpoint"},
"maxpoint":{"$max":"$maxpoint"},
"sumpoint":{"$sum":"$sumpoint"},
"countpoint":{"$sum":"$count"}
}},
{"$project":{
"minpoint":1,
"maxpoint":1,
"avgpoint":{"$divide":["$sumpoint","$countpoint"]}
}}
])