mongoDB distinct return multiple attributes - mongodb

Is there a way to combine distinct with another command to print not only the distinct attributes but also an attribute linked to the distinct one?
For example, print only 0,foo and 1,bar from the table below.
-----------------
| id | name |
| 0 | foo |
| 1 | bar |
| 1 | bar |
I am currently using
>db.foo.distinct('id')
to return ids on a db and what to use that to print the mathcing names.

You can try to this one:
db.foo.group({key:{'id':1}, initial: {sum:0}, reduce:function(doc,prev){prev.sum += 1}});

You can accomplish this with MapReduce as follows:
map = function(){
emit(this.id+","+this.name, {id: this.id, name: this.name})
}
reduce = function(key, values){
return {"id": values[0].id, "name": values[0].name};
}
db.mycollection.mapReduce(map, reduce, {out: "myresult_collection"})
db.myresult_collection.find({}, {value: true, _id: false})

Related

Postgresql - Filter object array and extract required values in a json object

I have a PostgreSQL table like below:
| data |
| -------------- |
| {"name":"a","tag":[{"type":"country","value":"US"}]} |
| {"name":"b","tag":[{"type":"country","value":"US"}]}, {"type":"country","value":"UK"}]} |
| {"name":"c","tag":[{"type":"gender","value":"male"}]} |
The goal is to extract all the value in "tag" array with "type" = "country" and aggregate them into a text array. The expected result is as follows:
| result |
| -------------- |
| ["US"] |
| ["US", "UK"] |
| [] |
I've tried to expand the "tag" array and aggregate the desired result back; however, it requires a unique id to group up the results. Hence, I add a column with row number to serve as unique id. Here is what I've done:
SELECT ROW_NUMBER() OVER () AS id, * INTO data_table_with_id FROM data_table;
SELECT ARRAY_AGG(tag_value) AS result
FROM (
SELECT
id,
json_array_elements("data"::json->'tag')->>'type' as tag_type,
json_array_elements("data"::json->'tag')->>'value' as tag_value
FROM data_table_with_id
) tags
WHERE tag_type = 'country'
GROUP BY id;
Is it possible to use a single select to filter the object array and get the required results?
You can do this easily with a JSON path function:
select jsonb_path_query_array(data, '$.tag[*] ?(#.type == "country").value')
from data_table;

Optimizing MongoDB indexing multiple field with multiple query

I am new to database indexing. My application has the following "find" and "update" queries, searched by single and multiple fields
reference | timestamp | phone | username | key | Address
update x | | | | |
findOne | x | x | | |
find/limit:16 | x | x | x | |
find/limit:11 | x | | | x | x
find/limit:1/sort:-1 | x | x | | x | x
find | x | | | |
1)update({"reference":"f0d3dba-278de4a-79a6cb-1284a5a85cde"}, ……….
2)findOne({"timestamp":"1466595571", "phone":"9112345678900"})
3)find({"timestamp":"1466595571", "phone":"9112345678900", "username":"a0001a"}).limit(16)
4)find({"timestamp":"1466595571", "key":"443447644g5fff", "address":"abc road, mumbai, india"}).limit(11)
5)find({"timestamp":"1466595571", "phone":"9112345678900", "key":"443447644g5fff", "address":"abc road, mumbai, india"}).sort({"_id":-1}).limit(1)
6)find({"timestamp":"1466595571"})
I am creating index
db.coll.createIndex( { "reference": 1 } ) //for 1st, 6th query
db.coll.createIndex( { "timestamp": 1, "phone": 1, "username": 1 } ) //for 2nd, 3rd query
db.coll.createIndex( { "timestamp": 1, "key": 1, "address": 1, phone: 1 } ) //for 4th, 5th query
Is this the correct way?
Please help me
Thank you
I think what you have done looks fine. One way to check if your query is using an index, which index is being used, and whether the index is effective is to use the explain() function alongside your find().
For example:
db.coll.find({"timestamp":"1466595571"}).explain()
will return a json document which details what index (if any) was used. In addition to this you can specify that the explain return "executionStats"
eg.
db.coll.find({"timestamp":"1466595571"}).explain("executionStats")
This will tell you how many index keys were examined to find the result set as well as the execution time and other useful metrics.

Mongo - find with multiple

Giving I have this data in my mongo collection
product_id | original_id | text
1 | "A00149" | "1280 x 1024"
1 | "A00373" | "Black"
2 | "A00149" | "1280 x 1024"
2 | "A00373" | "White"
3 | "A00149" | "1980 x 1200"
3 | "A00373" | "Black"
(I have added quotes around the values in hand - these are not in the real collection)
With the following query, Im getting 0 results, though I was expecting 1.
product_id = 1 should meet the query.
Can somebody explain me what Im doing wrong?
In SQL the where would look like this
WHERE
(original_id = "A00149" AND text = "1280 x 1024")
AND
(original_id = "A00373" AND text = "Black")
And the mongo query
db.Filter.find({
"find":true,
"query":{
"$and":[
{
"original_id":"A00149",
"text":"1280 x 1024"
},
{
"original_id":"A00373",
"text":"Black"
}
]
},
"fields":{
"product_id":1
}
});
If your collection is called 'Filter' and you want a query to return the document with product_id = 1 then its simple:
db.Filter.find({"product_id" : 1})
I maybe misunderstood your question though?
Edit:
try:
db.Filter.find({$and: [{"original_id": "A00149", "text": "1280 x 1024"}, {"original_id": "A00373", "text": "Black"}]},{"product_id": 1})
see http://docs.mongodb.org/manual/reference/operator/query/and/#op._S_and

group_by or distinct with postgres/dbix-class

I have a posts table like so:
+-----+----------+------------+------------+
| id | topic_id | text | timestamp |
+-----+----------+------------+------------+
| 789 | 2 | foobar | 1396026357 |
| 790 | 2 | foobar | 1396026358 |
| 791 | 2 | foobar | 1396026359 |
| 792 | 3 | foobar | 1396026360 |
| 793 | 3 | foobar | 1396026361 |
+-----+----------+------------+------------+
How would I could about "grouping" the results by topic id, while pulling the most recent record (sorting by timestamp desc)?
I've come to the understanding that I might not want "group_by" but rather "distinct on". My postgres query looks like this:
select distinct on (topic_id) topic_id, id, text, timestamp
from posts
order by topic_id desc, timestamp desc;
This works great. However, I can't figure out if this is something I can do in DBIx::Class without having to write a custom ResultSource::View. I've tried various arrangements of group_by with selects and columns, and have tried distinct => 1. If/when a result is returned, it doesn't actually preserve the uniqueness.
Is there a way to write the query I am trying through a resultset search, or is there perhaps a better way to achieve the same result through a different type of query?
Check out the section in the DBIC Cookbook on grouping results.
I believe what you want is something along the lines of this though:
my $rs = $base_posts_rs->search(undef, {
columns => [ {topic_id=>"topic_id"}, {text=>"text"}, {timestamp=>"timestamp"} ],
group_by => ["topic_id"],
order_by => [ {-desc=>"topic_id"}, {-desc=>"timestamp"} ],
})
Edit: A quick and dirty way to get around strict SQL grouping would be something like this:
my $rs = $base_posts_rs->search(undef, {
columns => [
{ topic_id => \"MAX(topic_id)" },
{ text => \"MAX(text)" },
{ timestamp => \"MAX(timestamp)" },
],
group_by => ["topic_id"],
order_by => [ {-desc=>"topic_id"}, {-desc=>"timestamp"} ],
})
Of course, use the appropriate aggregate function for your need.

How to map-reduce group, sort and count sort values

I have some problems with mapreduce.
I want to group, sort and count some values in collection. I have collection such as:
----------------------------
| item_id | date |
----------------------------
| 1 | 01/15/2012 |
----------------------------
| 2 | 01/01/2012 |
----------------------------
| 1 | 01/15/2012 |
----------------------------
| 1 | 01/01/2012 |
----------------------------
| 2 | 01/03/2012 |
----------------------------
| 2 | 01/03/2012 |
----------------------------
| 1 | 01/01/2012 |
----------------------------
| 1 | 01/01/2012 |
----------------------------
| 2 | 01/01/2012 |
----------------------------
| 2 | 01/01/2012 |
----------------------------
I want to group by item_id and count date by day for each item and sort date for each item and get result such as:
value: {{item_id:1, date:{01/01/2012:3, 01/15/2012:2 }},{item_id:2, date:{01/01/2012:3, 01/03/2012:2 }}}
I use mapReduce:
m=function()
{
emit(this.item_id, this.date);
}
r=function(key, values)
{
var res={};
values.forEach(function(v)
{
if(typeof res[v]!='undefined') ? res[v]+=1 : res[v]=1;
});
return res;
}
But I didn't receive result such as:
{{item_id:1, date:{01/01/2012:3, 01/15/2012:2 }},{item_id:2, date:{01/01/2012:3, 01/03/2012:2 }}}
Any ideas?
Given input documents of the form:
> db.dates.findOne()
{ "_id" : 1, "item_id" : 1, "date" : "1/15/2012" }
>
The following map and reduce functions should produce the output that you are looking for:
var map = function(){
myDate = this.date;
var value = {"item_id":this.item_id, "date":{}};
value.date[myDate] = 1;
emit(this.item_id, value);
}
var reduce = function(key, values){
output = {"item_id":key, "date":{}};
for(v in values){
for(thisDate in values[v].date){
if(output.date[thisDate] == null){
output.date[thisDate] = 1;
}else{
output.date[thisDate] += values[v].date[thisDate];
}
}
}
return output;
}
> db.runCommand({"mapReduce":"dates", map:map, reduce:reduce, out:{replace:"dates_output"}})
> db.dates_output.find()
{ "_id" : 1, "value" : { "item_id" : 1, "date" : { "1/15/2012" : 2, "1/01/2012" : 3 } } }
{ "_id" : 2, "value" : { "item_id" : 2, "date" : { "1/01/2012" : 3, "1/03/2012" : 2 } } }
Hopefully the above will do what you need it to, or at least get you pointed in the right direction.
For more information on using Map Reduce with MongoDB, please see the Mongo Documentation:
http://www.mongodb.org/display/DOCS/MapReduce
There are some additional Map Reduce examples in the MongoDB Cookbook:
http://cookbook.mongodb.org/
For a step-by-step walkthrough of how a Map Reduce operation is run, please see the "Extras" section of the MongoDB Cookbook recipe "Finding Max And Min Values with Versioned Documents" http://cookbook.mongodb.org/patterns/finding_max_and_min/
Good luck!