significance of $ and "" in mongodb - mongodb

I am learning MongoDB. Getting confused on usage of "$"
I have collection as below schema:
{
_id: 1,
"name": "test",
"city": "gr",
"sector": "IT",
"salary":1000
}
I find below output on executing below query:
Query Result
db.user.find({salary:2000}); Works
db.user.find({$salary:2000}); does not work(unknown top level operator: $salary)
db.user.aggregate({$group:{_id:null,avg:{$avg:"$salary"}}}); Works
db.user.aggregate({$group:{_id:null,avg:{$avg:$salary}}}); does not work($salary is not defined)
db.user.aggregate({$group:{_id:null,avg:{$avg:"salary"}}}); gives wrong output.
Can anyone please explain,what is the syntactical significance of "" and $ in mongoDB.

Hi lets look at these queries
1- db.user.find({salary:2000});
2- db.user.find({$salary:2000});
Take a look at this for find.
According to this find takes {field: value}, your first query works because salary is valid field.
Your second query doesn't work becuase there is no field $salary
3- db.user.aggregate({$group:{_id:null,avg:{$avg:"$salary"}}});
4- db.user.aggregate({$group:{_id:null,avg:{$avg:$salary}}});
5- db.user.aggregate({$group:{_id:null,avg:{$avg:"salary"}}});
For aggregation, lets take a look at this $avg.
Here it says that $avg takes {$avg: expression}. So you are actually keeping expression over there not a field.
Now take a look at this for expression.
Expression can be field paths and system variables, literals, expression objects, and expression operators.
Query numbers 3,4,5 aren't expression objects or expression operators. So lets eliminate these options.
Now lets take a look at $literal.
It states that literals can be of any type, however MongoDB parses literals that start with a dollar sign as a path to a field.
Finally take a look at Field Path and System variables.
It states "To specify a field path, use a string that prefixes with a dollar sign $ ... For example, "$user" to specify the field path for the user field or "$user.name" to specify the field path to "user.name" field."
That means you are specifying $salary as path to the field in $avg:"$salary" and query number 3 works.
Query number 4 doesn't work because $salary is an invalid expression.
This should explain the significance of ""
Query number 5 is not working because again it doesn't find any field to average on. Though it works because its a valid query it simply returns null.
You could have had
db.user.aggregate({$group:{_id:null,avg:{$avg:"some_non_existent_field"}}});
And the query will still run fine but you will get null for your results.
I hope this helps, this was a lot of fun to gather.

Related

MongoDB - find if a field value is contained in a given string

Is it possible to query documents where a specific field is contained in a given string?
for example if I have these documents:
{a: 'test blabla'},
{a: 'test not'}
I would like to find all documents that field a is fully included in the string "test blabla test", so only the first document would be returned.
I know I can do it with aggregation using $indexOfCP and it is also possible with $where and mapReduce. I was wandering if it's possible to do it in find query using the standard MongoDB operators (e.g., $gt, $in).
thanks.
I can think of 2 ways you could do this:
Option 1
Using $where:
db.someCol.find( { $where: function() { return "test blabla test".indexOf(this.a) > -1; } }
Explained: Find all documents whose value of field "a" is found WITHIN some given string.
This approach is universal, as you can run any code you like, but less recommended from a performance perspective. For instance, it cannot take advantage of indexes. Read full $where considerations here: https://docs.mongodb.com/manual/reference/operator/query/where/#considerations
Option 2
Using regex matching trickery, ONLY under certain circumstances; below is an example that only works with matching that the field value is found as a starting substring of the given string:
db.someCol.find( { a : /^(t(e(s(t( (b(l(a(b(l(a( (t(e(s(t)?)?)?)?)?)?)?)?)?)?)?)?)?)?)?)?$/ } )
Explained: Break up the components of your "should-be-contained-within" string and match against all sub-possibilities of that with regex.
For your case, this option is pretty much insane, but it's worth noting as there may be specific cases (such as limited namespace matching), where you would not have to break up each letter, but some very finite set of predetermined parts. And in that case, this option could still make use of indexes, and not suffer the $where performance pentalties (as long as the complexity of the regex doesn't outweigh that benefit :)
You can use regex to search .
db.company.findOne({"companyname" : {$regex : ".*hello.*"}});
If you are using Mongo v3.6+, you can use $expr.
As you mentioned $indexOfCP can be used to get index, here it will be
{
"$expr": {
{$ne : [{$indexOfCP: ["test blabla test", "$a"]}, -1]}
}
}
The field name should be prefixed with a dollar sign ($), as $expr allows filters from aggregation pipeline.

Find all Mongo document with array containing all search terms

I have set of documents that contain an array of search terms, e.g.
[ "apples", "oranges", "bananas" ]
The user will enter a search string of keyword prefixes, and I'd like to match all the documents that contain each term in the array. So, for example, "app oranges" will match the list above, but "applet oranges" wouldn't.
It would be fairly trivial to construct a $and query that checked that each term matched one of the items in the array as a prefix using $regex, however that doesn't go far enough...
Each keyword should have a unique match within the set, such that searching "apples app" will not match the list above because the "app" term can't match against "apple" since "apple" has already been matched. This constraint leads to a more subtle problem. Take this set as an example:
[ "france", "fred", "freddy" ]
If the user taps "fr france" then this should match. It's important that the match for "fr" doesn't remove "france" from the possible list of terms for the remaining keywords, otherwise the test for the term "france" that follows would fail.
I need to implement this as a Mongo query. I'm quite new to Mongo and I have't a clue where to start, or even of this is possible. Can it be done? If so, how?
To start with, you can use the $regex operator to match text patterns:
var searchTerms = "app oranges".split(" ");
var arr = [];
searchTerms.forEach(function(i){
var reg = new RegExp("^"+i);
arr.push({"names":{$regex:reg}});
})
db.collection.find({$and:arr});
Would give you the documents with array names containing values starting with app and containing oranges.
Each keyword should have a unique match within the set, such that searching "apples app" will not match the list above because the "app"
term can't match against "apple" since "apple" has already been
matched. This constraint leads to a more subtle problem. Take this set
as an example:
This logic should be carried out in the application server before/after firing the query. If the user enters a string that is a substring of another former input, then the query is bound to fail since it would have already matched the fromer.

Entity cannot be found by elasticsearch

I have the following entity in ElasticSearch:
{
"id": 123,
"entity-id": 1019,
"entity-name": "aaa",
"status": "New",
"creation-date": "2014-08-06",
"author": "bubu"
}
I try to query for all entities with status=New, so the above entity should appear there.
I run this code:
qesponse.setQuery(QueryBuilders.termQuery("status", "New"));
return qResponse.setFrom(start).setSize(size).execute().actionGet().toString();
But it return no result.
If I use this code (general search, not of specific field) I get the above entity.
qResponse.setQuery(QueryBuilders.queryString("New");
return qResponse.setFrom(start).setSize(size).execute().actionGet().toString();
Why?
The problem is a mismatch between a Term Query and using the Standard Analyzer when you index. The Standard Analyzer, among other things, lowercases the field when it's indexed:
Standard Analyzer
An analyzer of type standard is built using the Standard Tokenizer
with the Standard Token Filter, Lower Case Token Filter, and Stop
Token Filter.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-standard-analyzer.html
The Term query, however, matches without analysis:
Term Query
Matches documents that have fields that contain a term (not analyzed).
The term query maps to Lucene TermQuery.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-term-query.html
So in your case when you index the field status it becomes "new". But when you search with a Term Query it's looking for "New" - they don't match. They do match with a general search it works because the general search also uses the Standard Analyzer.
The default value of index for a string field is analyzed . So, when you write "status" = "New" , it will use standard_analyzer, and after analyzing it will write as "new" .
So, term Query doesn't seems to be working, If you wish to query like you specified ,write mapping for the field as "not_analyzed".
For more info. link

Mongodb $strcasecmp. Strange behaviour when the field content has dollar signs

I'm triying to compare two strings on MongoDB Aggregation Framework. This is the query I'm using:
db.people.aggregate({
$project:{
name:1,
balance:1,
compareBalance:{$strcasecmp:["$balance","$2,500.00"]}
}
});
My problem is that each "$balance" field has a dollar sign at the begining of the string, and the results returned by the query seem to be incorrect. For example:
{
"_id" : ObjectId("5257e2e7834a87e7ea509665"),
"balance" : "$1,416.00",
"name" : "Houston Monroe",
"compareBalance" : 1
}
As you can see the results, the field comparision is 1, but it should be -1 because $2,500.00 is higher than $1,416.00. In fact, all comparisions has a value of 1.
There is a workaround by using $substr to remove the dollar sign at the beginning of all fields, but I want to know who is doing this wrong, MongoDB or me.
Thanks in advance.
It sounds like you are trying to use the "balance" field as a numeric, for example might want to compare $10 to $100.
The best way to do this is to store the actual value, and add the formatting, the $ the , etc when displaying to the user.
So, you would have - balance: 2500
Slightly unrelated...
Not sure if you are doing much calculation on the value, but using binary floating point numbers for currency is a bad idea (can't accurately represent all numbers), so, it's often better to store an integer with the cents (or if high precision is required, an integer for hundredths of cents)
This could give: balanceCents: 250000 or balanceFourDec: 25000000
Then you can use $gt $lt and arithmetic
The $ is used as a field reference operator. So, the aggregation pipeline is trying to do a comparison between a field called "$balance" and "$2,500.00":
{
"balance": "$5,000.00",
"2,500.00": undefined
}
Of course, that's not what you are looking for.
You shouldn't start with the $ in the data. Also, unless you've got fixed length strings, sorting and comparisons isn't going to work the way you would expect if you're trying to store numbers as strings. If you're just doing this as an example, I'd suggest you use the actual math operators for numbers, and leave $strcasecmp to actual strings.
you can use the { $literal: < value > } pipeline operator to ignore the cash sign.
https://docs.mongodb.com/manual/reference/operator/aggregation/literal/

What does a dollar sign mean in mongodb in terms of groups?

According to the docs, a "$" is reserved for operators. If you look at the group operator however, values need to have a dollar prefixed. These values are not operators. What does it mean in this context then? Example below:
db.article.aggregate(
{ $group : {
_id : "$author",
docsPerAuthor : { $sum : 1 },
viewsPerAuthor : { $sum : "$pageViews" }
}}
);
Why does pageViews need a leading dollar sign? I've tried it locally and it doesn't work without the dollar sign.
In this case "$string" means you want to use the value of the key named "string" in the processed document. Contrast with "string" which would be a literal string.
$<field> is short for $$CURRENT.<field>:
"$" is equivalent to "$$CURRENT." where the CURRENT is a
system variable that defaults to the root of the current object in the
most stages, unless stated otherwise in specific stages. CURRENT can
be rebound.
And, "Unless documented otherwise, all stages start with CURRENT the same as ROOT."
Finally:
"ROOT: References the root document, i.e. the top-level document, currently being processed in the aggregation pipeline stage." Reference: System Variables
I.e. ROOT, and therefore CURRENT, is the document being grouped, and $<field> accesses a property of CURRENT.
Note:
CURRENT is modifiable. However, since $<field> is equivalent to $$CURRENT.<field>, rebinding CURRENT changes the meaning of $ accesses."
Reference: System Variables
You use $field-name format, when you want to reference a field from the original or intermediary document. Here you are summing up all the page views grouping them by author.