How to escape some characters in postgresql - postgresql

I have this data in one column in postgresql
{
"geometry":{
"status":"Point",
"coordinates":[
-122.421583,
37.795027
]
},
and i using his query
select * from students where data_json LIKE '%status%' ;
Above query return results but this one does not
select * from students where data_json LIKE '%status:%' ;
How can fix that

Of course the 2nd one doesn't find a match, there's no status: text in the value. I think you wanted:
select * from students where data_json LIKE '%"status":%'
... however, like most cases where you attempt text pattern matching on structured data this is in general a terrible idea that will bite you. Just a couple of problem examples:
{
"somekey": "the value is \"status\": true"
}
... where "status": appears as part of the text value and will match even though it shouldn't, and:
{
status : "blah"
}
where status has no quotes and a space between the quotes and colon. As far as JavaScript is concerned this is the same as "status": but it won't match.
If you're trying to find fields within json or extract fields from json, do it with a json parser. PL/V8 may be of interest, or the json libraries available for tools like pl/perl, pl/pythonu, etc. Future PostgreSQL versions will have functions to get a json key by path, test if a json value exists, etc, but 9.2 does not.
At this point you might be thinking "why don't I use regular expressions". Don't go there, you do not want to try to write a full JSON parser in regex. this blog entry is somewhat relevant.

Related

Create an operation with CEP using a custom fragment

I'm trying to write a custom CEP rule that creates an operation for an agent to gather the measurement that caused the rule to trigger. The CEP rule looks like this:
insert into CreateOperation
select
OperationStatus.PENDING as status,
"5345" as deviceId,
{
"c8y_GetData", {
"name": "get measurement data",
"measurementID": measurementEvent.measurement.id.value,
"measurementType": measurementEvent.measurement.type
}
} as fragments
from MeasurementCreated measurementEvent
where measurementEvent.measurement.type = "c8y_TemperatureMeasurement";
When I'm using simple strings for the measurementEvent.measurement... fields (e.g. "testString") the rule works. But those lines cause errors when typed like in this example. Changing the ":" to "," like in examples from the documentation makes the rule work but the result is
"name",
"get measurement data",
"measurementID",
"176438",
"measurementType",
"c8y_TemperatureMeasurement"
which doesn't work as key,value pair like "name": "get measurement data" would have.
Trying to encapsulate fragments within fragments doesn't seem to work either.
The fragments parameter in CEL is not an object. It is a list of key/value where the key is a JSONPath. Everything separated by commas (I know it looks weird). The curly braces in Esper actually indicate an array not an object.
Your statement should look like this:
insert into CreateOperation
select
"5345" as deviceId,
{
"c8y_GetData.name", "get measurement data",
"c8y_GetData.measurementID", measurementEvent.measurement.id.value,
"c8y_GetData.measurementType", measurementEvent.measurement.type
} as fragments
from MeasurementCreated measurementEvent
where measurementEvent.measurement.type = "c8y_TemperatureMeasurement";
Also note that I dropped the line with OperationStatus. On POST of an operation you cannot set the status. It will be automatically in status PENDING. Only on PUT you can change it. Keeping the line will result in an error on the API when trying to POST it.

significance of $ and "" in mongodb

I am learning MongoDB. Getting confused on usage of "$"
I have collection as below schema:
{
_id: 1,
"name": "test",
"city": "gr",
"sector": "IT",
"salary":1000
}
I find below output on executing below query:
Query Result
db.user.find({salary:2000}); Works
db.user.find({$salary:2000}); does not work(unknown top level operator: $salary)
db.user.aggregate({$group:{_id:null,avg:{$avg:"$salary"}}}); Works
db.user.aggregate({$group:{_id:null,avg:{$avg:$salary}}}); does not work($salary is not defined)
db.user.aggregate({$group:{_id:null,avg:{$avg:"salary"}}}); gives wrong output.
Can anyone please explain,what is the syntactical significance of "" and $ in mongoDB.
Hi lets look at these queries
1- db.user.find({salary:2000});
2- db.user.find({$salary:2000});
Take a look at this for find.
According to this find takes {field: value}, your first query works because salary is valid field.
Your second query doesn't work becuase there is no field $salary
3- db.user.aggregate({$group:{_id:null,avg:{$avg:"$salary"}}});
4- db.user.aggregate({$group:{_id:null,avg:{$avg:$salary}}});
5- db.user.aggregate({$group:{_id:null,avg:{$avg:"salary"}}});
For aggregation, lets take a look at this $avg.
Here it says that $avg takes {$avg: expression}. So you are actually keeping expression over there not a field.
Now take a look at this for expression.
Expression can be field paths and system variables, literals, expression objects, and expression operators.
Query numbers 3,4,5 aren't expression objects or expression operators. So lets eliminate these options.
Now lets take a look at $literal.
It states that literals can be of any type, however MongoDB parses literals that start with a dollar sign as a path to a field.
Finally take a look at Field Path and System variables.
It states "To specify a field path, use a string that prefixes with a dollar sign $ ... For example, "$user" to specify the field path for the user field or "$user.name" to specify the field path to "user.name" field."
That means you are specifying $salary as path to the field in $avg:"$salary" and query number 3 works.
Query number 4 doesn't work because $salary is an invalid expression.
This should explain the significance of ""
Query number 5 is not working because again it doesn't find any field to average on. Though it works because its a valid query it simply returns null.
You could have had
db.user.aggregate({$group:{_id:null,avg:{$avg:"some_non_existent_field"}}});
And the query will still run fine but you will get null for your results.
I hope this helps, this was a lot of fun to gather.

Mongo query with regex fails when backslash\newline is there in a field

Hi I have a field in a user collection called "Address".User saving their address from a textarea in my application. mongodb convert it to new line like following.
{
"_id": ObjectId("56a9ba0ffbe0856d4f8b456d"),
"address": "HOUSE NO. 3157,\r\nSECTOR 50-D",
"pincode": "",
},
{
"_id": ObjectId("56a9ba0ffbe0856d4f8b456d"),
"address": "HOUSE NO. 3257,\r\nSECTOR 50-C",
"pincode": "",
}
So now When I am running a search query on the basis of "address".Like following:
guardianAdd = $dm->getRepository('EduStudentBundle:GuardianAddress')->findBy(array(
'address' => new \MongoRegex('/.*' .$data['address'] . '.*/i'),
'isDelete' => false
));
echo count($guardianAdd);die;
it does not give any result. My Searchi key word is : "HOUSE NO.3157 SECTOR 50-D".
However if I am searching using like: HOUSE NO. 3157 its giving correct result.
Please advice how to fix this.Thanks in advance
First of all, trailing .* are redundant. regexps /.*aaa.*/ and /aaa/ are identical and match the same pattern.
Second, you probably need to use multiline modifier /pattern/im
Finally, it is not quite clear what you want to fix. The best think you can do is to provide some basic explanation of regex syntax in the search form, so users can search properly, e.g. HOUSE NO.*3157.*SECTOR 50-D to get best results.
You can make some bold assumptions and build the pattern with something like
$pattern = implode('\W+',preg_split('/\W+/', $data['address']))
which will give you a regexp HOUSE\W+NO\W+3157\W+SECTOR\W+50\W+D for different kind of HOUSE NO.3157 SECTOR 50-D requests, but it will cut all the regex flexibility available with bare input, and eventually will result with unexpected response anyway. You can follow this slippery slope and end up with your own query DSL to compile to regex, but I doubt it can be any better or more convenient than pure regex. It will be more error prone for sure.
Asking right question to get right answers is true not only on SO, but also in your application. Unfortunately there is no general solution to search for something that people have in mind, but fail to ask. I believe that in your particular case best code is no code.

MongoDB - Using regex wildcards for search that properly filter results

I have a Mongo search set up that goes through my entries based on numerous criteria.
Currently the easiest way (I know it's not performance-friendly due to using wildcards, but I can't figure out a better way to do this due to case insensitivity and users not putting in whole words) is to use regex wildcards in the search. The search ends up looking like this:
{ gender: /Womens/i, designer: /Voodoo Girl/i } // Should return ~200 results
{ gender: /Mens/i, designer: /Voodoo Girl/i } // Should return 0 results
In the example above, both searches are returning ~200 results ("Voodoo Girl" is a womenswear label and all corresponding entries have a gender: "Womens" field.). Bizarrely, when I do other searches, like:
{ designer: /Voodoo Girl/i, store: /Store XYZ/i } // should return 0 results
I get the correct number of results (0). Is this an order thing? How can I ensure that my search only returns results that match all of my wildcarded queries?
For reference, the queries are being made in nodeJS through a simple db.products.find({criteria}) lookup.
To answer the aside real fast, something like ElasticSearch is a wonderful way to get more powerful, performant searching capabilities in your app.
Now, the reason that your searches are returning results is that "mens" is a substring of "womens"! You probably want either /^Mens/i and /^Womens/i (if Mens starts the gender field), or /\bMens\b/ if it can appear in the middle of the field. The first form will only match the given field from the beginning of the string, while the second form looks for the given word surrounded by word boundaries (that is, not as a substring of another word).
If you can use the /^Mens/ form (note the lack of the /i), it's advisable, as anchored case-sensitive regex queries can use indexes, while other regex forms cannot.
$regex can only use an index efficiently when the regular expression has an anchor for the beginning (i.e. ^) of a string and is a case-sensitive match.

whoosh doesn't search for short words like "C#"

i am using whoosh to index over 200,000 books. but i have encountered some problems with it.
the whoosh query parser returns NullQuery for words like "C#", "C++" with meta-characters in them and also for some other short words. this words are used in the title and body of some documents so i am not using keyword type for them. i guess the problem is in the analysis or query-parsing phase of searching or indexing but i can't touch my data blindly. can anyone help me to correct this issue. Tnx.
i fixed the problem by creating a StandardAnalyzer with a regex pattern that meets my requirements,here is the regex pattern:
'\w+[#+.\w]*'
this will make tokenizing of fields to be done successfully, and also the searching goes well.
but when i use queries like "some query++*" or "some##*" the parsed query will be a single Every query, just the '*'. also i found that this is not related to my analyzer and this is the Whoosh's default behavior. so here is my new question: is this behavior correct or it is a bug??
note: removing the WildcardPlugin from the query-parser solves this problem but i also need the WildcardPlugin.
now i am using the following code:
from whoosh.util import rcompile
#for matching words like: '.NET', 'C++' and 'C#'
word_pattern = rcompile('(\.|[\w]+)(\.?\w+|#|\+\+)*')
#i don't need words shorter that two characters so i don't change the minsize default
analyzer = analysis.StandardAnalyzer(expression=word_pattern)
... now in my schema:
...
title = fields.TEXT(analyzer=analyzer),
...
this will solve my first problem, yes. but the main problem is in searching. i don't want to let users to search using the Every query or *. but when i parse queries like C++* i end up an Every(*) query. i know that there is some problem but i can't figure out what it is.
I had the same issue and found out that StandardAnalyzer() uses minsize=2 by default. So in your schema, you have to tell it otherwise.
schema = whoosh.fields.Schema(
name = whoosh.fields.TEXT(stored=True, analyzer=whoosh.analysis.StandardAnalyzer(minsize=1)),
# ...
)