Elastic Search Date Range Filter Not Working - date

Context
I have an index with a field called "date" which contains dates. I need an elasticsearch query that returns records where date is greater than a specific date value.
Issue
Running the following query with range filter returns does not work. Records with earlier dates are returned in the result set.
{
"size": 1000,
"query": {
"filtered": {
"filter": {
"range": {
"date": {
"gt": "2014-02-23T00:00:00"
}
}
}
}
}
}
Questions
What is the correct query to pull data where date is greater than a
specific value?
If my query is syntactically correct, is there
something else I can go check (e.g. datatype of field is actually
date)?
How should I go about root causing this?
etc.

Solution
In lieu of implementing mapping, I came up with a partial solution. I used Chrome to analyze some of the Kibana traffic. I noticed Kibana is passing date filters as int values. So, I converted the dates to ints using Unix timestamp conversion and things are working now.
(Reference http://www.epochconverter.com/)
What about mapping?
I looked at the mappings earlier. On my index they don't exist. I seem to recall reading that mappings will be inferred for known types that have strong consistency.
My date data is consistent:
- no nulls
- dates are getting flipped from SQL, to C#, to Elastic
I guess I could implement a mapping, but I'm going with the Epoch conversion for now until I have a true need to map this for some other compelling reason.

Your query is syntactically correct.
Use get mapping API to see the document mapping:
curl -XGET 'http://localhost:9200/twitter/_mapping/tweet'
It's hard to say where goes wrong. Probably the mapping of date field is not date type actually.

Related

What is the alternative for es.mapping.timestamp in ElasticSearch 7.x?

The official documentation for ElasticSearch says:
The document field/property name containing the document timestamp. To specify a constant, use the format. Will not work on Elasticsearch 6.0+ index versions, but support will continue for 5.x index versions and below.
And accordingly, when I try to use it to make sure that my indices have timestamp separately, I get this error.
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot use timestamps on index/update requests in ES 6.x and above. Please remove the [es.mapping.timestamp] setting.
The code I tried is:
df.write.format("org.elasticsearch.spark.sql").config("es.mapping.timestamp","timestamp")mode("overwrite").save("indexname/doc")
timestamp is a field in the Spark dataframe. I have tried it with saveToEs as well and got the same error. Is there any way I can do this in ElasticSearch 7.x by using any other field?
I'm not too familiar w/ Spark but the _timestamp meta field is deprecated so you'll have to use a date field which'd be defined in the ES mapping -- depending on your spec its format could either be epoch_second or epoch_millis:
PUT your_index_name
{
"mappings": {
"properties": {
"timestamp": {
"type": "date",
"format": "epoch_second"
}
}
}
}
After that, I suppose you wouldn't need .config("es.mapping.timestamp","timestamp") anymore because the timestamp values in your docs would be directly parsed in ES as date fields supporting their respective date queries.

Sails js rest api date range

I have a sails api with the created date formatted like
"createdAt": "2018-11-01T11:49:53.700Z",
i can get the contains filtering on a field working e.g
api2/items?status=IN_PROGRESS
but can't get the date range working, have tried the following
api2/items?createdAt={'>=":2018-11-01T11:49:53.700Z, '<=":2018-11-01T11:49:53.700Z}
/api2/items?where={createdAt: { '>=': 2018-11-01T11:49:53.700Z, '<=': 2018-11-01T11:49:53.700Z }}
any ideas?
I don't believe waterline supports this type of query on a datetime field. I would urge you to instead store these as a number (the unix time), which you will more easily be able to do such queries. When you want to format these items for display, you can use moment.js to help out.

Is it possible to search within a MongoDB field with type of Date or Int?

I have a search field within my application that I want users to enter a search term into and it searches across various fields in each Mongo document. Naturally, I can search data within the fields that are of type String (currently using a regular expression), but how do I do this for those with type Date or type Int that?
Note: when I say search within the field, I mean if a user types '16' into the search field, it will return dates that contain '16', e.g. 01/01/2016 or 16/03/2014. Same principle for integers.
One quick way I think is you can use $where
value to search val = "16"
db.foo.find({$where : "function(){ if(this.dateField.toString().indexOf(val)>= 0 || (""+this.intField).indexOf(val)>=0){return true;}}"})
What basically you can try is convert the field value into string and then search it in there. Downside is $where doesn't uses index, it basically scans the collection, You cannot use other operators when you are using $where.
Yes it is possible.
You can tell your find() to look into only fields which are of specific data types.
$type comes handy.
check out the following link for examples and usage.
https://docs.mongodb.com/manual/reference/operator/query/type/
an example would be
db.addressBook.find({"field": "search-value"}, { "field" : { $type : "double" } } )
will return documents where search-value has a match as well as if the field is of type double.
since i have an image hence posting it as another answer rather than a comment.
now if you notice, documents with ID 2 and ID 6 are exactly the same. Only difference is the data type for the zipcode field.
i ran my query with $and as you can see and it bring backs only the matching record. if i get rid of the $and and the $type condition it will bring back both the records.
i hope this will help you solve your issue.

Should I use the timestamp in "_id"?

I need monitor the time of the records been created, for further query and modify.
first thing flashed in my mind is give the document a "createDateTime" field, with the default value of "new Date()", but Mongodb said the document _id has a timestamp embedded with, and the id was generated when the document was created, so it sounds dummy to add a new field for that.
for too many times, I've seen people set a "createDateTime" for their data, and I don't know if they know about the details of mongodb's _id.
I want know should I use the _id as a "createDateTime" field? what is the best practice?
and the pros and cons.
thanks for any tips.
I'd actually say it depends on how you want to use the date.
For example, it's not actionable using the aggregation framework Date operators.
This will fail for example:
db.test.aggregate( { $group : { _id: { $year: "$_id" } } })
The following error occurs:
"errmsg" : "exception: can't convert from BSON type OID to Date"
(The date cannot be extracted from the ObjectId.)
So, operations that are normally simple date operations become much more complex if you wanted to do any sort of date math in an aggregation. It would be far easier to have a createDateTime stamp. Counting the number of documents created in a particular year and month would be simple using aggregation with a distinct createdDateTime field.
You can sort on an ObjectId, to some degree. The remaining 8 bytes of the ObjectId aren't sortable in a meaningful way. Most MongoDB drivers default to creating the ObjectId within the driver and not on the database. So, if you've got multiple clients (like web servers for example) creating new documents (and new ObjectIds), the time stamps will only be as accurate as the various servers.
Also, depending the precision you'd need, an ISODate value is stored using 8 bytes, rather than the 4 used in an ObjectId.
Yes, you should. There is no reason not to do, besides the human readability while directly looking into the database. See also here and here.
If you want to use the aggregation framework to group by the date within _id, this is not possible yet as WiredPrairie correctly said. There is an open jira ticket for that, you might watch. But of course you can do this with Map-Reduce and ObjectID.getTimestamp(). An example for that can be found here.

Rally bulk query api with date placeholder

I am using the Rally bulk query API to pull data from multiple tables. My issue happens when I try to use a placeholder for the Iteration's StartDate and pass it along to a following query same bulk request. i.e.
"iteration": "/Iteration?fetch=ObjectID,StartDate&query=(Name = \"Sprint 1\")",
"started": "${iteration.StartDate}",
"other_queries": "...?query=(CreatedDate > $(iteration.StartDate))"
The bulk service seems to convert this field to a formatted string. Is there a way to prevent this from happening? I am attempting to use the placeholder to limit other queries by date without making several requests.
It looks like the iteration object comes back with the date correctly, but when it is used as a placeholder it is automatically converted to a string.
"started": ["Wed Jan 16 22:00:00 MST 2013"],
"iteration": {
"Results": [
....
"StartDate": "2013-01-17T05:00:00.000Z",
]}
Unfortunately no, as this functionality is currently implemented, this is expected behavior. The placeholder is converted to a formatted String server-side, so it will be necessary to formulate a similar followup request if the same data is needed in another query.