reformat Timestamp from unix to iso - unix-timestamp

I´ve got Json input data, where i need to reformat the timestamp from unix-time to ISO 8601 (to process the file afterwards).
I tried to do this by using:
<input.json jq .[2].timestamp |= jq todate >output.json
This is reformating the timestamp in the right way, but how do i get the reformated timestamp back into the original file? I aim to get the original file with all its information, but reformated timestamp.
It works as i would like to have it at https://jqplay.org/ , but not in the command-line.
I appreciate your help!
Sample Input:
[
{
"channelId": 9088,
"errorCode": 0,
"value": 0,
"timestamp": 1460258309
},
{
"channelId": 10087,
"errorCode": 0,
"value": 1000,
"timestamp": 1460258294
},
{
"channelId": 10086,
"errorCode": 0,
"value": 90,
"timestamp": 1460258294
},
{
"errorCode": 0,
"errorLine": ""
}
]
Wanted Output:
[
{
"channelId": 9088,
"errorCode": 0,
"value": 0,
"timestamp": 2016-04-10T03:18:14Z
},
{
"channelId": 10087,
"errorCode": 0,
"value": 1000,
"timestamp": 2016-04-10T03:18:14Z
},
{
"channelId": 10086,
"errorCode": 0,
"value": 90,
"timestamp": 2016-04-10T03:18:14Z
},
{
"errorCode": 0,
"errorLine": ""
}
]

With your input:
<input.json jq 'map(if .timestamp then .timestamp |= todate else . end)'
the output is:
[
{
"channelId": 9088,
"errorCode": 0,
"value": 0,
"timestamp": "2016-04-10T03:18:29Z"
},
{
"channelId": 10087,
"errorCode": 0,
"value": 1000,
"timestamp": "2016-04-10T03:18:14Z"
},
{
"channelId": 10086,
"errorCode": 0,
"value": 90,
"timestamp": "2016-04-10T03:18:14Z"
},
{
"errorCode": 0,
"errorLine": ""
}
]

If you want every occurrence of the "timestamp" converted (no matter where it occurs), then if your jq has walk/1 you can use the filter illustrated by the following:
jq -n '[{timestamp: (24*60*60)}] | walk(if type == "object" and .timestamp then .timestamp |= todate else . end)'
[
{
"timestamp": "1970-01-02T00:00:00Z"
}
]
If your jq does not have walk/1, then you can copy its definition from https://github.com/stedolan/jq/blob/master/src/builtin.jq

Related

KAFKA connector Apply Tansform.filter.Value

I have create a connector to azureEventhub , it works fine to pull the data into the topic ,
My use case is to filter the messages that I m pullling basing on the message type .
Example :
{
"messageType": "Transformed",
"timeStamp": 1652113146105,
"payload": {
"externalId": "24323",
"equipmentType": "TemperatureSensor",
"measureType": "Temperature",
"normalizedData": {
"equipmentData": [
{
"key": "ReadingValue",
"value": 23,
"valueType": "number",
"measurementUnit": "celsius",
"measurementDateTime": "2022-05-09T16:18:34.0000000Z"
}
]
},
"dataProviderName": "LineMetrics"
},
},
{
"messageType": "IntegratorGenericDataStream",
"timeStamp": 1652113146103,
"payload": {
"dataSource": {
"type": "sensor",
},
"dataPoints": [
{
"type": "Motion",
"value": 0,
"valueType": "number",
"dateTime": "2022-05-09T16:18:37.0000000Z",
"unit": "count"
}
],
"dataProvider": {
"id": "ba84cbdb-cbf8-4d4f-9a55-93b43f671b5a",
"name": "LineMetrics",
"displayName": "Line Metrics"
}
},
}
I wanted to apply a filter on a value like shown in the pic:
enter image description here
the error that appears to me :
enter image description here

Increase Precision of Text Parsed via Facebook Duckling?

I'm using Facebook Duckling to parse some text but the results include some odd dimensions:
String text = "Tomorrow, February 28";
String result = Duckling.parseText(text);
Result:
[
{
"body": "Tomorrow, February 28",
"start": 0,
"value": {
"values": [
{
"value": "2022-02-28T00:00:00.000-08:00",
"grain": "day",
"type": "value"
}
],
"value": "2022-02-28T00:00:00.000-08:00",
"grain": "day",
"type": "value"
},
"end": 21,
"dim": "time",
"latent": false
},
{
"body": "28'",
"start": 19,
"value": {
"value": 28,
"type": "value",
"minute": 28,
"unit": "minute",
"normalized": {
"value": 1680,
"unit": "second"
}
},
"end": 22,
"dim": "duration",
"latent": false
},
{
"body": "28'",
"start": 19,
"value": {
"value": 28,
"type": "value",
"unit": "foot"
},
"end": 22,
"dim": "distance",
"latent": false
}
]
This result is odd since from the context of the query the text "28" is clearly referring to the day of month but Duckling also returns data as if it were referring to the Distance dimension.
Is there a way to make Duckling context aware and have it only return results matching the full query? Passing "dimensions" as argument is not ideal since I don't know the dimensions in advance.
Thanks

Elasticsearch - query dates without a specified timezone

I have an index with the following mappings - standard format for a date. In the 2nd record below the time specified is actually a local time - but ES treats it as UTC.
Even though ES is internally converting all parsed datetimes to UTC but it must obviously store the original string as well.
My question is whether (and how) it might be possible to query all records for which the scheduledDT value doesn't have the timezone explicitly specified.
{
"curator_v3": {
"mappings": {
"published": {
"analyzer": "classic",
"numeric_detection": true,
"properties": {
"Id": {
"type": "string",
"index": "not_analyzed",
"include_in_all": false
},
"createDT": {
"type": "date",
"format": "dateOptionalTime",
"include_in_all": false
},
"scheduleDT": {
"type": "date",
"format": "dateOptionalTime",
"include_in_all": false
},
"title": {
"type": "string",
"fields": {
"english": {
"type": "string",
"analyzer": "english"
},
"raw": {
"type": "string",
"index": "not_analyzed"
},
"shingle": {
"type": "string",
"analyzer": "shingle"
},
"spanish": {
"type": "string",
"analyzer": "spanish"
}
},
"include_in_all": false
}
}
}
}
}
}
We use .NET as our client to ElasticSearch and haven't been consistent in specifying a timezone for the scheduleDT field.
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 12,
"successful": 12,
"failed": 0
},
"hits": {
"total": 32,
"max_score": null,
"hits": [
{
"_index": "curator_v3",
"_type": "published",
"_id": "29651227",
"_score": null,
"fields": {
"Id": [
"29651227"
],
"scheduleDT": [
"2015-11-21T22:17:51.0946798-06:00"
],
"title": [
"97 Year-Old Woman Cries Tears Of Joy After Finally Getting Her High School Diploma"
],
"createDT": [
"2015-11-21T22:13:32.3597142-06:00"
]
},
"sort": [
1448165871094
]
},
{
"_index": "curator_v3",
"_type": "published",
"_id": "210466413",
"_score": null,
"fields": {
"Id": [
"210466413"
],
"scheduleDT": [
"2015-11-22T12:00:00"
],
"title": [
"6 KC treats to bring to Thanksgiving"
],
"createDT": [
"2015-11-20T15:08:25.4282-06:00"
]
},
"sort": [
1448193600000
]
}
]
},
"aggregations": {
"ScheduleDT": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 27,
"buckets": [
{
"key": 1448165871094,
"key_as_string": "2015-11-22T04:17:51.094Z",
"doc_count": 1
},
{
"key": 1448193600000,
"key_as_string": "2015-11-22T12:00:00.000Z",
"doc_count": 4
}
]
}
}
}
You can do this by querying the document having a scheduleDT whose field length is less than 20 characters (e.g. 2015-11-22T12:00:00). All the date fields with a specified time zone would be longer.
Something like this should do:
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "doc.scheduleDT.value.size() < 20"
}
}
}
}
}
Note, however, that in order to make your queries easier to create you should always try to convert all your timestamps in UTC before indexing your documents.
Finally, also make sure that you have dynamic scripting enabled in order to run the above query.
UPDATE
Actually, if you use the _source directly in the script it will work because it will return the real value from the source as it was when the document was indexed:
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "_source.scheduleDT.size() < 20"
}
}
}
}
}

find records by an array key in a document

I have some records like this
{
"_id": "test1",
"friend": {
"test2": {
"name": "ax",
"level": 1,
"intimacy": 0,
"status": 0
},
"test3": {
"name": "es",
"level": 1,
"intimacy": 0,
"status": 0
}
}
},
{
"_id": "test2",
"friend": {
"test2": {
"name": "ff",
"level": 1,
"intimacy": 0,
"status": 0
},
"test3": {
"name": "dd",
"level": 1,
"intimacy": 0,
"status": 0
}
}
}
i need find the node of friend.test2 in the document of _id=test1
i don`t want change the data struct to solve this problem.anyone can help me.
the result maybe like this
{
"test2": {
"name": "ax",
"level": 1,
"intimacy": 0,
"status": 0
}
}
Try below query.
db.<coll>.find({ _id:"test1" }, { "friend.test2":1 })
In mongo shell you can do this:
db.collection.find({_id:"test1"},{"friend.test2":1})
and for php you may try something like
$collection->find(array("_id"=>"test1"),array("friend.test2"=>true))
Following code is working correctly. Tested.
$results = $coll->find(array(), array("_id"=>"test1"),array("friend.test2"=> 1));
foreach($results as $test) {
print_r($test);
}

Using Elasticsearch with Mongodb-River for searching pdf

I want to search pdf files by content, but the resulting can't read properly the content of pdf file. It looks as following:
{
"took": 10,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.0,
"hits": [
{
"_index": "mongoindex",
"_type": "files",
"_id": "532595b8f37d5cc2d64a517d",
"_score": 1.0,
"_source": {
"content": {
"content_type": "application/pdf",
"title": "D:/sample.pdf",
"content": "JVBERi0xLjUNCiW1tbW1DQoxIDAgb2JqDQo8PC9UeXBlL0NhdGFsb2cvUGFnZXMgMiAwIFIvTGFuZyhlbi1VUykgPj4NCmVuZG9iag0KMiAwIG9iag0",
"filename": "D:/sample.pdf",
"contentType": "application/pdf",
"md5": "afe70f97bce7876e39aa43f71dc7266f",
"length": 82441,
"chunkSize": 262144,
"uploadDate": "2014-03-16T12:14:48.542Z",
"metadata": {}
}
}
}
]
}
}
Could you please help me find my mistake?
Here is the link I used:
http://v.bartko.info/?p=463
Your attachment has been encoded in BASE64.
You need to decode it on a client level.