Querying for output address using regex and tx time range - mongodb

qqqqqqqqqqqqqqqqqqqqqqqqqqqqqu08dsyxz98whc is one out of possibly many tx which have whc substring in output address,
https://blockchair.com/bitcoin-cash/transaction/ce4b6388c3b57dc188bfafde87d7af28ee3ba210d0a3223a3bc86f6083337459
I would like to find such outputs for 2018-08 using bitdb query lang similar to mongodb,
{
"v": 3,
"q": {
"find": {
"out.e.a": { "$regex": "whc$" },
"blk.t": {
"$gte": "2018-08-01T00:00:00Z",
"$lte": "2018-08-31T00:00:00Z"
}
},
"limit": 1
}
}
Unfortunately I'm not getting any result for such query
Is there some syntax issue which prevents correct results?

Try to query using Unix timestamps instead of ISO Dates.
To convert from ISO Date to unix timestamp in javascript you can do:
var myDate = new Date(ISODate("2015-10-25T00:00:00.000Z"));
var myTimeStamp = myDate.getTime() / 1000;

Related

How to get ISO string in Nifi getMongo Query Field

I'm trying to use expression languge to generate ISO string in Nifi getMongo Query field using following query,
{
"remindmeDate": {
"$gte": "${now():format("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'",'GMT')}",
"$lte": "${now():toNumber():plus(359999):format("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'",'GMT')}"
}
}
But i'm getting invalid JSON error error as double quotes are not escaped. When we try to escape it using \ operator, nifi is not evaluating the expression language. Is there any method or workaround to get this working ?
Thanks in advance
GetMongo processor of nifi requires your query to be in extended json format of mongo.So you can use query of below format to query mongo based on datetime:
{"bday":{"$gt":{"$date":"2014-01-01T05:00:00.000Z"}, "$lt" :{"$date":"2019-01-
01T05:00:00.000Z"}}}
I used your not changed expression in UpdateAttribute processor to evaluate new flowFile attribute.
your expression:
{
"remindmeDate": {
"$gte": "${now():format("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'",'GMT')}",
"$lte": "${now():toNumber():plus(359999):format("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'",'GMT')}"
}
}
the result:
{
"remindmeDate": {
"$gte": "2017-06-16T07:38:04.811Z",
"$lte": "2017-06-16T07:44:04.810Z"
}
}
and this is a correct json object.
Finally I found that GetMongo.Query property does not support nifi expression language (nifi 1.2.0 and 1.3.0). Just hover the question mark near parameter.
It means no way to build dynamic query (
Seems need to register an issue... https://issues.apache.org/jira/browse/NIFI-4082
But it's possible to specify current and relative date in mongo query language. something like this:
{
"remindmeDate": {
"$gte": new Date(),
"$lte": new Date(ISODate().getTime() + 359999)
}
}
Nifi's getMongo Query field doesnt support EL. So i created a stored function in MongoDB for my dynamic query and called it from Nifi.
{
"_id" : "reminderDateGMT",
"value" : function (reminderDateGMT) {
var reminder = new Date(reminderDateGMT)
var fromDate = new Date();
var toDate = new Date(new Date().getTime()+(1000 * 60 * 60));
if ((reminder >= fromDate) && (reminder <=toDate )) {
return true;
} else {
return false;
}
}
}
In nifi GetMongo Query,
{
"$where": "reminderDateGMT(this.reminderDateGMT)"
}
I think you may be able to use the unescapeJson expression language function to handle this. You have to provide valid JSON (escaped quotes) for the field level (PropertyDescriptor in NiFi parlance) validation, but the expression language string expects unescaped JSON during expression parsing, so the unescapeJson function removes the escapes first and then format receives a properly quoted string.
{
"remindmeDate": {
"$gte": "${now():format(\"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'\":unescapeJson(),'GMT')}",
"$lte": "${now():toNumber():plus(359999):format(\"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'\":unescapeJson(),'GMT')}"
}
}
I had a similar discussion on the mailing list, and here is the solution I found that works:
Mongo console:
db.system.js.save({
"_id": "lastFiveMinutes",
"value": function() {
return new Date(ISODate().getTime() - (1000 * 60 * 5));
}
});
db.loadServerScripts();
Query field:
{
"$where": "obj.ts >= lastFiveMinutes()"
}
Note: you probably want to set this on a timer in the scheduling property.
I know this is pretty old post, but I spent lot many hours and found a solution which worked for me.
Use UpdateAttribute Processor and created two attributes to calculate the date range, I need to fetch the mongo documents :
startDate: "${now():format('yyyy-MM-dd')}"
endDate : "${now():toNumber():plus(86400000):format('yyyy-MM-dd')}"
enter image description here
After that pass these attributes to GetMongo processor:
Query : {"createdDate":{"$gte":ISODate(${startDate}), "$lt":ISODate(${endDate})}}

Cannot use $dayOfMonth aggregate with expression

So, I need to extract the day-of-week for some objects to make some aggregations. But all my documents have is a timestamp, not a Date. So I'm trying to use $dayOfMonth (and others) with an expression, I can't figure out what it is not working.
Here is my query (along with a helper function to create my date from the timestamp):
db.Logging.aggregate([
{
$match: {
"timestamp": { $gte: dateToTimestamp("2017-04-10") }
}
},
{
$project: {
_id: 0,
timestamp: "$timestamp",
dia: { $dayOfMonth: myDate("$timestamp") }
}
}
])
function dateToTimestamp(str) {
let d = new Date( str );
return d.getTime() + d.getTimezoneOffset()*60*1000;
}
function myDate( ts) {
var d = new ISODate();
d.setTime( ts );
return d;
}
The problem seems to be in passing the value of $timestamp to the myDate function. If I use a literal (e.g. 1492430243) either as the value of ts inside the function or as the value of the parameter passed to myDate it works fine.
In other words: this $dayOfMonth: myDate("1492430243") works.
Although a solution has been shown to work here (Mongodb aggregation by day based on unix timestamp - a pretty ugly solution, if I may add), I want to know why my solution doesn't. As per mongodb docs, $dayOfMonth works with Date types, my function returns a date, so what is wrong?

momentjs date collation from a json table

Background
momentjs 2.8.3
angularjs
collating dates in a date table
Problem
Trevor wishes to get the global timespan of dates in a date table, where each record contains a start date and an end date.
Goal
The goal is to get a global timespan, such that the earliest part of the timespan reflects the earliest date in any row the table, and the latest part of the timespan reflects the latest date in any row the table.
Trevor does not know in advance how the dates are arranged in the table, other than they are all formatted as 'YYYY-MM-DD'
Trevor is sold on momentjs as the most effective js library for handling this kind of problem, but he is open to using any others.
Details
The data is all encoded in JSON and structured as below.
```
dataroot {
"datedemo_data_table": [
{
"datebeg": "2014-01-15",
"dateend": "2014-02-15"
},
{
"datebeg": "2014-03-15",
"dateend": "2015-01-01"
},
{
"datebeg": "2015-06-15",
"dateend": "2015-07-20"
},
{
"datebeg": "2012-08-15",
"dateend": "2013-08-15"
},
{
"datebeg": "2013-01-15",
"dateend": "2013-01-16"
}
],
"datedemosummary_data_dict": {
"x": "x",
"ds_soonst_date": "",
"ds_latest_date": ""
}
}
```
The goal is to populate the ds_soonst_date and ds_latest_date with the correct date values.
Questions
Is momentjs the best library for a task such as this?
Are there any performance implications for large data tables (over 10k records)?
You actually don't need moment (or any library) for this. Since the values are in YYYY-MM-DD format, they are sortable as strings. Simple array/object manipulation will work.
var data = JSON.parse('{"datedemo_data_table":[{"datebeg":"2014-01-15","dateend":"2014-02-15"},{"datebeg":"2014-03-15","dateend":"2015-01-01"},{"datebeg":"2015-06-15","dateend":"2015-07-20"},{"datebeg":"2012-08-15","dateend":"2013-08-15"},{"datebeg":"2013-01-15","dateend":"2013-01-16"}],"datedemosummary_data_dict":{"x":"x","ds_soonst_date":"","ds_latest_date":""}}');
var firstBegDate = data.datedemo_data_table
.map(function(x){return x.datebeg;})
.sort().shift();
var lastEndDate = data.datedemo_data_table
.map(function(x){return x.dateend;})
.sort().pop();
As far as performance goes - if you have 10k items in a single JSON, that's probably an issue right there. You will always have O(n) performance with any approach unless you use an index to reduce the data to start with.
Answer
momentjs is an excellent choice as it is a well-documented and feature-full library.
The performance question is not addressed here, perhaps someone else can chime in on that.
Nevertheless with a small table of a few values, you can get a quick result by doing a collation of the dates into a single javascript array, and then extracting the max and min using the relevant functions from momentjs.
This can be done easily with the following:
Solution
var fmt = 'YYYY-MM-DD'
,ddtemp = $scope.dataroot.datedemosummary_data_dict
,aatemp_dates = []
;
$scope.dataroot.datedemo_data_table.forEach(function(currow, ixx, arr) {
aatemp_dates.push(moment(currow.datebeg,fmt));
aatemp_dates.push(moment(currow.dateend,fmt));
},ddtemp);
ddtemp.ds_soonst_date = (moment.min(aatemp_dates).format(fmt));
ddtemp.ds_latest_date = (moment.max(aatemp_dates).format(fmt));
Result
dataroot {
"datedemo_data_table": [
{
"datebeg": "2014-01-15",
"dateend": "2014-02-15"
},
{
"datebeg": "2014-03-15",
"dateend": "2015-01-01"
},
{
"datebeg": "2015-06-15",
"dateend": "2015-07-20"
},
{
"datebeg": "2012-08-15",
"dateend": "2013-08-15"
},
{
"datebeg": "2013-01-15",
"dateend": "2013-01-16"
}
],
"datedemosummary_data_dict": {
"x": "x",
"ds_soonst_date": "2012-08-15",
"ds_latest_date": "2015-07-20"
}
}
See also
momentjs #min
momentjs #max
momentjs range addon library by gf3 https://github.com/gf3/moment-range

Aggregate MongoDB results by ObjectId date

How can I aggregate my MongoDB results by ObjectId date. Example:
Default cursor results:
cursor = [
{'_id': ObjectId('5220b974a61ad0000746c0d0'),'content': 'Foo'},
{'_id': ObjectId('521f541d4ce02a000752763a'),'content': 'Bar'},
{'_id': ObjectId('521ef350d24a9b00077090a5'),'content': 'Baz'},
]
Projected results:
projected_cursor = [
{'2013-09-08':
{'_id': ObjectId('5220b974a61ad0000746c0d0'),'content': 'Foo'},
{'_id': ObjectId('521f541d4ce02a000752763a'),'content': 'Bar'}
},
{'2013-09-07':
{'_id': ObjectId('521ef350d24a9b00077090a5'),'content': 'Baz'}
}
]
This is what I'm currently using in PyMongo to achieve these results, but it's messy and I'd like to see how I can do it using MongoDB's aggregation framework (or even MapReduce):
cursor = db.find({}, limit=10).sort("_id", pymongo.DESCENDING)
messages = [x for x in cursor]
this_date = lambda x: x['_id'].generation_time.date()
dates = set([this_date(message) for message in messages])
dates_dict = {date: [m for m in messages if this_date(m) == date] for date in dates}
And yes, I know that the easiest way would be to simply add a new date field to each record then aggregate by that, but that's not what I want to do right now.
Thanks!
Update: There is a built in way to do this now, see https://stackoverflow.com/a/51766657/295687
There is no way to accomplish what you're asking with mongodb's
aggregation framework, because there is no aggregation operator that
can turn ObjectId's into something date-like (there is a JIRA
ticket, though). You
should be able to accomplish what you want using map-reduce, however:
// map function
function domap() {
// turn ObjectId --> ISODate
var date = this._id.getTimestamp();
// format the date however you want
var year = date.getFullYear();
var month = date.getMonth();
var day = date.getDate();
// yields date string as key, entire document as value
emit(year+"-"+month+"-"+day, this);
}
// reduce function
function doreduce(datestring, docs) {
return {"date":datestring, "docs":docs};
}
The Jira Ticket pointed out by llovett has been solved, so now you can use date operators like $isoWeek and $year to extract this information from an ObjectId.
Your aggregation would look something like this:
{
"$project":
{
"_id": {
"$dateFromParts" : {
"year": { "$year": "$_id"},
"month": { "$month": "$_id"},
"day": { "$dayOfMonth": "$_id"}
}
}
}
}
So this doesn't answer my question directly, but I did find a better way to replace all that lambda nonsense above using Python's setdefault:
d = {}
for message in messages:
key = message['_id'].generation_time.date()
d.setdefault(key,[]).append(message)
Thanks to #raymondh for the hint in is PyCon talk:
Transforming Code into Beautiful, Idiomatic Python

Converting string to date in mongodb

Is there a way to convert string to date using custom format using mongodb shell
I am trying to convert "21/May/2012:16:35:33 -0400" to date,
Is there a way to pass DateFormatter or something to
Date.parse(...) or ISODate(....) method?
Using MongoDB 4.0 and newer
The $toDate operator will convert the value to a date. If the value cannot be converted to a date, $toDate errors. If the value is null or missing, $toDate returns null:
You can use it within an aggregate pipeline as follows:
db.collection.aggregate([
{ "$addFields": {
"created_at": {
"$toDate": "$created_at"
}
} }
])
The above is equivalent to using the $convert operator as follows:
db.collection.aggregate([
{ "$addFields": {
"created_at": {
"$convert": {
"input": "$created_at",
"to": "date"
}
}
} }
])
Using MongoDB 3.6 and newer
You cab also use the $dateFromString operator which converts the date/time string to a date object and has options for specifying the date format as well as the timezone:
db.collection.aggregate([
{ "$addFields": {
"created_at": {
"$dateFromString": {
"dateString": "$created_at",
"format": "%m-%d-%Y" /* <-- option available only in version 4.0. and newer */
}
}
} }
])
Using MongoDB versions >= 2.6 and < 3.2
If MongoDB version does not have the native operators that do the conversion, you would need to manually iterate the cursor returned by the find() method by either using the forEach() method
or the cursor method next() to access the documents. Withing the loop, convert the field to an ISODate object and then update the field using the $set operator, as in the following example where the field is called created_at and currently holds the date in string format:
var cursor = db.collection.find({"created_at": {"$exists": true, "$type": 2 }});
while (cursor.hasNext()) {
var doc = cursor.next();
db.collection.update(
{"_id" : doc._id},
{"$set" : {"created_at" : new ISODate(doc.created_at)}}
)
};
For improved performance especially when dealing with large collections, take advantage of using the Bulk API for bulk updates as you will be sending the operations to the server in batches of say 1000 which gives you a better performance as you are not sending every request to the server, just once in every 1000 requests.
The following demonstrates this approach, the first example uses the Bulk API available in MongoDB versions >= 2.6 and < 3.2. It updates all
the documents in the collection by changing the created_at fields to date fields:
var bulk = db.collection.initializeUnorderedBulkOp(),
counter = 0;
db.collection.find({"created_at": {"$exists": true, "$type": 2 }}).forEach(function (doc) {
var newDate = new ISODate(doc.created_at);
bulk.find({ "_id": doc._id }).updateOne({
"$set": { "created_at": newDate}
});
counter++;
if (counter % 1000 == 0) {
bulk.execute(); // Execute per 1000 operations and re-initialize every 1000 update statements
bulk = db.collection.initializeUnorderedBulkOp();
}
})
// Clean up remaining operations in queue
if (counter % 1000 != 0) { bulk.execute(); }
Using MongoDB 3.2
The next example applies to the new MongoDB version 3.2 which has since deprecated the Bulk API and provided a newer set of apis using bulkWrite():
var bulkOps = [],
cursor = db.collection.find({"created_at": {"$exists": true, "$type": 2 }});
cursor.forEach(function (doc) {
var newDate = new ISODate(doc.created_at);
bulkOps.push(
{
"updateOne": {
"filter": { "_id": doc._id } ,
"update": { "$set": { "created_at": newDate } }
}
}
);
if (bulkOps.length === 500) {
db.collection.bulkWrite(bulkOps);
bulkOps = [];
}
});
if (bulkOps.length > 0) db.collection.bulkWrite(bulkOps);
In my case I have succeed with the following solution for converting field ClockInTime from ClockTime collection from string to Date type:
db.ClockTime.find().forEach(function(doc) {
doc.ClockInTime=new Date(doc.ClockInTime);
db.ClockTime.save(doc);
})
You can use the javascript in the second link provided by Ravi Khakhkhar or you are going to have to perform some string manipulation to convert your orginal string (as some of the special characters in your original format aren't being recognised as valid delimeters) but once you do that, you can use "new"
training:PRIMARY> Date()
Fri Jun 08 2012 13:53:03 GMT+0100 (IST)
training:PRIMARY> new Date()
ISODate("2012-06-08T12:53:06.831Z")
training:PRIMARY> var start = new Date("21/May/2012:16:35:33 -0400") => doesn't work
training:PRIMARY> start
ISODate("0NaN-NaN-NaNTNaN:NaN:NaNZ")
training:PRIMARY> var start = new Date("21 May 2012:16:35:33 -0400") => doesn't work
training:PRIMARY> start
ISODate("0NaN-NaN-NaNTNaN:NaN:NaNZ")
training:PRIMARY> var start = new Date("21 May 2012 16:35:33 -0400") => works
training:PRIMARY> start
ISODate("2012-05-21T20:35:33Z")
Here's some links that you may find useful (regarding modification of the data within the mongo shell) -
http://cookbook.mongodb.org/patterns/date_range/
http://www.mongodb.org/display/DOCS/Dates
http://www.mongodb.org/display/DOCS/Overview+-+The+MongoDB+Interactive+Shell
I had some strings in the MongoDB Stored wich had to be reformated to a proper and valid dateTime field in the mongodb.
here is my code for the special date format: "2014-03-12T09:14:19.5303017+01:00"
but you can easyly take this idea and write your own regex to parse the date formats:
// format: "2014-03-12T09:14:19.5303017+01:00"
var myregexp = /(....)-(..)-(..)T(..):(..):(..)\.(.+)([\+-])(..)/;
db.Product.find().forEach(function(doc) {
var matches = myregexp.exec(doc.metadata.insertTime);
if myregexp.test(doc.metadata.insertTime)) {
var offset = matches[9] * (matches[8] == "+" ? 1 : -1);
var hours = matches[4]-(-offset)+1
var date = new Date(matches[1], matches[2]-1, matches[3],hours, matches[5], matches[6], matches[7] / 10000.0)
db.Product.update({_id : doc._id}, {$set : {"metadata.insertTime" : date}})
print("succsessfully updated");
} else {
print("not updated");
}
})
How about using a library like momentjs by writing a script like this:
[install_moment.js]
function get_moment(){
// shim to get UMD module to load as CommonJS
var module = {exports:{}};
/*
copy your favorite UMD module (i.e. moment.js) here
*/
return module.exports
}
//load the module generator into the stored procedures:
db.system.js.save( {
_id:"get_moment",
value: get_moment,
});
Then load the script at the command line like so:
> mongo install_moment.js
Finally, in your next mongo session, use it like so:
// LOAD STORED PROCEDURES
db.loadServerScripts();
// GET THE MOMENT MODULE
var moment = get_moment();
// parse a date-time string
var a = moment("23 Feb 1997 at 3:23 pm","DD MMM YYYY [at] hh:mm a");
// reformat the string as you wish:
a.format("[The] DDD['th day of] YYYY"): //"The 54'th day of 1997"