mongodb remove all dates less than specified - mongodb

I have the following data.
{
deviceID: 186,
date: "2014-3-15"
}
{
deviceID: 186,
date: "2014-3-14"
}
{
deviceID: 186,
date: "2014-3-13"
}
And some lower dates, like 2014-3-9 , 8 ,7 ,6 etc.
When doing a db.coll.remove({date:{$lte:"2014-3-5"}})
Mongo removes the 15,14,13 aswell, but keeps single digit day dates. Is this maybe due to the date is a string?
I dont know how else to format the date so I can remove all dates below a certain date.
It is supposed to be a cleaning process, removing all documents with a date lower than specified.

Its because the date field you are querying on is a string filed and not a Date(). In your mongo documents instead of a custom date string, insert javascript date objects into date field.
like
{ deviceID: 186,,"date": new Date(2012, 7, 14) }
and when you execute the remove do it like
db.coll.remove({date:{$lte:new Date(2012, 7, 14)}})

If you want to remove data from MongoDB from the date less than specified, you MUST make sure of the date.
Easiest way for you to check whether you are inputting the right format is to test it before you use it in your query.
For example if you want to get current date in ISODate in Mongo shell, just type new Date and you will get the current date in Mongo.
I've tried the following in the Mongo shell:
new Date(2017, 11, 1)
and it returns
ISODate("2017-11-30T16:00:00Z")
which is not what I wanted.
What I want is to delete data before 1 November 2017.
Here's what works for me:
new Date("2017-11-01")
and it returns:
ISODate("2017-11-01T00:00:00Z")
Which is what I wanted.

This is because you are storing your data in a wrong format. You have a string an string
'15' is smaller than string '5'. Convert your strings in the beginning to date (read here how to use dates in mongo).
And only than you can use it to properly compare your dates:
db.coll.remove({
date:{
$lte : new Date(2012, 7, 14)
}
})

The reason for this is is your dates are strings.
So in a lexical sense when comparing strings "2014-3-5" is greater than "2014-3-15", as what is being compared is that "1" is less than "5".
Fix your dates as real ISO Dates, or you will forever have this problem.
Batch convert like this, assuming "year" "month" "day" in format:
db.eval(function(){
db.collection.find().forEach(function(doc) {
var d = doc.date.split("-");
var date = new Date(
"" + d[0] + "-" +
( d[1] <= 9 ) ? "0" + d[1] : d[1] + "-" +
( d[2] <= 9 ) ? "0" + d[2] : d[2]
);
db.collection.update(
{ "_id": doc._id },
{ "$set": { "date": date }
);
});
})
That makes sure you get the right dates on conversion.

Related

Converting StringDate to a queryable representation. Group and project sum of today,yesterday, week, month

I was about to try to figure out how to complete the given question.
It might consists of 2 parts, first one being - my collection dates are stored in plain string with a mysql format (YYYY-mm-dd HH:mm:ss), the second how to project the (today, yesterday, 7 day, month - summaries).
I have been experimenting around and this is what I came up with.
Pipe 1.
$match - nothing fancy there just a simple field = value.
Pipe 2.
$addField - trying to process the string date as a ISO date I believe? I am not sure
{
expired: {
$dateFromString: {
dateString: '$expired',
timezone: 'America/New_York'
}
}
}
Pipe 3.
$match - Quoted out wanted to select only a specific range so not more than 30 days - doesn't work
expired: {
$gt: ISODate(new Date(new Date(ISODate().getTime() - 1000*60*60*24*30)))
}
Pipe 4.
$group - Here I group and sum everything per day. So an output is
_id: 2021-09-27, theVal : 100
{
_id: {
$dateToString: {
date: { $toDate: "$expired" },
format: "%Y-%m-%d" }
},
theVal : {$sum:{$first:"$values.quantity"}} // as $values is an array [0].quantity,[1].quantity,[2].quantity - I am just interested in the first element.
}
Pipe 5.
$project - getting rid of the _id field - making it date name field, keeping theVal.
{
"date": "$_id",
"theVal": 1,
"_id": 0
}
theVal is a sum of integers within a day.
Questions
Between Pipe 1 and 2 ( temporary 3 ) I should be able to match dates
within the last 30 days to reduce the processing?
How to get a desired output like this:
{
today : 100,
yesterday : 10,
7days : 220,
month: 1000,
}
Really appreciate any help here.
Tried to "replicate" what you intend to do as you didn't provided sample test data.
You may want to do the followings in an aggregation pipeline:
$match : filter out the ids you want - same as your pipe 1
$dateFromString: use "format": "%Y-%m-%d %H:%M:%S", "timezone": "America/New_York"
$match : filter out records that are within 30 days with $expr and $$NOW
$group : group by date without time; achieved by converting to dateString with date part only
$addFields : project flags that determine if the record are within today, ``yesterday, 7days, month`
$group : As you didn't provided what is the meaning of today, ``yesterday, 7days, month, I made an assumption that they are the cumulative sum in the ranges. Simply and conditional $sum` will do the summation with the help of flags in step 5
Here is a Mongo playground for your reference.

MongoDB query to retrieve distinct documents by date

I have documents in the database with a dateTime value like so:
{
"_id" : ObjectId("5a66fa22d29dbd0001521023"),
"exportSuccessful" : true,
"month" : 0,
"week" : 4,
"weekDay" : "Mon",
"dateTime" : ISODate("2018-01-22T09:02:26.525Z"),
"__v" : 0
}
I'd like to:
query the database for a given date and have it return the document that contains the dateTime if the date matches (I don't care about the time). This is mainly to test before inserting a document that there isn't already one for this date. In the above example, if my given date is 2018-01-22 I'd like the document to be returned.
retrieve all documents with a distinct date from the database (again, I don't care about the time portion). If there are two documents with the same date (but different times), just return the first one.
From what I understand Mongo's ISODate type does not allow me to store only a date, it will always have to be a dateTime value. And on my side, I don't have control over what goes in the database.
Try range query with start date time from start of the day to end date time to end of the day. So basically create dates a day apart.
Something like
var start = moment().utc().startOf('day');
var end = moment().utc().endOf('day');
db.collection.find({
dateTime: {
$gte: start,
$lte: end
}
})
Get all distinct dates documents:
db.collection.aggregate(
{"$group":{
"_id":{
"$dateToString":{"format":"%Y-%m-%d","date":"$dateTime"}
},
"first":{
"$first":"$$ROOT"
}
}}])

Mongo DB ISO format

Based on the below query return result I want to filter the month and ther year.
For example I only want data for the month of March.
db.SBM_USER_DETAIL.aggregate([
{
$project: {
join_date: '$JOIN_DATE'
}
}
]).map(
function(d) {
d.join_date = moment(d.join_date).locale('es').tz("Asia/Kolkata").format();
return d
})
How to use the returned formatted value of join_date inside the MongoDB aggregation query?
MongoDB's ISODate is very similar to the javascript Date class. If you have a date range in the Kolkata timezone, and want to filter by that, instantiate a pair of Date objects to define the range, before running the find.
For this instance, to return all join_date values that fall within March 2017, converted to the Kolkata (UTC-07:00) timezone, filter for date greater than or equal to midnight March 1 and less than midnight April 1, then convert the results using moment:
var first = new Date("2017-03-01T00:00:00-07:00");
var last = new Date("2017-04-01T00:00:00-07:00");
db.SBM_USER_DETAIL.find(
{join_date:{$gte: first, $lt: last}}, //filter based on join_date
{join_date:1,_id:0} // only return join_date, omit this if you need all fields
).map(
function(d) {
d.join_date = moment(d.join_date).locale('es').tz("Asia/Kolkata").format();
return d;
}
);

get mongodb records created in a specific month

I'm trying to get a specific range of documents, based on when they were created. What I'm trying to do is something like:
/getclaims/2015-01
/getclaims/2015-02
...
that way a user can browse through all records based on the selected month.
In my database I'm not storing a created_at date, but I know mongodb stores this in the objectid.
I found that I can get records like this:
db.claims.find({
$where: function () { return Date.now() - this._id.getTimestamp() < (365 * 24 * 60 * 60 * 1000) }
})
of course that doesn't filter based on a specific month, but only within a certain time limit.
What would be a possible way of limited a query based on a specific month, using the Timestamp from the objectid's?
I'm using mongoose, but it's probably a good idea to start in mongo shell itself.
Based on the function borrowed from the answer to this question - https://stackoverflow.com/a/8753670/131809
function objectIdWithTimestamp(timestamp) {
// Convert date object to hex seconds since Unix epoch
var hexSeconds = Math.floor(timestamp/1000).toString(16);
// Create an ObjectId with that hex timestamp
return ObjectId(hexSeconds + "0000000000000000");
}
Create a start and an end date for the month you're looking for:
var start = objectIdWithTimestamp(new Date(2015, 01, 01));
var end = objectIdWithTimestamp(new Date(2015, 01, 31));
Then, run the query with $gte and $lt:
db.claims.find({_id: {$gte: start, $lt: end}});

Convert to date MongoDB via mongoimport

I have downloaded huge chunks of data in the format in csv. I am using mongoimport to enter the data into MongoDB for processing.
How do I get the date into date format recognized by MongoDB?
sample data with header
Date, Open Price, High Price, Low Price, Last Traded Price , Close Price, Total Traded Quantity, Turnover (in Lakhs)
04-Apr-2014,901,912,889.5,896.75,892.85,207149,1867.08
03-Apr-2014,908,918,897.65,900,900.75,156260,1419.9
02-Apr-2014,916,921.85,898,900.7,900.75,175990,1591.97
As far as I know, there is no way to do this with mongoimport.
But this is achievable by importing the data and then running the following script (note that there is no point of all this hastle with a monthes as in Neil's Lunn script, because mongo can properly convert your date by doing this new Date('04-Apr-2014')):
db.collName.find().forEach(function(el){
el.dateField = new Date(el.dateField);
db.collName.save(el)
});
PS If timezone is so important (I assume that it is not, if there are only dates without time information), you can just change timezone on your local machine and then run the query. (Thanks to Neil Lunn for clarification regarding this)
As of Mongo version 3.4, you can use --columnsHaveTypes option to specify the type of your field while using mongoimport to import your data.
here is the link for reference.
Sample mongoimport syntax below:
mongoimport --db XYZ --collection abc --type tsv --fields id.int32(),client_name.string(),app_name.auto(),date.date() --columnsHaveTypes --file "abc.tsv" --verbose
You basically have three options here as though you can import CSV directly using mongoimport, it has no idea how to convert dates from this format.
Convert your CSV input to JSON format by whatever means. For your date values you can use the extended JSON syntax form that will be recognized by the tool. The resulting JSON you produce can then be passed to mongoimport.
Write your own program to import the data by reading your CSV input and doing the correct conversions.
Import the CSV content as is, and then manipulate the data directly in your MongoDB collection using your language of choice.
One take on the third option would be to loop the results and update the dates accordingly:
var months = [
"Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
];
db.collection.find({ },{ "Date": 1 }).forEach(function(doc){
var splitDate = doc.Date.split("-");
var mval = months.indexOf( splitDate[1] );
mval = ( mval < 10 ) ? "0" + mval : mval
var newDate = new Date( splitDate[2] + "-" + mval + "-" + splitDate[0] );
db.collection.update(
{ _id: doc._id },
{ "$set": { "Date": newDate } }
);
})
And that would make sure your dates are then converted to the correct BSON date format with the same matching date values you are expected.
Beware of "local" timezone conversions, you will want to be storing as UTC time.