How to combine rows hours into just one day with MongoDB? - mongodb

Are you able to use MongoDB to combine rows of data into one row?
I'm using dates with year, month, day and hour. The data is shown per hour. Is there a way to combine data of the hours into just one day with data. I would basically remove the hour column and sum the hour data into per day data.

I'm not sure what you mean by "the data is shown per hour" - do you mean it's stored in the database that way?
MongoDB doesn't have rows and columns - the equivalent of a row is a document, and the column equivalent is a field. Unlike in traditional SQL, a field isn't just one piece of information (a string, number/date, boolean, null, etc). It can be more than one piece of data - it can be an array, or a document, or an array of documents, etc.
Anyway, based on the small amount of information I have on your situation, I'd absolutely design the data with the bucket pattern. https://www.mongodb.com/blog/post/building-with-patterns-the-bucket-pattern
You could $unset the 'measurements' array and just keep the sum/count fields if that's what you want.
If your data is already set in stone, then I'd use an aggregation pipeline to group all the documents ('rows') together - the group _id would be year, month, day, and you could sum/count/min/max/etc the data in the group too.

Related

MongoDB aggregate documents in fixed sized buckets and then filter one document for each

I have a collection of documents with a field x, a timestamp date field, and other fields. Since there are a fixed number of documents per day, I want to bucket them by day (one bucket for all documents with timestamp on that particular day) and then select from each bucket the document with the maximum value of x with all its fields. So the pipeline must return a list of documents (as many as the number of days) each of which is the document that presents the highest value of x on that day.
How can I do that? P.S. I'm using Mongoose.

Weekly hour allocation problem in Rails and Postgresql

If I have a list of tasks with a certain date ranges, and the task is broken into weekly hour chunks of work (ie. 30 hours from 2018-12-31 to 2019-01-06 ... etc starting from Monday).
The kind of operations I would like to do are
Display all the weekly hours of all the tasks for a list of users
Sum the weekly hours for a user for all his tasks for the week
When the duration of the task is modified, create/destroy the weekly hour chunks.
Would it be more efficient to store these weekly records as
start date/end date/hours,
year/week number/hours
Storing start/end date probably give more flexibility to the table as it could potentially store non-weekly align hours.
Storing week number means given a date range, creating the weekly chunks is as simple as finding the week number of the start date and the week number of the end date, and populating the weeks in between (without converting to date ranges). Also easier validation for updating the hours for a week, as long as the week number is 1-53.
Wondering if anyone has tried out either option and can give any pointers on their preferred option.
I would probably go for a daterange column.
That gives you the flexibility to have differently sized chunks and allows you to define an exclusion constraint to prevent overlapping ranges.
Finding the row for a given week is still quite simple using the "contains" operator #>, e.g. where the_column #> to_date('2019-24', 'iyyy-iw') finds the row(s) that contain week number 24 in 2019.
The expression to_date('2019-24', 'iyyy-iw') returns the first day (Monday) of the specified week.
Finding all rows that are between two weeks can also be done, however construction the corresponding date range looks a bit ugly. You can either construction an inclusive range with the first and last day: daterange(to_date('2019-24', 'iyyy-iw'), to_date('2019-24', 'iyyy-iw') + 6, '[]')
Or you can create a range with an exclusive upper range with the next week's first day: daterange(to_date('2019-24', 'iyyy-iw'), to_date('2019-25', 'iyyy-iw'), '[)')
While ranges can be indexed quite efficiently and , the required GIST indexes are a bit more expensive to maintain than a B-Tree index on two integer columns.
Another downside of using ranges (if you don't really need the flexibility) is that they take up more space than two integer columns (14 byte instead of 8, or even 4 with two smallint). So if the size of the table is of any concern, then your current solution with the year/week columns is more efficient.
"Storing week number means given a date range, creating the weekly chunks is as simple as finding the week number of the start date and the week number of the end date"
If your input is a start and end date to begin with (rather than a "week number"), then I would definitely go for a daterange column. If that start and end date cover more than one week, then you store only one row, rather than multiple rows.

Crystal Reports - create calendar

I need to create an attendence list showing days in rows and employee names in colums. The list will always cover one full month chosen in parameters.
How can I create a recordset of days of chosen month? I've done it in command section but, due to ERP system limitations, it must done otherways.
Thank you,
Przemek
A good approach is to create a Calendar table (aka Date Dimension in data warehousing lingo). It makes it easy to show days without any attendance. If you don't need that aspect, you can simply create a formula that returns the attendance date month's day, and Group on that formula. The Day() function gets you the day of month. For example,
Day ({Orders.Order Date})
If you search 'creating a data dimension or calendar table' you'll find many helpful sources such as this one: https://www.mssqltips.com/sqlservertip/4054/creating-a-date-dimension-or-calendar-table-in-sql-server/
For your case, I agree with the comments in that post about using date instead of integer as the primary key. Integer PK makes more sense for true data warehousing scenarios as opposed to legacy databases.

Firestore creating a representation of a order form

I'm new at firebase/firestore. Appreciate your thoughts on this.
I'm trying to create an order form using Firestore.
I was wondering:
Do I need to create a YEAR, MONTH, DATE of collections and documents just to save all the order forms ?
This felt redundant to me, otherwise a lot of databases in Firestore will require a lot of collections and documents of dates!
E.g.:
(CAPS is Collection, bracketed are the documents)
YEAR (2018) -> MONTH (Jan, Feb, Mar, etc)
JAN -> 1 (order #1, order #2, order #3, etc)
JAN -> 2 (order #1, order #2, etc...)
Or just a simple time stamp on each document of order form - and then if I want to look up say for the month of January, is there a search query to look up all the documents in a specific month? or date?
And as order forms go, the user can enter as many items as possible. Is it correct to have each item stored as a new document? Therefore possibly creating 100+ documents per collection of order form.
E.g.:
order 1 collection - banana document, apple document, orange document
There is no single best data model. It all depends on the use-cases of your app. For a general introduction to this domain, I recommend reading NoSQL data modeling.
But in Firestore having a collection with many documents is not a concern. The database was made to scale to almost any number of documents.
You'll indeed want to store a timestamp in each order. The best way for this is like to use a server-side timestamp. You can then query for orders in a given date range with a query like this:
var now = new Date();
var yesterday = new Date(Date.now() - 24*60*60*1000);
ordersRef.where("timestamp", ">=", yesterday).where("timestamp", "<=", now)
Or
ordersRef.where("timestamp", ">=", "2017-01-01").where("timestamp", "<=", "2017-12-31")
For more examples, see the Firestore documentation on queries.

How to query data on weekly basis in MongoDB?

My actual documents are more complex than this but simplifying them like so will explain the problem I want to solve. I have daily and weekly documents.
Daily Document: ObjectId, Type, Count, Date
Weekly Documents: ObjectId, Type, Count, StartDate, EndDate
If I wanted a daily report I can run a query that will select documents with Date field value between range X to Y and Type equal to 'daily'. I can do the same thing for Weekly reports and it all works.
The problem:
For weekly reports if the start date is not the first day of the week and the end date is not exactly the last day of the week, selecting documents with Type 'weekly' will produce inaccurate reports since weekly documents store the data for the entire week. This may seem strange but Google Analytics lets you do it:
In the above screenshot Jul 3rd isn't the beginning of that week, nor is Jul 17th the end of that week. But Google Analytics lets you see the data as you want.
Possible Solution:
One possible solution is to produce a daily report for the overflowing days and subtract it from the weekly report.
The question:
Is there a nicer solution to solving the problem I described? I'm open to redesigning the documents
For weekly reports if the start date is not the first day of the week
and the end date is not exactly the last day of the week, selecting
documents with Type 'weekly' will produce inaccurate reports since
weekly documents store the data for the entire week. This may seem
strange but Google Analytics lets you do it:
Because they don't store documents like that.
The way this works (I reckon) is that Google just summerises the daily documents and ignores your "week" range and applies their own range of saying:
get as many full weeks as possible and aggregate the daily documents that
come under that range
and then:
Just throw the others ontop making each the end and start point
I wouldn't try to mix week and daily documents here I would just query over daily documents only and aggregate client side.