Aggregation framework for MongoDB - mongodb

I have following schema:
Customer ID
Location Name
Time of Visit
The above stores the information of all customer's visit at various locations.
I would like to know if there's a way to write an aggregate query in MongoDB, so that it gives the Total Visitor information by different sections of the day, per day per location.
Sections of the day:
EDIT:
12 am - 8 am
8 am - 11 am
11 am - 1 pm
1 pm - 4 pm
4 pm - 8 pm
8 pm - 12 pm
If a customer visits a location on the same day and same section of the day more than once, it should be counted just once. However, if that customer visits a location on the same day but for different sections of the day, it should be counted exactly once for each of the section of the day he has appeared in.
Example:
Customer 1 visits store A on day 1 at 9:30 AM
Customer 1 visits store A on day 1 at 10:30 PM
Customer 1 visits store B on day 2 at 9:30 AM
Customer 1 visits store B on day 2 at 11:30 AM
Customer 1 visits store B on day 2 at 2:45 PM
Customer 2 visits store A on day 1 at 9:45 AM
Customer 2 visits store B on day 1 at 11:00 AM
Customer 2 visits store B on day 2 at 9:45 AM
Final output of repeat visits:
Store B, Day 1, Section (00:00 - 08:00) : 0 Visitors
Store B, Day 1, Section (08:00 - 16:00) : 2 Visitors
Store B, Day 1, Section (16:00 - 24:00) : 1 Visitors
Store B, Day 2, Section (00:00 - 08:00) : 0 Visitors
Store B, Day 2, Section (08:00 - 16:00) : 2 Visitors
Store B, Day 2, Section (16:00 - 24:00) : 0 Visitors
Is there any way the above kind of query could be done using aggregation framework for MongoDB?

Yes, this can be done quite simply. It's very similar to the query that I describe in the answer to your previous question, but rather than aggregating by day, you need to aggregate by day-hour-combinations.
To start with, rather than doing a group you will need to project a new part of date where you need to transform your "Time of Visit" field to the appropriate hour form. Let's look at one way to do it:
{$project : { newDate: {
y:{$year:"$tov"}, m:{$month:"$tov"}, d:{$dayOfMonth:"$tov"},
h: { $subtract :
[ { $hour : "$tov" },
{ $mod : [ { $hour : "$tov" }, 8 ] }
]
}
},
customerId:1, locationId:1
}
}
As you can see this generates year, month, day and hour but the hour is truncated to mod 8 (so you get 0, 8(am), or 16 aka 4pm.
Next we can do the same steps we did before, but now we are aggregating to a different level of time granularity.
There are other ways of achieving the same thing, you can see some examples of date manipulation on my blog.

Related

Restart the order number count each day in dart

I am creating a flutter app and my client wants me to generate customer order numbers starting from 1 each day.
So, Day 1: order Number 1, 2, 3 ... etc
Day 2: order number 1, 2, 3 ... etc
How can I accomplish this in dart?
I would store the current date in a database-field "dateField" and the highest orderNumber of the day in another database-field "orderNumberLast".
Before adding a new order I would proof if "dateField" equals the current date.
If so, I would set orderNumberLast = orderNumberLast +1.
If not, I would set "dateField" to currentDate and orderNumberLast=1

convert year-month string into daily dates

recently I asked how to convert calendar weeks into a list of dates and received a great and most helpful answer:
convert calendar weeks into daily dates
I tried to apply the above method to create a list of dates based on a column with "year - month". Alas i cannot make out how to account for the different number of days in different months.
And I wonder whether the package lubridate 'automatically' takes leap years into account?
Sample data:
df <- data.frame(YearMonth = c("2016 - M02", "2016 - M06"), values = c(28,60))
M02 = February, M06 = June (M11 would mean November, etc.)
Desired result:
DateList Values
2016-02-01 1
2016-02-02 1
ect
2016-02-28 1
2016-06-01 2
etc
2016-06-30 2
Values would something like
df$values / days_in_month()
Thanks a million in advance - it is honestly very much appreciated!
I'll leave the parsing of the line to you.
To find the last day of a month, assuming you have GNU date, you can do this:
year=2016
month=02
last_day=$(date -d "$year-$month-01 + 1 month - 1 day" +%d)
echo $last_day # => 29 -- oho, a leap year!
Then you can use a for loop to print out each day.
thanks to answer 6 at Add a month to a Date and answer for (how to extract number with leading 0) i got an idea to solve my own question using lubridate. It might not be the most elegant way, but it works.
sample data
data <- data_frame(mon=c("M11","M02"), year=c("2013","2014"), costs=c(200,300))
step 1: create column with number of month
temp2 <- gregexpr("[0-9]+", data$mon)
data$monN <- as.numeric(unlist(regmatches(data$mon, temp2)))
step 2: from year and number of month create a column with the start date
data$StartDate <- as.Date(paste(as.numeric(data$year), formatC(data$monN, width=2, flag="0") ,"01", sep = "-"))
step 3: create a column EndDate as last day of the month based on startdate
data$EndDate <- data$StartDate
day(data$EndDate) <- days_in_month(data$EndDate)
step 4: apply answer from Apply seq.Date using two dataframe columns to create daily list for respective month
data$id <- c(1:nrow(data))
dataL <- setDT(data)[,list(datelist=seq(StartDate, EndDate, by='1 day'), costs= costs/days_in_month(EndDate)) , by = id]

mongodb indexable $and query possible?

I'm developing an app using a MongoDB database that needs to check for items enabled for today's particular weekday.
Items can be enabled for any individual days of the week. (eg: Monday and Wednesday, or Tuesday and Thursday and Saturday, every day, whatever)
I was going to do this:
var currentWeekDay = Math.pow(2,new Date().getDay());
Therefore
Sunday === 1
Monday === 2
Tuesday === 4
Wednesday === 8
...
Saturday === 64
An example item might be like this
{_id:'blah', weekDays:127}
Now I want to query all items that are enabled for today...
MongoDB has an operator $and, but that's only for logical operations.
It has $bitsAnySet, but it looks like it's only implemented in 3.16.
https://jira.mongodb.org/browse/SERVER-3518
I'm running MongoDB v2.6.10.
So I'm wondering how to come up with a sensible indexable query.
Maybe
{_id:'blah', w0:1, w1:1, w2:1, w3:1, w4:1, w5:1, w6:1} //every day
{_id:'blah', w0:1, w1:0, w2:0, w3:0, w4:0, w5:0, w6:1} //Sat and Sun
That would be easily indexable. Can anyone think of a more terse way of doing it?
One option would be storing days as an array of integers:
{ '_id' : '1' , 'weekDays' : [0,1,2,3,4] } // mon-fri
{ '_id' : '2' , 'weekDays' : [5,6] } // sat-sun
Then you could create a simple index on weekDays field:
db.collection.createIndex({ weekDays : 1 })
And querying would also be pretty simple:
db.collection.find({weekDays : 2}) // wed

Qlikview - Data between dates; filter out data past or future data depending on selected date

I've seen threads where the document has Start Date and End Date "widgets" where users type in their dates, however, I'm looking for a dynamic solution, for example on the table below, when I select a date, say "1/1/2004", I only want to see active players (this would exclude Michael Jordan only).
Jersey# Name RookieYr RetirementYr Average PPG
23 Michael Jordan 1/1/1984 1/1/2003 24
33 Scotty Pippen 1/1/1987 1/1/2008 15
1 Derrick Rose 1/1/2008 1/1/9999 16
25 Vince Carter 1/1/1998 1/1/9999 18
The most flexible way is to IntervalMatch the RookieYr * RetireYr dates into a table of all dates. See http://qlikviewcookbook.com/recipes/download-info/count-days-in-a-transaction-using-intervalmatch/ for a complete example.
Here's the interval match for your data. You'll can obviously create your calendar however you want.
STATS:
load * inline [
Jersey#, Name, RookieYr, RetirementYr, Average, PPG
23, Michael Jordan, 1/1/1984, 1/1/2003, 24
33, Scotty Pippen, 1/1/1987, 1/1/2008, 15
1, Derrick Rose, 1/1/2008, 1/1/9999, 16
25, Vince Carter, 1/1/1998, 1/1/9999, 18
];
let zDateMin=37000;
let zDateMax=40000;
DATES:
LOAD
Date($(zDateMin) + IterNo() - 1) as [DATE],
year( Date($(zDateMin) + IterNo() - 1)) as YEAR,
month( Date($(zDateMin) + IterNo() - 1)) as MONTH
AUTOGENERATE 1
WHILE $(zDateMin)+IterNo()-1<= $(zDateMax);
INTERVAL:
IntervalMatch (DATE) load RookieYr, RetirementYr resident STATS;
left join (DATES) load * resident INTERVAL; drop table INTERVAL;
There's not much to it you need to load 2 tables one with the start and end dates and one with the calendar dates then you interval match the date field to the start and end field and from there it will work the last join is just to tidy up a bit.
The result of all of that is this ctrl-t. Don't worry about the Syn key it is required to maintain the interval matching.
Then you can have something like this.
Derrick Rose is also excluded since he had not started by 1/1/2004

JPA query, find values for chosen members of groups of records

Suppose I have a table with some natural grouping and ordering, for example records by date, where the records for any given date are ordered by some other differentiator field
1 July, 1, 56.6
1 July, 2, 45.8
1 July, 3, 78.9
2 July, 1, 34.2
2 July, 2, 26.7
I want to select the records with the highest differentiator for each day, for example, to get at
1 July, 3, 78.9
2 July, 2, 26.7
in this simple case. I can't think how to structure a query to retrieve those records. So far I'm pulling back the whole set and selecting in Java - not really what i want to do.
Perhaps something like,
Select o from MyClass o where o.value = (Select Max(g.value) from MyClass g where g.date = o.date)