Unique hash/index for time interval - mongodb

I am working on a simple resource booking app. The use of the resource is exclusive so it can't be booked more than once at the same time. I am wondering if this constraint can be enforced by a unique index instead of having to build validation in code.
The resource can only be booked by block of 30 minutes and the start and end time must be on the hour OR the half hour. So the booking can modeled as an array of blocks that are unique (dividing the timestamp in chunks of 30 minutes).
Can anyone think of a way to hash that so any booking with one or more 30-min. block in common would violate the unique index condition?
NB: I am using MongoDB (I don't think it really matters)

I am wondering if this constraint can be enforced by a unique index instead of having to build validation in code.
Use an unique compound index on the resource id, day and chunk of 30 minutes. Then insert one document for each 30 minutes of period of reservation.
For example, to reserve the resource id 123 on 9 of June 2015 from 8:00 to 9:30 (16th, 17th and 18th 30 minutes period of the day), you insert 3 documents:
> db.booking.createIndex({resource: 1,
day: 1, period:1},{unique:true})
{
resource: 123,
day: ISODate("2015-09-06"),
period: 16
},
{
resource: 123,
day: ISODate("2015-09-06"),
period: 17
},
{
resource: 123,
day: ISODate("2015-09-06"),
period: 18
},
Depending the number en entries, you might consider using embedded documents instead:
> db.resource.createIndex({_id: 1,
"booking.day": 1,
"booking:period":1},{unique:true})
And describe your resources like this:
{
_id: 123,
someOtherResourceAttributes: "...",
booking: [
{
day: ISODate("2015-09-06"),
period: 16
},
{
day: ISODate("2015-09-06"),
period: 17
},
{
day: ISODate("2015-09-06"),
period: 18
},
]
},
This has the great advantage that insert/update would be atomic for the whole reservation. But beware that document size is limited to 16M.

Related

How Do I Structure MongoDB To Query By Date Time Per Minute?

I am trying to store the price of stocks price per minute, so I can easily return results based on the minute date time per minute interval and store historical data, so i can query like last 24 hours, last 30 days etc (please also let me know if this is wrong approach)
for example if i check current time with fmt.Println("time now: ", time.Now()) i get the following date time 2022-01-29 11:47:02.398118591 +0000 UTC m=+499755.770119738
so what i want is to only get up to minute level, so i can store per minute
so i will liek to use this date time 2022-01-29 11:47:00 +0000 UTC
I will like to UTC, so i can stick to that universal time zone to store and retrive data
Each row will be a list of multiple stock price data
Do i need to have the _id field? Am not sure, so just looking for best practice as help.
database name: "stock-price-db"
collection name: "stock-price"
Thinking of something like this, just for example
[
{
"_id" : ObjectId("5458b6ee09d76eb7326df3a4"),
"2022-01-29 11:48:00 +0000 UTC":
[
{
"stock": "TSLA",
"price": "859.83",
"marketcap": "8938289305",
},
{
"stock": "AAPL",
"price": "175.50",
"marketcap": "3648289305",
},
]
},
{
"_id" : ObjectId("5458b6ee09d76eb7326df3a4"),
"2022-01-29 11:47:00 +0000 UTC":
[
{
"stock": "TSLA",
"price": "855.50",
"marketcap": "8848289305",
},
{
"stock": "AAPL",
"price": "172.96",
"marketcap": "3638289305",
},
]
},
]
First, is this the right way to do store this type of data in mongodb and how do I structure the model to store the data this way so I can store the data per minute interval, so I can query per minute interval?
There are few drawbacks in your design.
Do not use dynamic keys - you will end up using few extra aggregation pipelines.
Store the date in a static-key field i.e time:ISODate()
Better store all the available time units, till milliseconds, it will be helpful to handle the future requirement changes
If there are too many stocks changes, it is not a scalable design.
If you want to find out historical data for a stock, provided design may have performance issues.
You will end up with issues in sharding.
What other alternatives:
Not all the use-cases can be solved by one design.
If this use case is purely for time series use case, I would recommend you to use a time series design/ time series database i.e influx, tsdb.
If you need to cover all the use-cases, normalise and use GQL.

Mongo db $group dynamic expression

I have a set of logs with a timestamp and needs to group that logs by some non-existent 'virtual session'.
New grouped session begins if there is half of hour between last log in previous session and first log in this.
For example we have following set of data:
[
{
id: "b4f0d0d7-495b-48db-95bf-d5ac0c8c9e9b"
time: 1461872894322
timestamp: "Apr 28, 2016 7:48:14 PM",
},
{
id: "bf55ca2f-b544-406c-bed6-766a1204683d"
time: 1461872937941
timestamp: "Apr 28, 2016 7:48:57 PM"
},
{
id: "7f2ab420-0434-46f8-9444-6e2ffa73aea8"
time: 1461873088155
timestamp: "Apr 28, 2016 7:51:28 PM"
},
{
id: "dd31124c-0375-454a-acca-c239465a2b22"
time: 1461839257257
timestamp: "Apr 28, 2016 10:27:37 AM"
},
{
id: "a4370974-bfea-408f-aa69-973961e9f058"
time: 1461839281324
timestamp: "Apr 28, 2016 10:28:01 AM"
}
]
It should be grouped in two virtual sessions. As a result of grouping i can get min and max time for each group in mongo aggregate $group, but how to write the correct expression?
Expected answer is something like
[
{min: 1461872894322, max: 1461873088155},
{min: 1461839257257, max: 1461839281324}
]
Unfortunately there is no way to do it by mongo query as there is no handle for previous row (like CTE common table expressions).
To solve this problem you need to process data client side (or using javascript in mongo console - like a SP from sql world) and iterate over all documents checking for time gap and adding a grouping indicator to collection.
Then you will be able to group by added grouping indicator.
Was thinking of suing $let as it can access external variable - but this is RO access so we cannot relay on that.
Have a fun!
Any comments welcome.

MongoDB squema design for objects with several dates

I'm building an event website. There are 2 types of events:
Events with specific dates and times. For example, a theatre show can have a show Jan 10 at 8pm, Jan 11 at 8pm and Jan 13 at 7pm.
Events which open a range of hours during several days. For example, an exhibition on a museum can open from Jan 10 to Jan 30 from 10am to 6pm.
I need to save the dates and times so that I can answer the following questions/queries:
Which events are going to happen tomorrow from 7pm to 12am?
Which events are going to happen this weekend?
Which events are about to finish? (the last day is less than one week away)
If we didn't have events type 2, we could have the following squema:
name
category
dates: an array of dates (each day would be on the array)
But because we have events type 2, it has to be different. I thought on having:
name
category
dates: an array of objects like {"2015-01-10 09:00": "2015-01-10 18:00"} with the range of hours of each day.
But I think it's not possible to do a query to solve Question 1 with this squema. Am I wrong?
How would you structure the data so I could answer those three questions?
thanks!
It was easier than I tought.
First, on MongoDB you can't have dates as keys.
The model is:
{
"name" : "Bob Dylan",
"category" : "Exhibition",
"dates" : [
{
"init" : ISODate("2015-01-08T08:00:00Z"),
"end" : ISODate("2015-01-08T19:00:00Z")
},
{
"init" : ISODate("2015-01-09T08:00:00Z"),
"end" : ISODate("2015-01-09T21:00:00Z")
},
{
"init" : ISODate("2015-01-10T08:00:00Z"),
"end" : ISODate("2015-01-10T21:00:00Z")
}
],
"createdAt" : ISODate("2015-01-09T16:33:51.338Z")
}
And the query is:
return Events.find({
'dates.init' : { $gte: dateInit },
'dates.end' : { $lte: dateEndPlusOneDay }
});

Why does FB Insights API return different values but the same dates when segmenting by days_28 / week / day / lifetime?

These 3 API calls all return values for essentially the same dates (Jan-1st - Jan-30th).
/50813163906/insights/page_impressions_paid_unique/week?since=1388552400&until=1391144400
/50813163906/insights/page_impressions_paid_unique/day?since=1388552400&until=1391144400
/50813163906/insights/page_impressions_paid_unique/days_28?since=1388552400&until=1391144400
However the values for each date are hugely different.
/week gives
{value: 635756,end_time:"2014-01-01"},,{value: 479251,end_time: "2014-01-02"},{value: 396633,end_time: "2014-01-03"}...
/day gives
{value: 110598,end_time:"2014-01-01"},{value: 458,end_time: "2014-01-02"},{value: 4,end_time: "2014-01-03"}...
/days_28 gives
{value: 411634,end_time:"2014-01-01"},{value: 407725,end_time: "2014-01-02"},{value: 403430,end_time: "2014-01-03"}...
what are these date segments supposed to total up and from when to when?
I'm pretty sure that the values given are totals for the end date dependant on your segmenting .
For example, the segment /week returns:
{value: 635756,end_time: "2014-01-01"}
{value: 479251,end_time: "2014-01-02"}
{value: 396633,end_time: "2014-01-03"}
Which means:
For the 7 days prior to & ending on *2014-01-01* there were 635756 impressions
For the 7 days prior to & ending on *2014-01-02* there were 479251 impressions
For the 7 days prior to & ending on *2014-01-03* there were 396633 impressions
The segment /day returns:
{value: 110598, end_time: "2014-01-01"}
{value: 458, end_time: "2014-01-02"}
{value: 4, end_time: "2014-01-03"}
Which means:
For the day *2014-01-01* there were 110598 impressions
For the day *2014-01-02* there were 458 impressions
For the day *2014-01-03* there were 4 impressions
The segment /days_28 returns:
{value: 411634, end_time: "2014-01-01"}
{value: 407725, end_time: "2014-01-02"}
{value: 403430, end_time: "2014-01-03"}
Which means:
For the 28 days prior to & ending on *2014-01-01* there were 411634 impressions
For the 28 days prior to & ending on *2014-01-02* there were 407725 impressions
For the 28 days prior to & ending on *2014-01-03* there were 403430 impressions
These numbers look to be about right, but your 28 days numbers seem smaller than your week numbers which is strange. Maybe that has something to do with the since & until limits that you're putting on the GET request.
If you want the true numbers for a month, it's probably best to do by day and take the individual values and using the paging next / previous values that are returned as part of the results to help you navigate backwards until you reach the first of the month.

Designing a database for querying by dates?

For example, I need to query the last 6 months by the earliest instance for each day (and last day by earliest instances in each hours, and last day by minutes). I was thinking of using MongoDB and having a nested structure like
year:{ month:{ day: { hour: { minute: [array of seconds]
But to get the first instance I would have to sort the array which is costly. Is there an easier way?
Would be better just to have a date field.
And query to be something like:
find(date: {$gt : 'starting_date', $lt : 'ending_date'})