How to deal with large ICS files? - icalendar

My application (php/laravel, but irrelevant here) holds calendar entries for its users (comparable to a car logbook), and some users want to sync those events to their calendar app of choice. I started looking into the ics standard (RFC 5545 etc.) and created an endpoint that generates those files.
Problem: The files are getting huge. Some users have their entire driving history with hundreds and thousands of entries in the application, generating and transfering those MBs of ICS files will take ages (using php, anyways), let alone doing that everytime the calendar app tries to sync.
Question(s): What is the preferred way of dealing with huge ICS documents? HTTP headers and caching is one thing, but how do other people solve this problem? Just send events of the last year? Is there a (pagination?) spec that I haven't found yet?

This is historical data, so it is not going to change. You could offer batches by time period and cache the historical batches. Last years, or anything before the last 4 weeks, never gets updated for example. They do a one-off import of each historical batch into a separate 'driving history' calendar. No more subscribing? Or maybe they can only subscribe for the last month say?
One cannot import & subscribe into the same calendar, so it does mean they would have at least 2 calendars - 1 historical calendar used for imports, and 1 'current' that will update with yesterday's ride. Of course there is then manual effort for anyone who wants to always have the old data as when events fall off the 'current' calendar, at some point they'd have to go import the latest 'old' events.

Related

How to setup an ICS file with recurring events that have lapses?

I've hit a roadblock in trying to figure out the best way to create an ICS file that will have recurring events (and single instance events). My main issue is with recurring events, because a lot of times the recurring events could have a break in between them. Given various articles I have read about ICS file size limitations, Google only accepting 1,111 events in total for a file, what are my options for formatting the event data to accurately represent the events while maximizing the events I can fit into the file?
So far I've come up with:
Option 1: A single VEVENT for each group of recurring events that have no lapse in recurrence.
Option 2: Separate VEVENTS for every single instance of recurring events.
Option 2 seems like the most accurate representation of events, however, that would use up data quickly.
Are these basically my two options?
If your lapse only involve a small number of instances skipped, then EXDATE may work for you. You could generate the next x years of EXDATES perhaps.
https://www.rfc-editor.org/rfc/rfc2445#section-4.8.5.1
Unfortunately for you EXRULE is deprecated and applications support for that may not be there.

Fetching 1 minute bars from Yahoo Finance

I'm trying to download 1 minute historical stock prices from Yahoo Finance, both for the current day and the previous ones.
Yahoo (just like Google) supports up to 15 days worth of data, using the following API query:
http://chartapi.finance.yahoo.com/instrument/1.0/AAPL/chartdata;type=quote;range=1d/csv
The thing is that data keeps on changing even when the markets are closed! Try refreshing every minute or so and some minute bars change, even from the beginning of the session.
Another interesting thing is that all of these queries return slightly different data for the same bars:
http://chartapi.finance.yahoo.com/instrument/2.0/AAPL/chartdata;type=quote;range=1d/csv
Replace the bold number with 100000 and it will still work but return slightly different data.
Does anyone understand this?
Is there a modern YQL query that can fetch historical minute data instead of this API?
Thanks!
Historical minute data is not as easily accessible as we all would like. I have found that the most affordable way to gather Intraday Stock Price data is to develop automated scripts that log price information for whenever the markets are open.
Similar to the Yahoo data URL that you shared, Bloomberg maintains 1-Day Intraday Price information in JSON format like this : https://www.bloomberg.com/markets/api/bulk-time-series/price/AAPL%3AUS?timeFrame=1_DAY
The URL convention appears easy to input on your own once you have a list of Ticker Symbols and an understanding of the consistent syntax.
To arrive at that URL initially though, without having any idea for guessing / reverse-engineering it, I simply went here https://www.bloomberg.com/quote/AAPL:US and used Developer Tools on my browser and tracked a background GET request which led me to that URL. I wouldn't be surprised if you could employ similar methods on other Price Data-related websites.
You can also write scripts to track price data as fast as your internet goes. One python package that I find pretty handy and is ystockquote
You can have it request price data every couple of seconds and log that into a daily time series database.
Yes there is other APIs.
I don't know if it can still help but if you need intraday data, there is a API on rapidapi called (Quotient) which allows to pull intraday (at 1-min level), EOD market (FX, Crypto, Stocks (US, CANADIAN, UK, AUSTRALIA, EUROPE), ETFs and Futures. It also provides earnings, dividends, splits and a lot others informations.

redmine: sending alert if user does not enter hours

So we have this distributed team who are working on a project and whose hours/progress is being monitored using redmine. All the guys are really talented and hardworking but pretty bad when it comes to updating their daily tasks/progress using redmine. This makes it very difficult for our project managers to understand and steer the progress as well as the upper management to get a quick overview of where we stand with various development initiatives.
Things have become so bad that I have been tasked to setup an email alert such that every night say 12:00 midnight, an email goes out to everyone on an email list with the date and names of users who have not updated their hours for that day. The management hopes that this exercise will instil in them the discipline to update their tickets on a daily basis.
My question is: Is this possible in redmine? Any API, or adhoc way of send out emails based on a custom Query? I have not worked with redmine before and have no idea how to go about this.
If there are anyone with prior experience I would be very grateful to get some directions!
I send some daily reminders to our redmine users, to help ensure that issues don't slip through the cracks in our workflow. I skip the API and just write perl scripts that connect directly to the database, scheduled via cron. The database is well designed and easy to understand: my SQL skills are very basic, and I've always been able to pretty easily hack out a query that gets what I need.
Some thoughts:
The end of "today" might be a relative concept if your team is worldwide. You could run your script hourly and base the reminders on users' time zones.
You might want to handle holidays and vacations, so that your users don't get nagged on their days off.
We use custom roles in redmine to control some of the emails. (We have a "new issue watcher" who gets triage mails in each project.) You could do the same thing to let certain users opt in or out of the time-tracking nag mails.
If you're interested, here's a link to one of my reminder scripts:
http://joecullin.com/redmine_scripts/redmine_reminders

ICal Format - One time import or RSS like functionality?

I'd like to publish a list of events in the iCal format. If I have a public ICS file that I add/update events on, will clients like Google Reader and Outlook receive those updates automatically? I.e., does it behave like an RSS feed that gets periodically pinged for changes or is it a one time import?
One time import. See Wikipedia, it basically represents a mechanism for sending one or more appointments.
If the iCal is published and updated, and your client can be set to periodically refresh, you may get the effect of a subscription to an RSS, but really it is a series of imports.
It depends how the client is configured. For instance, in Korganizer I often do one-time imports (e.g. departure times when I book a flight). But I have other ICS files (e.g. organizational calendars) where I set it do pinging.

Third party data delivery of lots of data

Does anyone know how sites that have a real-time feed of a lot of data work? I am referring to something like a stock site, where they can tell you in real time (well, 20 minute delay mostly, but still real-time - 20 minutes as I understand it).
They have thousands of data pieces delivered to them every second, I would imagine: MSFT 25.00 +.23 VOL 12000 ???? for each stock that had a change during some interval.
So, is there just a constant feed of small pushes going on? Or do you think a site will pull from the place that has the real data and say "give me all changes since 12:23:45 CST to now" type query?
I ask this because at work we might have a situation where we need to have at our application's fingertips real time information like this, and it won't make sense to hit our third party provider over and over and over again every second...
Generally there is a server/client protocol defined between the 2 parties. In the company I work for the connection is maintained at all times.
Here is info on real time data feeds to go with your stock example
NYSE,NASDAQ
It is common for data providers to also have FTP sites with (delayed) batched data. One that comes to mind is the NWS EMWIN
Sites like Twitter feed data to certain approved sites in real-time via XMPP (Wiki link).
In the broadest terms, a push model is going to be the best way of achieving "real time" transfer, particularly if you're talking about a large amount of data.
However you do always have a problem when using a purely push model of how to recover from missed data.
Depending on the nature of your data that may not be a problem (thinking of video delivery as an analogue, where the amount of data is huge but there is sufficient redundancy for it to recover from missing data). And if you have any control over the data you may be able to build some redundancy in. For example, on every change event you can provide absolute values rather than changes, or previous value and new value.
I've done this making an attempt to retrieve the stock quote from the source, and falling back to a timestamped on-disk cache of the quote when the main source fails or times out.