Advice on a paid for streaming solution please - streaming

I have a potential project for which the client has asked for the ability for people to visit a website and pay to hear an audio file.
They have an attraction and would like people to pay a nominal amount (£1) for them to listen to a short story as they wander round.
My thoughts were to have some sort of paywall streaming service integrated into his website so that they have no issues over hosting/bandwidth.
My research, however is drawing blanks. Most streaming services are set up for video, radio stations, or the fees involved wouldn't be worth paying at £1 a shot.
Both he client and myself are only expecting a few hundred streams every year, and the fee is really just to raise funds for local charity, so this is a pretty low level operation.
Does anyone have any advice in this area please?

If the total income will be just several hundred pounds per year, then I suspect it might be easier and even more profitable to just have a donations bucket at the venue.
If you do want to have some sort of online payment, then for the type of low monetary value content, i.e. people are unlikely to put the time and effort into circumventing the payment or pirating the content, a simple paywall before exposing the link to the audio may be fine.
There are also many open source audio streaming solutions and audio bandwidth is actually quite low by todays standards so actually hosting the audio may be easier that you expect. A goos list of servers is here:
https://en.wikipedia.org/wiki/List_of_streaming_media_systems

You don't need anything special for a pre-recorded audio file streaming. Any HTTP server would do just fine.
I would handle this by uploading your media file to an S3 bucket, with no public read permissions. Then, once someone has paid, use a key pair that does have read permissions to sign a URL so that the client can access it. Then, just play that file client-side in a normal <audio> tag like anything else.

Related

Audio streaming from Google Cloud Storage and CDNs - costs

So I'm making an app that involves streaming audio(radio-like) from the Google Cloud Storage and was looking into the costs. It seems it would be much too expensive as is.
e.g. Lets say I have 10MB audio files, a user listens to 20 files a day and I have 2000 active users. That's 400GBs or $48/day. i.e. ~$1440/month just for that.
I then looked into putting a CDN in front of it, to minimize direct reads from the Storage. Now initially that made sense to me. The CDN would cache the audio files and the clients would be getting the files from the cache most of the time. However, as I was looking at Fastly's pricing (Fastly is a Google partner and seems like a good fit) I noticed that they seem to be pricing bandwidth usage to their cache at the exact same rate as Google cloud does ($0.12/GB). So unless I'm reading this wrong, putting up the CDN would not save me ANY money. Now I get that there are other reasons why putting a CDN in front of it could be a good idea, but am I really reading this right?
Also, if you have any other tips on how I should set this up, I'm all ears.
Estimating the invoice of such a service is a complex matter. To get an informed answer and tips regarding possible implementation paths I would suggest reaching out to a GCP Sales representative. Similarly you should contact the Fastly team to get a precise picture of their pricing model.
Moreover, any estimate we could make here would be outdated as soon as any of the respective pricing model changes, which would invalidate the answer and probably drive future readers to wrong conclusions.

How can I search for past sent emails with Sendgrid?

As Sendgrid's documentation makes clear, their web GUI activity page is only searchable for the past 7 days.
How do I search for activity from farther in the past?
Web API documentation is here, but I can't find anything about just plain searching for info on sent emails. All I see are endpoints for seeing particular categories of emails' various fates, like blocks, bounces, invalid emails, and "filters", which seem like actions and not like filters.
It's got to be possible to just find info about some particular sent email, right?
It's not possible. As you noted, the documentation clearly states that:
Email activity only shows the most recent 7 days. To access data in
real time, we recommend that you consider implementing our Event
Webhook.
If you want to record all the history associated with your account you should record and save it yourself. You can record all the emails you send provided you have an endpoint to do so. See here: https://sendgrid.com/docs/User_Guide/Settings/parse.html
Later Edit:
"real time" means "as it happens", it does not mean "history searchable at any point in time".
When you use an API, as a developer, the responsibility to log all API calls and responses lies with you. While it's true that bounces aren't necessarily reported in the API call response, the SendGrid API offers several ways in which you can be notified. Personal opinion: I know this functionality is often omitted in the MVP because you need to go to market as soon as possible, but an ELK stack is not that hard to set up.
There are several ways you can look for bounces and other events as you can see here: https://sendgrid.com/docs/Classroom/Track/Bounces/bounce_reports_how_can_i_be_notified.html
Webhook for events: http://sendgrid.com/docs/API_Reference/Webhooks/event.html
Enabling Bounce Forwarding on your account
Bounce API: https://sendgrid.com/docs/API_Reference/Web_API_v3/bounces.html
If you really need to find out what happened on day X with email send Y, you can contact their Support team. They can probably look it up for you.
Personal opinion:
That 7 days is not a random number. I'm willing to bet that SendGrid does in fact log all calls you made but it can't provide them for an earlier time. When you use Facebook API, Twitter API, etc. You don't expect them to provide you with historical data of every API call you made. This is an ungodly amount of data. We're talking about an API that is used to send probably upwards of millions of emails per day, maybe even more. I believe they actually did the math and recalling historical data from earlier would put an unnecessary strain on the system, it would take a long time to answer such a request.
I'm sorry if I went on a bit of a rant but people often don't think about the volume of data needed to store such things and how much it would cost to search it.

Avoiding data loss: suggested reading

I am about to work on an app which handles extremely valuable data. Any loss of this data for the user would be very costly, so I'm interested in finding out more about the best architecture design for our needs.
The user will be inputting this data in their iPhone each day. The alternative to using this app is carrying around a piece of paper with this sensitive information on it. So while I know we can be more secure than a piece of paper, I want to make sure we also cover the user stories like "I flushed my phone down the toilet" or "my son deleted the app, where's my data?"
A service like Dropbox comes to mind, but I wouldn't want to require our users to have a Dropbox account; the syncing architecture must be transparent to the user. iCloud is out because web and Android versions may follow.
Can anyone suggest either some good reading on this subject, or some good frameworks to look at? I expect to use a node.js backend, and while we are targeting iPhone first, Android will follow.
The data itself consists of 2 tables, each with a small number of fields, with a many to many relationship. A few new rows will be created by the user each day, but the data will be small and highly compressible.
Turns out this is an extremely difficult issue. In data assurance (this isnt yet a security type situation although could become one because of the assurance aspect) there is ALWAYS a time element. As a simple example what happens if your use has locally updated some piece of data. Just before you have the ability to fully push the data to some cloud service, etc... he / she dumps it in the toilet. Even if good signal was there for transmitting the data there is time in transferring and time necessary for the cloud server to respond saying the data got there properly.
Generally in data assurance, you really have to work to the best you can. You will NEVER be able to solve all issues as there is no data center, nor link to a data center, etc... that is perfect. There is always a chance of data loss. Truly the best you can do, is SYNC as fast as data changes, and if there is loss of connection, as soon as the connection becomes alive again.
Now, for security. Security by itself does not create assurance. If the data itself is something that the customer does not want to lose, and that is his only requirement, then security is un-necessary. If he / she is also worried about other getting their hands on his data, then you have to be worried about data-in-transit (both up and down during syncing), and on the device itself. For the best potential security, encrypt the data locally on the device prior to pushing over the cloud. There are many known attacks that even if using SSL or other services, can get at the data. If you wish, locally encrypt a file, then you could for SOME added security still use SSL (at this point you will have doubly encrypted the data). You also want to sign the data so that there is little chance of it being manipulated in transit, or by the cloud server itself (if a hacker hacked the cloud server). Generally the way to protect the data while on device, you may choose to have the user input a password, and put some fairly strict rules around how passwords are formed, and how many tries you allow before you disallow attempts for 30 minutes or so.
You may also wish to store the data locally in an encrypted form. This way if someone gets the device, they still will need to have the password before they can get the data (unless of course they can crack the algorithm you use to generate the symetric key from the password).
In terms of online data service, you could use iCloud, etc... I am actually NOT a fan of anything cloud. I think it is SO counter enterprise / proprietary data, it isnt even funny. I think it actually almost laughable that so many of these phone / device manufacturers are going SOOOOO cloud based. I think they are abandoning the big companies, as NO big company I know of wants to place their proprietary data on a cloud server that THEY DONT CONTROL. In any case, I would argue that so long as you have a good local encryption scheme prior to sending out the data, then you should be OK. I would from an assurance perspective however look at where the servers are in locale. the reason being that if assurance of data is of prime concern, most larger IT setups like to have replicated data centers on opposing sides of the country / world etc... The reason for this is if an earthquake takes down the data center on one side of the country, it most likely will NOT take down the one on the other side of the country simultaneously. If the data centers for iCloud or whatever you can find are essentially in one locale, then you may consider syncing with one data center on the west coast, and choose a completely differing data center (in this case company) to sync with that is centered on the east coast.
This is all very high level, how you would implement this on an iPhone specifically we could also talk about, byt I hope this at least begins to help pave a path.

Facebook image URLs - how are they kept from un-authorised users?

I'm interested in social networks and have stumbled upon something which makes me curious.
How does facebook keep people from playing with URLs and gaining access to photos they should not?
Let me expand, here's an altered example of a facebook image URL that came up on my feed-
https://fbcdn-sphotos-g-a.akamaihd.net/hphotos-ak-prn1/s480x480/{five_digit_number}_{twelve_digit_number}_{ten_digit_number}_n.jpg
So, those with more web application experience will presumably know the answer to this, I suspect it's well understood, but what is to stop me from changing the numbers and seeing other people's photos that possibly I'm not supposed to?
[I understand that this doesn't work, I'm just trying to understand how they maintain security and avoid this problem]
Many thanks in advance,
Nick
There's a couple ways you can achieve it.
The first is link to a script or action that authenticates the request, and then returns an image. You can find an example with ASP.NET MVC here. The downside is it's pretty inefficient, and you run the risk of twice the bandwidth for each request (once so your server can grab the image from wherever it's stored, and once to serve it to your users).
The second option, you can do like Facebook and just generate obscure url's for each photo. As Thomas said in his comment, you're not going to guess a 27 digit number.
The third option I think is the best, especially if you're using something like Microsoft Azure or Amazon S3. Azure Blob Storage supports Shared Access Signatures, which let's you generate temporary url's for private files. These can be set to expire in a few minutes, or last a lifetime. The files are served directly to the user, and there's no risk if the url leaks after the expiration period.
Amazon S3 has something similar with Query String Authentication.
Ultimately, you need to figure out your threat model, and make a decision weighing the pros and cons of each approach. On Facebook, these are images that have presumably been shared with hundreds of friends. There's a significantly lower expectation of privacy, and so maybe authenticating every request is overkill. A random, hard to guess URL is probably sufficient, and let's them serve data through their CDN, and minimizes the amount of processing per request. With Option 3, you're still going to have overhead of generating those signed URL's.

Third party data delivery of lots of data

Does anyone know how sites that have a real-time feed of a lot of data work? I am referring to something like a stock site, where they can tell you in real time (well, 20 minute delay mostly, but still real-time - 20 minutes as I understand it).
They have thousands of data pieces delivered to them every second, I would imagine: MSFT 25.00 +.23 VOL 12000 ???? for each stock that had a change during some interval.
So, is there just a constant feed of small pushes going on? Or do you think a site will pull from the place that has the real data and say "give me all changes since 12:23:45 CST to now" type query?
I ask this because at work we might have a situation where we need to have at our application's fingertips real time information like this, and it won't make sense to hit our third party provider over and over and over again every second...
Generally there is a server/client protocol defined between the 2 parties. In the company I work for the connection is maintained at all times.
Here is info on real time data feeds to go with your stock example
NYSE,NASDAQ
It is common for data providers to also have FTP sites with (delayed) batched data. One that comes to mind is the NWS EMWIN
Sites like Twitter feed data to certain approved sites in real-time via XMPP (Wiki link).
In the broadest terms, a push model is going to be the best way of achieving "real time" transfer, particularly if you're talking about a large amount of data.
However you do always have a problem when using a purely push model of how to recover from missed data.
Depending on the nature of your data that may not be a problem (thinking of video delivery as an analogue, where the amount of data is huge but there is sufficient redundancy for it to recover from missing data). And if you have any control over the data you may be able to build some redundancy in. For example, on every change event you can provide absolute values rather than changes, or previous value and new value.
I've done this making an attempt to retrieve the stock quote from the source, and falling back to a timestamped on-disk cache of the quote when the main source fails or times out.