CSV file storage vs Postgres database for multiple users - postgresql

I have an dash app that features live-updates on CSV data uploaded to a private repository on Github at 5 minutes intervals. Currently the DASH app reads the entire CSV file at each update and populates the graphs. The app is hosted on heroku. I'm concerned about scalability of the app, as multiple users using the app will send API requests for CSV data of perhaps 30-70 rows of data potentially requiring a lot of dynos. Would hosting the data on Postgres prove advantages in any way? Currently, each new user has a separate folder in the github repository which makes API requests simple. Thank you. Note: I'm new to SQL, please be gentle.

Related

Flutter web NoSQL options with local saving and cross-user access

I'm looking for a flutter web compatible solution for NoSQL data storage/retrieval such that multiple users can save/fetch data - with the catch it needs to be hosted locally with no 3rd party tools. Firebase/Firestore is 100% perfect except for the caveat data is stored remotely.
So ideally when any user launches the app, the app will parse a locally stored JSON/NoSQL-like DB (from a SharePoint directory sigh), and their edits/saves/uploads will be written back. I'm not too worried about data-access collisions at this time. What solutions do I have? Thanks!

Near real time streaming data from 100s customer to Google Pub/Sub to GCS

I am getting near-real time data from 100s of customers. I need to store this data in Google Cloud Storage buckets created for each customer i.e. /gcs/customer_id/yy/mm/day/hhhh/
My data is in Avro. I guess I can use Pub/Sub to Avro Files on Cloud Storage template.
However, I'm not sure if Google Pub/Sub can accept data from multiple customers.
Appreciate any help here, thanks!
The template is quite simple: it takes all the data of PubSub and store them in an avro file on GCS.
However, it's a good starting point and you can make evolutions on that base to add a split per customer, and the file path that you want.
You can find the template in Java format on GitHub

Using Google Cloud Storage with Google Data Prep

I am using Google Cloud Storage to store CSV files. These CSV files get updated daily with new data in them. I'm hoping to use Google Data Prep to automate the process of cleaning these files and then combining them. Before I start to build this process, I am curious if this is a good way to use this platform. The CSV files will be in the same format each time. Are there any cause for concern if the files get updated on a daily basis? Or possible errors that could arise that I don't know about?
This is a great use case for Google Cloud Dataprep. You can parameterize your inputs. See https://cloud.google.com/dataprep/docs/html/Overview-of-Parameterization_118228665 and https://cloud.google.com/dataprep/docs/html/Create-Dataset-with-Parameters_118228628

Which kind of Google Cloud Platform mobile backend client is appropriate?

THE PROBLEM
I'm writing a mobile app which will allow a user to log in, save some preferences that must be stored in a database, and display congressional bills to the user.
I've only written simple RESTful services with PHP and MySQL in the past. I'd like to take advantage of newer technologies, and am a little lost on general direction.
The bill data (formatted as JSON) can be gathered by running the scrapers found here. Using docker, I managed to set a working directory and download the files on my local machine.
I've designed a MySQL database for holding the relevant bill and user data.
I started to mess around in Google Cloud Platform, and read the doc that describes different models. I'm thinking of a few different ideas, but aren't familiar with GCP or what I can actually accomplish.
QUESTIONS
1) What are App Engine, Compute Engine, and Container Engine each for? I get the gist that Container Engine holds different instances of stuff you load up with docker, and that Compute Engine sets up a VM, but I don't really understand the relationships. How should I think of them?
2) When I run those scrapers from the shell, where are the files being stored, and how can I check on them? On my computer, I set a working directory, but how do directories work in GCP? Is it just a directory in the currently selected VM, or is this what Buckets are for?
IDEAS
1) Since my bill data already comes as JSON, should I skip the entire process of building a database for the bills and insert them into Firebase somehow? Is this even possible? If so, am I stuck using Firebase's NoSQL, or can I still set up a relational database?
2) I could schedule the scrapers to run periodically, detect new files, and run a script to parse the JSON and insert new bill data into my a database (PostgrSQL?/MySQL?). Then I would write an API.
3) Download the JSON files to a bucket, and write an API that reads from them. Not sure how the performance would compare to using a DB.
I'm open to other suggestions as well.
For your use case (stateless web application), App Engine is probably your best choice. The Google documentation has severalcomparisons of your computing options
You can use App Engine with PHP and cloud-hosted MySQL if you want, which could be a good way to get your toes wet without going in over your head.

Free web servers to publish csv files or other types of data

I am writing an App that requires some kind of outside input which I want the user to be able to update sporadically.
Are there any free web services where one can upload a file e.g. CSV and update it at any stage and then access it via iOS?
What about dropbox? Its free for small amounts and has a desktop and IOS client. It also has a fairly decent API.
http://www.dropbox.com