Can anyone point me to any documentation on streaming data into big query using Dart/Flutter?
I'm pretty new to all this and cannot for the life of me find any Flutter specific documentation or precedence.
Can anyone help?
Dart/Flutter isn't one of the languages which has a directly maintained client library, but it looks like there's a discovery-based library. The API method endpoint you need for streaming data is bigquery.tabledata.insertAll, which is surfaced in the dart library here it seems:
https://pub.dev/documentation/googleapis/latest/googleapis.bigquery.v2/TabledataResourceApi/insertAll.html
None of the docs in the BigQuery public documentation will contain Dart/Flutter code examples, but you can get more background on streaming data in https://cloud.google.com/bigquery/streaming-data-into-bigquery
Flutter --> Google Cloud Function --> BigQuery
Your best approach by far will be to use a Google Cloud Function to write to BigQuery. Architecturally, this is the right approach, though I could argue that putting a stream platform (like Google Pub/Sub) in the middle has some additional benefits. In your situation, however, I'd recommend going straight from Cloud Functions to BigQuery for simplicity since you're already going to have a learning curve and probably don't need the stream layer at this point.
There's good documentation for integrating Cloud Functions into Flutter, and it's quite intuitive to use.
Here are the official Flutter docs that talk about this integration path:
https://firebase.flutter.dev/docs/functions/overview/
Here's a good walkthrough on how to actually do it: https://rominirani.com/tutorial-flutter-app-powered-by-google-cloud-functions-3eab0df5f957
On the BigQuery side:
There's a very detailed walkthrough here on writing to BigQuery from Cloud Functions: https://cloud.google.com/solutions/streaming-data-from-cloud-storage-into-bigquery-using-cloud-functions
You can find more guides as well if you search Google.
Related
What is the proper way to cache API results using Hive?
The current way I plan to implement it is using the request URL as the key and the returned data as the body.
Is there a proper way to do this more production friendly? I can't find a tutorial as most tutorials are abstracted by using another package that takes care of this for them or a tutorial uses a different package.
In order to cache rest API data, you can use hive which is a No-SQL database and is easier to use and faster to retrieve and is faster than shared preferences and sqflite.
For more details you can check this repo to understand better :
https://github.com/shashiben/Anime-details
And you can read this article: https://medium.com/flutter-community/flutter-cache-with-hive-410c3283280c
The code is written cleaner and is architectures by using stacked architecture. Hope this answer is helpful to you
Is there any way to search text in json files that Google vision api created from pdf.
searching of text should be happen over Google cloud storage only
Google Cloud Storage is an Object based storage solution that does not provide processing features. In order to perform any process job over the Cloud Storage data you would need a computing/processing solution, and I’d opt for a serverless option such as Cloud Functions.
I’ve found at the Cloud Functions Docs a sample application that integrates several APIs with Cloud Functions and Cloud Storage, I think you can use it as a guideline to develop your own setup.
Once you have the mentioned setup you could apply a regex implementation to search for the desired data, how to implement it will depend on the runtime, libraries and technologies that you choose to use.
I've done quite a bit of searching, but haven't been able to find anything within this community that fits my problem.
I have a MongoDB collection that I would like to normalize and upload to Google Big Query. Unfortunately, I don't even know where to start with this project.
What would be the best approach to normalize the data? From there, what is recommended when it comes to loading that data to BQ?
I realize I'm not giving much detail here... but any help would be appreciated. Please let me know if I can provide any additional information.
If you're using python, easy way is to read collection chunky and use pandas' to_gbq method. Easy and quite fast to implement. But better to get more details.
Additionally to the answer provided by SirJ, you have multiple options to load data to BigQuery, including loading the data to Cloud Storage, local machine, Dataflow any more as mentioned here. Cloud Storage supports data in multiple formats such as CSV, JSON, Avro, Parquet and more. You also have various options to load data using Web UI, Command Line, API or using the Client Libraries which support C#, GO, Java, Node.JS, PHP, Python and Ruby.
I know Google Dataflow only officially supports as I/O for a Dataflow a file in Google Cloud Storage, BigQuery, Avro files or Pub/Sub out of the box.
But as it has an API for Custom Source and Sink I was wondering, is there some Pipeline I/O implementation for MongoDB?
Right now I will have to either migrate my data to BigQuery or write the whole Pipeline I/O implementation before even being able to know if Google Dataflow is a viable solution to my current problems.
I tried googling and looking at the current SDK issues and didn't see anything related. I even started to wonder if I missed something very basic from Google Dataflow concept and docs that completely invalidades this initial idea to use MongoDB as a data source.
Recently a MongoDB connector was added to Apache Beam (incubating). Please see MongoDBIO.
I'm Cococa programmer, but right now I encountered situation when I can't go any further without smarter people:)
I always used small databases in my applciations. I programmed PHP backend on my own server and it worked good.
Right now I have to switch for something much bigger and I decided to try with Google App Engine, because it is relatively cheap and has great scalability.
I'm so confused with documentation and I really don't know where to start.
My new app will store data (images, videos) as well as database (mysql) in google cloud.
I concluded that for app like that I should use:
Google Cloud Storage for images / viedos etc.
Google Cloud SQL for CRUD operations for users (inserting and fetching personal data)
I would prefer to use JSON api. Then I don't have to write any Java, Python or GO code, right? Only REST requests for Google Cloud SQL...
My question is : Am I thinking correctly? Should I use these two services?
Google App Engine has a feature called "Cloud Endpoints" (Java | Python)
that automatically generates a JSON API similar to the APIs that Google provides for its own services (and also generates client libraries in JavaScript, Obj-C, and Java to invoke those APIs), saving you the trouble of writing the REST API yourself and manually serializing/deserializing the request and, instead, focusing on just the business logic that performs the storage and retrieval operations. So, what I would suggest is that you write the code that reads/writes data into the datastore (and cloud storage), but then use Cloud Endpoints to automatically generate your JSON API and client libraries, rather than manually writing that code.
Your plan seems fine so far. Google Cloud Storage is a great choice for storing a large number of images and movies, and Google Cloud SQL is a great choice for handling smaller, more relational data.
If you're using PHP from app engine, there's built-in support for Google Cloud Storage. See https://developers.google.com/appengine/docs/php/googlestorage/
If you're using PHP from your app that lives somewhere else, you could write to the Google Cloud Storage JSON or XML APIs directly, but there's also a PHP library for the Google APIs that might be easier for you to use: https://code.google.com/p/google-api-php-client/