Is there any way to call Bing-ads api through a pipeline and load the data into Bigquery through Google Data Fusion? - rest

I'm creating a pipeline in Google Data Fusion that allows me to export my bing-ads data into Bigquery using my bing-ads developer token. I couldn't find any data sources that should be added to my pipeline in data fusion. Is fetching data from API calls even supported on Google Data Fusion and if it is, how can it be done?

HTTP based sources for Cloud Data Fusion are currently in development and will be released by Q3. Could you elaborate on your use case a little more, so we can make sure that your requirements will be covered by those plugins? For example, are you looking to build a batch or real-time pipeline?
In the meantime, you have the following two, more immediate options/workarounds:
If you are ok with storing the data in a staging area in GCS before loading it into BigQuery, you can use the HTTPToHDFS plugin that is available in the Hub. Use a path that starts with gs:///path/to/file
Alternatively, we also welcome contributions, so you can also build the plugin using the Cloud Data Fusion APIs. We are happy to guide you, and can point you to documentation and samples.

Related

Is there a connectivity possible between google firebase and AWS MongoDB

I have data getting stored in Google firebase (i.e. output of google vision API). I need the same data to get stored in MongoDB which is running on AWS. Is there a connectivity possible between Google cloud and AWS for data migration?
There is no out of the box solution for what you're trying to accomplish. Currently Firestore supports exporting it's data as documented here though the format of the export is probably not something MongoDB could import right away. Even if the format was compatible you would need some kind of data processing pipeline to handle the flow from side to side.
Depending on how you're handling the ingestion of Vision API results you might be able to include code to also send that data to MongoDB. If that's not the case you might need to design a custom solution for this particular use case.

how to search text in json file that Google vision api created from pdf

Is there any way to search text in json files that Google vision api created from pdf.
searching of text should be happen over Google cloud storage only
Google Cloud Storage is an Object based storage solution that does not provide processing features. In order to perform any process job over the Cloud Storage data you would need a computing/processing solution, and I’d opt for a serverless option such as Cloud Functions.
I’ve found at the Cloud Functions Docs a sample application that integrates several APIs with Cloud Functions and Cloud Storage, I think you can use it as a guideline to develop your own setup.
Once you have the mentioned setup you could apply a regex implementation to search for the desired data, how to implement it will depend on the runtime, libraries and technologies that you choose to use.

How setup Google Ads as source of Cloud Data Fusion pipeline?

I'm trying to ingest data of my Google Ads account into a Cloud Data Fusion pipeline, but I just see that are only available 12 sources (BigQuery, Amazon S3, File, Excel, Kafka Consumer, etc)
Does anybody know if there are a way to connect directly via API? Or need I a paying solution as extractor of the data?
Many thanks!
Are you looking to ingest data from Analytics 360? https://marketingplatform.google.com/about/analytics-360/
Cloud Data Fusion does not have this connector but we will have this available in the future.
An update, now you can go to HUB on the top right corner and choose your data source such as Google ads or Google analyticsenter image description here

Is there a Google Dataflow MongoDB Source/Sink?

I know Google Dataflow only officially supports as I/O for a Dataflow a file in Google Cloud Storage, BigQuery, Avro files or Pub/Sub out of the box.
But as it has an API for Custom Source and Sink I was wondering, is there some Pipeline I/O implementation for MongoDB?
Right now I will have to either migrate my data to BigQuery or write the whole Pipeline I/O implementation before even being able to know if Google Dataflow is a viable solution to my current problems.
I tried googling and looking at the current SDK issues and didn't see anything related. I even started to wonder if I missed something very basic from Google Dataflow concept and docs that completely invalidades this initial idea to use MongoDB as a data source.
Recently a MongoDB connector was added to Apache Beam (incubating). Please see MongoDBIO.

Is there a way to use AWS Data Pipeline for ETL project?

I have a data transformation task at hand and am currently in need of implementing an SSIS class package using AWS Data Pipeline. Is it possible to do custom code using its SDK to retrieve data from third party SOAP based web services?
I obviously need to pull data from third party SOAP Service and then do a lot of data massaging of my own before I can dump that data on an Amazon S3 storage.
Any help in this direction is welcome.