how to search text in json file that Google vision api created from pdf - google-cloud-storage

Is there any way to search text in json files that Google vision api created from pdf.
searching of text should be happen over Google cloud storage only

Google Cloud Storage is an Object based storage solution that does not provide processing features. In order to perform any process job over the Cloud Storage data you would need a computing/processing solution, and I’d opt for a serverless option such as Cloud Functions.
I’ve found at the Cloud Functions Docs a sample application that integrates several APIs with Cloud Functions and Cloud Storage, I think you can use it as a guideline to develop your own setup.
Once you have the mentioned setup you could apply a regex implementation to search for the desired data, how to implement it will depend on the runtime, libraries and technologies that you choose to use.

Related

Is there any way to call Bing-ads api through a pipeline and load the data into Bigquery through Google Data Fusion?

I'm creating a pipeline in Google Data Fusion that allows me to export my bing-ads data into Bigquery using my bing-ads developer token. I couldn't find any data sources that should be added to my pipeline in data fusion. Is fetching data from API calls even supported on Google Data Fusion and if it is, how can it be done?
HTTP based sources for Cloud Data Fusion are currently in development and will be released by Q3. Could you elaborate on your use case a little more, so we can make sure that your requirements will be covered by those plugins? For example, are you looking to build a batch or real-time pipeline?
In the meantime, you have the following two, more immediate options/workarounds:
If you are ok with storing the data in a staging area in GCS before loading it into BigQuery, you can use the HTTPToHDFS plugin that is available in the Hub. Use a path that starts with gs:///path/to/file
Alternatively, we also welcome contributions, so you can also build the plugin using the Cloud Data Fusion APIs. We are happy to guide you, and can point you to documentation and samples.

Creating an IBM Watson search engine using Bluemix for internet & database research

I would like to use Bluemix to create an IBM Watson search engine (i.e. similar to a Google Search Engine interface) that will query either the internet (websites I specify) or online database and provide summaries of unstructured data, identify concepts, etc.
Are there any existing apps like this available or does anyone know how this can be setup with Bluemix or another platform?
You should take a look at the Alchemy API service on Bluemix.
It allows you to do things like extract entities and keywords.
Most of the APIs allow you to feed them html, text or web-based content. Stringing a bunch of these together and tagging content in a database such as Elasticsearch should allow you to achieve what you're after.
Hard to be too specific given the fairly broad nature of your question.

uploading images to php app on GCE and storing them onto GCS

I have a php app running on several instances of Google Compute Engine (GCE). The app allows users to upload images of various sizes, resizes the images and then stores the resized images (and their thumbnails) in the storage disk and their meta data in the database.
What I've been trying to find is a method for storing the images onto Google Cloud Storage (GCS) through the php app running on GCE instances. A similar question was asked here but no clear answer was given there. Any hints or guidance on the best way for achieving this is highly appreciated.
You have several options, all with pros and cons.
Your first decision is how users upload data to your service. You might choose to have customers upload their initial data to Google Cloud Storage, where your app would then fetch it and transform it, or you could choose to have them upload it directly to your service. Let's assume you choose the second option, and you want users to stream data directly to your service.
Your service then transforms the data into a different size. Great. You now have a new file. If this was video, you might care about streaming the data to Google Cloud Storage as you encode it, but for images, let's assume you want to process the whole thing locally and then store it in GCS afterwards.
Now we have to get a file into GCS. It's a PHP app, and so as you have identified, your main three options are:
Invoke the GCS JSON API through the Google API PHP client.
Invoke either the GCS XML or JSON API via custom code.
Use gsutil.
Using gsutil will be the easiest solution here. On GCE, it automatically picks up appropriate credentials for your service account, and it's got several useful performance optimizations and tuning that a raw use of the API might not do without extra work (for example, multithreaded uploads). Plus it's already installed on your GCE instances.
The upside of the PHP API is that it's in-process and offers more fine-grained, programmatic control. As your logic gets more complicated, you may eventually prefer this approach. Getting it to perform as well as gsutil may take some extra work, though.
This choice is comparable to copying files via SCP with the "scp" command line application or by using the libssh2 library.
tl;dr; Using gsutil is a good idea unless you have a need to handle interactions with GCS more directly.

Blob Storage Server with REST API

I am looking for a solution similar to Amazon S3 or Azure Blob Storage that can be hosted internally instead of remotely. I don't necessarily need to scale out, but I'd like to create a central location where my growing stable of apps can take advantage of file storage. I would also like to formalize file access. Does anybody know of anything like the two services I mentioned above?
I could write this myself, but if something exists then I'd rather now reinvent the wheel, unless that weel has corners :)
The only real alternative to services like S3 and Azure blobs I've seen is Swift, though if you don't plan to scale out this may be overkill for your specific scenario.
The OpenStack Object Store project, known as Swift, offers cloud storage software so that you can store and retrieve lots of data in virtual containers. It's based on the Cloud Files offering from Rackspace.
The OpenStack Object Storage API is implemented as a set of ReSTful (Representational State Transfer) web services. All authentication and container/object operations can be performed with standard HTTP calls
http://docs.openstack.org/developer/swift/

Using Google App Engine coonfusion

I'm Cococa programmer, but right now I encountered situation when I can't go any further without smarter people:)
I always used small databases in my applciations. I programmed PHP backend on my own server and it worked good.
Right now I have to switch for something much bigger and I decided to try with Google App Engine, because it is relatively cheap and has great scalability.
I'm so confused with documentation and I really don't know where to start.
My new app will store data (images, videos) as well as database (mysql) in google cloud.
I concluded that for app like that I should use:
Google Cloud Storage for images / viedos etc.
Google Cloud SQL for CRUD operations for users (inserting and fetching personal data)
I would prefer to use JSON api. Then I don't have to write any Java, Python or GO code, right? Only REST requests for Google Cloud SQL...
My question is : Am I thinking correctly? Should I use these two services?
Google App Engine has a feature called "Cloud Endpoints" (Java | Python)
that automatically generates a JSON API similar to the APIs that Google provides for its own services (and also generates client libraries in JavaScript, Obj-C, and Java to invoke those APIs), saving you the trouble of writing the REST API yourself and manually serializing/deserializing the request and, instead, focusing on just the business logic that performs the storage and retrieval operations. So, what I would suggest is that you write the code that reads/writes data into the datastore (and cloud storage), but then use Cloud Endpoints to automatically generate your JSON API and client libraries, rather than manually writing that code.
Your plan seems fine so far. Google Cloud Storage is a great choice for storing a large number of images and movies, and Google Cloud SQL is a great choice for handling smaller, more relational data.
If you're using PHP from app engine, there's built-in support for Google Cloud Storage. See https://developers.google.com/appengine/docs/php/googlestorage/
If you're using PHP from your app that lives somewhere else, you could write to the Google Cloud Storage JSON or XML APIs directly, but there's also a PHP library for the Google APIs that might be easier for you to use: https://code.google.com/p/google-api-php-client/