Google manual search results do not match Custom Search API results - google-api-client

I am trying to obtain the total result count from manual Google web search via the CustomSearch API. I am searching the entire web based on a keyword and an associated site, so the search query is " site:". Judging by my research, it is a known issue that manual Google search results tend to differ from CustomSearch API results obtained from searching the entire web, as cited here and elsewhere.
Is there really no way to exactly reproduce manual search results with the API? If that is the case, then the API is rather limited and should be explained up-front to developers in the documentation or fixed.
My custom search engine is already set up to search the whole web I believe.
It looks like maybe PDF files returned in a web search are not returned by the API. I have already tried specifying fileType parameter to include PDFs to no avail.
3 Results returned via manual web search.
0 results returned by API.
If anyone has lessons to share I will be thankful!

Related

Azure Devops API - how to filter/search by project name?

I'm using the GET project List API and I want to filter out the results by a search query parameters.
I have multiple projects under a certain organization and I want to get back only the projects that start with some 'name' initial. I looked everywhere on the documentation but couldn't find any way doing such a query. Is there a way to narrow down the results?
I saw that some API's have the ?$filter={filter} query param, but it won't work on projects filtering.
As you said, it's impossible in this API to filter the results before you get them.
You only can filter the projects after you get the reulsts with Bash/Powershell etc.
I run into similar issue, except I am trying to filter base on Azure Devops's Repo list.
It seems M$ has done a terrible job in providing consistent filters features.
I did notice some of the API does provide ?$filter={filter} and I found documentation here would probably help?
Some of the API provides a different search method in form of search criteria /commits?searchCriteria.$skip={searchCriteria.$skip}&searchCriteria.$top={searchCriteria.$top}...
https://learn.microsoft.com/en-us/rest/api/azure/devops/git/commits/get-commits?view=azure-devops-rest-6.0

async autocomplete service

Call me crazy, but I'm looking for a service that will deliver autocomplete functionality similar to Google, Twitter, etc. After searching around for 20 min I thought to ask the geniuses here. Ideas?
I don't mind paying, but it would great if free.. Also is there a top notch NLP service that I can submit strings to and get back states, cities, currencies, company names, establishments, etc. Basically I need to take unstructured data (generic search string) and pull out key information with relevant meta-data.
Big challenge, I know.
Sharing solutions I found after further research.
https://github.com/haochi/jquery.googleSuggest
http://shreyaschand.com/blog/2013/01/03/google-autocomplete-api/
If you dont want to implement it yourself, you can use this service called 'Autocomplete as a Service' which is specifically written for these purposes. You can access it here - www.aaas.io.
you can add metadata with each record and it returns metadata along with the matching results. Do check out demo put up on the home page. It has got a very simple API specifically written for autocomplete search
It does support large datasets and you can apply filters as well while searching.
Its usage is simple - Add your data and use the API URL as autocomplete data source.
Disclaimer: I am founder of it. I will be happy to provide this service to you.

StockTwits API Streaming and Search Used Like Twitter Streaming

The StockTwits API documentation describes steams in a way that sounds like static search results, for example streams/symbol:
This allows an API application to search for a symbol or user. 30 Results will be a
combined list of symbols and users.
This seems similar to search/symbols:
This allows an API application to search for a symbol directly. 30 Results will return
only ticker symbols.
Other than the fact that search excludes users, I don't see the difference.
In contrast, the Twitter API provides methods to request a continuous stream of tweets, which I have gotten to provide tens of thousands of tweets in a few days.
Is it possible to have StockTwit pump tweets continuously, similar to Twitter?
If so, what is required? Since StockTwit streaming looks like searching to me, the only option I have seen is to submit repeated search requests, but that would exhaust the rate limit.
I prefer C#, but I am glad to study answers in other languages, such as PHP.
This is a static search for symbols or both symbols and users as a combined search. This isn't a streaming search endpoint for filtering content. This is strictly for use for finding a symbol or a user to go directly to the stream.
We are looking into offering streaming endpoints and search would be part of this offering.
You may be interested in using streamdata.io which allows to stream any APIs. We have already implemented a StockTwits demo, which can be found here and explanations can be found in this blog post.
I think it's quite easy to transpose what has been done with Android to the C# world. All you need is an EventSource library and a JSON-Patch library.

how to specify URL in filters pagePath core reporting api V3

I am building a web app that pulls data through the Core Reporting Api v3 from Google. I am using the client PHP library offered by Google.
I am currently trying to specify a page and retrieve its pageviews for a time range. Every other seems to be working okay except for the fact that if a specfy a filter with ga:pagePath==http://link/uri then I get 0 all the time no matter the time range.
I think the problem is got to do with the setting of value for this pagePath. I want to have spearate data for the desktop version of the site and the smartphone version denoted by s. subdomain
Can anyone hint me on some tips and or tricks to use to get the required data?
Example URL:
http://domain.com/user/profile/id/1
http://s.domain.com/user/profile/id/1
Thanks in advance!
for the the default implementation of Google Analytics, ga:pagePath doesn't include the scheme or hostname so in your case you'd actually want to filter using ga:hostname and ga:pagePath together.
I suggest you use the Query Explorer to build your queries and get familiar with what will work. You can also use this tool to at least get a sense for what type of data the ga:pagePath and ga:hostname dimensions return before trying to filter on them. Finally, once you have the query you want, you can easily get the exact Core Reporting API query by clicking on the Query URI button.
Also check out the Combining Filters section of GA API docs.
So if you want filter on ga:pagepath for domain.com and s.domain.com separately you could do something like
filters=ga:pagePath==/user/profile/id/1;ga:hostname==domain.com
filters=ga:pagePath==/user/profile/id/1;ga:hostname==s.domain.com

How can I fetch more than 1000 Google results with the Perl Google API?

HUsing the regular search engine, as a human, can get you not more than 1000 results, which is far more than a regular person needs.
But what If I do want to get 2000? is it possible? I read that it is possible using the App Engine or something like that (over here...), but, is it possible, somehow, to do it through Perl?
I don't know a way around this limit, other than to use a series of refined searches versus one general search.
For example instead of just "Tim Medora", I might search for myself by:
Search #1: "Tim Medora Phoenix"
Search #2: "Tim Medora Boston"
Search #3: "Tim Medora Canada"
However, if you are trying to use Google to search a particular site, you may be able to read that site's Google sitemaps.
For example, www.linkedin.com exposes all 80 million+ users/businesses via a series of nested sitemap XML files: http://www.linkedin.com/sitemap.xml.
Using this method, you can crawl a specific site quite easily with your own search algorithm if they have good Google sitemaps.
Of course, I am in no way suggesting that you exploit a sitemap for illegal/unfriendly purposes.