About csv data differences between PSI and GSC - google-search-console

I have a problem about cwv data in PSI and GSC. Hope specialists can help me with it.
I noticed that when i check a specific url in PSI, it has two versions: this url and origin. And according to the documentation, PSI data is based on real-user data for this URL. Does it mean that if a url has insufficient users, aka traffic (from seo or direct source), the shown cwv data will show the origin version, which will be the cwv data of the whole domain instead of this specific url?
In the same way, we assume that cwv data shown in the GSC are all based on urls with natural search traffic (search users). Don't know if we understand right.
So can we say that PSI cwv data is based on urls have different traffic sources including seo, direct and maybe social; and GSC cwv data is based on urls only have natural search traffic?
We want to know the difference cwv data sources, so our tech guy can decide which data can we use to check and therefore improve the cwv of whole site (have over 30M live listings). Thanks.

Both PSI and GSC are based on the same underlying data source: Chrome UX Report. All eligible users' experiences are included in the dataset and there is no difference in the user population across PSI and GSC.

Related

OSM place name translations

I use osm in my application to determine the name of the rental from a given longitude and latittude, however I sometimes do not have an answer from the OSM server (network is unreachable). I want to know the number of requests allowed, and the restrictions of the OSM.
You can find the full list of policy rules here.
Among others here is a list of rules:
The requirements:
No heavy uses (an absolute maximum of 1 request per second).
Provide a valid HTTP Referer or User-Agent identifying the application (stock User-Agents as set by http libraries will not do).
Clearly display attribution as suitable for your medium.
Data is provided under the ODbL license which requires to share alike (although small extractions are likely to be covered by fair
usage / fair dealing).
And there is also a set of "Unacceptable usage" rules:
Auto-complete search This is not yet supported by Nominatim and you
must not implement such a service on the client side using the API.
Systematic queries This includes reverse queries in a grid, searching for complete lists of postcodes, towns etc. and
downloading all POIs in an area. If you need complete sets of data,
get it from the OSM planet or an extract.
Scraping of details The details page is there for debugging only and may not be downloaded automatically.

How to design endpoints for data not considered as a resource in a REST API

User context:
An school administrator logs into a dashboard. The page displays a block of data at the top of the page:
Number of students who used the service over the past week
The aggregate feedback (positive, negative, neutral) left by the students over the past week in percentages.
Other aggregate data
Underneath is a bunch of charts and graphs representing usage of the service broken down by month, daily usage broken down by hour, etc.
My problem:
I'm trying to build an API following REST principals where endpoints should define a resource and HTTP verbs as the action to take on those resources. My problem is going about building endpoints for this more 'analytical' and aggregate data that doesn't really seem to fit anywhere in my resources. Ideally, each graph or chart could be one request to an endpoint, and the block of aggregate data at the top would also be its own request, rather than 3 requests (1 for each piece of data). Can someone guide me in the right direction on how to go about building the endpoints for these specific scenarios?
Thanks
Can someone guide me in the right direction on how to go about building the endpoints for these specific scenarios?
TL;DR: How would you build a web site to support those scenarios? Do that.
If you were using something like a document store, then you would take the URI, say /feedbackReports/lastWeek, and use that as a key, and pull from the document store a representation of that report, and return it to the client (along with various bits of metadata).
If you were using something like a file system, then you would take the URI, and construct some reference to a file, like /www_root/feedbackReports/lastWeek, and read the representation of that report from disk, and return it to the client (along with various bits of metadata).
Is you were using something like a relational database, then you would take the URI, and see that the "last week" report was being asked for, and from that you would inject a bunch of "-7 days" parameters into prepared statements, and run them, then reshape the data in memory into some representation of that report, and return it to the client (along with various bits of metadata).
I'm trying to build an API following REST principals where endpoints should define a resource and HTTP verbs as the action to take on those resources
The REST principle in question is that the API isolates the clients (and intermediary components) from all of the implementation details. The API is the mask that your application wears so that web integrations just work.
My problem is going about building endpoints for this more 'analytical' and aggregate data that doesn't really seem to fit anywhere in my resources.
So create more resources.
Note: these are integration resources; which is to say that they produce the representations that web clients need to interact with your domain.
Jim Webber, in 2008
URIs do NOT map onto domain objects - that violates encapsulation.
Work (ex: issuing commands to the domain model) is a side effect of
managing resources. In other words, the resources are part of the
anti-corruption layer. You should expect to have many many more
resources in your integration domain than you do business objects
in your business domain.

exporting data from Bluemix Presence Insights

I'm trying to export data from Presence Insights on Bluemix, I followed the following documentation:
https://presenceinsights.ng.bluemix.net/pidocs/analytics/export/
however I can't seem to find export button mentioned inside the document.
Data can be exported from the IBM Presence Insights Dashboard if you have data available. There are also REST APIs for exporting data. They are documented in the Floors, Sites, and Zones sections of the API Reference.
There were REST APIs in the product some time ago, but they were found to have limitations that made them less useful in production. In particular, the amount of data that builds up forces the response time on the API to grow beyond what the Bluemix infrastructure allowed. The API requests would timeout. To that end, the APIs were backed out, but it appears the documentation was left. That will be removed shortly.
Presence Insights still understands the value of exporting the data, so a new scheme is under investigation. For example, it would be ideal if the data could be exported under the covers to a production data storage facility, on a regular time frame.
In the interim, an alternative solution would be to use a Subscription to gather the backend enter/exit/dwell/timeout events directly and roll your own solution to store only what you need in whatever facility works for your application.

Clarification on Bing Maps API locationCode field

I'm working on learning the Bing Maps API for my current project at work. In various locations in route and traffic data output, there is a field called locationCode (or locationCodes). The documentation provides no clarification on what this field is, except to state: "A subscription is typically required to be able to interpret these codes for a geographical area or country."
Nothing I've been able to find yet defines this further or clarifies what data that code can be translated into by these subscription services. So, what are some of the services available to translate this code and what data can be obtained via these services?
Those information are related to traffic for specific roads, see the details for the API:
http://msdn.microsoft.com/en-us/library/hh441726.aspx
Also the description of each field included in the result:
http://msdn.microsoft.com/en-us/library/hh441730.aspx
You are referring to the locationCode. Those are related to the Traffic Message Channel information where the included information are the event id, the location code, expected incident duration, affected extent and other details.
Those TMC are managed in different ways depending on the country you are interested in and specialised mapping data provider can give you access to those kind of information and that's what the subscription is about. It is also related to the way you can use the TMC depending on the concerned area to get or give information from there.
For more information about TMC:
http://en.wikipedia.org/wiki/Traffic_message_channel

Comparing data with RESTful API

For a website I am working on defining a RESTful API. I believe I got it (mostly) correct using proper resource URIs and proper use of GET/POST/UPDATE/DELETE.
However there is one point where I can't quite figure out what the proper way to do it "in" REST would be - comparison of lists.
Let's say I have a bookstore and a customer can have a wishlist. The wishlist consists of books (their full Book record, i.e. name, synopsis, etc) and a full copy of the list exists on the client. What would be a good way to design the RESTful API to allow a client to query the correctness of its local wishlist (i.e. get to know what books have been added/removed on the wishlist on the server side)?
One option would be to just download the full wishlist from the server and compare it locally. However this is quite a large amount of data (due to the embedded content) and this is a mobile client with a low-bandwidth connection, so this would cause a lot of problems.
Another option would be to download not the whole wishlist (i.e. not including book infos) but only a list of the books' identifiers. This would be not much data (compared to the previous option) and the client could compare the lists locally. However to get the full book record for newly added books, a REST call would have to be made for every single new book. Again, as this is a mobile client with bad network connectivity, this could be problematic.
A third option and my favorite, would be that the client sends its list of identifiers to the server and the server compares it to the wishlist and returns what books were removed and the data for books that were added. This would mean a single roundtrip and only the necessary amount of data. As the wishlist size is estimated to be less than 100 entries, sending just the IDs would be a minimal amount of data (~0.5kb). However I don't know what kind of call would be appropriate - it can't be GET as we are sending data (and putting it all in the URL does not feel right), it can't be POST/UPDATE as we do not change anything on the server. Obviously it's not DELETE either.
How would you implement this third option?
Side-question: how would you solve this problem (i.e. why is option 3 stupid or what better, simple solutions may there be)?
Thank you.
P.S.: A fourth option would be to implement a more sophisticated protocol where the server keeps track of changes to the list (additions/deletes) and the client can e.g. query for changes based on a version identifier or simply a timestamp. However I like the third option better as implementation-wise it is much more simpler and less error-prone on both client and server.
There is nothing in HTTP that says that POST must update the server. People seem to forget the following line in RFC2616 regarding one use of POST:
Providing a block of data, such as the result of submitting a
form, to a data-handling process;
There is nothing wrong with taking your client side wishlist and POSTing to a resource whose sole purpose is to return a set of differences.
POST /Bookstore/WishlistComparisonEngine
The whole concept behind REST is that you leverage the power of the underlying HTTP protocol.
In this case there are two HTTP headers that can help you find out if the list on your mobile device is stale. An added benefit is that the client on your mobile device probably supports these headers natively, which means you won't have to add any client side code to implement them!
If-Modified-Since: check to see if the server's copy has been updated since your client first retrieved it
Etag: check to see if a unique identifier for your client's local copy matches that which is on the server. An easy way to generate the unique string required for ETags on your server is to just hash the service's text output using MD5.
You might try reading Mark Nottingham's excellent HTTP caching tutorial for information on how these headers work.
If you are using Rails 2.2 or greater, there is built in support for these headers.
Django 1.1 supports conditional view processing.
And this MIX video shows how to implement with ASP.Net MVC.
I think the key problems here are the definitions of Book and Wishlist, and where the authoritative copies of Wishlists are kept.
I'd attack the problem this way. First you have Books, which are keyed by ISBN number and have all the metadata describing the book (title, authors, description, publication date, pages, etc.) Then you have Wishlists, which are merely lists of ISBN numbers. You'll also have Customer and other resources.
You could name Book resources something like:
/book/{isbn}
and Wishlist resources:
/customer/{customer}/wishlist
assuming you have one wishlist per customer.
The authoritative Wishlists are on the server, and the client has a local cached copy. Likewise the authoritative Books are on the server, and the client has cached copies.
The Book representation could be, say, an XML document with the metadata. The Wishlist representation would be a list of Book resource names (and perhaps snippets of metadata). The Atom and RSS formats seem good fits for Wishlist representations.
So your client-server synchronization would go like this:
GET /customer/{customer}/wishlist
for ( each Book resource name /book/{isbn} in the wishlist )
GET /book/{isbn}
This is fully RESTful, and lets the client later on do PUT (to update a Wishlist) and DELETE (to delete it).
This synchronization would be pretty efficient on a wired connection, but since you're on a mobile you need to be more careful. As #marshally points out, HTTP 1.1 has a lot of optimization features. Do read that HTTP caching tutorial, and be sure to have your web server properly set Expires headers, ETags, etc. Then make sure the client has an HTTP cache. If your app were browser-based, you could leverage the browser cache. If you're rolling your own app, and can't find a caching library to use, you can write a really basic HTTP 1.1 cache that stores the returned representations in a database or in the file system. The cache entries would be indexed by resource names, and hold the expiration dates, entity tag numbers, etc. This cache might take a couple days or a week or two to write, but it is a general solution to your synchronization problems.
You can also consider using GZIP compression on the responses, as this cuts down the sizes by maybe 60%. All major browsers and servers support it, and there are client libraries you can use if your programming language doesn't already (Java has GzipInputStream, for instance).
If I strip out the domain-specific details from your question, here's what I get:
In your RESTful client-server application, the client stores a local copy of a large resource. Periodically, the client needs to check with the server to determine whether its copy of the resource is up-to-date.
marshally's suggestion is to use HTTP caching, which IMO is a good approach provided it can be done within your app's constraints (e.g., authentication system).
The downside is that if the resource is stale in any way, you'll be downloading the entire list, which sounds like it's not feasible in your situation.
Instead, how about re-evaluating the need to keep a local copy of the Wishlist in the first place:
How is your client currently using the local Wishlist?
If you had to, how would you replace the local copy with data fetched from the server?
What have you done to minimize your client's data requirements when building its Wishlist view(s) and executing business logic?
Your third alternative sounds nice, but I agree that it doesn't feel to RESTfull ...
Here's another suggestion that may or may not work: If you keep a version history of of your list, you could ask for updates since a specific version. This feels more like something that can be a GET operation. The version identifiers could either be simple version numbers (like in e.g. svn), or if you want to support branching or other non-linear history they could be some kind of checksums (like in e.g. monotone).
Disclaimer: I'm not an expert on REST philosophy or implementation by any means.
Edit: Did you ad that PS after I loaded the question? Or did I simply not read your question all the way through before writing an answer? Sorry. I still think the versioning might be a good idea, though.