HAR file - access "Size" column entries from Chrome Dev Tools Network tab? - google-chrome-devtools

I am working on measuring the percentage of GET requests being handled / returned by a site's service worker. Within Chrome Dev Tools there is a "Size" column that shows "(from ServiceWorker)" for files matched by the cache.
When I right-click on any row and choose "Save as HAR with content" then open up the downloaded file in a text editor, searching for "service worker" includes some results (where within the response, there is "statusText": "Service Worker Fallback Required"), but none of them look related to the fact that some requests were handled by the service worker.
Is this information I'm looking for accessible anywhere within the downloaded HAR file? Alternatively, could this be found out by some other means like capturing network traffic through Selenium Webdriver / ChromeDriver?

It looks like the content object defines the size of requests: http://www.softwareishard.com/blog/har-12-spec/#content
But I'm not seeing anything in a sample HAR file from airhorner.com that would help you determine that the request came from a service worker. Seems like a shortcoming in the HAR spec.
It looks like Puppeteer provides this information. See response.fromServiceWorker().

I tried to investigate this a bit in Chrome 70. Here's a summary.
I'm tracking all requests for the https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.5/require.min.js URL, which is a critical script for my site.
TL;DR
As Kayce suggests, within a Chrome HAR file there is no explicit way of determining that an entry was handled by a service worker (as far as I can see). I also haven't been able to find a combination of existing HAR entry fields that would positively identify an entry as being handled by a service worker (but perhaps there is such a combination).
In any case, it would be useful for browsers to record any explicit relationships between HAR entries, so that tools like HAR Viewer could recognise that two entries are for the same logical request, and therefore not display two requests in the waterfall.
Setup
Clear cache, cookies, etc, using the Clear Cache extension.
First and second entries found in HAR
The first entry (below) looks like a request that is made by the page and intercepted/handled by the service worker. There is no serverIPAddress and no connection, so we can probably assume this is not a 'real' network request.
The second entry is also present as a result of the initial page load - there has been no other refresh/reload - you get 2 entries in the HAR for the same URL on initial page load (if it passes through a service worker and reaches the network).
The second entry (below) looks like a request made by the service worker to the network. We see the serverIPAddress and response.connection fields populated.
An interesting observation here is that entry#2's startedDateTime and time fall 'within' the startedDateTime and time of the 'parent' request/entry.
By this I mean entry#2's start and end time fall completely within entry#1's start and end time. Which makes sense as entry#2 is a kind of 'sub-request' of entry#1.
It would be good if the HAR spec had a way of explicitly recording this relationship. I.e. that request-A from the page resulted in request-B being sent by the service worker. Then a tool like HAR Viewer would not display two entries for what is effectively a single request (would this cover the case where a single fetch made by the page resulted in multiple service worker fetches?).
Another observation is that entry#1 records the request.httpVersion and response.httpVersion as http/1.1, whereas the 'real' request used http/2.0.
Third entry (from pressing enter in the address bar after initial page load)
This entry appears in the HAR as a result of pressing enter in the address bar. The _fromCache field is memory as expected, as the resource should be served from regular browser cache in this situation (the resource uses cache-control=public, max-age=30672000).
Questions:
Was this entry 'handled' by the service worker's fetch event?
Maybe when a resource is in memory cache the service worker fetch event isn't fired?
Or is service worker effectively 'transparent' here?
There is no serverIPAddress or connection fields as expected, as there was no 'real' network request.
There is a pageref field present, unlike for entry#2 (entry#2 was a service worker initiated network request).
Fourth entry
The preparation work for this entry was:
Add the resource to a service-worker-cache (https://developer.mozilla.org/en-US/docs/Web/API/Cache).
Use Clear Cache extension to clear all caches (except service-worker-cache).
Reload page.
This entry has fromCache set to disk. I assume this is because the service-worker-cache was able to satisfy the request.
There is no serverIPAddress or connection field set, but the pageref is set.
Fifth entry
The preparation work for this entry was:
Use devtools to enter 'Offline' mode.
This entry is basically the same as entry#4.

Related

REST: How to support create-Or-Update and partial-update ? (aka PUT vs PATCH)

We are designing WebAPI for our software for managing ecommerce product information. We want to provide (among many others) two operations:
Simple one: allow user to add/modify existing product information:
don't create new product if it not exists
don't delete any information from existing product which was not provided in this request
In my opinion HTTP PATCH method is proper way to handle this scenario (with json-patch or json-merge-ptach) with URL like this: /products/{ID}
Harder one: allow user to add/modify existing product or create one
create product if not exists in DB
don't delete any information from existing product which was not provided in this request (same behaviour as in first case)
I'm struggling with designing REST endpoint for this second use case. I have few options but none of them fits perfectly for me in the REST principles:
a) Add custom HTTP header to the endpoint designed for first case (patch) to allow a caller to control of "not found behaviour" eg. create-entity-when-not-exists: true/false - but in my opinion PATCH shouldn't be used for creating resources.
b) Design new endpoint using PUT with special header "preserve-not-provided-data" - this on the other hand violates for me PUT principles because PUT is create-or-replace not create-or-update method
c) Create PATCH for /products URL (without {ID} at the end) - in this case we are updating whole collection(resource) of products - so if product exists we can update it or create new one if not exists.
For now c) solution looks fine for me with one exception: If in the future we would like to support batch operations (for both use cases: 1 and 2) we would like to use /products URL and it will conflict with URL from solution c)
What do you think ? Do you have any other ideas ?
PUT and PATCH have differing message semantics, but the core context ("remote authoring") is the same. In both cases, the client request is "Please, server, make your representation of this resource match my local copy".
For example, I GET a JSON document from the server. I make local edits to it. Now I want to "save" my changes on the server. If the document is modest in size, I might just send the entire revised document over the network. If the document is very large, and my changes are modest, then I might instead send the patch instead.
If you imagine using HTTP to publish edits of HTML web pages to a server, then you've got the right frame of reference. There's not a lot of practical difference between "please patch the title of your copy of the document" and "here is a complete new copy of the document, with my edit to the title". The bytes on disk are going to be the same in either case.
Given that, it would be very odd if those two methods for publishing a new revision of the document were to have vastly different side effects.
Your third approach, based on modifying /products, is potentially fine for both your individual and batch. The server gets the new representation of /products (or the patch document describing the changes), decides whether to accept the changes, and if so computes what it needs to do to its own database to make things work.
Note:
A PUT request applied to the target resource can have side effects on other resources.
The HTTP specification is relatively strict about what the message means, but offers the server a lot of leeway in how it behaves in response.

Jmeter recorded script is not added data on frontend side

I recorded my jmeter script on server x and make it dynamic after that run that same script on server y - it fetch all data by post processor and did not give any error but data is not added on fronted . how can I solve it any reason behind it? (website is same just change the server for testing)
expected-Data should add on fronted like create lead on server y (successfully create on server x)
actual -data not added on server y
Most probably you need to correlate your script as it is not doing what it is supposed to be doing.
You can run your test with 1 virtual user and 1 iteration configured in the Thread Group and inspect request and response details using View Results Tree listener
My expectation is that you either not getting logged in (you have added HTTP Cookie Manager to your Test Plan, haven't you?) or fail to provide valid dynamic parameters. Modern web applications widely use dynamic parameters for example for client side state tracking or for CSRF protection
You can easily detect dynamic parameters by recording the same scenario one more time and compare the generated scripts. All the values which differ need to be correlated, to wit extracted from the previous response using a suitable Post-Processor and stored into a JMeter Variable. Once done you will need to replace recorded hard-coded value with the aforementioned JMeter Variable.
Check out How to Handle Correlation in JMeter article for comprehensive information with examples.

DELETE different resources with one requests - Is it ok or we should try to mix those resources to one

Let assume that I have a collection with /playrequests endpoint. It is a collection (list) for those players who want to find another player to start a match.
The server will check this collection periodically and if it finds two unassigned players, it will create another resource in another collection with /quickmatchs endpoint and also change (for example) a field in the PlayRequests collection for both players to shows that they are assigned to a quickMatch.
At this point, players can send a PUT or PATCH request to set the (for example) "ready" field of their related quickMach resource to true. so the server and each of them can find out that if both of them is ready and the match can be started.
(The Issue Part Is Below Part...)
Also, before a the playRequests assigned to a match and also after they assigned to it, they can send a DELETE request to /playrequests endpoint to tell the server that they want to give up the request. So if the match doesn't create yet, It is ok. the resource related to the player will remove from playRequests collection. but if player assigned to a match, the server must delete the related playRequest and also it must delete the related quickMatch resource from the quickMatchs collection. ( and also we should modify the playRequest related to another player to indicate that it's unassigned now. or we can check and change it later when he to check the status of his related resources in both collection. It is not the main issue for now. )
So, my question is that is it ok to change a resource that is related to the given end point and also change another resource accordingly, If it is necessary? ( I mean is it ok to manipulate different resources with different endpoints in one request? I don't want to send multiple requests.) or I need to mix those two collections to avoid such an action?
I know that many things ( different strategies ) are possible but I want to know that (from the viewpoint of RESTFUL) what is standard/appropriate and what is not? (consider that I am kinda new to restful)
it ok to change a resource that is related to the given end point and also change another resource accordingly
Yes, and there are consequences.
Suppose we have two resources, /fizz and /buzz, and the representations of those resources are related via some implementation details on the server. On the client, we have accessed both of these resources, so we have cached copies of each of them.
The potential issue here is that the server changes the representations of these resources together, but as far as the client is concerned, they are separate.
For instance, if the client sends an unsafe request to change /fizz, a successful response message from the server will invalidate the locally cached copy of that representation, but the stale representation of /buzz does not get evicted. In effect, the client now has a view of the world with version 0 /buzz and version 1 /fizz.
Is that OK? "It depends" -- will expensive things happen to you if your clients are in a state of unmatched representations? Can you patch over the "problem" in other ways (for instance, by telling the client to check resources for updates more often)?

Avoid duplicate POSTs with REST

I have been using POST in a REST API to create objects. Every once in a while, the server will create the object, but the client will be disconnected before it receives the 201 Created response. The client only sees a failed POST request, and tries again later, and the server happily creates a duplicate object...
Others must have had this problem, right? But I google around, and everyone just seems to ignore it.
I have 2 solutions:
A) Use PUT instead, and create the (GU)ID on the client.
B) Add a GUID to all objects created on the client, and have the server enforce their UNIQUE-ness.
A doesn't match existing frameworks very well, and B feels like a hack. How does other people solve this, in the real world?
Edit:
With Backbone.js, you can set a GUID as the id when you create an object on the client. When it is saved, Backbone will do a PUT request. Make your REST backend handle PUT to non-existing id's, and you're set.
Another solution that's been proposed for this is POST Once Exactly (POE), in which the server generates single-use POST URIs that, when used more than once, will cause the server to return a 405 response.
The downsides are that 1) the POE draft was allowed to expire without any further progress on standardization, and thus 2) implementing it requires changes to clients to make use of the new POE headers, and extra work by servers to implement the POE semantics.
By googling you can find a few APIs that are using it though.
Another idea I had for solving this problem is that of a conditional POST, which I described and asked for feedback on here.
There seems to be no consensus on the best way to prevent duplicate resource creation in cases where the unique URI generation is unable to be PUT on the client and hence POST is needed.
I always use B -- detection of dups due to whatever problem belongs on the server side.
Detection of duplicates is a kludge, and can get very complicated. Genuine distinct but similar requests can arrive at the same time, perhaps because a network connection is restored. And repeat requests can arrive hours or days apart if a network connection drops out.
All of the discussion of identifiers in the other anwsers is with the goal of giving an error in response to duplicate requests, but this will normally just incite a client to get or generate a new id and try again.
A simple and robust pattern to solve this problem is as follows: Server applications should store all responses to unsafe requests, then, if they see a duplicate request, they can repeat the previous response and do nothing else. Do this for all unsafe requests and you will solve a bunch of thorny problems. Repeat DELETE requests will get the original confirmation, not a 404 error. Repeat POSTS do not create duplicates. Repeated updates do not overwrite subsequent changes etc. etc.
"Duplicate" is determined by an application-level id (that serves just to identify the action, not the underlying resource). This can be either a client-generated GUID or a server-generated sequence number. In this second case, a request-response should be dedicated just to exchanging the id. I like this solution because the dedicated step makes clients think they're getting something precious that they need to look after. If they can generate their own identifiers, they're more likely to put this line inside the loop and every bloody request will have a new id.
Using this scheme, all POSTs are empty, and POST is used only for retrieving an action identifier. All PUTs and DELETEs are fully idempotent: successive requests get the same (stored and replayed) response and cause nothing further to happen. The nicest thing about this pattern is its Kung-Fu (Panda) quality. It takes a weakness: the propensity for clients to repeat a request any time they get an unexpected response, and turns it into a force :-)
I have a little google doc here if any-one cares.
You could try a two step approach. You request an object to be created, which returns a token. Then in a second request, ask for a status using the token. Until the status is requested using the token, you leave it in a "staged" state.
If the client disconnects after the first request, they won't have the token and the object stays "staged" indefinitely or until you remove it with another process.
If the first request succeeds, you have a valid token and you can grab the created object as many times as you want without it recreating anything.
There's no reason why the token can't be the ID of the object in the data store. You can create the object during the first request. The second request really just updates the "staged" field.
Server-issued Identifiers
If you are dealing with the case where it is the server that issues the identifiers, create the object in a temporary, staged state. (This is an inherently non-idempotent operation, so it should be done with POST.) The client then has to do a further operation on it to transfer it from the staged state into the active/preserved state (which might be a PUT of a property of the resource, or a suitable POST to the resource).
Each client ought to be able to GET a list of their resources in the staged state somehow (maybe mixed with other resources) and ought to be able to DELETE resources they've created if they're still just staged. You can also periodically delete staged resources that have been inactive for some time.
You do not need to reveal one client's staged resources to any other client; they need exist globally only after the confirmatory step.
Client-issued Identifiers
The alternative is for the client to issue the identifiers. This is mainly useful where you are modeling something like a filestore, as the names of files are typically significant to user code. In this case, you can use PUT to do the creation of the resource as you can do it all idempotently.
The down-side of this is that clients are able to create IDs, and so you have no control at all over what IDs they use.
There is another variation of this problem. Having a client generate a unique id indicates that we are asking a customer to solve this problem for us. Consider an environment where we have a publicly exposed APIs and have 100s of clients integrating with these APIs. Practically, we have no control over the client code and the correctness of his implementation of uniqueness. Hence, it would probably be better to have intelligence in understanding if a request is a duplicate. One simple approach here would be to calculate and store check-sum of every request based on attributes from a user input, define some time threshold (x mins) and compare every new request from the same client against the ones received in past x mins. If the checksum matches, it could be a duplicate request and add some challenge mechanism for a client to resolve this.
If a client is making two different requests with same parameters within x mins, it might be worth to ensure that this is intentional even if it's coming with a unique request id.
This approach may not be suitable for every use case, however, I think this will be useful for cases where the business impact of executing the second call is high and can potentially cost a customer. Consider a situation of payment processing engine where an intermediate layer ends up in retrying a failed requests OR a customer double clicked resulting in submitting two requests by client layer.
Design
Automatic (without the need to maintain a manual black list)
Memory optimized
Disk optimized
Algorithm [solution 1]
REST arrives with UUID
Web server checks if UUID is in Memory cache black list table (if yes, answer 409)
Server writes the request to DB (if was not filtered by ETS)
DB checks if the UUID is repeated before writing
If yes, answer 409 for the server, and blacklist to Memory Cache and Disk
If not repeated write to DB and answer 200
Algorithm [solution 2]
REST arrives with UUID
Save the UUID in the Memory Cache table (expire for 30 days)
Web server checks if UUID is in Memory Cache black list table [return HTTP 409]
Server writes the request to DB [return HTTP 200]
In solution 2, the threshold to create the Memory Cache blacklist is created ONLY in memory, so DB will never be checked for duplicates. The definition of 'duplication' is "any request that comes into a period of time". We also replicate the Memory Cache table on the disk, so we fill it before starting up the server.
In solution 1, there will be never a duplicate, because we always check in the disk ONLY once before writing, and if it's duplicated, the next roundtrips will be treated by the Memory Cache. This solution is better for Big Query, because requests there are not imdepotents, but it's also less optmized.
HTTP response code for POST when resource already exists

RESTful way to create multiple items in one request

I am working on a small client server program to collect orders. I want to do this in a "REST(ful) way".
What I want to do is:
Collect all orderlines (product and quantity) and send the complete order to the server
At the moment I see two options to do this:
Send each orderline to the server: POST qty and product_id
I actually don't want to do this because I want to limit the number of requests to the server so option 2:
Collect all the orderlines and send them to the server at once.
How should I implement option 2? a couple of ideas I have is:
Wrap all orderlines in a JSON object and send this to the server or use an array to post the orderlines.
Is it a good idea or good practice to implement option 2, and if so how should I do it.
What is good practice?
I believe that another correct way to approach this would be to create another resource that represents your collection of resources.
Example, imagine that we have an endpoint like /api/sheep/{id} and we can POST to /api/sheep to create a sheep resource.
Now, if we want to support bulk creation, we should consider a new flock resource at /api/flock (or /api/<your-resource>-collection if you lack a better meaningful name). Remember that resources don't need to map to your database or app models. This is a common misconception.
Resources are a higher level representation, unrelated with your data. Operating on a resource can have significant side effects, like firing an alert to a user, updating other related data, initiating a long lived process, etc. For example, we could map a file system or even the unix ps command as a REST API.
I think it is safe to assume that operating a resource may also mean to create several other entities as a side effect.
Although bulk operations (e.g. batch create) are essential in many systems, they are not formally addressed by the RESTful architecture style.
I found that POSTing a collection as you suggested basically works, but problems arise when you need to report failures in response to such a request. Such problems are worse when multiple failures occur for different causes or when the server doesn't support transactions.
My suggestion to you is that if there is no performance problem, for example when the service provider is on the LAN (not WAN) or the data is relatively small, it's worth it to send 100 POST requests to the server. Keep it simple, start with separate requests and if you have a performance problem try to optimize.
Facebook explains how to do this: https://developers.facebook.com/docs/graph-api/making-multiple-requests
Simple batched requests
The batch API takes in an array of logical HTTP requests represented
as JSON arrays - each request has a method (corresponding to HTTP
method GET/PUT/POST/DELETE etc.), a relative_url (the portion of the
URL after graph.facebook.com), optional headers array (corresponding
to HTTP headers) and an optional body (for POST and PUT requests). The
Batch API returns an array of logical HTTP responses represented as
JSON arrays - each response has a status code, an optional headers
array and an optional body (which is a JSON encoded string).
Your idea seems valid to me. The implementation is a matter of your preference. You can use JSON or just parameters for this ("order_lines[]" array) and do
POST /orders
Since you are going to create more resources at once in a single action (order and its lines) it's vital to validate each and every of them and save them only if all of them pass validation, ie. you should do it in a transaction.
I've actually been wrestling with this lately, and here's what I'm working towards.
If a POST that adds multiple resources succeeds, return a 200 OK (I was considering a 201, but the user ultimately doesn't land on a resource that was created) along with a page that displays all resources that were added, either in read-only or editable fashion. For instance, a user is able to select and POST multiple images to a gallery using a form comprising only a single file input. If the POST request succeeds in its entirety the user is presented with a set of forms for each image resource representation created that allows them to specify more details about each (name, description, etc).
In the event that one or more resources fails to be created, the POST handler aborts all processing and appends each individual error message to an array. Then, a 419 Conflict is returned and the user is routed to a 419 Conflict error page that presents the contents of the error array, as well as a way back to the form that was submitted.
I guess it's better to send separate requests within single connection. Of course, your web-server should support it
You won't want to send the HTTP headers for 100 orderlines. You neither want to generate any more requests than necessary.
Send the whole order in one JSON object to the server, to: server/order or server/order/new.
Return something that points to: server/order/order_id
Also consider using CREATE PUT instead of POST