I am quite new to API testing and any topics in networking in general.
I'm attempting to retrieve BC, Canada school names and rankings from a website. The target data is in the right table, available after a prompt to choose a province (here I selected British Columbia). I am using the chrome developer tools to analyze the requests/responses after selecting the province, however I am getting gibberish when expecting a JSON response.
After choosing the province, 3 XMLHttpRequests are made to compareschoolrankings.org/api/v1/ with response headers of
content-type: application/json
I assume the responses to these requests hold my target data, however the response content is gibberish when I would otherwise expect it in json format, in example:
3dd3U2FsdGVkX1/TDJgJ2Kpx3ekEf3yT9DaZMp8nDRMZJlP85M8RWOruj5tm1Qu6c2UF1ifJVFMU8+XbeXvIbWZ/Or ... bo/XkaOUHOnWGMhpFIC8mYz
Here is one request (it's header) that I expect is requesting the target data:
:authority: www.compareschoolrankings.org
:method: GET
:path: /api/v1/schools.json?province=bc&ht=NzQ1Mjg
:scheme: https
accept: application/json, text/plain, */*
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
cookie: _ga=GA1.2.217437611.1609538525; _gid=GA1.2.1976080400.1609538525; _gat_UA-3850680-10=1; _hjTLDTest=1; _hjid=1d1327a2-dd14-4388-b08f-ef670f6178cf; _hjFirstSeen=1
referer: https://www.compareschoolrankings.org/
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36
x-requested-with: XMLHttpRequest
Here is the corresponding response header:
accept-ranges: bytes
age: 14440
cache-control: max-age=21600, public
content-encoding: gzip
content-language: en
content-length: 448428
content-type: application/json
date: Fri, 01 Jan 2021 22:02:13 GMT
etag: W/"1609524093"
expires: Sun, 19 Nov 1978 05:00:00 GMT
last-modified: Fri, 01 Jan 2021 18:01:33 GMT
server: nginx
strict-transport-security: max-age=300
vary: Accept-Encoding, Cookie
via: 1.1 varnish, 1.1 varnish, 1.1 varnish
x-cache: MISS, HIT, HIT
x-cache-hits: 0, 1, 1
x-content-type-options: nosniff
x-drupal-cache: MISS
x-frame-options: SAMEORIGIN
x-generator: Drupal 8 (https://www.drupal.org)
x-pantheon-styx-hostname: styx-fe3fe4-h-6c4765d776-86gmx
x-served-by: cache-yyz4534-YYZ, cache-sea4433-SEA, cache-sea4483-SEA
x-styx-req-id: 623d1ec1-4c5b-11eb-bbc4-620e110c7f7f
x-timer: S1609538534.794192,VS0,VE0
x-ua-compatible: IE=edge
Question: Why is the response not in a JSON format when both the request and response headers indicate the content to be so? Where should I be looking to retreive my target data?
Any help or references would be much appreciated!
Related
I am facing an issue in downloading the txt file from a website. The script below downloads the http code instead of the actual txt file and its contents.
$WebClient = New-Object System.Net.WebClient
$WebClient.DownloadFile("https://thegivebackproject.org/CheckStatus.txt", "D:\CheckStatus.txt")
Short Answer
The server is doing browser sniffing to send different responses based on the User-Agent header in your request. You can get the response you want by sending a canned user agent string:
$useragent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36"
Invoke-WebRequest -URI https://thegivebackproject.org/CheckStatus.txt -OutFile c:\temp\CheckStatus.txt -UserAgent $useragent
Long Answer
The server responding to the url you're hitting is doing browser sniffing to decide what content to return. If you give it a User-Agent header that it recognises it will return the response you're expecting (i.e. the literal text "Azeemkhan-WaseemRaza").
If you don't include a User-Agent header (and $WebClient.DownloadFile doesn't include one), the server is responding with a html page instead.
You can see this behaviour yourself if you install a HTTP trace tool like Fiddler. When you hit the page in a browser you see this HTTP request and response pair:
request
GET https://thegivebackproject.org/CheckStatus.txt HTTP/1.1
Host: thegivebackproject.org
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36
Sec-Fetch-User: ?1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3
Sec-Fetch-Site: none
Sec-Fetch-Mode: navigate
Accept-Encoding: gzip, deflate, br
Accept-Language: en-GB,en-US;q=0.9,en;q=0.8
Cookie: SPSI=ee952ba44e33e958f963807ede78624b
response
HTTP/1.1 200 OK
Server: nginx
Date: Tue, 12 Nov 2019 08:13:57 GMT
Content-Type: text/plain
Content-Length: 20
Connection: keep-alive
Last-Modified: Thu, 07 Nov 2019 16:15:48 GMT
Accept-Ranges: bytes
X-Cache: MISS
Azeemkhan-WaseemRaza
but when you use $WebClient.DownloadFile you see this instead:
request
GET https://thegivebackproject.org/CheckStatus.txt HTTP/1.1
Host: thegivebackproject.org
response
HTTP/1.1 200 OK
Server: nginx
Date: Tue, 12 Nov 2019 08:14:21 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: SPSI=9c24f8993046ef610e25cc727c4a4ae2; Path=/
Set-Cookie: adOtr=obsvl; Expires=Thu, 2 Aug 2001 20:47:11 UTC; Path=/
Set-Cookie: UTGv2=D-h4d40f620bfdd6c3b77b035ee99f96621134; Expires=Wed, 11-Nov-20 08:14:21 GMT; Path=/
cache-control: no-store, no-cache, max-age=0, must-revalidate, private, max-stale=0, post-check=0, pre-check=0
Vary: Accept-Encoding
X-Cache: MISS
Accept-Ranges: bytes
5908
<!doctype html>
<head>
<meta charset="utf-8">
<meta http-equiv="x-ua-compatible" content="ie=edge">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<title>StackPath</title>
<style>
* {
box-sizing: border-box;
}
... etc...
The workaround is to include a recognised User-Agent header in your request, which is easier to to if you use Invoke-WebRequest like #BiNZGi suggested, rather than the WebClient class - see the "short answer" above for the code.
Also, note that this sniffing behaviour with the User-Agent is specific to "thegivebackproject.org" website and isn't necessarily true for other websites - you don't always need to include a User-Agent header as a rule of thumb.
You can use the easier Invoke-WebRequest:
Invoke-WebRequest -URI https://thegivebackproject.org/CheckStatus.txt -OutFile D:\CheckStatus.txt
I have a small site where I have a mailing list contact form in an iFrame, and once its submitted, a callback page I registered with the mailing list service is called, displaying in the iFrame and asking the user to check their email. The page I registered is http://mydomain.com/verify.html. In vertify.html I use "window.parent.document.getElementById('lightbox4').style.display='none';" to close the lightbox div that contains the I frame. This all works well, as long as the user initially visits http://mydomain.com, but if they visit http://www.mydomain.com, then calling "window.parent.document.getElementById('lightbox4').style.display='none';" doesn't work, because its a cross domain request.
So, no problem I thought, I'll just create a redirect rule to convert calls from www.mydomain.com, to mydomain.com. But now I'm getting the error "This webpage has a redirect loop" when I try to go to either www.mydomain.com or mydomain.com. In IIS7, I have two bindings, one for mydomain.com and one for www.mydomain.com. My DNS zone has an A record for mydomain.com, and a CNAME for www.mydomain.com.
Am I doing something stupid here? Is there ome way to debug this? I can see in Firefox, using the Live HTTP headers plugin, the URL is redirected properly from www.mydomain.com to mydomain.com , but then tries to keep trying to redirect mydomain.com to mydomain.com, creating the endless loop:
http://www.mydomain.com/
GET / HTTP/1.1
Host: www.mydomain.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:20.0) Gecko/20100101 Firefox/20.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
HTTP/1.1 302 Redirect
Content-Type: text/html; charset=UTF-8
Location: http://mydomain.com/
Server: Microsoft-IIS/7.0
X-Powered-By: ASP.NET
Date: Sun, 21 Apr 2013 15:20:12 GMT
Content-Length: 150
----------------------------------------------------------
http://mydomain.com/
GET / HTTP/1.1
Host: mydomain.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:20.0) Gecko/20100101 Firefox/20.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
HTTP/1.1 302 Redirect
Content-Type: text/html; charset=UTF-8
Location: http://mydomain.com/
Server: Microsoft-IIS/7.0
X-Powered-By: ASP.NET
Date: Sun, 21 Apr 2013 15:20:12 GMT
Content-Length: 150
----------------------------------------------------------
http://mydomain.com/
GET / HTTP/1.1
Host: mydomain.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:20.0) Gecko/20100101 Firefox/20.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
HTTP/1.1 302 Redirect
Content-Type: text/html; charset=UTF-8
Location: http://mydomain.com/
Server: Microsoft-IIS/7.0
X-Powered-By: ASP.NET
Date: Sun, 21 Apr 2013 15:20:12 GMT
Content-Length: 150
----------------------------------------------------------
and it keeps going until "This webpage has a redirect loop" is displayed
I expect I have to create a new virtual directory for www.mydomain.com and then redirect that to mydomain.com, but that seems awkward.
I am using rails 4.0 to develop facebook page_tab. I got blank content showed on the facebook tabpage.
From what I think, the issue is related to turbolink. The following are the firefox requrest and response headers
Response header
HTTP/1.1 200 OK
Date: Mon, 01 Apr 2013 08:54:54 GMT
Status: 200 OK
Connection: close
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
X-UA-Compatible: chrome=1
X-XHR-Current-Location: /page_tab
Content-Type: text/html; charset=utf-8
Etag: "5d34060006e527f1a21db545df3d919f"
Cache-Control: max-age=0, private, must-revalidate
Set-Cookie: _likenotlike_session=SEhKbk5oZ0FHT2o0RkRMK3k2OThidHY1Yk5HYjdIWGNkNFIrWisxbkVKRitLT2tJM2d2b1NVV0xQYW5Qc015L0ljVjdDWCtITWR4cUhLc2VjK3hGUHNCbHAzb0YxV1F4OUNaa0hudDE0MkFZRlhYUGgxK2M5eDBNMTRIZzdhZXVyRTBmZEx3Q1RKaXRrZFJwaUYyY2JMdUNpSmlZRmhNS0Z6dGFEMEE5b2RLOXJGdWF0Z1NHcDR1N0ZleVgvZDRJLS1KcjhndzRuUjJaSXZnd1lNdjUyNTJBPT0%3D--a51e845979d81ace643d14b399ffa655ece63d79; path=/; HttpOnly
X-Request-Id: aac0e275-92b7-4b4b-9be7-b811ff9dec29
X-Runtime: 0.024202
Request Header
POST /page_tab HTTP/1.1
Host: localhost:60000
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:20.0) Gecko/20100101 Firefox/20.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://static.ak.facebook.com/platform/page_proxy.php?v=5
Cookie: fbm_353759128067702=base_domain=.localhost; fbm_470420673030979=base_domain=.localhost; request_method=POST; _likenotlike_session=T2o2dVZUSkhxUDhWdDJyWGsvQmYxZHVGVGszYy9pc2VIdGs3OWJ0YkRQTSt2eTJtR2pxTDZLSFRpbWVDamx2ZFVxU2pJRENNRzl2elNqMkF4Q01hcTlWZkZNNUVnSy9ucnJrUWQ0YWFheUJqRklsaEQ1RlM5ZGN1MEhGV0NpQ0E5bjc0VXZoQThuVzJjbjFQTmpZeUVzK2M1anRBamZqU3VwZVlYUlNpQmRnYnlVNWJZTk5wc3dZTEZpR0lyWTE2LS1tSkRHb3JpNGM4U205bEdxMEpkOE5nPT0%3D--85ea3314a43d08dda9d00218a5045968ef040d0b
Connection: keep-alive
In the response header there are X-- headers that I think are related to ajax. So I think rails together with turbolink think that the request is the ajax request but actually the request is normal post request if you can see from the request header above.
Really appreciate for your help.
Solution to the problem is the following link
http://conpanna.net/en-us/blog/5185b5ce79ec73ae54000003
Just add response.headers["X-Frame-Options"] = "GOFORIT"
and every thing works
I have three websites hosted (example1.com, example2.com, example3.com) on a server. There is a page (test.php) on example1.com with just code below inside it:
<?php
header('Location:http://example2.com/a.php');
?>
When I browse test.php it goes to http://example1.com/a.php . it doesn't understand it is another domain url, it tried to find the page on itself.
but when I put http://google.com instead of example2.com/a.php it works correct. I really get confused.
What is the problem ? Should I set some configuration on the server?
( I am administrator of the hosting server ).
Ps. The server is behind a pound server.
Edited:
Here's the Firebug Net output for example1.com/test.php
Response Headers:
HTTP/1.1 302 Found
Date: Tue, 09 Oct 2012 09:03:34 GMT
Server: Apache/2.2.16 (Debian)
Location: http://example1.com/a.php
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 21
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=utf-8
Request Headers:
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding gzip, deflate
Accept-Language en-us,en;q=0.5
Connection keep-alive
Cookie mycookie
Host example1.com
User-Agent Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20100101 Firefox/14.0.1
the problem is solved. it was because of pound server configuration. 'RewriteLocation' entry in pound server configuration must be set to 2 to this server doesn't change the redirect location.
anyway, thank you for answering.
I'm trying to implement file upload functionality in the iPhone app. Server code is tested and works when files are uploaded from the desktop browser, so I moved to implementing the Objective-C client code. I'm assembling HTTP requests body manually, and despite that it looks correct, it is rejected by the server (server handler unable to extract the parts from multipart content). In desperation I've simplified the form to having only one parameter, but it still does not work.
I've captured the network traffic and I could see that Wireshark could not parse my multipart content as well (have a look at screenshots: Firefox request, iPhone request). I'm pasting it below in hope that you could see the errors I can't see.
Thanks in advance.
Firefox:
POST /cubepaint/actions/gallery/post HTTP/1.1
Host: [...]
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-GB; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Authorization: Basic [...]
Content-Type: multipart/form-data; boundary=---------------------------20072377098235644401115438165
Content-Length: 180
-----------------------------20072377098235644401115438165
Content-Disposition: form-data; name="deviceId"
12345
-----------------------------20072377098235644401115438165--
HTTP/1.1 200 OK
Date: Sat, 17 Oct 2009 22:09:21 GMT
Server: Apache/2.2.3 (Debian) DAV/2 SVN/1.4.2 mod_python/3.2.10 Python/2.4.4 mod_ssl/2.2.3 OpenSSL/0.9.8c
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
iPhone:
POST /cubepaint/actions/gallery/post HTTP/1.1
Host: [...]
User-Agent: Copenhagen/1.0 CFNetwork/459 Darwin/9.8.0
Content-Type: multipart/form-data; boundary=----------0E7B16E6-CD3D-4213-9B42-07DA30822C74
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Authorization: Basic [...]
Content-Length: 187
Connection: keep-alive
----------0E7B16E6-CD3D-4213-9B42-07DA30822C74
Content-Disposition: form-data; name="deviceId"
00000000-0000-1000-8000-0016CBCC0B61
----------0E7B16E6-CD3D-4213-9B42-07DA30822C74--
HTTP/1.1 200 OK
Date: Sat, 17 Oct 2009 22:04:07 GMT
Server: Apache/2.2.3 (Debian) DAV/2 SVN/1.4.2 mod_python/3.2.10 Python/2.4.4 mod_ssl/2.2.3 OpenSSL/0.9.8c
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
Your iPhone version indicates keep-alive but doesn't specify a length. Not sure that's enough to cause trouble.
Also, is it possible your server is checking for user-agent strings it recognizes (say, for backward-compatibility mode)?
I'd also compare the two in a text editor that shows CR/LF characters to make sure you're getting proper line endings.
Another thing you could try is create a simple web-page that does a multipart POST and run it from the iPhone browser (instead of the Mac one) then check the headers that go across the wire. Or you could snag a toolkit like ASIHTTPRequest and see what kind of output it generates for multi-part posts (or just use the toolkit instead of trying to write your own).
Good luck
Solved by reading RFC 2046 (MIME specification): boundary between parts of multipart message should contain two leading '-'s, and last boundary should additionally contain two trailing '-'s. The boundary in the request header and request body in the Firefox request differ:
---------------------------20072377098235644401115438165
and
-----------------------------20072377098235644401115438165
The last boundary looks like this:
-----------------------------20072377098235644401115438165--
You really could not see this with the eye when there are so many leading '-'s in the original boundary.