I am trying to fetch a CSV file from a website (https://www.stocknet.fr/accueil.asp) using a GET request on the https URL. The response I get via Postman looks like this:
Type;Groupe Acc�s;Code;EOTP autoris�s;Familles EOTP autoris�es;Nom;Pr�nom;Adresse Mail;Agences autoris�es;D�p�ts autoris�s;Date cr�ation;Fournisseurs autoris�s;Classes autoris�es;Familles article
But when access the URL directly, my browser automatically downloads the file, and I open it on windows with a proper encoding:
Type;Groupe Accès;Code;EOTP autorisés;Familles EOTP autorisées;Nom;Prénom;Adresse Mail;Agences autorisées;Dépôts autorisés;Date création;Fournisseurs autorisés;Classes autorisées;Familles article
When I inspect the website HTML, I can see the tag <meta charset="ISO-8859-1" />
I tried using headers as such:
Accept-Charset: ISO-8859-1
Accept-Charset: UTF-8
Content-Type: text/csv; charset=ISO-8859-1
Content-Type: text/csv; charset=UTF-8
Content-Encoding: gzip
Content-Encoding: compress
Content-Encoding: deflate
Content-Encoding: identity
Content-Encoding: br
Nothing seem to return a response with the correct encoding.
Any idea what I am doing wrong ? Note that, whatever page of the website I try to fetch, I get this wrong encoding. It's not only with the CSV file.
The server is returning content in iso-8859-1 and telling you it's iso-8859-1. You will not convince the server to return anything else. Your web browser contains code to convert encodings. If you want to have the content in a different encoding, you have to convert it yourself.
For ways how to do that, see:
Best way to convert text files between character sets?
I'm working on a Delphi REST client for a public API that requires an HMAC256/Base64 signed string to be added to the headers of the request to authenticate. I've spent hours trying to figure out why it's not working, so I compared the raw request from my Delphi client to that of a working C# library (using Wireshark).
It turns out my request matches perfectly the request generated by the working C# library, except that Delphi's REST client is URL-encoding the values added to the request's header, therefore invalidating the carefully crafted signature.
This is how I'm adding the signature string to the header:
RESTRequest1.Params.AddHeader('SIGNATURE', FSignature);
The signature string may have slashes, plus signs, and/or equal signs that are being URL-encoded when they shouldn't. For example when the value of the signature string is...
FSignature = '8A1BgACL9kB6P/kXuPdm99s05whfkrOUnEziEtU+0OY=';
...then the request should should output raw headers like...
GET /resource HTTP/1.1
User-Agent: Embarcadero URI Client/1.0
Connection: Keep-Alive
<snip>
SIGNATURE: 8A1BgACL9kB6P/kXuPdm99s05whfkrOUnEziEtU+0OY=
<snip>
...but instead Wireshark shows this as the real value being sent...
SIGNATURE: 8A1BgACL9kB6P%2FkXuPdm99s05whfkrOUnEziEtU%2B0OY%3D
Is there a way to prevent the URL-encoding of values when using AddHeader? Or maybe another way to add raw headers to a TRESTClient request?
PS: I already tried both TRESTRequest.Params.AddHeader and TRESTClient.AddParameter with TRESTRequestParameterKind.pkHTTPHEADER as the Kind parameter. Both resulted in URL-encoded values.
PS2: Using Delphi RAD Studio 10.3.
You should include poDoNotEncode in the Options property of the TRESTRequestParameter.
This can be done using:
RESTClient1.AddParameter('SIGNATURE', FSignature, pkHTTPHEADER, [poDoNotEncode]);
or by using:
RESTClient1.Params.AddHeader('SIGNATURE', FSignature).Options := [poDoNotEncode];
I am using the Postman Chrome extension for testing a web service.
There are three options available for data input.
I guess the raw is for sending JSON.
What is the difference between the other two, form-data and x-www-form-urlencoded?
These are different Form content types defined by W3C.
If you want to send simple text/ ASCII data, then x-www-form-urlencoded will work. This is the default.
But if you have to send non-ASCII text or large binary data, the form-data is for that.
You can use Raw if you want to send plain text or JSON or any other kind of string. Like the name suggests, Postman sends your raw string data as it is without modifications. The type of data that you are sending can be set by using the content-type header from the drop down.
Binary can be used when you want to attach non-textual data to the request, e.g. a video/audio file, images, or any other binary data file.
Refer to this link for further reading:
Forms in HTML documents
This explains better:
Postman docs
Request body
While constructing requests, you would be dealing with the request body editor a lot. Postman lets you send almost any kind of HTTP request (If you can't send something, let us know!). The body editor is divided into 4 areas and has different controls depending on the body type.
form-data
multipart/form-data is the default encoding a web form uses to transfer data. This simulates filling a form on a website, and submitting it. The form-data editor lets you set key/value pairs (using the key-value editor) for your data. You can attach files to a key as well. Do note that due to restrictions of the HTML5 spec, files are not stored in history or collections. You would have to select the file again at the time of sending a request.
urlencoded
This encoding is the same as the one used in URL parameters. You just need to enter key/value pairs and Postman will encode the keys and values properly. Note that you can not upload files through this encoding mode. There might be some confusion between form-data and urlencoded so make sure to check with your API first.
raw
A raw request can contain anything. Postman doesn't touch the string entered in the raw editor except replacing environment variables. Whatever you put in the text area gets sent with the request. The raw editor lets you set the formatting type along with the correct header that you should send with the raw body. You can set the Content-Type header manually as well. Normally, you would be sending XML or JSON data here.
binary
binary data allows you to send things which you can not enter in Postman. For example, image, audio or video files. You can send text files as well. As mentioned earlier in the form-data section, you would have to reattach a file if you are loading a request through the history or the collection.
UPDATE
As pointed out by VKK, the WHATWG spec say urlencoded is the default encoding type for forms.
The invalid value default for these attributes is the application/x-www-form-urlencoded state. The missing value default for the enctype attribute is also the application/x-www-form-urlencoded state.
Here are some supplemental examples to see the raw text that Postman passes in the request. You can see this by opening the Postman console:
form-data
Header
content-type: multipart/form-data; boundary=--------------------------590299136414163472038474
Body
key1=value1key2=value2
x-www-form-urlencoded
Header
Content-Type: application/x-www-form-urlencoded
Body
key1=value1&key2=value2
Raw text/plain
Header
Content-Type: text/plain
Body
This is some text.
Raw json
Header
Content-Type: application/json
Body
{"key1":"value1","key2":"value2"}
multipart/form-data
Note. Please consult RFC2388 for additional information about file uploads, including backwards compatibility issues, the relationship between "multipart/form-data" and other content types, performance issues, etc.
Please consult the appendix for information about security issues for forms.
The content type "application/x-www-form-urlencoded" is inefficient for sending large quantities of binary data or text containing non-ASCII characters. The content type "multipart/form-data" should be used for submitting forms that contain files, non-ASCII data, and binary data.
The content type "multipart/form-data" follows the rules of all multipart MIME data streams as outlined in RFC2045. The definition of "multipart/form-data" is available at the [IANA] registry.
A "multipart/form-data" message contains a series of parts, each representing a successful control. The parts are sent to the processing agent in the same order the corresponding controls appear in the document stream. Part boundaries should not occur in any of the data; how this is done lies outside the scope of this specification.
As with all multipart MIME types, each part has an optional "Content-Type" header that defaults to "text/plain". User agents should supply the "Content-Type" header, accompanied by a "charset" parameter.
application/x-www-form-urlencoded
This is the default content type. Forms submitted with this content type must be encoded as follows:
Control names and values are escaped. Space characters are replaced by +', and then reserved characters are escaped as described in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by %HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as "CR LF" pairs (i.e., %0D%0A'). The control names/values are listed in the order they appear in the document. The name is separated from the value by =' and name/value pairs are separated from each other by `&'.
application/x-www-form-urlencoded the body of the HTTP message sent to the server is essentially one giant query string -- name/value pairs are separated by the ampersand (&), and names are separated from values by the equals symbol (=). An example of this would be:
MyVariableOne=ValueOne&MyVariableTwo=ValueTwo
The content type "application/x-www-form-urlencoded" is inefficient for sending large quantities of binary data or text containing non-ASCII characters. The content type "multipart/form-data" should be used for submitting forms that contain files, non-ASCII data, and binary data.
let's take everything easy, it's all about how a http request is made:
1- x-www-form-urlencoded
http request:
GET /getParam1 HTTP/1.1
User-Agent: PostmanRuntime/7.28.4
Accept: */*
Postman-Token: a14f1286-52ae-4871-919d-887b0e273052
Host: localhost:12345
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 55
postParam1Key=postParam1Val&postParam2Key=postParam2Val
2- raw
http request:
GET /getParam1 HTTP/1.1
Content-Type: text/plain
User-Agent: PostmanRuntime/7.28.4
Accept: */*
Postman-Token: e3f7514b-3f87-4354-bcb1-cee67c306fef
Host: localhost:12345
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Content-Length: 73
{
postParam1Key: postParam1Val,
postParam2Key: postParam2Val
}
3- form-data
http request:
GET /getParam1 HTTP/1.1
User-Agent: PostmanRuntime/7.28.4
Accept: */*
Postman-Token: 8e2ce54b-d697-4179-b599-99e20271df90
Host: localhost:12345
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Content-Type: multipart/form-data; boundary=--------------------------140760168634293019785817
Content-Length: 181
----------------------------140760168634293019785817
Content-Disposition: form-data; name="postParam1Key"
postParam1Val
----------------------------140760168634293019785817--
While analyzing the HTTP Requests OF a website. I found that in one of the POST request it sends three postdata to the server the first one was SAML data first base64 encoded then urlencoded.
But I am not able to figure out the value of other two postvars. One thing I am sure about is that it is not using any encryption methods like md5 or sha1 etc. COZ the response text contains my user name value which according to my research is neither stored in session variable or cookies means this encoding of post data can be reversed. So I am guessing that may be my user name "RAHUL" is inside one of these post variables. But am unable to read it.
First String:
sRrWj1zUsisp/UylJiEf/pekY//ok1nYAAcvJfkxL9kMEggMAX0jTTs1hPPKTU9d1u/qgdq6eIvS
nk3NT6KkR9bKiGyQKY5iJ39JXGNlBvxs3F9N7TMHUBeNZ2BSDg05dTyYtdiVffRDnQ5KgDCy7ZjG
Lzj5J3x3LJumTau7aFc5CZ2b4xqzEPc4kGVcg/6l5D7Hxonp6U/0DnIzemcrXfb95X40CidNmz1J
PlGaeZzgAsA619vhs3AlGPNZ/Nbbm7IsJlVcKY6TvigrP0jMCp/0BvYb45gztvaJicN43JrNUsgc
+CLKaTvxflkLhul/sAe5Gbm83AtR/kNKQZf2hg==
Second String:
Og5+F9RTHNs7NqUEYpgGSshInxZQzCP3gU2fkI8VnS60Ce2hmurlTLn6IcdP63zUkrDbdA2/+J00
DNgD15yW2lNo5Zi3PdfEEOxFjw8L5/RFwoIrMzTzS8csZaWqSAfqW1GiE4hbpAgeKZ4pXrmTLy2A
/AfT90uCptaoEa19qzD6/5o2+G4lCeJf5ZUMeZRMLvX3U909TlzCggf9KsHeJpfXGnGEefu9o0V9
kbQ5FzLEuao9ByCnXaFBEcDBDAFljrK0fsqJyLyv2gnhj4IOcCAEowa9N6tBsu/ngac9uR+NHY4+
r4l67i+nt5CRZ9PRLq/hT2qCoy6PguhDOEHbgg==
When I decoded the above strings using base64 decoder it returns unreadable value. I want to know what to do next so as to get useful data inside it.
I am pasting the complete post request including Headers and form data.
HEADERS
Host: xxxx.xxxx.xxxx.xxxxxxx
User-Agent: Mozilla/5.0 (Windows NT 6.2; rv:20.0) Gecko/20100101 Firefox/20.0
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer:xxx.xxx.com/xxxxxx/adsf
Cookie: _abc_abc_session=1032510200e6bf9a8ae265553120e1ca;
AWSELB=F7610D8306188BFF856DC4E8C0134950D9FBEC546F2ACFBA970F103CC9E2B9074253115B0BB906564BB68191596A2637A0D1F52106813C785600B014A199891F5B8C6C8420
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 8725
FORM
TARGET: www.XXVVVVVVVVVF.com/sessions/consume
SAMLResponse: XXXXXXXXXX
APID: ap_00001
pca_red74:
KiiYkBzqSHEKWu2Q//CgZg47iEBSOkU1Ew3yaUIAQNqHAf8AwZVLQXdNw5ZF0B67WJH46JDKQ/sP
Cypp2sofHA/Eq0gXMoH7yZt3RG0LXTuNANYNr/chOx4kks0/fINjpowPXTiSkWc0bsXimWH62BZy
mq7TATEsXM6w4ywu1cVTP+/DlfNy3Mf0V3VVwEjMWwtR/3X8zKgtRJKMTtwe/YGhus6YefSEknPO
pO9oy3zdDy0Yp7qRp93tPAdxRSXyIsJs5bJlefH8o5QSzsk7hlBhQFhd/OlKpMCsYMDSOHa+FJ1K
AqEWgH0eMzczO6LFhVdhAAm3DFaAvxL4u+DkuQ==
pca_red75:
tU48SalKFzVys9fZR1Se+5xP1dlOh9SlbYBT/Ct6BGiyIFEVEdyq2XR7BDuz/0BAsMfGwhgwI3Ws
uNk6KnEyOBIX+9u0eFer/VoHkGydw8310fGxJiiq13BYHnkzk9OLZCdD43VF27a6SvEtaA/LXnm4
ZrURgpoFWtfBmaC4zIkHkYgXW5wTYeJ1Ze0rgmBYPFlms2BefeRricA68NR3OsbSoCmwIKfuWe+2
esM4RN8t9jG/nccM2EeluDXRKJHA09O02Lq7KBhZw5o2OBCQ7nDc9p47Poli0as1yo+ylHfjJOag
qCeVuPBCLEwpJL74CreuzJGAYqSOVA9BOx5SQA==
I building a module for compressing HTTP output. Reading the spec, I haven't found a clear distinction on a couple of things:
Accept-Encoding:
Should this be treated the same as a Accept-Encoding: * or as if no header is present?
Or what if I don't support gzip, but I get a header like this:
Accept-Encoding: gzip
Should I return a 406 error or just return the data unencoded?
EDIT:
I've read over the spec a few times. It mentions my first case, but it doesn't define what the behavior of the server should be.
Should I treat this case as if the header is not present? Or should I return a 406 error because there's no way to encode something given the field value ('' isn't a valid encoding).
There is written everything in the Spec: 14.3 Accept-Encoding:
The special "*" symbol in an Accept-Encoding field matches any
available content-coding not explicitly listed in the header
field.
If an Accept-Encoding field is present in a request, and if the server cannot send a response which is acceptable according to the Accept-Encoding header, then the server SHOULD send an error response with the 406 (Not Acceptable) status code.
edit:
If the Accept-Encoding field-value is empty, then only the "identity"
encoding is acceptable.
In this case, if "identity" is one of the available content-codings, then the server SHOULD use the "identity" content-coding, unless it has additional information that a different content-coding is meaningful to the client.
What is "identity"
identity
The default (identity) encoding; the use of no transformation whatsoever. This content-coding is used only in the Accept- Encoding header, and SHOULD NOT be used in the Content-Encoding header.