HTTP Response encoding issue

HTTP Response encoding issue - rest

I am trying to fetch a CSV file from a website (https://www.stocknet.fr/accueil.asp) using a GET request on the https URL. The response I get via Postman looks like this:
Type;Groupe Acc�s;Code;EOTP autoris�s;Familles EOTP autoris�es;Nom;Pr�nom;Adresse Mail;Agences autoris�es;D�p�ts autoris�s;Date cr�ation;Fournisseurs autoris�s;Classes autoris�es;Familles article
But when access the URL directly, my browser automatically downloads the file, and I open it on windows with a proper encoding:
Type;Groupe Accès;Code;EOTP autorisés;Familles EOTP autorisées;Nom;Prénom;Adresse Mail;Agences autorisées;Dépôts autorisés;Date création;Fournisseurs autorisés;Classes autorisées;Familles article
When I inspect the website HTML, I can see the tag <meta charset="ISO-8859-1" />
I tried using headers as such:
Accept-Charset: ISO-8859-1
Accept-Charset: UTF-8
Content-Type: text/csv; charset=ISO-8859-1
Content-Type: text/csv; charset=UTF-8
Content-Encoding: gzip
Content-Encoding: compress
Content-Encoding: deflate
Content-Encoding: identity
Content-Encoding: br
Nothing seem to return a response with the correct encoding.
Any idea what I am doing wrong ? Note that, whatever page of the website I try to fetch, I get this wrong encoding. It's not only with the CSV file.

The server is returning content in iso-8859-1 and telling you it's iso-8859-1. You will not convince the server to return anything else. Your web browser contains code to convert encodings. If you want to have the content in a different encoding, you have to convert it yourself.
For ways how to do that, see:
Best way to convert text files between character sets?

Related

Incomplete attachments remain attached to the mail

I am using mimedefang filtering tool. In the configuration, I strip out all the attachments and forward it to another address. For particular sender, I can see milter changes the header Content-Type from application/pdf and multipart-mixed. In the received email on outlook, when I open the pdf using text editor (it contains content like ("This is a multi-part message in MIME format..." followed by some random numbers "------------=_1525668389-64274-8--").
Can anyone guess why this might be happening?

Multi-part messages (like those with attachments) have their parts divided by a boundary. This boundary is between 1 and 70 characters and must not appear anywhere in the anywhere within the encapsulated parts of the message (between boundaries).
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08jU534c0p
This is a message with multiple parts in MIME format.
--gc0p4Jq0M2Yt08jU534c0p
Content-Type: text/html; charset=UTF-8
<html><head></head><body>This is the HTML body of the message.</body></html>
--gc0p4Jq0M2Yt08jU534c0p
Content-Type: text/plain
This is the body of the message.
--gc0p4Jq0M2Yt08jU534c0p
Content-Type: application/octet-stream
Content-Transfer-Encoding: base64
PGh0bWw+CiAgPGhlYWQ+CiAgPC9oZWFkPgogIDxib2R5PgogICAgPHA+VGhpcyBpcyB0aGUg
Ym9keSBvZiB0aGUgbWVzc2FnZS48L3A+CiAgPC9ib2R5Pgo8L2h0bWw+Cg==
--gc0p4Jq0M2Yt08jU534c0p--
I suspect that somewhere between mimedefang and your milter configuration, the boundaries are getting mangled or included into the attachment can causing them to be corrupted.

SoapUI UTF-8 encoding

I'm trying to make a 'DEL' REST request using Soapui Pro. One of the parameters includes a value containing '+' (which gets treated as SPACE if not UTF-8 encoded). Soapui doesn't seem to URL encode the request even though tried to re-enforce encoding in the REST request step.
Secondly, i tried to pass UTF-8 encoded value (encoded by a groovy step). But Soapui seems to, some how, automatically. decode the value before sending. Am i missing something here? Please help.
Sample request (RAW)
Sensitive data masked by XXXXX, YYYYY
DELETE https://XXXXXservices.com/XXXXX_services/ogw/emf/securityShare/888247189312/members?shareId=8882015810875&memberIds=XXXXX&auth-username=YYYYY&auth-nonce=XIX9UL9UBIE2L8K&auth-sharedkeyid=System&auth-expiresby=1414607713258&auth-algorithm=HMACSHA256&auth-signature=9JmAxG6rkqZe0Fu+zPQIh9Eh3tazDoE1YBZFxgIRLMc= HTTP/1.1
Accept-Encoding: gzip,deflate
Host: XXXXXservices.com
Connection: Keep-Alive
User-Agent: Apache-HttpClient/4.1.1 (java 1.5)

Postman Chrome: What is the difference between form-data, x-www-form-urlencoded and raw

I am using the Postman Chrome extension for testing a web service.
There are three options available for data input.
I guess the raw is for sending JSON.
What is the difference between the other two, form-data and x-www-form-urlencoded?

These are different Form content types defined by W3C.
If you want to send simple text/ ASCII data, then x-www-form-urlencoded will work. This is the default.
But if you have to send non-ASCII text or large binary data, the form-data is for that.
You can use Raw if you want to send plain text or JSON or any other kind of string. Like the name suggests, Postman sends your raw string data as it is without modifications. The type of data that you are sending can be set by using the content-type header from the drop down.
Binary can be used when you want to attach non-textual data to the request, e.g. a video/audio file, images, or any other binary data file.
Refer to this link for further reading:
Forms in HTML documents

This explains better:
Postman docs
Request body
While constructing requests, you would be dealing with the request body editor a lot. Postman lets you send almost any kind of HTTP request (If you can't send something, let us know!). The body editor is divided into 4 areas and has different controls depending on the body type.
form-data
multipart/form-data is the default encoding a web form uses to transfer data. This simulates filling a form on a website, and submitting it. The form-data editor lets you set key/value pairs (using the key-value editor) for your data. You can attach files to a key as well. Do note that due to restrictions of the HTML5 spec, files are not stored in history or collections. You would have to select the file again at the time of sending a request.
urlencoded
This encoding is the same as the one used in URL parameters. You just need to enter key/value pairs and Postman will encode the keys and values properly. Note that you can not upload files through this encoding mode. There might be some confusion between form-data and urlencoded so make sure to check with your API first.
raw
A raw request can contain anything. Postman doesn't touch the string entered in the raw editor except replacing environment variables. Whatever you put in the text area gets sent with the request. The raw editor lets you set the formatting type along with the correct header that you should send with the raw body. You can set the Content-Type header manually as well. Normally, you would be sending XML or JSON data here.
binary
binary data allows you to send things which you can not enter in Postman. For example, image, audio or video files. You can send text files as well. As mentioned earlier in the form-data section, you would have to reattach a file if you are loading a request through the history or the collection.
UPDATE
As pointed out by VKK, the WHATWG spec say urlencoded is the default encoding type for forms.
The invalid value default for these attributes is the application/x-www-form-urlencoded state. The missing value default for the enctype attribute is also the application/x-www-form-urlencoded state.

Here are some supplemental examples to see the raw text that Postman passes in the request. You can see this by opening the Postman console:
form-data
Header
content-type: multipart/form-data; boundary=--------------------------590299136414163472038474
Body
key1=value1key2=value2
x-www-form-urlencoded
Header
Content-Type: application/x-www-form-urlencoded
Body
key1=value1&key2=value2
Raw text/plain
Header
Content-Type: text/plain
Body
This is some text.
Raw json
Header
Content-Type: application/json
Body
{"key1":"value1","key2":"value2"}

multipart/form-data
Note. Please consult RFC2388 for additional information about file uploads, including backwards compatibility issues, the relationship between "multipart/form-data" and other content types, performance issues, etc.
Please consult the appendix for information about security issues for forms.
The content type "application/x-www-form-urlencoded" is inefficient for sending large quantities of binary data or text containing non-ASCII characters. The content type "multipart/form-data" should be used for submitting forms that contain files, non-ASCII data, and binary data.
The content type "multipart/form-data" follows the rules of all multipart MIME data streams as outlined in RFC2045. The definition of "multipart/form-data" is available at the [IANA] registry.
A "multipart/form-data" message contains a series of parts, each representing a successful control. The parts are sent to the processing agent in the same order the corresponding controls appear in the document stream. Part boundaries should not occur in any of the data; how this is done lies outside the scope of this specification.
As with all multipart MIME types, each part has an optional "Content-Type" header that defaults to "text/plain". User agents should supply the "Content-Type" header, accompanied by a "charset" parameter.
application/x-www-form-urlencoded
This is the default content type. Forms submitted with this content type must be encoded as follows:
Control names and values are escaped. Space characters are replaced by +', and then reserved characters are escaped as described in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by %HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as "CR LF" pairs (i.e., %0D%0A'). The control names/values are listed in the order they appear in the document. The name is separated from the value by =' and name/value pairs are separated from each other by `&'.
application/x-www-form-urlencoded the body of the HTTP message sent to the server is essentially one giant query string -- name/value pairs are separated by the ampersand (&), and names are separated from values by the equals symbol (=). An example of this would be:
MyVariableOne=ValueOne&MyVariableTwo=ValueTwo
The content type "application/x-www-form-urlencoded" is inefficient for sending large quantities of binary data or text containing non-ASCII characters. The content type "multipart/form-data" should be used for submitting forms that contain files, non-ASCII data, and binary data.

let's take everything easy, it's all about how a http request is made:
1- x-www-form-urlencoded
http request:
GET /getParam1 HTTP/1.1
User-Agent: PostmanRuntime/7.28.4
Accept: */*
Postman-Token: a14f1286-52ae-4871-919d-887b0e273052
Host: localhost:12345
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 55
postParam1Key=postParam1Val&postParam2Key=postParam2Val
2- raw
http request:
GET /getParam1 HTTP/1.1
Content-Type: text/plain
User-Agent: PostmanRuntime/7.28.4
Accept: */*
Postman-Token: e3f7514b-3f87-4354-bcb1-cee67c306fef
Host: localhost:12345
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Content-Length: 73
{
postParam1Key: postParam1Val,
postParam2Key: postParam2Val
}
3- form-data
http request:
GET /getParam1 HTTP/1.1
User-Agent: PostmanRuntime/7.28.4
Accept: */*
Postman-Token: 8e2ce54b-d697-4179-b599-99e20271df90
Host: localhost:12345
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Content-Type: multipart/form-data; boundary=--------------------------140760168634293019785817
Content-Length: 181
----------------------------140760168634293019785817
Content-Disposition: form-data; name="postParam1Key"
postParam1Val
----------------------------140760168634293019785817--

How to Determine encoding/compression of the string which appears like characters are dancing in gangam style

While analyzing the HTTP Requests OF a website. I found that in one of the POST request it sends three postdata to the server the first one was SAML data first base64 encoded then urlencoded.
But I am not able to figure out the value of other two postvars. One thing I am sure about is that it is not using any encryption methods like md5 or sha1 etc. COZ the response text contains my user name value which according to my research is neither stored in session variable or cookies means this encoding of post data can be reversed. So I am guessing that may be my user name "RAHUL" is inside one of these post variables. But am unable to read it.
First String:
sRrWj1zUsisp/UylJiEf/pekY//ok1nYAAcvJfkxL9kMEggMAX0jTTs1hPPKTU9d1u/qgdq6eIvS
nk3NT6KkR9bKiGyQKY5iJ39JXGNlBvxs3F9N7TMHUBeNZ2BSDg05dTyYtdiVffRDnQ5KgDCy7ZjG
Lzj5J3x3LJumTau7aFc5CZ2b4xqzEPc4kGVcg/6l5D7Hxonp6U/0DnIzemcrXfb95X40CidNmz1J
PlGaeZzgAsA619vhs3AlGPNZ/Nbbm7IsJlVcKY6TvigrP0jMCp/0BvYb45gztvaJicN43JrNUsgc
+CLKaTvxflkLhul/sAe5Gbm83AtR/kNKQZf2hg==
Second String:
Og5+F9RTHNs7NqUEYpgGSshInxZQzCP3gU2fkI8VnS60Ce2hmurlTLn6IcdP63zUkrDbdA2/+J00
DNgD15yW2lNo5Zi3PdfEEOxFjw8L5/RFwoIrMzTzS8csZaWqSAfqW1GiE4hbpAgeKZ4pXrmTLy2A
/AfT90uCptaoEa19qzD6/5o2+G4lCeJf5ZUMeZRMLvX3U909TlzCggf9KsHeJpfXGnGEefu9o0V9
kbQ5FzLEuao9ByCnXaFBEcDBDAFljrK0fsqJyLyv2gnhj4IOcCAEowa9N6tBsu/ngac9uR+NHY4+
r4l67i+nt5CRZ9PRLq/hT2qCoy6PguhDOEHbgg==
When I decoded the above strings using base64 decoder it returns unreadable value. I want to know what to do next so as to get useful data inside it.
I am pasting the complete post request including Headers and form data.
HEADERS
Host: xxxx.xxxx.xxxx.xxxxxxx
User-Agent: Mozilla/5.0 (Windows NT 6.2; rv:20.0) Gecko/20100101 Firefox/20.0
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer:xxx.xxx.com/xxxxxx/adsf
Cookie: _abc_abc_session=1032510200e6bf9a8ae265553120e1ca;
AWSELB=F7610D8306188BFF856DC4E8C0134950D9FBEC546F2ACFBA970F103CC9E2B9074253115B0BB906564BB68191596A2637A0D1F52106813C785600B014A199891F5B8C6C8420
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 8725
FORM
TARGET: www.XXVVVVVVVVVF.com/sessions/consume
SAMLResponse: XXXXXXXXXX
APID: ap_00001
pca_red74:
KiiYkBzqSHEKWu2Q//CgZg47iEBSOkU1Ew3yaUIAQNqHAf8AwZVLQXdNw5ZF0B67WJH46JDKQ/sP
Cypp2sofHA/Eq0gXMoH7yZt3RG0LXTuNANYNr/chOx4kks0/fINjpowPXTiSkWc0bsXimWH62BZy
mq7TATEsXM6w4ywu1cVTP+/DlfNy3Mf0V3VVwEjMWwtR/3X8zKgtRJKMTtwe/YGhus6YefSEknPO
pO9oy3zdDy0Yp7qRp93tPAdxRSXyIsJs5bJlefH8o5QSzsk7hlBhQFhd/OlKpMCsYMDSOHa+FJ1K
AqEWgH0eMzczO6LFhVdhAAm3DFaAvxL4u+DkuQ==
pca_red75:
tU48SalKFzVys9fZR1Se+5xP1dlOh9SlbYBT/Ct6BGiyIFEVEdyq2XR7BDuz/0BAsMfGwhgwI3Ws
uNk6KnEyOBIX+9u0eFer/VoHkGydw8310fGxJiiq13BYHnkzk9OLZCdD43VF27a6SvEtaA/LXnm4
ZrURgpoFWtfBmaC4zIkHkYgXW5wTYeJ1Ze0rgmBYPFlms2BefeRricA68NR3OsbSoCmwIKfuWe+2
esM4RN8t9jG/nccM2EeluDXRKJHA09O02Lq7KBhZw5o2OBCQ7nDc9p47Poli0as1yo+ylHfjJOag
qCeVuPBCLEwpJL74CreuzJGAYqSOVA9BOx5SQA==

HTTP Accept-Encoding and sending unencoded data

I building a module for compressing HTTP output. Reading the spec, I haven't found a clear distinction on a couple of things:
Accept-Encoding:
Should this be treated the same as a Accept-Encoding: * or as if no header is present?
Or what if I don't support gzip, but I get a header like this:
Accept-Encoding: gzip
Should I return a 406 error or just return the data unencoded?
EDIT:
I've read over the spec a few times. It mentions my first case, but it doesn't define what the behavior of the server should be.
Should I treat this case as if the header is not present? Or should I return a 406 error because there's no way to encode something given the field value ('' isn't a valid encoding).

There is written everything in the Spec: 14.3 Accept-Encoding:
The special "*" symbol in an Accept-Encoding field matches any
available content-coding not explicitly listed in the header
field.
If an Accept-Encoding field is present in a request, and if the server cannot send a response which is acceptable according to the Accept-Encoding header, then the server SHOULD send an error response with the 406 (Not Acceptable) status code.
edit:
If the Accept-Encoding field-value is empty, then only the "identity"
encoding is acceptable.
In this case, if "identity" is one of the available content-codings, then the server SHOULD use the "identity" content-coding, unless it has additional information that a different content-coding is meaningful to the client.
What is "identity"
identity
The default (identity) encoding; the use of no transformation whatsoever. This content-coding is used only in the Accept- Encoding header, and SHOULD NOT be used in the Content-Encoding header.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse