Hotmail messing with encoded URL parameters - email

We have a system that sends out regular emails with links in, many of which contain URL encoded parameters such as this:
href="http://www.mydomain.com/login.aspx?returnurl=http%3A%2F%2Fwww.mydomain.com%2Fview.aspx%3Fid%3D1234%26alert%3Dtrue"
You can see that the "returnurl" parameter is encoded. However, it seems that a large number of our users (seemingly hotmail) are receiving the emails with this paramater partly decoded such as:
href="http://www.mydomain.com/login.aspx?returnurl=http://www.mydomain.com/view.aspx?view.aspx%3Fid%3D1234%26alert%3Dtrue"
Why would it decode like this? Why only partly decode?? I therefore have no idea how to deal with it. I thought of base-64 encoding but that base64 strings contain characters that would need decoding too... I thought of double encoding but then I will not know whether to double-decode the parameter or not... Can anyone help? Thanks.

One reason this could be happening is because url rules for encoding are different before and after ? so if mechanism that is doing decoding does it from the 'back' of url and apples query decoding rules until it finds first ? then this could cause problem you are describing...
Not sure how to deal with it though as I understand system that does this inappropriate decoding is outside of your control. I would try to hide the ? in return url query somehow...

Related

How can I decode email contents that are differently "Content-Transfer-Encoding" encoded?

I'm reading emails with imaplib, and found out that some email contents are encoded base64, and some 7bits.
I tried to decode it with 'Content-Transfer-Encoding' value.
But even more, some have 'Content-Transfer-Encoding' header in message object, whereas some have it in message.get_payload()[0].
I can deal with these some cases, but I think there can be more cases that I haven't found.
Is there any better way to decode email contents, no matter how they are encoded?
Thanks :)
when using get_payload(), I added decode=True option, so that it can automatically decode if needed. link
And then, isinstance(content, bytes) tells you whether you have to uni-decode or not.

How to build a Uri in Spray?

I would like to make a simple GET request via Spray with a few query parameters
Get("http://localhost/user?email=abc+a#abc.com")
However + means a space in application/x-www-form-urlencoded content resulting the call to http://localhost/user?email=abc a#abc.com (with a space instead of plus sign).
I could use a non-Spray java.net.URLEncoder to encode the URL before passing it to the GET request however I doing this every time seems like a hack.
Is there a Spray way of applying query parameters and encoding them?
Uri("http://localhost/").withQuery(Map("email"->"abc+a#abc.com")) is a nice way to construct a Uri but it doesn't encode the params as well...
Actually Uri("http://localhost/").withQuery(Map("email"->"abc+a#abc.com")) works fine as it encodes the special symbols.
However, Uri("http://localhost/").withQuery("email=abc+a#abc.com") doesn't.
I use java.net.URLEncoder. I believe that is the accepted method.
It would be nice if that happened automatically!

Is form data automatically encoded by browsers?

I have read some stuff about form data encoding, but one thing remains unclear. In case of enctype="application/x-www-form-urlencoded" we need to urlencode data by hand, don't we?
... Forms submitted with this content type must be encoded as follows
Must be encoded by whom? By browsers? Or by application developers?
The other thing is -- what encoding (if any) is used, or should be used, in case of multipart/form-data?
I'm kindda mislead so big thx in advance.
Actually, browsers url-encode data automatically. And this w3 docs is first of all for those who make browsers. So that phrase, Forms submitted with this content type must be encoded as follows means that data should be encoded by browsers. Anyway, one can check it by viewing raw post in the form data handling script (in case of php in looks like file_get_contents("php://input");)

How to convert all \u**** string into the real readable NSString? [duplicate]

I am trying to support arbitrary unicode from a variety of international users. They have already put a bunch of data into sqlite databases on their iPhones, and now I want to capture the data into a database, then send it back to their device. Right now I am using a php page that is sending data back to from an internet mysql database. The data is saved in the mysql database properly, but when it's sent back it comes out as unicode text, such as
Frank\u00e2\u0080\u0099s iPad
instead of just
Frank's iPad
where the apostrophe should really be a curly apostrophe.
The answer posted to another question indicates that there is no built-in Cocoa methods to convert the "\u00e2\u0080\u0099" portion of the unicode string from the webserver to an NSString object. Is this correct?
That seems really surprising (and scarily disappointing), since Cocoa definitely allows input from many different Unicode characters, and I need to support any arbitrary language that I have never heard of, and all of the possible characters. I save them to and from the local sqlite database just fine now, but once I send it to a web server, then perhaps pull down different data, I want to ensure the data pulled from the web server is correctly formatted.
[...] there is no built-in Cocoa methods to convert [...]. Is this
correct?
It's not correct.
You might be interested in CFStringTransform and it's capabilities. It is a full blown ICU transformation engine, which can (also) perform your requested transformation.
See Using Objective C/Cocoa to unescape unicode characters, ie \u1234
All NSStrings are Unicode.
The problem with the “Frank\u00e2\u0080\u0099s iPad” data isn't that it's Unicode; it's that it's escaped to ASCII. “Frank’s iPad” is valid Unicode in any UTF, and is what you need.
So, you need to see whether the database is returning the data escaped or the PHP layer is escaping it at some point. If either of those is the case, fix it if you can; the PHP resource should return UTF-8/16/32. Only if that approach fails should you seek to unescape the string on the Cocoa side.
You're correct that there is no built-in way to unescape the string in Cocoa. If you get to that point, see if you can find some open-source code to do it; if not, you'll need to do it yourself, probably using NSScanner.
Check that your web service response has Content type and charset. Also that xml has encoding specified. In PHP you need to add the following before printing XML:
header('Content-type: text/xml; charset=UTF-8');
print '<?xml version="1.0" encoding="UTF-8"?>';
I guess there is just no encoding specified.

MVC HttpUtility.UrlEncode

i am attempting to use HttpUtility.UrlEncode to encode strings that ultimately are used in URLs.
example
/string/http://www.google.com
or
/string/my test string
where http://www.google.com is a parameter passed to a controller.
I have tried UrlEncode but it doesn't seem to work quite right
my route looks like:
routes.MapRoute(
"mStringView",
"mString/{sText}",
new { controller = "mString", action = "Index", sText = UrlParameter.Optional }
);
The problem is the encoded bits are decoded it seems somewhere in the routing.. except things like "+" which replace " " are not decoded..
Understanding my case, where a UrlParameter can be any string, including URL's.. what is the best way to encode them before pushing them into my db, and then handling the decode knowing they will be passed to a controller as a parameter?
thanks!
It seems this problem has come up in other forums and the general recommendation is to not rely on standard url encoding for asp.net mvc. The advantage is url encoding is not necessarily as user friendly as we want, which is one of the goals of custom routed urls. For example, this:
http://server.com/products/Goods+%26+Services
can be friendlier written as
http://server.com/products/Good-and-Services
So custom url encoding has advantages beyond working around this quirk/bug. More details and examples here:
http://www.dominicpettifer.co.uk/Blog/34/asp-net-mvc-and-clean-seo-friendly-urls
You could convert the parameter to byte array and use the HttpServerUtility.UrlTokenEncode
If the problem is that the "+" doesn't get decoded, use HttpUtility.UrlPathEncode to encode and the decoding will work as desired.
From the documentation of HttpUtility.UrlEncode:
You can encode a URL using with the UrlEncode method or the
UrlPathEncode method. However, the methods return different results.
The UrlEncode method converts each space character to a plus character
(+). The UrlPathEncode method converts each space character into the
string "%20", which represents a space in hexadecimal notation. Use
the UrlPathEncode method when you encode the path portion of a URL in
order to guarantee a consistent decoded URL, regardless of which
platform or browser performs the decoding.