Unicode turns ANSI after FTP transfer - unicode

I have a bunch of unicode (UTF-16LE) xml files that I want to transfer via an old OLD vb6 ftp component, but when I send them through there, they turn to ANSI on the ftp server side (win2k3 server).
When I attempt to send it using the windows terminal ftp client, it works fine whether I use binary or ascii transfer mode. The file stays unicode. What could be possible causes of this?
Edit: perhaps unrelated, but I notice sending files through an old email component also does this to unicode files.

The answer was finalized here; Writing ANSI string to Unicode file over FTP

Related

Wrong encoding in Joomla 3

I am using joomla 3 which changes character ' into â€.
Secondly it also converting any space into Â.
I tried using
and
on myhead file but still the problem persist.
My database is with collation of utf8 too.
I am using no editor on my joomla administrator.
Also on my windows operating system it is working fine but when i push files to linux server, it shows these weird sign.
I tried google search alot but in vain.
Any help will be appreciated.
One possible reason is that the files you are pushing to the Linux server have encoding other than UTF-8, they might be Windows-1252.
Here's one suggestion to test if this might be the case.
Create a new text document in Notepad, then Save As, and under Encoding select UTF-8
Upload this test file and check the results. If this still doesn't help, you might want to double check preferences on your SFTP application, just in case it is overriding file encoding.
Good luck!
I had same problem, I deactivate Google ModPagespeed on my Cpanel and my problem has been solved.
Remove "ModPagespeed On" from .htaccess

bin file in eclipse or notepad

Whenever I tried to open the .bin file in Windows, (and also eclipse), it is like this, so I cannot read anything. I am using it to test Buffer Pool, but I cannot read, so I cannot know that is the test was successful or not. It is same when I opened it wil notepad.
I am using U.S. window, but installed Korean language, but I still can read/write English well.
Extension of file name ".bin" stands for "BINARY". That means your file may contain not-printable characters as you saw.
If you want to see the contenst of binary files, you should use 'Hexadecimal Editor', 'hex editor' in short, instead of text editor like notepad.
http://en.wikipedia.org/wiki/Comparison_of_hex_editors shows many hex editors.
Some software may be able to handle your .bin files. It depends that the origin of that file.

Keep file encoding in eclipse for each file (different encodings for different files)

I'm working with a git repository where some of the files are encoded in latin-1 and some of them in utf-8. I'm using Eclipse CDT to work with them, and it's configured to use UTF-8 as default encoding.
The thing is, when I open latin-1 encoded files, some of the characters are not shown properly , and despite I've just tried also the Luna version, which came out 2 days ago, the problem persists (It's supposed that latin-1 and latin-2 are supported now, according to the review information).
Furthermore, and here comes the real trouble, when I modify and save latin-1 encoded files, they are being saved as UTF-8 (as configured in Eclipse), so if I push these changes to the repository, quite a lot of conflicts will emerge, messing up the entire commit.
Is there some way of telling Eclipse to keep the original encoding for each file?
Thank you.

Jekyll does not parse UTF-8

I created a page in notepad and selected UTF-8 as the encoding while saving. Jekyll does not parse this page. It renders the liquid extensions in the page as they are.
Now I saved the same page using ANSI encoding. Jekyll parses that easily and my site is up and running. But it is limited only to ANSI and some characters appear as a question mark due to wrong encoding. I do not want to use ANSI instead of UTF-8 when the web fully supports it.
It may be due to the fact that Notepad inserts a byte order mark (BOM) at the beginning of UTF-8 documents, which may interfere with their processing (especially by tools that are aimed primarily at Unix). You could try using another text editor (or stripping out the BOM with another tool may work).

Apple DMG files over FTP are getting corrupted why?

I am trying to FTP some apple DMG files, if we do it by hand through Safari or IE it ends up at the destination just fine and uncorrupted. However, if I use a freeware FTP client that we had been using with great success for zip's and exe's or if I use a Powershell script I finished off (adapted from another stackover flow's question's answer) then I lose about a 1/2 Mb on a 10.5 Mb file and the dmg is corrupted. Does anyone have anyclues what could be going wrong? Things I could do to prevent it? So far all I have tried is gzipping the dmg before sending and that accomplished nothing. Again, anything but a dmg gets transmitted just fine.
FYI I am using binary mode transfers, so that is not it..thx though
Seems like your client treats dmg file as text file.
set Binary transfer mode in your ftp client and it will ftp it as is.
I always thought that ascii transfer mode in ftp is just plain stupid. It causes more trouble then it is worth.
Are you sure everything except a DMG gets transferred correctly? It sounds like a problem with the transfer encoding. FTP supports both binary and ASCII transfer types, mainly due to historical baggage. In ye old days, when bandwidth was scarer, leaving off the high bit (which ASCII doesn't use) was a good time saver. However, if you have any bytes with the bit set, ASCII transfer mode will lose them - hence "binary" mode, which truncates nothing.
Typically, the command to switch transfer modes is "bin" or "ascii".
Just so everyone knows. It must have been the client I was using had the exact same issue as my PowerShell script. I was using StreamReader to get the bytes for transfer and it was assuming an encoding which was not correct. I switched to a BinaryReader which does not, and it now works.