SAS import problem with utf-8 and u+2019 character

SAS import problem with utf-8 and u+2019 character - import

I am trying to import a value from a csv, using an input statement with encoding at utf8.
The value contains a u+2019 character, which sas doesn't recognize at all and displays a box instead.
Anyone knows what could the problem be?

The session needs to be running UTF8, otherwise SAS will try to transcode the text into the session encoding. Ask your SAS admin to show you how to connect to an application server that is running using UTF8 encoding.

Related

Can PostgreSQL convert entries to UTF-8 even though the input is Latin1?

I have psql (PostgreSQL) 10.10 and client_encoding is UTF8. Now entries are made by an older Delphi version which cannot use UTF8 so the entries in the DB have the special signs not represented as UTF8. A ™ sign is represented by \u0099 for instance. Is it possible to force a conversion when the sign is entered into the data base? Switching Delphi is not an option right now. I am sorry if this is a basic question. My knowledge about data bases is limited.

It looks like your Delphi client is not using LATIN1, but WINDOWS-1252, because ™ is code point 99 in that encoding.
You can change client_encoding per session, and that is what you should do.
Either let your application execute
SET client_encoding = WIN1252;
or set the PGCLIENTENCODING environment variable or specify client_encoding as part of the connect string.

Python3 with FileMaker data using pyodbc not giving actual unicode value [French and Hungary character]

I am using Python3 to connect with FileMaker Database. The result I need to get from Database is "Pontificia Universidad Católica del Perú". But using pyodbc with unicode encoding UTF-8 is some thing like this "enter image description here". Once I tried to pass this value to html the unicode value is printing as space. I tried to encode my dbconnection object with 'UTF-8', 'LATIN1', 'iso-8859-1'. Still the result is same. Without this I am stuck for few days to continue my work. Please help to get actual value from FileMAker DB.

Escape Cyrillic, Chinese, Arabic, Hebrew characters in postgresql

I'm trying to load in a postgres table, records from a flat file, I'm doing it with the Copy command, which has worked well so far.
But now I am receiving fields with words in Chinese, Japanese, Cyrillic and other languages, and when I try to do it, it gives me an error in the load.
How could those characters escape in Postgres, I searched, but I have not found any reference to this type of topic.

You should not escape the characters, you should load them as they are.
Your database encoding is UTF8, so that's no problem. If your database encoding is not UTF8, change that.
For each file, figure out what its encoding is and use the ENCODING option of COPY or the environment variable PGCLIENTENCODING so that PostgreSQL knows which encoding the file is in.

CSV export with the wrong character set (utf-8 ?)

I have exported the data is CSV format but it contains funny character like Ã©, Ã¯Â»Â¿.
What is the charset? UTF-8 or the one of my computer?
Is there a way to specify the charset at export?

It is unfortunately impossible to specify Charset at the export... But I think you can define encoding during CSV import process in LibreOffice or MS Excel. Tell me if it solves your issue.

Non-ISO extended-ASCII CSV giving special character while importing in DB

I am getting CSV from S3 server and inserting it into PostgreSQL using java.
S3Object object = s3Client.getObject(new GetObjectRequest(bucketName, key));
BufferedReader reader = new BufferedReader(
new InputStreamReader(object.getObjectContent())
);
For some of the rows the value in a column contains the special characters ï¿½. I tried using the encodings UTF-8, UTF-16 and ISO-8859-1 with InputStreamReader, but it didn't work out.
When the encoding WIN-1252 is used, the DB still shows some special characters, but when I export the data to CSV it shows the same characters which I found in the raw file.
But again when I am opening the file in Notepad the character is fine, but when I open it in excel, the same special character appears.

All the PostgreSQL stuff is quite irrelevant; PostgreSQL can deal with practically any encoding. Check your data with an utility such as enca to determine how it is encoded, and set your PostgreSQL session to that encoding. If the server is in the same encoding or in some Unicode encoding, it should work fine.