When inserting Unicode data in ODBC application, how to determine the encoding in which it should be - unicode

I have a generic ODBC application reading and writing data via ODBC to some db(can be ms sql, mysql or anything else). The received and sent data can be Unicode. I'm using SQL_C_WCHAR for my bindings in this case.
So I have two questions here:
Can I determine the encoding in which the data came from the ODBC data source?
In which encoding should I send data to the ODBC data source? I'm running parameterised insert statement for this purpose.
My researched showed that some data sources have connection options to set the encoding, but I want to write a generic application working with anything.
Couldn't find any ODBC option telling me the encoding of the data source. Is there something like that? ODBC docs just say use SQL_C_WCHAR. Is SQL_C_WCHAR for UTF-16?

I did some more research and both Microsoft docs and unixodbc docs seem to point out that ODBC only supports UCS-2. So I think all the data sent or received needs to be UCS-2 encoded.

Related

Reading Postgress (encoded in SQL_ASCII) from Data Factory

I am trying to read a on-premise postgres Database that is encoded in SQL_ASCII from the Azure Data Factory Copy Data Activity in order to copy the database's data into a Azure Data Lake.
I am running into encoding issues with special characters such as "è" an "é" and I am quite clueless as to how I should go about fixing this.
When seting up my source Dataset provinding a given table, it's preview shows the characters as "?". Does anyone have an idea of how I could fix this ?
Note that I cannot change the database's encoding because it is not under my control.
Any help will be greatly appreciated!
I was able to fix/workaround my issue by using the Postgres ANSI ODBC driver (64 bits) and specifying the DSN as an additional property on the Postgres Linked Service inside Azure Data Factory

Font decoding problem by importing records from a table in a DB2 database into IBM Lotus Notes documents using a Lotuscript code agent

I have an agent written in Lotuscript (IBM Domino 9.0.1 - Windows 10) that reads records into a DB2 database and writes them to Notes documents. The table in DB2 (Centos OS) contains international names in Varchar fields such as "Łódź".
The DB2 database was created as UTF-8 CodePage: 1208 and Domino by its nature supports UNICODE. Unfortunately, the value loaded in the notes document is not "Łódź" as it should be but it is "? Ód?".
How can I import special characters from DB2
in Domino NSF DBs in correct ways?
Thank you
To import the table I used the following code taken from OpenNtfs XSnippets:
https://openntf.org/XSnippets.nsf/snippet.xsp?id=db2-run-from-lotusscript-into-notes-form
Find where the codepage conversion is happening. Alter the lotusscript to dump the hex of the received data for the column-concerned to a file or in a dialog-box. If the hex codes differ from what is in the column, then it may be your Db2-client that is using the wrong codepage. Are you aware of the DB2CODEPAGE environment variable for Windows? That might help if it is the Db2-client that is doing the codepage conversion.
i.e setting environment variable DB2CODEPAGE=1208 may help, although careful testing is required to ensure it does not cause other symptoms that are mentioned online.

How to Get translateBinary to Work in Rational Application Developer Data Connection

Using Rational Application Developer for Websphere 9.1.0 to make a data connection to a DB2 iseries file, the column data displays a Hex(I think).
I have added the "translateBinary=true" property to the url connection but is does not change the display results.
jdbc:as400:host;translateBinary=true
DB2 for iSeries uses EBCDIC natively but the Toolbox JDBC driver will automatically attempt to translate EBCDIC to unicode for you. Since only some fields are not being translated, it is likely those fields are tagged with CCSID 65535 indicating to the Toolbox driver not to translate them. You can either tag those fields with a CCSID indicating to translate, or use the translate binary driver property, which you're attempting to. The property is not working because you mis-typed it. According to this faq, it should be ";translate binary=true" instead of what you've tried.

How read Chinese text from postgreSQL database using ado.net c#?

Try to read the chinese text but did not succeed.
I have used Npgsql as a Provider and npgqsl.dll as dll
I have used ADO.NET NpgsqlConnection,NpgsqlCommand,NpgsqlDataReader and NpgsqlDataAdapter class objects.
I want to read the chinese text which is stored in table of postgreSQL database.
Anyone help me.
If your database has encoding SQL_ASCII, you are lost.
Other that that, set the connection string parameter Client Encoding to the value your .NET application expects.

Database encoding in PostgreSQL

I have recently started using PostgreSQL for creating/updating existing SQL databases. Being rather new in this I came across an issue of selecting correct encoding type while creating new database. UTF-8 (default) did not work for me as data to be included is of various languages (English, Chinese, Japanese, Russia etc) as well as includes symbolic characters.
Question: What is the right database encoding type to satisfy my needs.
Any help is highly appreciated.
There are four different encoding settings at play here:
The server side encoding for the database
The client_encoding that the PostgreSQL client announces to the PostgreSQL server. The PostgreSQL server assumes that text coming from the client is in client_encoding and converts it to the server encoding.
The operating system default encoding. This is the default client_encoding set by psql if you don't provide a different one. Other client drivers might have different defaults; eg PgJDBC always uses utf-8.
The encoding of any files or text being sent via the client driver. This is usually the OS default encoding, but it might be a different one - for example, your OS might be set to use utf-8 by default, but you might be trying to COPY some CSV content that was saved as latin-1.
You almost always want the server encoding set to utf-8. It's the rest that you need to change depending on what's appropriate for your situation. You would have to give more detail (exact error messages, file contents, etc) to be able to get help with the details.