libpqxx postgresql utf8 strings - postgresql

Does it possible insert into database table (based on postgresql) utf8 (unicode) string?
pqxx::work tr(*_conn.get(), "notify");
std::stringstream ss;
ss << "INSERT INTO tbl (msg) VALUES ('" << msg << "');";
tr.exec(ss.str());
tr.commit();
I want message content will be for example キエオイウカクケコサシスセソタチツテア. But exec method waits char string, not wchar. How can I encode utf8 string to pass it into the query?
Additional question: how can I encode utf8 string using wchar_t type? I assume that wchar type represents 2-bytes symbols, but utf8 may contain up to 6-bytes symbols.

It's possible to convert wide char string into utf8 like this:
std::wstring_convert<std::codecvt_utf8<wchar_t>> conv;
std::string u8str = conv.to_bytes(msg);
or this way:
std::wstring wmsg_text = L"キエオイウカクケコサシスセソタチツテア";
char buffer[100] = { 0 };
WideCharToMultiByte(CP_UTF8, 0, wmsg_text.data(), wmsg_text.size(), buffer, sizeof(buffer)-1, NULL, NULL);
Of course after obtaining the string from database it's necessarily to execute:
std::wstring_convert<std::codecvt_utf8<wchar_t>> conv;
std::wstring u8str = conv.from_bytes(message);

Related

PowerShell 5.1 handling sql server nvarchar datatype

I am loading a DataTable with data from SQL Server as such:
$queryStr = "SELECT TOP 10 ID, QueryText FROM dbo.DatabaseName";
$dataRows = Invoke-DbaQuery -SqlInstance instance.name -Database databasename -Query $queryStr -As DataSet;
In SQL Server the QueryText is nvarchar(max). In PowerShell, this becomes a string datatype, equal to varchar I think. I think this because when I try to calculate the hash in PowerShell with Get-FileHash, and in SQL Server I calculate the hash on the nvarchar column with SELECT (CONVERT([varchar](70),hashbytes('SHA2_256', QueryText),(1))), the hashes do not match.
They DO match however, if I convert the nvarchar to a varchar(max) in SQL Server.
So the question is, in PowerShell how can I convert the string datatype to match the nvarchar datatype in SQL Server? Because as far as I know, PowerShell does not have a nvarchar datatype, just the generic string datatype.
Added this part after reading comments.
In the DataTable that I retrieve from SQL Server as per the above code I add an extra column to hold the hash that I calculate in PowerShell.
Add extra column to DataTable:
$HashColumn = [System.Data.DataColumn]::new('QueryHashString', [string]);
$dataRows.Tables[0].Columns.Add($HashColumn);
Now I do a foreach to fill this column I just added:
foreach($row in $dataRows.Tables[0]) {
$stringAsStream = [System.IO.MemoryStream]::new()
$writer = [System.IO.StreamWriter]::new($stringAsStream)
$writer.write("$($row.QueryText)")
$writer.Flush()
$stringAsStream.Position = 0
$row.QueryHashString = (Get-FileHash -InputStream $stringAsStream | Select-Object -ExpandProperty Hash)
}
Your code uses StreamWriter that uses the default UTF-8 encoding, which matches what you get with hashing a VARCHAR -- if you stick to ASCII characters. To hash Unicode instead (and for variation, let's use SHA256 directly instead of going through Get-FileHash, and throw in an emoji so we have to deal with surrogates):
$s = "Hello, world! I 💖 you"
$sha256 = [System.Security.Cryptography.SHA256]::Create()
[BitConverter]::ToString(
$sha256.ComputeHash([System.Text.Encoding]::Unicode.GetBytes($s))
).Replace("-", "")
This yields the same result as
SELECT CONVERT(CHAR(64), HASHBYTES('SHA2_256', N'Hello, world! I 💖 you'), 2)

encode a text which contains hex string into utf-8

I'm struggling to encode this hex string: =D8=A8=D8=A7 <br /> =D8=B3=D9=84=D8=A7=D9=85 hello =D9=88 =D8=A7=D8=AD=D8=AA=D8=B1=D8=A7= =D9=85 into proper format which must be utf-8 and must be displayed as :
با سلام hello و احترا= م
What I've tried so far is that at first step I've tried to decode hex string with decode function but since there are invalid digit in my hex string (=) it throws error:
select decode(content, 'hex') from attachments
ERROR: invalid hexadecimal digit: "="
I also tried to directly convert it to utf-8 but nothing has changed in output:
select convert_from(content::bytea, 'utf-8') from attachments
=D8=A8=D8=A7 <br /> =D8=B3=D9=84=D8=A7=D9=85 hello =D9=88 =D8=A7=D8=AD=D8=AA=D8=B1=D8=A7= =D9=85
Try something like this:
select convert(decode(replace(replace(content,'=',''),' ','20'), 'hex'),'UTF8') from attachments
or
select convert_from(decode(replace(replace(content,'=',''),' ','20'), 'hex')::bytea, 'utf-8') from attachments

Error in Pentaho Data Integrator - invalid byte sequence for encoding "UTF8": 0x00

Error getting while insert bulk rows with Pentaho Data Interrogator. I am using PostgreSQL
ERROR: invalid byte sequence for encoding "UTF8": 0x00
"UTF8": 0x00 = "null character". You can use "Modified Javascript" step, and then apply a mask pattern as follows:
function removeNull(e) {
if(e != null)
return e.replace(/\0/g, '');
else
return '';
}
var replacedString = removeNull(fieldToRemoveNullChars);
Select the new field for the Modified Javascript output, and voilla!. Use to have this problem with AS400 incoming data.
PostgreSQL is very strict content of text fields, and doesn't allow 0x00 in utf8 encoded fields. You should to fix your input data.
Some possible solution https://superuser.com/questions/287997/how-to-use-sed-to-remove-null-bytes
Finally I got the solution:
In Table Input, check the "Enable lazy conversion" option
Enter the "Select Values" step Select all fields and on the forced "Metadata" tab by entering the "UTF-8" encoding for all fields.

QT QPSQL table insert return column

Using Qt Creator 4.1.0 based on Qt 5.6.
I have a postgresql table that looks like this:
CREATE TABLE shelf (
bookid sequence,
title character varying(200),
author character varying(200),
publisher character varying(200),
isbn character varying(200),
genre character varying(200)
);
I have just done a successful insert
QSqlQuery que;
que.exec("insert into shelf(title) values('blah') returning bookid;");
The insert worked fine, how to I get returning bookid?
try:
while (que.next()) {
QString bookid = que.value(0).toString();
qDebug() << bookid;
}
?..
You have two ways to get the last insert id when using Qt and PostgreSQL:
Using lastInsertId(): This has the advantage of being portable (i.e. your code keeps working if you decide to change to another DBMS later on), but it might be unsafe if you had triggers in your database that fire upon insertion in your table (e.g. shelf) and insert other values in other tables (note that it is implemented by issuing the query SELECT lastval(); on PostgreSQL). When using this approach, your code should look something like this:
QSqlQuery query;
query.prepare("INSERT INTO some_table(col_name) VALUES(?);");
query.addBindValue("sth");
if(!query.exec()) //insert statement failed
qWarning() << "insert statement failed with error: "
<< query.lastError().databaseText();
int insertId = query.lastInsertId().toInt();
//do anything you want with the id
qDebug() << "id: " << insertId;
Using the RETURNING clause: This has the advantage of being safer in case you had the weird triggers mentioned above, but it is not portable (i.e. the insert statement won't execute if you ever decide to change to another DBMS because it is not standard SQL anymore). This way, your code should look something like this:
QSqlQuery query;
query.prepare("INSERT INTO some_table(col_name) VALUES(?) RETURNING id_col_name;");
query.addBindValue("sth");
if(!query.exec()) //insert statement failed
qWarning() << "insert statement failed with error: "
<< query.lastError().databaseText();
if(!query.next()) //returning clause failed
qWarning() << "returning clause did not return any data";
int insertId = query.value(0).toInt();
//do anything you want with the id
qDebug() << "id: " << insertId;

Psycopg2 copy_from throws DataError: invalid input syntax for integer

I have a table with some integer columns. I am using psycopg2's copy_from
conn = psycopg2.connect(database=the_database,
user="postgres",
password=PASSWORD,
host="",
port="")
print "Putting data in the table: Opened database successfully"
cur = conn.cursor()
with open(the_file, 'r') as f:
cur.copy_from(file=f, table = the_table, sep=the_delimiter)
conn.commit()
print "Successfully copied all data to the database!"
conn.close()
The error says that it expects the 8th column to be an integer and not a string. But, Python's write method can only read strings to the file. So, how would you import a file full of string representation of number to postgres table with columns that expect integer when your file can only have character representation of the integer (e.g. str(your_number)).
You either have to write numbers in integer format to the file (which Python's write method disallows) or psycopg2 should be smart enough to the conversion as part of copy_from procedure, which it apparently is not. Any idea is appreciated.
I ended up using copy_expert command. Note that on Windows, you have to set the permission of the file. This post is very useful setting permission.
with open(the_file, 'r') as f:
sql_copy_statement = "copy {table} FROM '"'{from_file}'"' DELIMITER '"'{deli}'"' {file_type} HEADER;".format(table = the_table,
from_file = the_file,
deli = the_delimiter,
file_type = the_file_type
)
print sql_copy_statement
cur.copy_expert(sql_copy_statement, f)
conn.commit()