How to return strings that are trimmed automatically by ServiceStack.OrmLite.PostgreSQL? - postgresql

I've noticed that when my entities are returned, the string values are padded to the number of characters of the field definition in the database, is the a setting to control this behavior?
Thank you,
Stephen
PostGreSQL 9.3
ServiceStack.OrmLite.PostgreSQL.4.0.30

PostgreSQL only pads strings for fixed-size CHAR(N) columns, if you don't want this behavior you should instead use VARCHAR(N) columns instead - the recommended default for strings.
OrmLite also uses VARCHAR for strings when it's used to generate your table schema's, e.g:
db.CreateTable<Table>();
If dealing with legacy DB's, you can get OrmLite to automatically trim padded strings returned from RDBMS's with:
OrmLiteConfig.StringFilter = s => s.TrimEnd();

Related

PostgreSQL - Converting Binary data to Varchar

We are working towards migration of databases from MSSQL to PostgreSQL database. During this process we came across a situation where a table contains password field which is of NVARCHAR type and this field value got converted from VARBINARY type and stored as NVARCHAR type.
For example: if I execute
SELECT HASHBYTES('SHA1','Password')`
then it returns 0x8BE3C943B1609FFFBFC51AAD666D0A04ADF83C9D and in turn if this value is converted into NVARCHAR then it is returning a text in the format "䏉悱゚얿괚浦Њ鴼"
As we know that PostgreSQL doesn't support VARBINARY so we have used BYTEA instead and it is returning binary data. But when we try to convert this binary data into VARCHAR type it is returning hex format
For example: if the same statement is executed in PostgreSQL
SELECT ENCODE(DIGEST('Password','SHA1'),'hex')
then it returns
8be3c943b1609fffbfc51aad666d0a04adf83c9d.
When we try to convert this encoded text into VARCHAR type it is returning the same result as 8be3c943b1609fffbfc51aad666d0a04adf83c9d
Is it possible to get the same result what we retrieved from MSSQL server? As these are related to password fields we are not intended to change the values. Please suggest on what needs to be done
It sounds like you're taking a byte array containing a cryptographic hash and you want to convert it to a string to do a string comparison. This is a strange way to do hash comparisons but it might be possible depending on which encoding you were using on the MSSQL side.
If you have a byte array that can be converted to string in the encoding you're using (e.g. doesn't contain any invalid code points or sequences for that encoding) you can convert the byte array to string as follows:
SELECT CONVERT_FROM(DIGEST('Password','SHA1'), 'latin1') AS hash_string;
hash_string
-----------------------------
\u008BãÉC±`\u009Fÿ¿Å\x1A­fm+
\x04­ø<\u009D
If you're using Unicode this approach won't work at all since random binary arrays can't be converted to Unicode because there are certain sequences that are always invalid. You'll get an error like follows:
# SELECT CONVERT_FROM(DIGEST('Password','SHA1'), 'utf-8');
ERROR: invalid byte sequence for encoding "UTF8": 0x8b
Here's a list of valid string encodings in PostgreSQL. Find out which encoding you're using on the MSSQL side and try to match it to PostgreSQL. If you can I'd recommend changing your business logic to compare byte arrays directly since this will be less error prone and should be significantly faster.
then it returns 0x8BE3C943B1609FFFBFC51AAD666D0A04ADF83C9D and in turn
if this value is converted into NVARCHAR then it is returning a text
in the format "䏉悱゚얿괚浦Њ鴼"
Based on that, MSSQL interprets these bytes as a text encoded in UTF-16LE.
With PostgreSQL and using only built-in functions, you cannot obtain that result because PostgreSQL doesn't use or support UTF-16 at all, for anything.
It also doesn't support nul bytes in strings, and there are nul bytes in UTF-16.
This Q/A: UTF16 hex to text suggests several solutions.
Changing your business logic not to depend on UTF-16 would be your best long-term option, though. The hexadecimal representation, for instance, is simpler and much more portable.

replacing characters in a CLOB column (db2)

I have a CLOB(2000000) field in a db2 (v10) database, and I would like to run a simple UPDATE query on it to replace each occurances of "foo" to "baaz".
Since the contents of the field is more then 32k, I get the following error:
"{some char data from field}" is too long.. SQLCODE=-433, SQLSTATE=22001
How can I replace the values?
UPDATE:
The query was the following (changed UPDATE into SELECT for easier testing):
SELECT REPLACE(my_clob_column, 'foo', 'baaz') FROM my_table WHERE id = 10726
UPDATE 2
As mustaccio pointed out, REPLACE does not work on CLOB fields (or at least not without doing a cast to VARCHAR on the data entered - which in my case is not possible since the size of the data is more than 32k) - the question is about finding an alternative way to acchive the REPLACE functionallity for CLOB fields.
Thanks,
krisy
Finally, since I have found no way to this by an SQL query, I ended up exporting the table, editing its lob content in Notepad++, and importing the table back again.
Not sure if this applies to your case: There are 2 different REPLACE functions offered by DB2, SYSIBM.REPLACE and SYSFUN.REPLACE. The version of REPLACE in SYSFUN accepts CLOBs and supports values up to 1 MByte. In case your values are longer than you would need to write your own (SQL-based?) function.
BTW: You can check function resolution by executing "values(current path)"

Scala Slick, Trouble with inserting Unicode into database

When I create a new row of data that contains several columns that may contain Unicode, the columns that do contain Unicode are being corrupted.
However, if I insert that data directly, using the mysql-cli Slick will retrieve that Unicode data fine.
Is there anything I should add to my table class to tell Slick that this Column may be a Unicode string?
I found the problem, I have to set the character encoding, for the connection.
db.default.url="jdbc:mysql://localhost/your_db_name?characterEncoding=UTF-8"
You probably need to configure that on the db schema side by setting the right collation.

How to avoid trailing spaces in T-SQL

I store data in T-SQL server and use nchar fields to store string values.
The problem is that after selecting element from database I have very long string ended with spaces.
Can I set another field in database to avoid such situation, or have I do it in my ORM?
Rewrite the DB to store string values as nvarchar if it better represents your data.
If rewriting your DB is not possible or practical, then you have to trim the trailing spaces in your ORM.
Adding to the other answer. You can also deal with it in the select
RTRIM
A situation to use char or nchar is to prevent fragmentation.
If value starts null but you are going to later add a value then space is reserved.

IBM DB2 9.7, any difference between inlined CLOB and VARCHAR?

Using IBM DB2 9.7, in a 32k tablespace, assuming a 10000b (ten thousand bytes) long column fits nicely in a tablespce. Is there a difference between these two, and is one preferred over the other?
VARCHAR(10000)
CLOB(536870912) INLINE LENGTH 10000
Is either or preferred in terms of functionality and performance? A quick look at the two would be that the CLOB is actually more versatile; all content shorter than 10000 is stored in stablespace, but IF bigger content is required then that is fine too, it is just stored elsewhere on disk.
There are a number of restrictions on the way CLOB can be used in a query:
Special restrictions apply to expressions resulting in a CLOB data
type, and to structured type columns; such expressions and columns are
not permitted in:
A SELECT list preceded by the DISTINCT clause
A GROUP BY clause An ORDER BY clause A subselect of a set operator other
than UNION ALL
A basic, quantified, BETWEEN, or IN predicate
An aggregate function
VARGRAPHIC, TRANSLATE, and datetime scalar functions
The pattern operand in a LIKE predicate, or the search
string operand in a POSSTR function
The string representation of a
datetime value.
So if you need to do any of those things, VARCHAR is to be preferred.
I don't have a definitive answer about performance (unfortunately, information like this just doesn't seem to be in the documentation--or at least, it is not easily locatable). However, logically speaking, the DB has more work to do with a CLOB. It has to decide whether to return the CLOB directly in the result or not. That has to mean at least some overhead. Here is a good discussion of some of the issues, though it doesn't give a clear answer on performance, either.
My default position would be to use VARCHAR unless CLOB is really needed (data in the column can be bigger than the VARCHAR limit).