Postgresql order by returning two different orders - postgresql

I have a PostgreSQL server running on beta and one running locally. On both, I have a table called profile with a column named name of type character varying (255). I have checked that the dbs have the same values.
The weird part is when I do a select on the profile table with order by name asc I am getting different results. So on my local db, profile with name (I)Contractor is first and on beta profile with name 3B is first.
So it seems on my local db ( comes before numeric characters and vice versa for beta. Any idea how this is happening? Would the sort rule be different for different versions of Postgresql?

The reason for this behavior probably lies in the fact that the two servers run on two different operating systems (eg. Gnu Linux and MS Windows). The difference in the method of sorting is due to the fact that the collation is provided by operating system. To get the same sort order, you can use collate:
select name from profile order by name collate "C"
See also Different behaviour in “order by” clause: Oracle vs. PostgreSQL.

Related

Heroku Postgres ignores underscores when sorting

This is driving me bonkers. My Heroku Postgres (9.5.18) DB seems to be ignoring underscores when sorting results:
Query:
SELECT category FROM categories ORDER BY category ASC;
Results:
category
-------------------
z_commercial_overlay
z_district
zr_use_group
zr_uses_footnote
z_special_district
This is new to me. I've never noticed another system where underscores are not respected in sorting, and this is the first time I've noticed Postgres behaving like this.
On my local OSX box (Postgres 10.5) the results are sorted the 'normal' expected way:
category
-------------------
z_commercial_overlay
z_district
z_special_district
zr_use_group
zr_uses_footnote
UPDATE:
Based on the comments, I was able to get the correct sorting by using COLLATE "C"
SELECT category FROM categories ORDER BY category COLLATE "C" ASC;
But I don't understand why is this necessary. BOTH of the Postgres instances show the same default collation value, and all of the table columns were created the same way, with no alternate collation specified.
SHOW lc_collate;
lc_collate
-------------
en_US.UTF-8
SHOW lc_ctype;
lc_ctype
-------------
en_US.UTF-8
So why does the Heroku Postgres DB require the COLLATE declaration?
I've never encountered another system where underscores are not respected in sorting
Really? Never used one, or just never paid attention to one?
On Ubuntu 16.04 (and every other modern system I've paid attention to), the system sort tool behaves the same way as long as you are using en_US.
LC_ALL= LANG=en_US.UTF-8 sort
<produced the same order as the first one you show above)
On my local box (Postgres 10.5) the results are sorted the 'normal' expected way:
BOTH of the Postgres instances show the same collation value:
SHOW lc_collate;
lc_collate
en_US.UTF-8
That only shows the default collation for the database. The column could have been declared to use a different collation than the default:
create table categories(category text collate "C");
If your local database is supposed to be using en_US, and is not, then it is busted.

PostgreSQL shows wrong sequence details in the table's information schema

Recently I saw a strange scenario with my PostgreSQL DB. The information schema of my database is showing a different sequence name than the one actually allocated for the column of my table.
The issue is:
I have a table tab_1
id name
1 emp1
2 emp2
3 emp3
Previously the id column (integer) of the table was an auto generated field where the sequence number was generated at run time via JPA. (Sequence name: tab_1_seq)
We made a change and updated the table's column id to bigserial and the sequence is maintained in the column level (allocated new sequence: tab_1_temp_seq) not handled by the JPA anymore.
After this change everything was working fine for few months and after that we faced an error - "the sequence "tab_1_temp_seq" is not yet defined in this session"
On analyzing the issue I found out that there is a mismatch between the sequences allocated for the table.
In the table structure, we where shown the sequence as tab_1_temp_seq and in the information_schema the table was allocated with the old sequence - tab_1_seq.
I am not sure what has really triggered this to happen, as we are not managing our database system. If you have faced any issues like this, kindly let me know its root cause.
Queries:
SELECT table_name, column_name, column_default from information_schema.columns where table_name = ‘tab_1’;
result :
table_name column_name column_default
tab_1 id nextval('tab_1_seq::regclass')
Below are the details found in the table structure/properties:
id nextval('tab_1_temp_seq::regclass')
name varChar
Perhaps you are suffering from data corruption, but it is most likely that you are suffering from bad tools to visualize your database objects. Whatever program shows you the “table structure/properties” might be confused.
To find out the truth (which DEFAULT value PostgreSQL uses), run:
SELECT pg_get_expr(adbin, adrelid)
FROM pg_attrdef
WHERE adrelid = 'tab1'::regclass;
This is also what information_schema.columns will show, but I added the naked query for clarity.
This DEFAULT value will be used whenever the INSERT statement either doesn't specify the id column or fills it with the special value DEFAULT.
Perhaps the confusion is also caused by different programs that may set default values in their way. The way I showed above is PostgreSQL's way, but nothing can keep a third-party tool from using its own sequence to filling the id.

Can we setup a table in postgres to always view the latest inserts first?

Right now when I create a table and do a
select * from table
I always see the first insert rows first. I'd like to have my latest inserts displayed first. Is it possible to achieve with minimal performance impact?
I believe that Postgres uses an internal field called OID that can be sorted by. Try the following.
select *,OID from table order by OID desc;
There are some limitations to this approach as described in SQL, Postgres OIDs, What are they and why are they useful?
Apparently the OID sequence "does" wrap if it exceeds 4B 6. So in essence it's a global counter that can wrap. If it does wrap, some slowdown may start occurring when it's used and "searched" for unique values, etc.
See also https://wiki.postgresql.org/wiki/FAQ#What_is_an_OID.3F
NB - in more recent version of Postgres this could be deprecated ( https://www.postgresql.org/docs/8.4/static/runtime-config-compatible.html#GUC-DEFAULT-WITH-OIDS )
Although you should be able to create tables with OID even in the most recent version if done explicitly on table create as per https://www.postgresql.org/docs/9.5/static/sql-createtable.html
Although the behaviour you are observing in the CLI appears consistent, it isn't a standard and cannot be depended on. If you are regularly needing to manually see the most recently added rows on a specific table you could add a timestamp field or some other sortable field and perhaps even wrap the query into a stored function .. I guess the approach depends on your particular use case.

The type of column conflit with the type of other columns specified in the UNPIVOT list

In SQL Server 2005, I built a trigger that contains a SQL statement that unpivot's some data. Somewhat similar to the following simple example: http://sqlfiddle.com/#!3/cdc1b/1/0. Let's say that the table the trigger is built on is "table1" and it's set run after updates.
Within SSMS whenever I update "table1" everything works fine. Unfortunately, whenever I update "table1" in a proprietary application (which I don't have the source code to), it fails with the message "The type of column conflit with the type of other columns specified in the UNPIVOT list".
After doing a bit of searching I added COLLATE DATABASE_DEFAULT to my cast's in the view without any luck. It was a bit of a long shot because the collation all matched whenever I queried INFORMATION_SCHEMA.COLUMNS.
I then changed the casts from VARCHAR to CHAR and it worked without issue. For obvious reasons, I'd rather use VARCHAR. What is different between a SSMS and application connection? I assume the application isn't using a connection property that SSMS uses.
PS: The database is a bit funky because it does not use NULLs and uses CHAR instead of VARCHAR.

Records not matching without LTRIM, RTRIM and Upper/Lower function

This issue I am facing from long time. I have two tables in different database having same columns and exactly same data type. But when doing join or any other matching query I get few results only, I noticed that when keeping
LTRIM(RTRIM(UPPER(SourceTable.Column))) =
LTRIM(RTRIM(UPPER(DestinationTable.Column)))
It works fine. I am surprised to say that I have seen same issue on bit and integer column and they also works fine when I keep LTRIM, RTRIM and UPPER/LOWER.
Below are the collation of the two databases:
Source: SQL_Latin1_General_CP1_CI_AS
Destination: SQL_Latin1_General_CP1_CI_AS
As you can see that they have same collation even though I am getting this issue. Can I have a permanent solution to this?
If the datatypes are exactly the same, it could be that you actually have a different collation on the columns - you can actually have a different collation to the database one, specified at the column level. First port of call for me would be to check that.
MSDN resource, quote:
Column-level collations
When you create or alter a table, you can specify collations for each
character-string column by using the COLLATE clause. If no collation is
specified, the column is assigned the default collation of the database.