Need to use COLLATION in a SELECT DISTINCT

Need to use COLLATION in a SELECT DISTINCT - tsql

I am trying to apply collation on a SELECT DISTINCT statement. anyone now how to do this?
One would think that the DISTINCT would detect upper and lower case as different, i.e.. 'Yes' and 'YES'.
But DISTINCT does not appear to be case sensitive. So I believe I need to add COLLATE...
SELECT DISTINCT COLLATE Latin1_General_CS_AS Shrt_Text AS Sht_text
FROM tblMatStrings
Any idea on how to distinguish upper and lower in a SELECT DISTINCT?

You've just got your syntax a bit backwards
SELECT DISTINCT
Shrt_Text COLLATE Latin1_General_CS_AS As Shrt_text
FROM tblMatStrings

Related

Postgresql subqueries using a calculated column

I am new to this platform and need to get a value using a column I already calculated. I know I need a subquery, but am confused by the proper syntax.
SELECT well_id, reported_date, oil,
(EXTRACT(EPOCH FROM age(reported_date,
LAG(reported_date) OVER w))/3600)::int as hourly_rate,
(oil/hourly_rate)::double precision as six
FROM public.production
WINDOW w AS (PARTITION BY well_id ORDER BY well_id, reported_date
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
The error I am getting is
ERROR: column "hourly_rate" does not exist
LINE 4: (oil/hourly_rate)::double precision as six
^
HINT: Perhaps you meant to reference the column "production.hour_rate".
SQL state: 42703
Character: 171
Which I understand...I have tried brackets, naming the sub queries and different tactics. I know this is a syntax thing can someone please give me a hand. Thank you

I'm a bit confused with your notation, but it looks like there are parenthesis issues: your from statement is not linked to the select.
In my opinion, the best way to manage subqueries is to wrinte someting like this :
WITH query1 AS (
select col1, col2
from table1
),
query2 as (
select col1, col2
from query1
(additional clauses)
),
select (what you want)
from query2
(additional statements)
Then you can manipulate your data progressively until you have the right organisation of your data for the final select, including aggregations

You cannot use alias in the select list. YOu need to include the original calculation in the column. So your updated query would look alike -
SELECT well_id, reported_date, oil,
(EXTRACT(EPOCH FROM age(reported_date, LAG(reported_date) OVER w))/3600)::int as hourly_rate,
(Oil/(EXTRACT(EPOCH FROM age(reported_date, LAG(reported_date) OVER w))/3600))::double precision as six
FROM public.production
WINDOW w AS (PARTITION BY well_id ORDER BY well_id, reported_date
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)

Firebird sort order with other character set

The sorting in this query does not take in account signs, only letters:
SELECT CAST(Text AS VARCHAR(20) CHARACTER SET ISO8859_1) COLLATE NO_NO Result FROM (
select CAST('_Anon' AS VARCHAR(20)) COLLATE UNICODE_CI_AI as Text from RDB$DATABASE
UNION
SELECT CAST('Abba' AS VARCHAR(20)) COLLATE UNICODE_CI_AI AS Text from RDB$DATABASE
UNION
SELECT CAST('Beatles' AS VARCHAR(20)) COLLATE UNICODE_CI_AI AS Text from RDB$DATABASE)
ORDER BY Result
Expected sort order(non-alpha-numeric before any letter):
_Anon
Abba
Beatles
But I get:
Abba
_Anon
Beatles
The collation does not matter. If you delete "COLLATE NO_NO" it still sorts wrong.
Edit: Found that collation ES_ES sorts this correct, but it fails to sort Norwegian characters.
Is this a bug or am I missing something in this query?
What I'm trying to do is to get correct sort order in Norwegian, and none of the collations in UNICODE_CI_AI gives me the correct order.
Update: Expanded the example with another sub-query so that it clearer shows the point.

Marks hint to look at the collation pointed me in the direction of a solution.
I do consider this a bug, so I was going to file a a bug report to firebirdsql, but found out it's a "Won't fix" and the workaround below is the official fix.
Of all base collations defined ES_* is the only one with the attribute: SPECIALS-FIRST=1 set. In fact it's the only collation with any attribute set.
And that attribute defines that special characters should be sorted before alphanumeric characters.
So the workaround is to create a new collation based on the NO_NO collation:
CREATE COLLATION NO_NO_NOPAD_CI_SF
FOR ISO8859_1
FROM NO_NO
NO PAD
CASE INSENSITIVE
'SPECIALS-FIRST=1';
then using the new collation like this:
SELECT CAST(Text AS VARCHAR(20) CHARACTER SET ISO8859_1) COLLATE NO_NO_NOPAD_CI_NUM_SF Result FROM (
select CAST('_Anon' AS VARCHAR(20)) COLLATE UNICODE_CI_AI as Text from RDB$DATABASE
UNION
SELECT CAST('Abba' AS VARCHAR(20)) COLLATE UNICODE_CI_AI AS Text from RDB$DATABASE
UNION
SELECT CAST('Beatels' AS VARCHAR(20)) COLLATE UNICODE_CI_AI AS Text from RDB$DATABASE)
ORDER BY Result
Yields the expected result:
_Anon
Abba
Beatles

Postgres: Find number of distinct values for each column

I am trying to find the number of distinct values in each column of a table. Declaratively that is:
for each column of table xyz
run_query("SELECT COUNT(DISTINCT column) FROM xyz")
Finding the column names of a table is shown here.
SELECT column_name
FROM information_schema.columns
WHERE table_name=xyz
However, I don't manage to merge the count query inside. I tried various queries, this one:
SELECT column_name, thecount
FROM information_schema.columns,
(SELECT COUNT(DISTINCT column_name) FROM myTable) AS thecount
WHERE table_name=myTable
is syntactically not allowed (reference to column_name in the nested query not allowed).
This one seems erroneous too (timeout):
SELECT column_name, count(distinct column_name)
FROM information_schema.columns, myTable
WHERE table_name=myTable
What is the right way to get the number of distinct values for each column of a table with one query?
Article SQL to find the number of distinct values in a column talks about a fixed column only.

In general, SQL expects the names of items (fields, tables, roles, indices, constraints, etc) in a statement to be constant. That many database systems let you examine the structure through something like information_schema does not mean you can plug that data into the running statement.
You can however use the information_schema to construct new SQL statements that you execute separately.
First consider your original problem.
CREATE TABLE foo (a numeric, b numeric, c numeric);
INSERT INTO foo(a,b,c)
VALUES (1,1,1), (1,1,2), (1,1,3), (1,2,1), (1,2,2);
SELECT COUNT(DISTINCT a) "distinct a",
COUNT(DISTINCT b) "distinct b",
COUNT(DISTINCT c) "distinct c"
FROM foo;
If you know the name of all of your columns when you are writing the query, that is sufficient.
If you are seeking data for an arbitrary table, you need to construct the SQL statement via SQL (I've added plenty of whitespace so you can see the different levels involved):
SELECT 'SELECT ' || STRING_AGG( 'COUNT (DISTINCT '
|| column_name
|| ') "'
|| column_name
|| '"',
',')
|| ' FROM foo;'
FROM information_schema.columns
WHERE table_name='foo';
That however is just the text of the necessary SQL statement. Depending on how you are accessing Postgresql, it might be easy for you to feed that into a new query, or if you are keeping everything inside Postgresql, then you will have to resort to one of the integrated procedural languages. An excellent (though complex,) discussion of the issues may provide guidance.

What's the difference between SQL_Latin1_General_CP1_CI_AS and SQL_Latin1_General_CP1_CI_AI

i get this error when i run an update query in Microsoft SQL Server
Cannot resolve the collation conflict between "SQL_Latin1_General_CP1_CI_AS" and "SQL_Latin1_General_CP1_CI_AI" in the equal to operation.
the query uses only 2 tables, the table it's updating and a temp table which it does an inner join into, neither table have i specified the collation of and they are both on the same database which means they should have the same collation since's it should be the database default one right
looking at the collations, the only difference is the last character, all i understand of the last part is that CI stands for Case Insensitive. if i was to take a stab in the dark i would think AI stands for Auto Increment but i have no idea what AS stands for

AI stands for accent insensitive (i.e. determines if cafe = café).
You can use the collate keyword to convert one (or both) of the values' collations.
See link for more info: http://msdn.microsoft.com/en-us/library/aa258237(v=sql.80).aspx
Example: DBFiddle
--setup a couple of tables, populate them with the same words, only vary whether to accents are included
create table SomeWords (Word nvarchar(32) not null)
create table OtherWords (Word nvarchar(32) not null)
insert SomeWords (Word) values ('café'), ('store'), ('fiancé'), ('ampère'), ('cafétería'), ('fête'), ('jalapeño'), ('über'), ('zloty'), ('Zürich')
insert OtherWords (Word) values ('cafe'), ('store'), ('fiance'), ('ampere'), ('cafétería'), ('fete'), ('jalapeno'), ('uber'), ('zloty'), ('Zurich')
--now run a join between the two tables, showing what comes back when we use AS vs AI.
--NB: Since this could be run on a database of any collation I've used COLLATE on both sides of the equality operator
select sw.Word MainWord
, ow1.Word MatchAS
, ow2.Word MatchAI
from SomeWords sw
left outer join OtherWords ow1 on ow1.Word collate SQL_Latin1_General_CP1_CI_AS = sw.Word collate SQL_Latin1_General_CP1_CI_AS
left outer join OtherWords ow2 on ow2.Word collate SQL_Latin1_General_CP1_CI_AI = sw.Word collate SQL_Latin1_General_CP1_CI_AI
Example's Output:
MainWord MatchAS MatchAI
café cafe
store store store
fiancé fiance
ampère ampere
caféteríacaféteríacafétería
fête fete
jalapeño jalapeno
über uber
zloty zloty zloty
Zürich Zurich

Eliminating Duplicate rows in Postgres

I want to remove duplicate rows return from a SELECT Query in Postgres
I have the following query
SELECT DISTINCT name FROM names ORDER BY name
But this somehow does not eliminate duplicate rows?

PostgreSQL is case sensitive, this might be a problem here
DISTINCT ON can be used for case-insensitive search (tested on 7.4)
SELECT DISTINCT ON (upper(name)) name FROM names ORDER BY upper(name);

Maybe something with same-looking-but-different characters (like LATIN 'a'/CYRILLIC 'а')

Don't forget to add a trim() on that too. Or else 'Record' and 'Record ' will be treated as separate entities. That ended up hurting me at first, I had to update my query to:
SELECT DISTINCT ON (upper(trim(name))) name FROM names ORDER BY upper(trim(name));

In Postgres 9.2 and greater you can now cast the column to a CITEXT type or even make the column that so you don't have to cast on select.
SELECT DISTINCT name::citext FROM names ORDER BY name::citext

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Need to use COLLATION in a SELECT DISTINCT - tsql

You've just got your syntax a bit backwards SELECT DISTINCT Shrt_Text COLLATE Latin1_General_CS_AS As Shrt_text FROM tblMatStrings

Related

Postgresql subqueries using a calculated column

Firebird sort order with other character set

Postgres: Find number of distinct values for each column

What's the difference between SQL_Latin1_General_CP1_CI_AS and SQL_Latin1_General_CP1_CI_AI

Eliminating Duplicate rows in Postgres

Categories

Resources