postgres, substring a subselect

postgres, substring a subselect - postgresql

I have a query that uses a subselect like this
SELECT "columnA","columnB", (SELECT column1 FROM tableB WHERE id=1 LIMIT 1) as text
FROM tableA WHERE id=1
Now i would like to only get the last 3 chars from my "as text" column. I have tried to apply the substring or right around my subselect but that returns an error, can anyone explain why and how to do this properly?

You need to use internal function substring matching POSIX regular expression
SELECT "columnA","columnB", (SELECT substring(column1::TEXT from '...$') FROM tableB WHERE id=1 LIMIT 1) as text
FROM tableA WHERE id=1
Please keep in mind that this way, if you have more than 1 record in tableA that matches your WHERE criteria, you will still be getting the same value in variable text for this query.

Related

How to do a Select * followed by a join SEA-ORM

I want to do a join with another table. I followed the tutorial on the site and the my code compiles but it's not performing the join and instead just selects the first table.
SELECT
"table1.col1"
"table1.col2"
"table1.col3"
FROM
"table1"
JOIN "table2" ON "table1"."col1" = "table2"."col1"
LIMIT
1
It is only returning the data from table1 and not concatenating the columns where the condition for table1 and table2 is met.
I execute the query using the following code:
Entity::find()
.from_raw_sql(Statement::from_string(DatabaseBackend::Postgres, query.to_owned()))
.all(&self.connection)
.await?
That returns a Vec<Model>. Is this the correct way? Also, how can I build a SQL statement using an Entity as the base which looks like SELECT * from "table1".

After 'SELECT' (and before 'FROM') you are specifying which columns
to include in the output,
and you are selecting only three columns from table1 in your code.
Add the columns you want to include from table2 here, and you may get
the results you want.

Equivalent of FIRST in Postgresql

Edit: Answer is to use MIN. it works on both strings & numbers. Credit to #cadet down below.
Original question:
I've been reading through similar questions around this for the last half an hour and cannot understand the responses so let me try to get a simple easy to follow answer.
What is the PostgresSQL equivalent to this code which I would write if I were using SQL Server, to bring back the first value in field2 when aggregating:
Select field1, first(field2) from table group by field1?
I have read that DISTINCT ON is the right thing to use? In that case would it be:
Select field1, DISTINCT ON(field2) from table group by field1? because that gives me a syntax error
Edit:
Here is the error stating that the FIRST function does not exist in PostGresSQL:
ERROR: function first(asset32type) does not exist
LINE 1: Select policy, first (name) from multi_asset group by policy...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
SQL state: 42883
Character: 16
And in case it isn't already clear when I say that in SQL Server the first() function brings back the first value in field2 when aggregating, I mean if you had data like this:
field1
field2
Tom
32
Tom
53
Then select field1, first(field2) group by field1 would give you back:
Tom, 32 - i.e. it picks the first value from field2

Maybe this one, using DISTINCT ON():
SELECT DISTINCT ON (field1)
field1
, field2
FROM table
ORDER BY
field1
, field2;
But without any data or any example, it's just a wild guess.

If first is related with specific order
select distinct field1,
first_value(field2)
over (partition by field1 order by field2) from
(
values (1,10),(1,11),(1,12),(2,23),(2,24)
) as a(field1,field2)
If first is just minimum or maximum
select field1,
min(field2)
from
(
values (1,10),(1,11),(1,12),(2,23),(2,24)
) as a(field1,field2)
group by field1

Is a subquery able to select columns from outer query? [duplicate]

This question already has answers here:
sql server 2008 management studio not checking the syntax of my query
(2 answers)
Closed 1 year ago.
I have the following select:
SELECT DISTINCT pl
FROM [dbo].[VendorPriceList] h
WHERE PartNumber IN (SELECT DISTINCT PartNumber
FROM [dbo].InvoiceData
WHERE amount > 10
AND invoiceDate > DATEADD(yyyy, -1, CURRENT_TIMESTAMP)
UNION
SELECT DISTINCT PartNumber
FROM [dbo].VendorDeals)
The issue here is that the table [dbo].VendorDeals has NO column PartNumber, however no error is detected and the query works with the first part of the union.
Even more, IntelliSense also allows and recognize PartNumber. This fails only when inside a complex statement.
It is pretty obvious that if you qualify column names, the mistake will be evident.

This isn't a bug in SQL Server/the T-SQL dialect parsing, no, this is working exactly as intended. The problem, or bug, is in your T-SQL; specifically because you haven't qualified your columns. As I don't have the definition of your table, I'm going to provide sample DDL first:
CREATE TABLE dbo.Table1 (MyColumn varchar(10), OtherColumn int);
CREATE TABLE dbo.Table2 (YourColumn varchar(10) OtherColumn int);
And then an example that is similar to your query:
SELECT MyColumn
FROM dbo.Table1
WHERE MyColumn IN (SELECT MyColumn FROM dbo.Table2);
This, firstly, will parse; it is a valid query. Secondly, provided that dbo.Table2 contains at least one row, then every row from table dbo.Table1 will be returned where MyColumn has a non-NULL value. Why? Well, let's qualify the column with table's name as SQL Server would parse them:
SELECT Table1.MyColumn
FROM dbo.Table1
WHERE Table1.MyColumn IN (SELECT Table1.MyColumn FROM dbo.Table2);
Notice that the column inside the IN is also referencing Table1, not Table2. By default if a column has it's alias omitted in a subquery it will be assumed to be referencing the table(s) defined in that subquery. If, however, none of the tables in the sub query have a column by that name, then it will be assumed to reference a table where that column does exist; in this case Table1.
Let's, instead, take a different example, using the other column in the tables:
SELECT OtherColumn
FROM dbo.Table1
WHERE OtherColumn IN (SELECT OtherColumn FROM dbo.Table2);
This would be parsed as the following:
SELECT Table1.OtherColumn
FROM dbo.Table1
WHERE Table1.OtherColumn IN (SELECT Table2.OtherColumn FROM dbo.Table2);
This is because OtherColumn exists in both tables. As, in the subquery, OtherColumn isn't qualified it is assumed the column wanted is the one in the table defined in the same scope, Table2.
So what is the solution? Alias and qualify your columns:
SELECT T1.MyColumn
FROM dbo.Table1 T1
WHERE T1.MyColumn IN (SELECT T2.MyColumn FROM dbo.Table2 T2);
This will, unsurprisingly, error as Table2 has no column MyColumn.
Personally, I suggest that unless you have only one table being referenced in a query, you alias and qualify all your columns. This not only ensures that the wrong column can't be referenced (such as in a subquery) but also means that other readers know exactly what columns are being referenced. It also stops failures in the future. I have honestly lost count how many times over years I have had a process fall over due to the "ambiguous column" error, due to a table's definition being changed and a query referencing the table wasn't properly qualified by the developer...

Full outer join with different WHERE clauses in Knex.js for PostgreSQL

I try to get a single row with two columns showing aggregation results: one column should show the total sum based on one WHERE-clause while the other column should show the total sum based on a different WHERE clause.
Desired output:
amount_vic amount_qld
100 70
In raw PostgreSQL I could write something like that:
select
sum(a.amount) as amount_vic,
sum(b.amount) as amount_qld
from mytable a
full outer join mytable b on 1=1
where a.state='vic' and b.state= 'qld'
Question: How do I write this or a similar query that returns the desired outcome in knex.js? For example: the 'on 1=1' probably needs knex.raw() and I think the table and column aliases do not work for me and it always returns some errors.
One of my not-working-attempts in knex.js:
knex
.sum({ amount_vic: 'a.amount' })
.sum({ amount_qld: 'b.amount' })
.from('mytable')
.as('a')
.raw('full outer join mytable on 1=1')
.as('b')
.where({
a.state: 'vic',
b.state: 'qld'
})
Thank you for your help.

Disclaimer: this does not answer the Knex part of the question - but it is too long for a comment.
Although your current query does what you want, the way it is phrased seems suboptimal. There is not need to generate a self-cartesian product here - which is what full join ... on 1 = 1 does. You can just use conditional aggregation.
In Postgres, you would phrase this as:
select
sum(amount) filter(where state = 'vic') amount_vic,
sum(amount) filter(where state = 'qld') amount_qld
from mytable
where state in ('vic', 'qld')
I don't know Knex so I cannot tell how to translate the query to it. Maybe this query is easier for you to translate.

Getting group by attribute in nested query

I am trying to find the most frequent value in a postgresql table. The problem is that I also want to "group by" in that table and only get the most frequent from the values that have the same name.
So I have the following query:
select name,
(SELECT value FROM table where name=name GROUP BY value ORDER BY COUNT(*) DESC limit 1)
as mfq from table group by name;
So, I am using where name=name, trying to get the outside group by attribute "name", but it doesn't seem to work. Any ideas on how to do it?
Edit: for example in the following table:
name value
a 3
a 3
a 3
b 2
b 2
I want to get:
name value
a 3
b 2
but the above statement gives:
name value
a 3
b 3
instead, since where doesn't work correctly.

There is a dedicated function in PostgreSQL for this case: the mode() ordered-set aggregate:
select name, mode() within group (order by value) mode_value
from table
group by name;
which returns the most frequent input value (arbitrarily choosing the first one if there are multiple equally-frequent results) -- which is the same behavior as with your order by count(*) desc limit 1.
It is available from PostgreSQL 9.4+.
http://rextester.com/GHGJH15037

If you want your query to work, you need table aliases. Table aliases and qualified column names are always a good idea:
select t.name,
(select t2.value
from table t2
where t2.name = t.name
group by t2.value
order by COUNT(*) desc
limit 1
) as mfq
from table t
group by t.name;