Postgresql to BigQuery - Left Join on X and Y - postgresql

I have a table with a column (value) that holds different types of information that I need to parse into separate columns. In postgresql, I can easily do this:
SELECT m1.value shipname
, m2.value agent
FROM maritimeDB m1
JOIN maritimeDB m2
ON m1.rowID = m2.rowID
AND m2.itemname = 'Agent'
WHERE m1.rowID
IN (SELECT DISTINCT rowID FROM maritimeDB WHERE entity='9999')
AND m1.itemname='shipname'
I want to do this same sort of query in BigQuery (with JOIN becoming LEFT JOIN), but I get this error:
Error: ON clause must be AND of = comparisons of one field name from each table, with all field names prefixed with table name.
Any suggestions?

This error is coming from Legacy SQL dialect (which is default). This query should work with Standard SQL dialect which supports arbitrary JOIN predicates.

Related

The sqliite db query is not working in postgresql db

i am having a query which is working correctly in SQLite. but its giving error in PostgreSQL.
SELECT decks.id, decks.name, count(cards.id)
from decks
JOIN cards ON decks.id = cards.did
GROUP BY cards.did
above query is giving error in postgresql.
ERROR: column "decks.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT decks.id, decks.name, count(cards.id) FROM decks JOIN...
You can't have columns in the SELECT list, that are not used in an aggregate function or part of the GROUP BY. The fact that SQLite accepts this, is a bug in SQLite. The fact that Postgres rejects this, is correct.
You need to rewrite your query to:
SELECT decks.id, decks.name, count(cards.id)
from decks
JOIN cards ON decks.id = cards.did
GROUP BY decks.id, decks.name;
If decks.id is the primary key, you can shorten the grouping to GROUP BY decks.id

Transpose/Pivot a table in Postgres

I am trying for hours to transpose one table into another one this way:
My idea is to grab on an expression (which can be a simple SELECT * FROM X INNER JOIN Y ...), and transpose it into a MATERIALIZED VIEW.
The problem is that the original table can have an arbitrary number of rows (hence columns in the transposed table). So I was not able to find a working solution, not even with colpivot.
Can this ever be done?
Use conditional aggregation:
select "user",
max(value) filter (where property = 'Name') as name,
max(value) filter (where property = 'Age') as age,
max(value) filter (where property = 'Address') as addres
from the_table
group by "user";
A fundamental restriction of SQL is, that all columns of a query must be known to the database before it starts running that query.
There is no way you can have a "dynamic" number of columns (evaluated at runtime) in SQL.
Another alternative is to aggregate everything into a JSON value:
select "user",
jsonb_object_agg(property, value) as properties
from the_table
group by "user";

Snowflake invalid identifier when performin a join

I have been trying to do an outer join across two different tables in two different schemas. I am trying to filter out before from the table variants the character that are smaller than 4 and bigger than 5 digits. The join was not working with a simply where clause in the end, hence this decision.
The problem is if I do not put the quotes, Snowflake will say that I put invalid identifiers. However, when I run this with the quotes, it works but I get as values in the fields of the column raw.stitch_heroku.spree_variants.SKU only named as the column name, all across the table!
SELECT
analytics.dbt_lcasucci.product_category.product_description,
'raw.stitch_heroku.spree_variants.SKU'
FROM analytics.dbt_lcasucci.product_category
LEFT JOIN (
SELECT * FROM raw.stitch_heroku.spree_variants
WHERE LENGTH('raw.stitch_heroku.spree_variants.SKU')<=5
and LENGTH('raw.stitch_heroku.spree_variants.SKU')>=4
) ON 'analytics.dbt_lcasucci.product_category.product_id'
= 'raw.stitch_heroku.spree_variants.SKU'
Is there a way to work this around? I am confused and have not found this issue on forums yet!
thx in advance
firstly single quote define a string literal 'this is text' where as double quotes are table/column names "this_is_a_table_name"
add alias's to the tables makes the SQL more readable, and the duplicate length command can be reduced with a between, thus this should work better:
SELECT pc.product_description,
sp.SKU
FROM analytics.dbt_lcasucci.product_category AS PC
LEFT JOIN (
SELECT SKU
FROM raw.stitch_heroku.spree_variants
WHERE LENGTH(SKU) BETWEEN 4 AND 5
) AS sp
ON pc.product_id = sp.SKU;
So I reduced the sub-selects results as you only used sku from sp but given you are comparing product_id to sku as your example exists you don't need to join to sp.
the invalid indentifiers implies to me something is named incorrectly, the first step there is to check the tables exist and the columns are named as you expect and the type of the columns are the same for the JOIN x ON y clause via:
describe table analytics.dbt_lcasucci.product_category;
describe table raw.stitch_heroku.spree_variants;

How to Get Talend to Keep Table Names in tOracleInput

Is there a way to tell Talend not to remove the prefix of column names especially when they are specified in the query to retrieve data from data source and keep the names mentioned in the query itself?
Thanks!
Assuming you are using the 'guess schema' feature with a query that joins some tables. Further assuming your tables have columns with the same names you run into trouble with the guessed schema. There is no way to have talend use or even know the names of the tables the colums come from, because they are part of a 'projection' and could result from transformation and/or aggregation. Thus, you'll need to help talend guessing the correct schema, which means a) you cant use the * to select all columns and b) you should assign each column an alias that hints at the table the column comes from.
So instead of select * from employee join department on employee.department_id = department.id you'd have something like select e.id as emp_id, e.name as emp_name, d.id as department_id, d.name as department_name from employee e join department d on e.department_id = d.id. The id from employee will be emp_id in the guessed schema.

selecting a distinct column with alias table not working in postgres

SELECT pl_id,
distinct ON (store.store_ID),
in_user_id
FROM plan1.plan_copy_levl copy1
INNER JOIN plan1._PLAN_STORE store
ON copy1.PLAN_ID = store .PLAN_ID;
while running this query in postgres server i am getting the below error..How to use the distinct clause..in above code plan 1 is the schema name.
ERROR: syntax error at or near "distinct" LINE 2: distinct ON
(store.store_ID),
You are missing an order by where the first set of rows should be the ones specified in the distinct on clause. Also, the distinct on clause should be at start of the selection list.
Try this:
SELECT distinct ON (store_ID) store.store_ID, pl_id,
in_user_id
FROM plan1.plan_copy_levl copy1
INNER JOIN plan1._PLAN_STORE store
ON copy1.PLAN_ID = store .PLAN_ID
order by store_ID, pl_id;