How to use the results of a SELECT query in a WHERE clause - hiveql

Coming from Teradata, I would usually create a volatile table with some variables that I would use throughout my code.
E.g.,
create volatile table var as (
select 'filter_value' as var_field
) with data on commit preserve rows;
Then I would use that table in a SELECT WHERE clause:
select * from table
where some_field = (select var_field from var);
I am trying to do something similar in HUE (Impala editor) however getting an error:
create table var as
select 'filter_value' as var_field
select * from table
where some_field = (select var_field from var)
AnalysisException: Syntax error in line 5:undefined: from
table-name-hidden ^ Encountered: FROM Expected: CASE, CAST, DEFAULT,
EXISTS, FALSE, IF, INTERVAL, NOT, NULL, REPLACE, TRUNCATE, TRUE,
IDENTIFIER CAUSED BY: Exception: Syntax error
Does anyone know how to do this or replicate this feature in Hue?
It's convenient not having to define my variables throughout the whole code and keep them all at the top in one table.

I have a solution. I tried it in Hue and I was able to accomplish your goal.
Queries used in Teradata:
Query 1:
create volatile table var as (
select 'filter_value' as var_field
) with data on commit preserve rows;
Query 2:
select * from table
where some_field = (select var_field from var);
Queries that work in Impala for the same scenario as above:
Query 1:
create table var as
select 'filter_value' as var_field;
Query 2:
SELECT * FROM test
where test_field in (select var_field from var);
Steps performed to Test:
1. Volatile Table creation:
create table var as
select 'filter_value' as var_field;
+-------------------+
| summary |
+-------------------+
| Inserted 1 row(s) |
+-------------------+
2. Volatile table data check:
SELECT * FROM var;
+--------------+
| var_field |
+--------------+
| filter_value |
+--------------+
3. Sample table creation and data insertion:
CREATE TABLE test
(test_field string);
Fetched 0 row(s) in 0.81s
----
INSERT INTO test
values ("filter_value");
Modified 1 row(s) in 5.64s
----
INSERT INTO test
values ("filter_value2");
Modified 1 row(s) in 0.32s
----
4. Sample table data check:
select * from test;
+---------------+
| test_field |
+---------------+
| filter_value |
| filter_value2 |
+---------------+
5. Goal:
SELECT * FROM test
where test_field in (select var_field from var);
+--------------+
| test_field |
+--------------+
| filter_value |
+--------------+

Related

PostgreSQL - Loop Over Rows to Fill NULL Values

I have a table named players which has the following data
+------+------------+
| id | username |
|------+------------|
| 1 | mike93 |
| 2 | james_op |
| 3 | will_sniff |
+------+------------+
desired result:
+------+------------+------------+
| id | username | uniqueId |
|------+------------+------------|
| 1 | mike93 | PvS3T5 |
| 2 | james_op | PqWN7C |
| 3 | will_sniff | PHtPrW |
+------+------------+------------+
I need to create a new column called uniqueId. This value is different than the default serial numeric value. uniqueId is a unique, NOT NULL, 6 characters long text with the prefix "P".
In my migration, here's the code I have so far:
ALTER TABLE players ADD COLUMN uniqueId varchar(6) UNIQUE;
(loop comes here)
ALTER TABLE players ALTER COLUMN uniqueId SET NOT NULL;
and here's the SQL code I use to generate these unique IDs
SELECT CONCAT('P', string_agg (substr('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', ceil (random() * 62)::integer, 1), ''))
FROM generate_series(1, 5);
So, in other words, I need to create the new column without the NOT NULL constraint, loop over every already existing row, fill the NULL value with a valid ID and eventually add the NOT NULL constraint.
In theory it should be enough to run:
update players
set unique_id = (SELECT CONCAT('P', string_agg ...))
;
However, Postgres will not re-evaluate the expression in the SELECT for every row, so this generates a unique constraint violation. One workaround is to create a function (which you might want to do anyway) that generates these fake IDs
create function generate_fake_id()
returns text
as
$$
SELECT CONCAT('P', string_agg (substr('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', ceil (random() * 62)::integer, 1), ''))
FROM generate_series(1, 5)
$$
language sql
volatile;
Then you can update your table using:
update players
set unique_id = generate_fake_id()
;
Online example

Postgres 10: do rows automatically move between partitions?

Assuming I have a parent table with child partitions that are created based on the value of a field.
If the value of that field changes, is there a way to have Postgres automatically move the row into the appropriate partition?
For example:
create table my_table(name text)
partition by list (left(name, 1));
create table my_table_a
partition of my_table
for values in ('a');
create table my_table_b
partition of my_table
for values in ('b');
In this case, if I change the value of name in a row from aaa to bbb, how can I get it to automatically move that row into my_table_b.
When I tried to do that, (i.e. update my_table set name = 'bbb' where name = 'aaa';), I get the following error:
ERROR: new row for relation "my_table_a" violates partition constraint
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63
it doesn't handle updates that cross partition boundaries.
thus you need to create one yourself... here's your set:
t=# insert into my_table select 'abc';
INSERT 0 1
t=# insert into my_table select 'bcd';
INSERT 0 1
t=# select tableoid::regclass,* from my_table;
tableoid | name
------------+------
my_table_a | abc
my_table_b | bcd
(2 rows)
here's rule and fn():
t=# create or replace function puf(_j json,_o text) returns void as $$
begin
raise info '%',': '||left(_j->>'name',1);
execute format('insert into %I select * from json_populate_record(null::my_table, %L)','my_table_'||left(_j->>'name',1), _j);
execute format('delete from %I where name = %L','my_table_'||left(_o,1), _o);
end;
$$language plpgsql;
CREATE FUNCTION
t=# create rule psr AS ON update to my_table do instead select puf(row_to_json(n),OLD.name) from (select NEW.*) n;
CREATE RULE
here's update:
t=# update my_table set name = 'bbb' where name = 'abc';
INFO: : b
puf
-----
(1 row)
UPDATE 0
checking result:
t=# select tableoid::regclass,* from my_table;
tableoid | name
------------+------
my_table_b | bcd
my_table_b | bbb
(2 rows)
once again:
t=# update my_table set name = 'a1' where name = 'bcd';
INFO: : a
puf
-----
(1 row)
UPDATE 0
t=# select tableoid::regclass,* from my_table;
tableoid | name
------------+------
my_table_a | a1
my_table_b | bbb
(2 rows)
Of course using json to pass NEW record looks ugly. And it is ugly indeed. But I did not have time to study the new PARTITION feature of 10, so don't know the elegant way to do this task. Hopefully I could give the generic idea of how you can possible solve the problem and you will produce a better neat code.
update
its probablygood idea to limit such rule to ON update to my_table where left(NEW.name,1) <> left(OLD.name,1) do instead, to release the heavy manipulations need

Postgres Translate column value into schema prefix in a query

I have a database that uses postgresql schemas for multi-tenancy purposes. It has a table in the public schema called customers with an id and tenant column. The value for tenant is a string, and there's a corresponding postgresql schema with tables in it that match.
It looks like this:
# public.customers # first.users # second.users
| id | tenant | | id | name | | id | name |
|----|--------| |----|--------| |----|--------|
| 1 | first | | 1 | bob | | 1 | jen |
| 2 | second | | 2 | jess | | 2 | mike |
I'm wondering how I could make a single query to fetch values from a table in the schema, just given a customer id.
So if I have a customer_id of 1, how can I select * from first.users in a single query.
I'm guessing this might have to be a function written in pgpsql, but I don't have a lot of experience with that. Something like:
select * from tenant_table(1, 'users');
?
create or replace function f(_id int)
returns table (id int, name text) as $f$
declare _tenant text;
begin;
select tenant into _tenant
from public.customers
where id = _id;
return query execute format($e$
select *
from %I.users
$e$, _tenant);
end;
$f$ language plpgsql;
You cannot do that with a single query.
You'll have to use one query that selects the schema name, then construct a second query and run that.
Of course you can define a PL/pgSQL function that does both for you and executes the dynamic query with EXECUTE.

How can I get result by using “execute ‘delete from table1'"

when I use execute command to run a sql cmd, I want to get the result of it.
As we know, I can get total counts by variable sc when I use :
execute 'select * from table" into sc;
But How can I get result by using:
execute 'delete from table1'"?
when I use INTO, it turns out
ERROR: "INTO used with a command that cannot return data"
execute 'WITH row_deleted AS (DELETE FROM table1 RETURNING *) SELECT count(*) FROM row_deleted' into c;
You can use it inside a plsql funtion as following:
--Drop the table and the functin if it exist:
DROP TABLE IF EXISTS table1;
DROP FUNCTION if exists _deleted_rows();
--Create the table for the example:
CREATE TABLE table1
(
row_id serial NOT NULL,
col1 character varying,
CONSTRAINT table1_pkey PRIMARY KEY (row_id)
);
--Insert some rows:
insert into table1 (col1) values ('test1');
insert into table1 (col1) values ('test2');
insert into table1 (col1) values ('test3');
--Ctreate the function that count the number of deleted rows of the table: table1
CREATE OR REPLACE FUNCTION _deleted_rows()
RETURNS character varying AS
$BODY$declare
nbr_deleted integer;
begin
execute 'WITH row_deleted AS (DELETE FROM table1 RETURNING *) SELECT count(*) FROM row_deleted' into nbr_deleted;
return (nbr_deleted);
end;$BODY$
LANGUAGE plpgsql VOLATILE;
Test that function (got problem building shema on sqlfidlle):
select * from _deleted_rows();
_deleted_rows
---------------
3
(1 ligne)
Execute command
DELETE command
It's a little unclear to me what you are trying to do, but you should be able use "RETURNING". Here I am just returning the rows that were deleted:
CREATE TEMP TABLE foo(id int, description text);
INSERT INTO foo VALUES
(1, 'HELLO'),
(2, 'WORLD');
DELETE FROM foo returning *;
+----+-------------+
| id | description |
+----+-------------+
| 1 | HELLO |
| 2 | WORLD |
+----+-------------+
(2 rows)
Also, if you need them moved "into" a table (for example), you could do something like:
DROP TABLE IF EXISTS foo;
DROP TABLE IF EXISTS deleted_foo;
CREATE TEMP TABLE foo(id int, description text);
INSERT INTO foo VALUES
(1, 'HELLO'),
(2, 'WORLD');
CREATE TEMP TABLE deleted_foo(id int, description text);
WITH x AS (DELETE FROM foo RETURNING *)
INSERT INTO deleted_foo
SELECT * FROM x;
SELECT * FROM deleted_foo;
+----+-------------+
| id | description |
+----+-------------+
| 1 | HELLO |
| 2 | WORLD |
+----+-------------+
(2 rows)
Assuming that you are doing this from inside a plpgsql function, you could also use the ROW_COUNT variable. For example:
GET DIAGNOSTICS integer_var = ROW_COUNT;
This would give you the number of rows that were deleted.

T-SQL One column in multiple columns select query

I have a simple problem that I have not been able to find a solution to and I'm hoping someone on StackOverflow can help.
I currently have an example query as shown below
SELECT ID
, ColumnName
FROM Table
If I run this query I get the following result:
==================
ID | ColumnName
------------------
1 | One_Two_Three
2 | Four_Five_Six
==================
The result I'm after is as follows:
========================
ID | Col1 | Col2 | Col3
------------------------
1 | One | Two | Three
2 | Four | Five | Six
========================
Your assistence is appreciated.
Have a look at this example
DECLARE #Table1 TABLE
([ID] int, [ColumnName] varchar(13))
INSERT INTO #Table1
([ID], [ColumnName])
VALUES
(1, 'One_Two_Three'),
(2, 'Four_Five_Six')
;WITH Vals AS (
SELECT *,
CAST('<d>' + REPLACE([ColumnName], '_', '</d><d>') + '</d>' AS XML) ColumnValue
FROM #Table1
)
SELECT v.*,
A.B.value('.', 'varchar(max)')
FROM Vals v CROSS APPLY
ColumnValue.nodes('/d') A(B)
SQL Fiddle DEMO