'TOK_STRINGLITERALSEQUENCE not supported in insert/values' getting this error while loading data into the hive.
when trying to insert a comma-separated string into a single column it is showing error as
'TOK_STRINGLITERALSEQUENCE not supported in insert/values'
insert into table table_name values('llu'/t'ghf'/t'a,b,c,d'/t'gh,edf,ghu,kjhl'/t'1')
/t represents delimiter as tab
while loading data I am getting an error as 'TOK_STRINGLITERALSEQUENCE not supported in insert/values'.
Expected results
col1 col2 col3 col4 col5
llu ghf a,b,c,d gh,edf,ghu,kjhl 1
I'm not sure why you're using tab-delimitation for the insert statement. This worked for me in Hive version 1.2.1
create table test (col1 STRING, col2 STRING, col3 STRING, col4 STRING, col5 STRING);
insert into table test values('llu','ghf','a,b,c,d','gh,edf,ghu,kjhl','1');
select * from test;
+------------+------------+------------+------------------+------------+--+
| test.col1 | test.col2 | test.col3 | test.col4 | test.col5 |
+------------+------------+------------+------------------+------------+--+
| llu | ghf | a,b,c,d | gh,edf,ghu,kjhl | 1 |
+------------+------------+------------+------------------+------------+--+
I have two tables. The first one:
col1 | col2 | ColumnOfInterest | DateOfInterest
--------------------------------------------------------
abc | def | ghi | 2013-02-24 17:48:32.548
.
.
.
The second one:
ColumnOfInterest | DateChanged | col3 | col4
--------------------------------------------------------
ghi | 2012-08-13 06:28:11.092 | jkl | mno
ghi | 2012-10-16 23:54:07.613 | pqr | stu
ghi | 2013-01-29 14:13:18.502 | vwx | yz1
ghi | 2013-10-01 14:17:32.992 | 234 | 567
.
.
.
What I'm trying to do is to make a 1:1 join between the two tables on the ColumnOfInterest and so that the DateOfInterest reflects the date from the second table.
That is, the line from the first table would be joined to the third line of the second table.
Do you have any ideas?
Thanks
select table1.ColumnOfInterest, max(table2.DateChanged)
from table1
join table2
on table1.ColumnOfInterest = table1.ColumnOfInterest
and table1.CDateOfInterest >= table2.DateChanged
group by table1.ColumnOfInterest
SELECT 'abc' col1,
'def' col2,
'ghi' ColumnOfInterest,
CAST('2013-02-24 17:48:32.548' AS DATETIME) DateOfInterest
INTO #DateOfInterest
CREATE TABLE #History
(
ColumnOfInterest VARCHAR(5),
DateChanged DATETIME,
col3 VARCHAR(5),
col4 VARCHAR(5)
)
INSERT INTO #History
VALUES ('ghi','2012-08-13 06:28:11.092','jkl','mno'),
('ghi','2012-10-16 23:54:07.613','pqr','stu'),
('ghi','2013-01-29 14:13:18.502','vwx','yz1'),
('ghi','2013-10-01 14:17:32.992','234','567');
;WITH CTE_Date_Ranges
AS
(
SELECT ColumnOfInterest,
DateChanged,
LAG(DateChanged,1,GETDATE()) OVER (PARTITION BY ColumnOfInterest ORDER BY DateChanged) AS end_date,
col3,
col4
FROM #History
)
SELECT B.*,
A.*
FROM CTE_Date_Ranges A
INNER JOIN #DateOfInterest B
ON B.DateOfInterest > A.DateChanged AND B.DateOfInterest < A.end_date
Results:
col1 col2 ColumnOfInterest DateOfInterest ColumnOfInterest DateChanged end_date col3 col4
---- ---- ---------------- ----------------------- ---------------- ----------------------- ----------------------- ----- -----
abc def ghi 2013-02-24 17:48:32.547 ghi 2012-08-13 06:28:11.093 2015-04-21 18:46:46.967 jkl mno
How can I simulate a XOR function in PostgreSQL? Or, at least, I think this is a XOR-kind-of situation.
Lets say the data is as follows:
id | col1 | col2 | col3
---+------+------+------
1 | 1 | | 4
2 | | 5 | 4
3 | | 8 |
4 | 12 | 5 | 4
5 | | | 4
6 | 1 | |
7 | | 12 |
And I want to return 1 column for those rows where only one of the columns is filled in. (ignore col3 for now..
Lets start with this example of 2 columns:
SELECT
id, COALESCE(col1, col2) AS col
FROM
my_table
WHERE
COALESCE(col1, col2) IS NOT NULL -- at least 1 is filled in
AND
(col1 IS NULL OR col2 IS NULL) -- at least 1 is empty
;
This works nicely an should result in:
id | col
---+----
1 | 1
3 | 8
6 | 1
7 | 12
But now, I would like to include col3 in a similar way. Like this:
id | col
---+----
1 | 1
3 | 8
5 | 4
6 | 1
7 | 12
How can this be done is a more generic way? Does Postgres support such a method?
I'm not able to find anything like it.
rows with exactly 1 column filled in:
select * from my_table where
(col1 is not null)::integer
+(col1 is not null)::integer
+(col1 is not null)::integer
=1
rows with 1 or 2
select * from my_table where
(col1 is not null)::integer
+(col1 is not null)::integer
+(col1 is not null)::integer
between 1 and 2
The "case" statement might be your friend here, the "min" aggregated function doesn't affect the result.
select id, min(coalesce(col1,col2,col3))
from my_table
group by 1
having sum(case when col1 is null then 0 else 1 end+
case when col2 is null then 0 else 1 end+
case when col3 is null then 0 else 1 end)=1
[Edit]
Well, i found a better answer without using aggregated functions, it's still based on the use of "case" but i think is more simple.
select id, coalesce(col1,col2,col3)
from my_table
where (case when col1 is null then 0 else 1 end+
case when col2 is null then 0 else 1 end+
case when col3 is null then 0 else 1 end)=1
How about
select coalesce(col1, col2, col3)
from my_table
where array_length(array_remove(array[col1, col2, col3], null), 1) = 1
I have two tables, table A has ID column whose values are comma separated, each of those ID value has a representation in table B.
Table A
+-----------------+
| Name | ID |
+------------------
| A1 | 1,2,3|
| A2 | 2 |
| A3 | 3,2 |
+------------------
Table B
+-------------------+
| ID | Value |
+-------------------+
| 1 | Apple |
| 2 | Orange |
| 3 | Mango |
+-------------------+
I was wondering if there is an efficient way to do a select where the result would as below,
Name, Value
A1 Apple, Orange, Mango
A2 Orange
A3 Mango, Orange
Any suggestions would be welcome. Thanks.
You need to first "normalize" table_a into a new table using the following:
select name, regexp_split_to_table(id, ',') id
from table_a;
The result of this can be joined to table_b and the result of the join then needs to be grouped in order to get the comma separated list of the names:
select a.name, string_agg(b.value, ',')
from (
select name, regexp_split_to_table(id, ',') id
from table_a
) a
JOIN table_b b on b.id = a.id
group by a.name;
SQLFiddle: http://sqlfiddle.com/#!12/77fdf/1
There are two regex related functions that can be useful:
http://www.postgresql.org/docs/current/static/functions-string.html
regexp_split_to_table()
regexp_split_to_array()
Code below is untested, but you'd use something like it to match A and B:
select name, value
from A
join B on B.id = ANY(regexp_split_to_array(A.id, E'\\s*,\\s*', 'g')::int[]))
You can then use array_agg(value), grouping by name, and format using array_to_string().
Two notes, though:
It won't be as efficient as normalizing things.
The formatting itself ought to be done further down, in your views.
Is there a unpivot equivalent function in PostgreSQL?
Create an example table:
CREATE TEMP TABLE foo (id int, a text, b text, c text);
INSERT INTO foo VALUES (1, 'ant', 'cat', 'chimp'), (2, 'grape', 'mint', 'basil');
You can 'unpivot' or 'uncrosstab' using UNION ALL:
SELECT id,
'a' AS colname,
a AS thing
FROM foo
UNION ALL
SELECT id,
'b' AS colname,
b AS thing
FROM foo
UNION ALL
SELECT id,
'c' AS colname,
c AS thing
FROM foo
ORDER BY id;
This runs 3 different subqueries on foo, one for each column we want to unpivot, and returns, in one table, every record from each of the subqueries.
But that will scan the table N times, where N is the number of columns you want to unpivot. This is inefficient, and a big problem when, for example, you're working with a very large table that takes a long time to scan.
Instead, use:
SELECT id,
unnest(array['a', 'b', 'c']) AS colname,
unnest(array[a, b, c]) AS thing
FROM foo
ORDER BY id;
This is easier to write, and it will only scan the table once.
array[a, b, c] returns an array object, with the values of a, b, and c as it's elements.
unnest(array[a, b, c]) breaks the results into one row for each of the array's elements.
You could use VALUES() and JOIN LATERAL to unpivot the columns.
Sample data:
CREATE TABLE test(id int, a INT, b INT, c INT);
INSERT INTO test(id,a,b,c) VALUES (1,11,12,13),(2,21,22,23),(3,31,32,33);
Query:
SELECT t.id, s.col_name, s.col_value
FROM test t
JOIN LATERAL(VALUES('a',t.a),('b',t.b),('c',t.c)) s(col_name, col_value) ON TRUE;
DBFiddle Demo
Using this approach it is possible to unpivot multiple groups of columns at once.
EDIT
Using Zack's suggestion:
SELECT t.id, col_name, col_value
FROM test t
CROSS JOIN LATERAL (VALUES('a', t.a),('b', t.b),('c',t.c)) s(col_name, col_value);
<=>
SELECT t.id, col_name, col_value
FROM test t
,LATERAL (VALUES('a', t.a),('b', t.b),('c',t.c)) s(col_name, col_value);
db<>fiddle demo
Great article by Thomas Kellerer found here
Unpivot with Postgres
Sometimes it’s necessary to normalize de-normalized tables - the opposite of a “crosstab” or “pivot” operation. Postgres does not support an UNPIVOT operator like Oracle or SQL Server, but simulating it, is very simple.
Take the following table that stores aggregated values per quarter:
create table customer_turnover
(
customer_id integer,
q1 integer,
q2 integer,
q3 integer,
q4 integer
);
And the following sample data:
customer_id | q1 | q2 | q3 | q4
------------+-----+-----+-----+----
1 | 100 | 210 | 203 | 304
2 | 150 | 118 | 422 | 257
3 | 220 | 311 | 271 | 269
But we want the quarters to be rows (as they should be in a normalized data model).
In Oracle or SQL Server this could be achieved with the UNPIVOT operator, but that is not available in Postgres. However Postgres’ ability to use the VALUES clause like a table makes this actually quite easy:
select c.customer_id, t.*
from customer_turnover c
cross join lateral (
values
(c.q1, 'Q1'),
(c.q2, 'Q2'),
(c.q3, 'Q3'),
(c.q4, 'Q4')
) as t(turnover, quarter)
order by customer_id, quarter;
will return the following result:
customer_id | turnover | quarter
------------+----------+--------
1 | 100 | Q1
1 | 210 | Q2
1 | 203 | Q3
1 | 304 | Q4
2 | 150 | Q1
2 | 118 | Q2
2 | 422 | Q3
2 | 257 | Q4
3 | 220 | Q1
3 | 311 | Q2
3 | 271 | Q3
3 | 269 | Q4
The equivalent query with the standard UNPIVOT operator would be:
select customer_id, turnover, quarter
from customer_turnover c
UNPIVOT (turnover for quarter in (q1 as 'Q1',
q2 as 'Q2',
q3 as 'Q3',
q4 as 'Q4'))
order by customer_id, quarter;
FYI for those of us looking for how to unpivot in RedShift.
The long form solution given by Stew appears to be the only way to accomplish this.
For those who cannot see it there, here is the text pasted below:
We do not have built-in functions that will do pivot or unpivot. However,
you can always write SQL to do that.
create table sales (regionid integer, q1 integer, q2 integer, q3 integer, q4 integer);
insert into sales values (1,10,12,14,16), (2,20,22,24,26);
select * from sales order by regionid;
regionid | q1 | q2 | q3 | q4
----------+----+----+----+----
1 | 10 | 12 | 14 | 16
2 | 20 | 22 | 24 | 26
(2 rows)
pivot query
create table sales_pivoted (regionid, quarter, sales)
as
select regionid, 'Q1', q1 from sales
UNION ALL
select regionid, 'Q2', q2 from sales
UNION ALL
select regionid, 'Q3', q3 from sales
UNION ALL
select regionid, 'Q4', q4 from sales
;
select * from sales_pivoted order by regionid, quarter;
regionid | quarter | sales
----------+---------+-------
1 | Q1 | 10
1 | Q2 | 12
1 | Q3 | 14
1 | Q4 | 16
2 | Q1 | 20
2 | Q2 | 22
2 | Q3 | 24
2 | Q4 | 26
(8 rows)
unpivot query
select regionid, sum(Q1) as Q1, sum(Q2) as Q2, sum(Q3) as Q3, sum(Q4) as Q4
from
(select regionid,
case quarter when 'Q1' then sales else 0 end as Q1,
case quarter when 'Q2' then sales else 0 end as Q2,
case quarter when 'Q3' then sales else 0 end as Q3,
case quarter when 'Q4' then sales else 0 end as Q4
from sales_pivoted)
group by regionid
order by regionid;
regionid | q1 | q2 | q3 | q4
----------+----+----+----+----
1 | 10 | 12 | 14 | 16
2 | 20 | 22 | 24 | 26
(2 rows)
Hope this helps, Neil
Pulling slightly modified content from the link in the comment from #a_horse_with_no_name into an answer because it works:
Installing Hstore
If you don't have hstore installed and are running PostgreSQL 9.1+, you can use the handy
CREATE EXTENSION hstore;
For lower versions, look for the hstore.sql file in share/contrib and run in your database.
Assuming that your source (e.g., wide data) table has one 'id' column, named id_field, and any number of 'value' columns, all of the same type, the following will create an unpivoted view of that table.
CREATE VIEW vw_unpivot AS
SELECT id_field, (h).key AS column_name, (h).value AS column_value
FROM (
SELECT id_field, each(hstore(foo) - 'id_field'::text) AS h
FROM zcta5 as foo
) AS unpiv ;
This works with any number of 'value' columns. All of the resulting values will be text, unless you cast, e.g., (h).value::numeric.
Just use JSON:
with data (id, name) as (
values (1, 'a'), (2, 'b')
)
select t.*
from data, lateral jsonb_each_text(to_jsonb(data)) with ordinality as t
order by data.id, t.ordinality;
This yields
|key |value|ordinality|
|----|-----|----------|
|id |1 |1 |
|name|a |2 |
|id |2 |1 |
|name|b |2 |
dbfiddle
I wrote a horrible unpivot function for PostgreSQL. It's rather slow but it at least returns results like you'd expect an unpivot operation to.
https://cgsrv1.arrc.csiro.au/blog/2010/05/14/unpivotuncrosstab-in-postgresql/
Hopefully you can find it useful..
Depending on what you want to do... something like this can be helpful.
with wide_table as (
select 1 a, 2 b, 3 c
union all
select 4 a, 5 b, 6 c
)
select unnest(array[a,b,c]) from wide_table
You can use FROM UNNEST() array handling to UnPivot a dataset, tandem with a correlated subquery (works w/ PG 9.4).
FROM UNNEST() is more powerful & flexible than the typical method of using FROM (VALUES .... ) to unpivot datasets. This is b/c FROM UNNEST() is variadic (with n-ary arity). By using a correlated subquery the need for the lateral ORDINAL clause is eliminated, & Postgres keeps the resulting parallel columnar sets in the proper ordinal sequence.
This is, BTW, FAST -- in practical use spawning 8 million rows in < 15 seconds on a 24-core system.
WITH _students AS ( /** CTE **/
SELECT * FROM
( SELECT 'jane'::TEXT ,'doe'::TEXT , 1::INT
UNION
SELECT 'john'::TEXT ,'doe'::TEXT , 2::INT
UNION
SELECT 'jerry'::TEXT ,'roe'::TEXT , 3::INT
UNION
SELECT 'jodi'::TEXT ,'roe'::TEXT , 4::INT
) s ( fn, ln, id )
) /** end WITH **/
SELECT s.id
, ax.fanm -- field labels, now expanded to two rows
, ax.anm -- field data, now expanded to two rows
, ax.someval -- manually incl. data
, ax.rankednum -- manually assigned ranks
,ax.genser -- auto-generate ranks
FROM _students s
,UNNEST /** MULTI-UNNEST() BLOCK **/
(
( SELECT ARRAY[ fn, ln ]::text[] AS anm -- expanded into two rows by outer UNNEST()
/** CORRELATED SUBQUERY **/
FROM _students s2 WHERE s2.id = s.id -- outer relation
)
,( /** ordinal relationship preserved in variadic UNNEST() **/
SELECT ARRAY[ 'first name', 'last name' ]::text[] -- exp. into 2 rows
AS fanm
)
,( SELECT ARRAY[ 'z','x','y'] -- only 3 rows gen'd, but ordinal rela. kept
AS someval
)
,( SELECT ARRAY[ 1,2,3,4,5 ] -- 5 rows gen'd, ordinal rela. kept.
AS rankednum
)
,( SELECT ARRAY( /** you may go wild ... **/
SELECT generate_series(1, 15, 3 )
AS genser
)
)
) ax ( anm, fanm, someval, rankednum , genser )
;
RESULT SET:
+--------+----------------+-----------+----------+---------+-------
| id | fanm | anm | someval |rankednum| [ etc. ]
+--------+----------------+-----------+----------+---------+-------
| 2 | first name | john | z | 1 | .
| 2 | last name | doe | y | 2 | .
| 2 | [null] | [null] | x | 3 | .
| 2 | [null] | [null] | [null] | 4 | .
| 2 | [null] | [null] | [null] | 5 | .
| 1 | first name | jane | z | 1 | .
| 1 | last name | doe | y | 2 | .
| 1 | | | x | 3 | .
| 1 | | | | 4 | .
| 1 | | | | 5 | .
| 4 | first name | jodi | z | 1 | .
| 4 | last name | roe | y | 2 | .
| 4 | | | x | 3 | .
| 4 | | | | 4 | .
| 4 | | | | 5 | .
| 3 | first name | jerry | z | 1 | .
| 3 | last name | roe | y | 2 | .
| 3 | | | x | 3 | .
| 3 | | | | 4 | .
| 3 | | | | 5 | .
+--------+----------------+-----------+----------+---------+ ----
Here's a way that combines the hstore and CROSS JOIN approaches from other answers.
It's a modified version of my answer to a similar question, which is itself based on the method at https://blog.sql-workbench.eu/post/dynamic-unpivot/ and another answer to that question.
-- Example wide data with a column for each year...
WITH example_wide_data("id", "2001", "2002", "2003", "2004") AS (
VALUES
(1, 4, 5, 6, 7),
(2, 8, 9, 10, 11)
)
-- that is tided to have "year" and "value" columns
SELECT
id,
r.key AS year,
r.value AS value
FROM
example_wide_data w
CROSS JOIN
each(hstore(w.*)) AS r(key, value)
WHERE
-- This chooses columns that look like years
-- In other cases you might need a different condition
r.key ~ '^[0-9]{4}$';
It has a few benefits over other solutions:
By using hstore and not jsonb, it hopefully minimises issues with type conversions (although hstore does convert everything to text)
The columns don't need to be hard coded or known in advance. Here, columns are chosen by a regex on the name, but you could use any SQL logic based on the name, or even the value.
It doesn't require PL/pgSQL - it's all SQL