Casting string like "[1, 2, 3]" to array [duplicate] - pyspark

This question already has an answer here:
Handle string to array conversion in pyspark dataframe
(1 answer)
Closed 4 years ago.
Pretty straightforward. I have an array-like column encoded as a string (varchar) and want to cast it to array (so I can then explode it and manipulate the elements in "long" format).
The two most natural approaches don't seem to work:
-- just returns a length-1 array with a single string element '[1, 2, 3]'
select array('[1, 2, 3]')
-- errors: DataType array is not supported.
select cast('[1, 2, 3]' as array)
The ugly/inelegant/circuitous way to get what I want is:
select explode(split(replace(replace('[1, 2, 3]', '['), ']'), ', '))
-- '1'
-- '2'
-- '3'
(regexp_replace could subsume the two replace but regex with square brackets are always a pain; ltrim and rtrim or trim(BOTH '[]'...) could also be used)
Is there any more concise way to go about this? I'm on Spark 2.3.1.

I am assuming here that the elements are digits. But you get the idea
>>> s = '[1,2,3]'
>>> list(c for c in s if c.isdigit())
['1', '2', '3']
>>> map(int, list(c for c in s if c.isdigit()))
[1, 2, 3]

Related

Gaps - how to show values with a specific symbol?

I have a column with numbers - 1,2,3...100. 5 of them looks like 1_, 3_, 5_
So, actually I have 1, 1_, 2, 3, 3_, 4, 5, 5_, 6, 7, 8 ...
I did the query:
SELECT code
FROM table.column1
WHERE column1 LIKE '%_'
to get the list of this numbers with _. Instesd of it I got the list of all values, including values with _. What's your opinion - what's the reason, where is a mistake?
The character _ is, similar to %, a wildcard in LIKE-operations. While % matches multiple characters, _ only matches a single character. To get your expected result you need to escape the _ character:
SELECT code
FROM table.column1
WHERE column1 LIKE '%\_'
Here is a dbfiddle showing it.
You use the "E" before the string to escape the symbol.
SELECT col1
FROM table1
WHERE col1 LIKE E'%\_';

How to map a jsonb array of numbers to a jsonb array of strings?

I have a jsonb array of numbers or strings, e.g. '[1, "2", 3]'::jsonb
I want to map all values to strings, so that I can end up with '["1", "2", "3"]'::jsonb
I tried this:
select jsonb_agg(jsonb_array_elements_text('[1, 2, 3]'::jsonb))
But it is complaining
LINE 1: select jsonb_agg(jsonb_array_elements_text('[1, 2, 3]'::json...
^
HINT: You might be able to move the set-returning function into a LATERAL FROM item.
Do what the error message suggests: use the set returning function jsonb_array_elements_text() like a table:
select jsonb_agg(element)
from jsonb_array_elements_text('[1, 2, 3]'::jsonb) as x(element);

How to extract a value from a delimited string in Db2 [duplicate]

This question already has answers here:
How to split a string value based on a delimiter in DB2
(12 answers)
Closed 2 years ago.
how to extract PAXG from the following string
0410|M|PAXG|20181114
in Db2 SQL
Try this for table BILL and its column COL1 with data.
SELECT
COL1
-- since 9.7
, xmlcast(xmlquery('fn:tokenize($s, "\|")[3]' passing BILL.COL1 as "s") as varchar(20)) as one
-- since 11.1
, REGEXP_SUBSTR(BILL.COL1 || '|', '([^\|]*)\|', 1, 3, '', 1) as two
FROM
(VALUES '0410|M|PAXG|20181114', '0410|M||20181114') BILL (COL1)
--BILL
;
If you are using Db2 11.1 or above then REGEXP_EXTRACT will work for you. E.g.
VALUES REGEXP_EXTRACT('0410|M|PAXG|20181114','([^\|]*)\|?', 1, 3, '', 1)
returns
1
------
PAXG
REGEXP_EXTRACT

How can I conditionally add elements to a jsonb array? [duplicate]

This question already has answers here:
PostgreSQL JSON building an array without null values
(4 answers)
Closed 8 months ago.
Is there a way in which I can conditionally add elements into a postgres jsonb array? I'm trying to construct an array to be added into a larger object where most of the elements are always required but I'd like to have some of them optional.
As a simplified example:
select jsonb_build_array(
jsonb_build_object('a', a),
jsonb_build_object('b', b),
jsonb_build_object('c', c),
case when a + b <> c then
jsonb_build_object('error', c - (a + b))
end
) from ( values (2, 2, 5) ) as things (a,b,c);
This works fine when a+b<>c but when a+b=c i get a null in the array.e.g.
sophia=> \i ~/cc/dpdb/migration/foo.sql
jsonb_build_array
----------------------------------------------
[{"a": 2}, {"b": 2}, {"c": 5}, {"error": 1}]
(1 row)
sophia=> \i ~/cc/dpdb/migration/foo.sql
jsonb_build_array
--------------------------------------
[{"a": 2}, {"b": 2}, {"c": 4}, null]
(1 row)
sophia=>
Is there a way to add the element without the null or if added, remove the null? Obviously, I could put the whole block in a case and duplicate the first few lines but that would be rather ugly and verbose. There's jsonb_strip_nulls but that only works on objects not arrays.
You have to use a second step because you cannot create "no element" in your syntax. Either you really separate both cases with two different array creations or you have to conditionally adjust the created array afterwards:
demo:db<>fiddle
SELECT
CASE WHEN a + b <> c THEN
my_array || jsonb_build_object('error', c - (a + b))
ELSE
my_array
END
FROM (
select
a, b, c,
jsonb_build_array(
jsonb_build_object('a', a),
jsonb_build_object('b', b),
jsonb_build_object('c', c)
) AS my_array
from ( values (2, 2, 5), (2, 2, 4) ) as things (a,b,c)
) s

Compare array against array column in postgres

# things :string is an Array
scope :things, ->(q) { where('ARRAY[?]::varchar[] IN things', Array.wrap(q)) }
scope :things, ->(q) { where('things && ARRAY[?]::varchar[]', Array.wrap(q)) }
scope :things, ->(q) { where('ARRAY[?]::varchar[] <# things', Array.wrap(q)) }
I've tried a few versions, but I can't seem to find the proper incantation. I'm looking to find any row that has any of the things in the array... is there any overlap?
[1, 2, 3] & [1, 8] = t
[1, 2, 3] & [8, 9] = f
I'm trying to mimic ActiveRecord's default where behavior. If I give it an array, it'll get all the matching rows. Is this possible with postgres arrays? Is it even efficient?
One way of doing this is by converting the arrays to a set of rows. Once you have the arrays as set of rows, you can do an intersection between them and check if the result is empty set.
For example:
CREATE TABLE my_test_table(id BIGINT, test_array BIGINT[]);
INSERT INTO my_test_table(id, test_array)
VALUES
(1, array[1,2,3]),
(2, ARRAY[1,5,8]);
SELECT * FROM my_test_table
WHERE array_length((SELECT array
(
SELECT UNNEST(test_array)
INTERSECT
SELECT UNNEST(array[3,15,2])
)), 1) > 0;
The result of the SELECT statement above is:
1 | {1,2,3}
This allows for more complex matching of elements of 2 arrays. For example, if you would like to select the arrays that have at least 2 common elements, you could just change the WHERE part to
WHERE array_length((SELECT array
(
SELECT UNNEST(test_array)
INTERSECT
SELECT UNNEST(array[3,15,2])
)), 1) > 1;