to find even odd and plain series in postgresql - postgresql

this might be naive but I want to know is there any way possible to find rows which have even entries and which have odd.
I have data in this format
"41,43,45,49,35,39,47,37"
"12,14,18,16,20,24,22,10"
"1,2,3,4,5,6"
"1,7,521,65,32"
what I have tried is to split these values with an Id column and then reading them with Even and Odd functions but it is too time taking.
is there a query or a function through I can find out that which of the rows are even, odd sequence and arbitrary?
thanks in advance.

Assuming your table has some way of (uniquely) identifying each of those lists, this should work:
create table data (id integer, list text);
insert into data
values
(1, '41,43,45,49,35,39,47,37'),
(2, '12,14,18,16,20,24,22,10'),
(3, '1,2,3,4,5,6'),
(4, '1,7,521,65,32');
with normalized as (
select id, unnest(string_to_array(list,',')::int[]) as val
from data
)
select id,
bool_and((val % 2) = 0) as all_even,
bool_and((val % 2) <> 0) as all_odd
from normalized
group by id;
Not sure if that is fast enough for your needs

Related

PostgreSQL array of data composite update element using where condition

I have a composite type:
CREATE TYPE mydata_t AS
(
user_id integer,
value character(4)
);
Also, I have a table, uses this composite type as an array of mydata_t.
CREATE TABLE tbl
(
id serial NOT NULL,
data_list mydata_t[],
PRIMARY KEY (id)
);
Here I want to update the mydata_t in data_list, where mydata_t.user_id is 100000
But I don't know which array element's user_id is equal to 100000
So I have to make a search first to find the element where its user_id is equal to 100000 ... that's my problem ... I don't know how to make the query .... in fact, I want to update the value of the array element, where it's user_id is equal to 100000 (Also where the id of tbl is for example 1) ... What will be my query?
Something like this (I know it's wrong !!!)
UPDATE "tbl" SET "data_list"[i]."value"='YYYY'
WHERE "id"=1 AND EXISTS (SELECT ROW_NUMBER() OVER() AS i
FROM unnest("data_list") "d" WHERE "d"."user_id"=10000 LIMIT 1)
For example, this is my tbl data:
Row1 => id = 1, data = ARRAY[ROW(5,'YYYY'),ROW(6,'YYYY')]
Row2 => id = 2, data = ARRAY[ROW(10,'YYYY'),ROW(11,'YYYY')]
Now i want to update tbl where id is 2 and set the value of one of the tbl.data elements to 'XXXX' where the user_id of element is equal to 11
In fact, the final result of Row2 will be this:
Row2 => id = 2, data = ARRAY[ROW(10,'YYYY'),ROW(11,'XXXX')]
If you know the value value, you can use the array_replace() function to make the change:
UPDATE tbl
SET data_list = array_replace(data_list, (11, 'YYYY')::mydata_t, (11, 'XXXX')::mydata_t)
WHERE id = 2
If you do not know the value value then the situation becomes more complex:
UPDATE tbl SET data_list = data_arr
FROM (
-- UPDATE doesn't allow aggregate functions so aggregate here
SELECT array_agg(new_data) AS data_arr
FROM (
-- For the id value, get the data_list values that are NOT modified
SELECT (user_id, value)::mydata_t AS new_data
FROM tbl, unnest(data_list)
WHERE id = 2 AND user_id != 11
UNION
-- Add the values to update
VALUES ((11, 'XXXX')::mydata_t)
) x
) y
WHERE id = 2
You should keep in mind, though, that there is an awful lot of work going on in the background that cannot be optimised. The array of mydata_t values has to be examined from start to finish and you cannot use an index on this. Furthermore, updates actually insert a new row in the underlying file on disk and if your array has more than a few entries this will involve substantial work. This gets even more problematic when your arrays are larger than the pagesize of your PostgreSQL server, typically 8kB. All behind the scene so it will work, but at a performance penalty. Even though array_replace sounds like changes are made in-place (and they indeed are in memory), the UPDATE command will write a completely new tuple to disk. So if you have 4,000 array elements that means that at least 40kB of data will have to be read (8 bytes for the mydata_t type on a typical system x 4,000 = 32kB in a TOAST file, plus the main page of the table, 8kB) and then written to disk after the update. A real performance killer.
As #klin pointed out, this design may be more trouble than it is worth. Should you make data_list as table (as I would do), the update query becomes:
UPDATE data_list SET value = 'XXXX'
WHERE id = 2 AND user_id = 11
This will have MUCH better performance, especially if you add the appropriate indexes. You could then still create a view to publish the data in an aggregated form with a custom type if your business logic so requires.

Use sum function in calculated column

Is it possible to use a sum function in a calculated column?
If yes, I would like to create a calculated column, that calculates the sum of a column in the same table where the date is smaller than the date of this entry. is this possible?
And last, would this optimize repeated calls on this value over the exemplified view below?
SELECT ProductGroup, SalesDate, (
SELECT SUM(Sales)
FROM SomeList
WHERE (ProductGroup= KVU.ProductGroup) AND (SalesDate<= KVU.SalesDate)) AS cumulated
FROM SomeList AS KVU
Is it possible to use a sum function in a calculated column?
Yes, it's possible using a scalar valued function (scalar UDF) for you computed column but this would be a disaster. Using scalar UDFs for computed columns destroy performance. Adding a scalar UDF that accesses data (which would be required here) makes things even worse.
It sounds to me like you just need a good ol' fashioned index to speed things up. First some sample data:
IF OBJECT_ID('dbo.somelist','U') IS NOT NULL DROP TABLE dbo.somelist;
GO
CREATE TABLE dbo.somelist
(
ProductGroup INT NOT NULL,
[Month] TINYINT NOT NULL CHECK ([Month] <= 12),
Sales DECIMAL(10,2) NOT NULL
);
INSERT dbo.somelist
VALUES (1,1,22),(2,1,45),(2,1,25),(2,1,19),(1,2,100),(1,2,200),(2,2,50.55);
and the correct index:
CREATE NONCLUSTERED INDEX nc_somelist ON dbo.somelist(ProductGroup,[Month])
INCLUDE (Sales);
With this index in place this query would be extremely efficient:
SELECT s.ProductGroup, s.[Month], SUM(s.Sales)
FROM dbo.somelist AS s
GROUP BY s.ProductGroup, s.[Month];
If you needed to get a COUNT by month & product group you could create an indexed view like so:
CREATE VIEW dbo.vw_somelist WITH SCHEMABINDING AS
SELECT s.ProductGroup, s.[Month], TotalSales = COUNT_BIG(*)
FROM dbo.somelist AS s
GROUP BY s.ProductGroup, s.[Month];
GO
CREATE UNIQUE CLUSTERED INDEX uq_cl__vw_somelist ON dbo.vw_somelist(ProductGroup, [Month]);
Once that indexed view was in place your COUNTs would be pre-aggregated. You cannot, however, include SUM in an indexed view.

How to split array in json using json_query?

I've got a column in a table that's a json. It contains only values without keys like
Now I'm trying to split the data from the json and create new table using every index of each array as new entry like
I've already tried
SELECT JSON_QUERY(abc) as 'Type', Id as 'ValueId' from Table FOR JSON AUTO
Is there any way to handle splitting given that some arrays might be empty and look like
[]
?
A fairly simply approach would be to use outer apply with openjson.
First, create and populate sample table (Please save us this step in your future questions):
DECLARE #T AS TABLE
(
Id int,
Value nvarchar(20)
)
INSERT INTO #T VALUES
(1, '[10]'),
(2, '[20, 200]'),
(3, '[]'),
(4, '')
The query:
SELECT Id, JsonValues.Value
FROM #T As t
OUTER APPLY
OPENJSON( Value ) As JsonValues
WHERE ISJSON(t.Value) = 1
Results:
Id Value
1 10
2 20
2 200
3 NULL
Note the ISJSON condition in the where clause will prevent exceptions in case the Value column contains anything other than a valid json (an empty array [] is still considered valid for this purpose).
If you don't want to return a row where the json array is empty, use cross apply instead of outer apply.
Your own code calling for FOR JSON AUTO tries to create JSON out of tabular data. But what you really needs seems to be the opposite direction: You want to transform JSON to a result set, a derived table. This is done by OPENJSON.
Your JSON seems to be a very minimalistic array.
You can try something along this.
DECLARE #json NVARCHAR(MAX) =N'[1,2,3]';
SELECT * FROM OPENJSON(#json);
The result returns the zero-based ordinal position in key, the actual value in value and a (very limited) type-enum.
Hint: If you want to use this against a table's column you must use APPLY, something along
SELECT *
FROM YourTable t
OUTER APPLY OPENJSON(t.TheJsonColumn);

Multiply rows by difference of numbers in columns, with sequence list

I need to create a table using postgres that multiplies a row by the difference of the numbers in 2 columns, and provides the corresponding sequence. It's hard to explain, I'll leave a picture to save us a thousand words:
I have found a partial answer to this question in SQL, but it only multiplies by one column, and I'm having trouble with using it in Posgresql:
How to multiply a single row with a number from column in sql.
You can use the generate_series function: https://www.postgresql.org/docs/current/static/functions-srf.html
create table table_a(
a integer primary key,
start_a integer,
end_a integer
);
insert into table_a values
(1, 1, 3),
(2, 2, 5);
create table table_b as
select a, start_a, end_a, g as start_b, g+1 as end_b
from table_a, lateral generate_series(start_a, end_a-1) g;
select * from table_b;
You can try it here: http://rextester.com/RTZWK4070

searching an element in an array without a loop in postgresql

Im writing a query for searching an element in an array. Search using "for" loop is not efficient because my array is having a lot of elements. Because of this the query is taking lot of time to execute. So can any one say how to search an element in an array without "for" loop which should be faster. I have to get the index on searching
Thanks,
Karthika
Use the ANY operator:
where 1 = ANY (array_column)
That will return all rows where array_column contains the value 1 at least once. If you want to check for multiple values, see Clodoaldo's answer.
If you create an index on that column, this should be very fast. Something like this:
create index on the_table using gin (the_array_column);
The following is inspired by the solution shown here: Finding the position of a value in PostgreSQL arrays
with sample_data (pk_column, array_data) as (
values
(1, array[1,2,3,4,5]),
(2, array[7,8,9,11]),
(3, array[5,4,3,2,1]),
(4, array[10,9,8,1,4,6]),
(5, array[7,8,9])
)
select *
from (
select pk_column,
unnest(array_data) as value,
generate_subscripts(array_data, 1) as array_index
from sample_data
where 1 = any(array_data)
) t
where value = 1
The inner where will reduce the total work that needs to be done to only those rows that actually contain the value. The outer query will then "explode" the array to get the value's index. But using the function shown in the linked question might actually be what you are after.
Check the contains operator #>
select array[1,2] #> array[1];
?column?
----------
t
http://www.postgresql.org/docs/current/static/functions-array.html