Counting occurrences of a value in a column of a table - postgresql - postgresql

I have a table with multiple columns:
table1 | column1 | column2 | column3 |
| x | .... | .... |
| y | .... | .... |
| x | .... | .... |
How can I count the occurences of a value, for example x, in one of the columns, for example column1? Given table1 this would have to return me 2 (numbers of x present in column1).

You can use SUM() aggregate function with a CASE statement like
select sum(case when column1 = 'x' then 1 else 0 end) as X_Count
from tabl1;

SELECT COUNT(*) FROM table1 WHERE column1 = 'x'

Related

Counting consecutive days in postgres

I'm trying to count the number of consecutive days in two tables with the following structure:
| id | email | timestamp |
| -------- | -------------- | -------------- |
| 1 | hello#example.com | 2021-10-22 00:35:22 |
| 2 | hello2#example.com | 2021-10-21 21:17:41 |
| 1 | hello#example.com | 2021-10-19 00:35:22 |
| 1 | hello#example.com | 2021-10-18 00:35:22 |
| 1 | hello#example.com | 2021-10-17 00:35:22 |
I would like to count the number of consecutive days of activity. The data above would show:
| id | email | length |
| -------- | -------------- | -- |
| 1 | hello#example.com | 1 |
| 2 | hello2#example.com | 1 |
| 1 | hello#example.com | 3 |
This is made more difficult because I need to join the two tables using a UNION (or something similar and then run the grouping. I tried to build on this query (Finding the length of a series in postgres) but I'm unable to group by consecutive days.
select max(id) as max_id, email, count(*) as length
from (
select *, row_number() over wa - row_number() over wp as grp
from began_playing_video
window
wp as (partition by email order by id desc),
wa as (order by id desc)
) s
group by email, grp
order by 1 desc
Any ideas on how I could do this in Postgres?
First create an aggregate function in order to count the adjacent dates within an ascendant ordered list. The jsonb data type is used because it allows to mix various data types inside the same array :
CREATE OR REPLACE FUNCTION count_date(x jsonb, y jsonb, d date)
RETURNS jsonb LANGUAGE sql AS
$$
SELECT CASE
WHEN d IS NULL
THEN COALESCE(x,y)
ELSE
to_jsonb(d :: text)
|| CASE
WHEN COALESCE(x,y) = '[]' :: jsonb
THEN '[1]' :: jsonb
WHEN COALESCE(x->>0, y->>0) :: date + 1 = d :: date
THEN jsonb_set(COALESCE(x-0, y-0), '{-1}', to_jsonb(COALESCE(x->>-1, y->>-1) :: integer + 1))
ELSE COALESCE(x-0, y-0) || to_jsonb(1)
END
END ;
$$
DROP AGGREGATE IF EXISTS count_date(jsonb, date) ;
CREATE AGGREGATE count_date(jsonb, date)
(
sfunc = count_date
, stype = jsonb
) ;
Then iterate on the count_date on your table grouped by id :
WITH list AS (
SELECT id, email, count_date('[]', timestamp ORDER BY timestamp :: timestamp) as count_list
FROM your_table
GROUP BY id, email
)
SELECT id, email, jsonb_array_elements(count_list-0) AS length
FROM list

Parse text data in PostgreSQL

I've got a PostgreSQL database, one table with 2 text columns, stored data like this:
id| col1 | col2 |
------------------------------------------------------------------------------|
1 | value_1, value_2, value_3 | name_1(date_1), name_2(date_2), name_3(date_3)|
2 | value_4, value_5, value_6 | name_4(date_4), name_5(date_5), name_6(date_6)|
I need to parse rows in a new table like this:
id | col1 | col2 | col3 |
1 | value_1 | name_1 | date_1 |
1 | value_2 | name_2 | date_2 |
...| ... | ... | ... |
2 | value_6 | name_6 | date_6 |
How might I do this?
step-by-step demo:db<>fiddle
SELECT
id,
u_col1 as col1,
col2_matches[1] as col2, -- 5
col2_matches[2] as col3
FROM
mytable,
unnest( -- 3
regexp_split_to_array(col1, ', '), -- 1
regexp_split_to_array(col2, ', ') -- 2
) as u (u_col1, u_col2),
regexp_matches(u_col2, '(.+)\((.+)\)') as col2_matches -- 4
Split the data of your first column into an array
Split the data of your second column into an array of form {a(a), b(b), c(c)}
Transpose all array elements into own records
Split the elements of form a(b) into an array of form {a,b}
Show required columns. For the col2 and col3 show the first or the second array element from step 4

Postgres 10 lateral unnest missing null values

I have a Postgres table where the content of a text column is delimited with '|'.
ID | ... | my_column
-----------------------
1 | ... | text|concatenated|as|such
2 | ... | NULL
3 | ... | NULL
I tried to unnest(string_to_array()) this column to separate rows which works fine, except that my NULL values (>90% of all entries) are excluded. I have tried several approaches:
SELECT * from "my_table", lateral unnest(CASE WHEN "this_column" is NULL
THEN NULL else string_to_array("this_column", '|') END);
or
as suggested here: PostgreSQL unnest with empty array
What I get:
ID | ... | my_column
-----------------------
1 | ... | text
1 | ... | concatenated
1 | ... | as
1 | ... | such
But this is what I need:
ID | ... | my_column
-----------------------
1 | ... | text
1 | ... | concatenated
1 | ... | as
1 | ... | such
2 | ... | NULL
3 | ... | NULL
Use a LEFT JOIN instead:
SELECT m.id, t.*
from my_table m
left join lateral unnest(string_to_array(my_column, '|')) as t(w) on true;
There is no need for the CASE statement to handle NULL values. string_to_array will handle them correctly.
Online example: http://rextester.com/XIGXP80374

SQL group by distinct array

I have table1:
col1 (integer) | col2 (varchar[]) | col3 (integer)
----------------------------------------------------
1 | {A,B,C} | 2
1 | {A} | 5
1 | {A,B} | 1
2 | {A,B} | 2
2 | {A} | 3
2 | {B} | 1
I want summarize 'col3 ' with a GROUP BY 'col1 ' by keeping only DISTINCT values ​​from 'col3 '
Expected result below :
col1 (integer) | col2 (varchar[]) | col3 (integer)
----------------------------------------------------
1 | {A,B,C} | 8
2 | {A,B} | 6
I tried this :
SELECT col1, array_to_string(array_accum(col2), ','::text),sum(col3) FROM table1 GROUP BY col1
but the result is not the one expected :
col1 (integer) | col2 (varchar[]) | col3 (integer)
---------------------------------------------------------------
1 | {A,B,C,A,A,B} | 8
2 | {A,B,A,B} | 6
do you have any suggestion?
If the logic of which col2 you want is by the largest (like in your expected output is {A,B,C} & {A,B}.
SELECT col1, (SELECT sub.col2
FROM table1 sub
INNER JOIN table1 sub ON MAX(char_length(sub.col2)) = col2
WHERE sub.col1 = col1)
SUM(col3)
FROM table1
GROUP BY col1
SELECT
col1,
array_to_string(array_accum(col2), ','::text),
sum(col3)
FROM table1
GROUP BY col1;
but array_to_string concatenates array elements using supplied delimiter and optional null string.
You have to devise a different strategy like using array_dims(anyarray) to select the array with max elements or create a new aggregation function.
For this you could be interested in this answer:
eliminate duplicate array values in postgres

SQL - group by - limit clause - postgresql

I have a table which has two columns C1 and C2.
C1 has an integer data type and C2 has text.
Table looks like this.
---C1--- ---C2---
1 | a |
1 | b |
1 | c |
1 | d |
1 | e |
1 | f |
1 | g |
2 | h |
2 | i |
2 | j |
2 | k |
2 | l |
2 | m |
2 | n |
------------------
My question: i want a sql query which does group by on column C1 but with size of 3.
looks like this.
------------------
1 | a,b,c |
1 | d,e,f |
1 | g |
2 | h,i,j |
2 | k,l,m |
2 | n |
------------------
is it possible by executing SQL???
Note: I do not want to write stored procedure or function...
You can use a common table expression to partition the results into rows, and then use STRING_AGG to join them into comma separated lists;
WITH cte AS (
SELECT *, (ROW_NUMBER() OVER (PARTITION BY C1 ORDER BY C2)-1)/3 rn
FROM mytable
)
SELECT C1, STRING_AGG(C2, ',') ALL_C2
FROM cte
GROUP BY C1,rn
ORDER BY C1
An SQLfiddle to test with.
A short explanation of the common table expression;
ROW_NUMBER() OVER (...) will number the results from 1 to n for each value of C1. We then subtract 1 and divide by 3 to get the sequence 0,0,0,1,1,1,2,2,2... and group by that value in the outer query to get 3 results per row.
Apart from Joachim Isaksson's answer,you try this method also
SELECT C1, string_agg(C2, ',') as c2
FROM (
SELECT *, (ROW_NUMBER() OVER (PARTITION BY C1 ORDER BY C2)-1)/3 as row_num
FROM atable) t
GROUP BY C1,row_num
ORDER BY c2