How to multiple decimal numbers in column within a group by? - amazon-redshift

I have sql table that looks like this:
date id value type
2020-01-01 1 1.03 a
2020-01-01 1 1.02 a
2020-01-02 2 1.06 a
2020-01-02 2 1.2 a
2020-01-03 3 1.09 b
I need to build a query that groups by date,id, and type by multiplying the value column whereever type = 'a'.
what new table should look like:
date id value type
2020-01-01 1 1.0506 a
2020-01-02 2 1.272 a
2020-01-03 3 1.09 b
currently I am building this query,
select
date, id, value, type
from my_table
where date between 'some date' and 'some date'
and trying to fit in EXP(SUM(LOG(value)
but, how do I do the multiplication only where type = 'a' in a group by?
edit:
there are more than 2 values in the type column
I am using redshift. Not postgresql.

select date
, id
-- use the 'case' syntax to check if it is type 'a'
, case when type = 'a' then EXP(SUM(LOG(value::float))) -- your multiply logic
else max(value) -- use min or max to pick only one value
end as value
from my_table
where date between 'some date' and 'some date'
group
by date, id, type

Related

ORDER BY MIN date

I need to fetch the number of employees per month, having a first work in a selected period. And I have to display only the month when the employee appears for the first time. My request works fine, but I need to order the result by date. Here is my request:
SELECT TO_CHAR(sub.minStartDate,'mm/YYYY') as date,
COUNT(DISTINCT sub.id) AS nombre
FROM (
SELECT MIN(sw.start_date) as minStartDate,
e.id
FROM employee e
INNER JOIN social_work sw ON e.id = sw.employee_id
GROUP BY e.id
HAVING MIN(sw.start_date) BETWEEN '2020-01-01' AND '2022-12-31'
) sub
GROUP BY date
ORDER BY date
And the result:
date | nombre
--------------
04/2021 | 2
05/2020 | 1
Excepted output:
date | nombre
--------------
05/2020 | 1
04/2021 | 2
I've tried to put sub.minStartDate in the ORDER BY clause but then I also have to put it in GROUP BY clause, what gives me this output :
date | nombre
--------------
05/2020 | 1
04/2021 | 1
04/2021 | 1
And it's not what I want.
You're ordering by date, which is the result of the TO_CHAR() function. The TO_CHAR() function returns a text, so your ORDER BY clause results in an alphanumeric sort.
Since you don't want to ORDER BY sub.minStartDate, you could try changing your format to put the least significant variable of the date (in this case, the month) to the right: TO_CHAR(sub.minStartDate, 'YYYY/mm').
If you can't change your format either, then you'll probably have to resort to grouping and ordering by minStartDate:
SELECT
TO_CHAR(sub.minStartDate,'mm/YYYY') as date,
TO_CHAR(sub.minStartDate,'YYYY/mm') sortingDate,
COUNT(DISTINCT sub.id) AS nombre
FROM
-- omitted for simplicity
GROUP BY date, sortingDate
ORDER BY sortingDate

confusion in using select command in postgresql with timestamp column

I have table which has structure like this.
CREATE TABLE users (
id serial NOT NULL,
created_at timestamp NOT NULL
)
I have more than 1000 records in this table.
This is my first query.
1 query
select id,created_at from users where id in (1051,1052)
This returns two rows which is correct as as expected. However when I use
2nd Query
select id,created_at from users where created_at = '2020-06-28'
or
select id,created_at from users where created_at = date '2020-06-28'
It returns nothing, this is not expected result as it should return two rows against it.
Similarly if I use this
3rd Query
select id, created_at from users where created_at between date '2020-06-28' and date '2020-06-28'
It returns nothing however I think this should also return two rows.
While this
4th Query
select id, created_at from users where created_at between date '2020-06-28' and date '2020-06-29'
returns two rows.
Show timezone returns correct timezong in which currently i am
I did`t understand this, why the results are different in 2nd, 3rd and 4th query. How can i get same result as of query 1 using 3rd query.
One single reason for all your queries is that you are comparing timestamp with date
in Query 2
You are comparing 2020-06-28 13:02:53 = 2020-06-28 00:00:00 which will not match so returning no records.
in Query 3
You are using between i.e. 2020-06-28 13:02:53 between 2020-06-28 00:00:00 and 2020-06-28 00:00:00 which will not match so returning no records.
in Query 4
You are using between i.e. 2020-06-28 13:02:53 between 2020-06-28 00:00:00 and 2020-06-29 00:00:00. Here both records are falling in those timestamps and you are getting the records
So you have to compare date values. As right operand is a date type value, you have to convert the left operand to date. try this
for 2nd Query
select id,created_at from users where date(created_at) = '2020-06-28'
for 3rd Query
select id, created_at from users where date(created_at) between date '2020-06-28' and date '2020-06-28'
You should opt 3rd method if you want to compare a date range. For one day only you should use 2nd query.
Because what you are doing is:
test(5432)=# select '2020-06-28'::timestamp;
timestamp
---------------------
06/28/2020 00:00:00
You are selecting for created_at that is exactly at midnight and there is none. The same thing when you do:
select id, created_at from users where created_at between date '2020-06-28' and date '2020-06-28'
You already corrected the mistake in your 3rd query in the 4th query:
select id, created_at from users where created_at between date '2020-06-28' and date '2020-06-29'
where you span the time from midnight of 06/28/2020 to midnight 06/29/2020
An alternate solution is:
create table dt_test(id integer, ts_fld timestamp);
insert into dt_test values (1, '07/04/2020 8:00'), (2, '07/05/2020 1:00'), (3, '07/05/2020 8:15');
select * from dt_test ;
id | ts_fld
----+---------------------
1 | 07/04/2020 08:00:00
2 | 07/05/2020 01:00:00
3 | 07/05/2020 08:15:00
select * from dt_test where date_trunc('days', ts_fld) = '07/05/2020'::date;
id | ts_fld
----+---------------------
2 | 07/05/2020 01:00:00
3 | 07/05/2020 08:15:00
In your case:
select id, created_at from users where date_trunc('days', created_at) = '2020-06-28'::date;

Get distinct rows based on one column with T-SQL

I have a column in the following format:
Time Value
17:27 2
17:27 3
I want to get the distinct rows based on one column: Time. So my expected result would be one result. Either 17:27 3 or 17:27 3.
Distinct
T-SQL uses distinct on multiple columns instead of one. Distinct would return two rows since the combinations of Time and Value are unique (see below).
select distinct [Time], * from SAPQMDATA
would return
Time Value
17:27 2
17:27 3
instead of
Time Value
17:27 2
Group by
Also group by does not appear to work
select * from table group by [Time]
Will result in:
Column 'Value' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Questions
How can I select all unique 'Time' columns without taking into account other columns provided in a select query?
How can I remove duplicate entries?
This is where ROW_NUMBER will be your best friend. Using this as your sample data...
time value
-------------------- -----------
17:27 2
17:27 3
11:36 9
15:14 5
15:14 6
.. below are two solutions with that you can copy/paste/run.
DECLARE #youtable TABLE ([time] VARCHAR(20), [value] INT);
INSERT #youtable VALUES ('17:27',2),('17:27',3),('11:36',9),('15:14',5),('15:14',6);
-- The most elegant way solve this
SELECT TOP (1) WITH TIES t.[time], t.[value]
FROM #youtable AS t
ORDER BY ROW_NUMBER() OVER (PARTITION BY t.[time] ORDER BY (SELECT NULL));
-- A more efficient way solve this
SELECT t.[time], t.[value]
FROM
(
SELECT t.[time], t.[value], ROW_NUMBER() OVER (PARTITION BY t.[time] ORDER BY (SELECT NULL)) AS RN
FROM #youtable AS t
) AS t
WHERE t.RN = 1;
Each returns:
time value
-------------------- -----------
11:36 9
15:14 5
17:27 2

PostgreSQL group by error

I have a relation in a PostgreSQL database called 'processed_data' having the following schema:
Date -> date type, shop_id -> integer type, item_category_id -> integer type, sum_item_cnt_day -> real type.
Displaying the first 5 rows of the relation is as follows:
date | shop_id | item_category_id | sum_item_cnt_day
------+-----------+--------------------+------------------
2014-12-29 | 49 | 3 | 4
2014-12-29 | 49 | 6 | 1
2014-12-29 | 49 | 7 | 1
2014-12-29 | 49 | 12 | 3
2014-12-29 | 49 | 16 | 1
Now, the 'shop_id' has 60 unique shops ranging from 0-59 where each shop sells some items grouped to a new column 'item_category_id' where 'sum_item_cnt_day' denotes the number of items sold by a shop and it's item_category_id.
I am now trying to further aggregate the data by just trying to get the following columns as final result-
date, shop_id, sum_item_cnt_day
So that, data is aggregated according to number of all items sold in 'item_category_id' per shop (denoted by 'shop_id') and calculating sum of 'sum_item_cnt_day'.
When I try to execute the following SQL command-
select date, shop_id, sum(sum_item_cnt_day) from processed_data group by shop_id;
It gives the error-
ERROR: column "processed_data.date" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select date, shop_id, sum(sum_item_cnt_day) from processed_d...
^
Even the following SQL command-
select date, shop_id, sum(sum_item_cnt_day) from processed_data where date between '2013-01-01' and '2013-01-31' group by shop_id;
Gives the error-
ERROR: column "processed_data.date" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select date, shop_id, sum(sum_item_cnt_day) from processed_d...
^
Any suggestions as to what's going wrong and what am I missing?
Thanks!
The simplest fix, which is what I think you want, would be to just add date to the GROUP BY clause:
SELECT date, shop_id, SUM(sum_item_cnt_day)
FROM processed_data
GROUP BY date, shop_id;
If you really don't want sums taken for each shop on each day, but rather for each shop over all days, then you will have to think of which of the many dates you want to display.

LIKE operator in Postgresql

Is it possible using LIKE operator to write a query to find values that residing in a numeric datatype column?
For example,
Table sample
ID | VALUE(numeric)
1 | 1.00
2 | 2.00
select * from sample where VALUE LIKE '1%'
Please clear my doubt...
If I understood you correctly then following could be a solution for you
consider this sample
create table num12 (id int,VALUE numeric);
insert into num12 values (1,1.00),(2,2.00);
insert into num12 values (3,1.50),(4,1.90);
the table look like
id value
-- -----
1 1.00
2 2.00
3 1.50
4 1.90
select * from num12 where value =1
will return only single row,
id value
-- -----
1 1.00
If you want to select all 1s then use(I guess you're trying to find a solution for this)
select * from num12 where trunc(value) =1
result:
id value
-- -----
1 1.00
3 1.50
4 1.90
Is it possible using LIKE operator to write a query to find values
that residing in a numeric datatype column?
Answer: Yes
You can use select * from num12 where value::text like '1%'
Note : It yields same result as shown above but its not a good method