How to get sum of each column from the same select query - oracle10g

I have a table as follows
Col1 Col2 Col3 Col4
------------------------------
100 400 400 300
200 600 400 700
800 600 500 900
300 100 700 500
--------------------------------
Total 1700 2000 2400
IAs you can see, I want total of each column (excluding 1st column).
I am not sure whether we can fetch the total of each column with same select query which I am using to fetch this data.
If not please suggest any alternative.

Use UNION ALL operator to get this done like,
SELECT col1, col2, col3, col4
FROM <table>
UNION ALL
SELECT NULL, sum(col2), sum(col3), sum(col4)
FROM <table>;

Why cant you achieve this by simple select statement,
select sum(col2),sum(col3),sum(col4) from table;
Hope this works !!

Related

Taking N-samples from each group in PostgreSQL

I have a table containing data that has a column named id that looks like below:
id
value 1
value 2
value 3
1
244
550
1000
1
251
551
700
1
540
60
1200
...
...
...
...
2
19
744
2000
2
10
903
100
2
44
231
600
2
120
910
1100
...
...
...
...
I want to take 50 sample rows per id that exists but if less than 50 exist for the group to simply take the entire set of data points.
For example I would like a maximum 50 data points randomly selected from id = 1, id = 2 etc...
I cannot find any previous questions similar to this but have tried taking a stab at at least logically working through the solution where I could iterate and union all queries by id and limit to 50:
SELECT * FROM (SELECT * FROM schema.table AS tbl WHERE tbl.id = X LIMIT 50) UNION ALL;
But it's obvious that you cannot use this type of solution because UNION ALL requires aggregating outputs from one id to the next and I do not have a list of id values to use in place of X in tbl.id = X.
Is there a way to accomplish this by gathering that list of unique id values and union all results or is there a more optimal way this could be done?
If you want to select a random sample for each id, then you need to randomize the rows somehow. Here is a way to do it:
select * from (
select *, row_number() over (partition by id order by random()) as u
from schema.table
) as a
where u <= 50;
Example (limiting to 3, and some row number for each id so you can see the selection randomness):
setup
DROP TABLE IF EXISTS foo;
CREATE TABLE foo
(
id int,
value1 int,
idrow int
);
INSERT INTO foo
select 1 as id, (1000*random())::int as value1, generate_series(1, 100) as idrow
union all
select 2 as id, (1000*random())::int as value1, generate_series(1, 100) as idrow
union all
select 3 as id, (1000*random())::int as value1, generate_series(1, 100) as idrow;
Selection
select * from (
select *, row_number() over (partition by id order by random()) as u
from foo
) as a
where u <= 3;
Output:
id
value1
idrow
u
1
542
6
1
1
24
86
2
1
155
74
3
2
505
95
1
2
100
46
2
2
422
33
3
3
966
88
1
3
747
89
2
3
664
19
3
In case you are looking to get 50 (or less) from each group of IDs then you can use windowing -
From question - "I want to take 50 sample rows per id that exists but if less than 50 exist for the group to simply take the entire set of data points."
Query -
with data as (
select row_number() over (partition by id order by random()) rn,
* from table_name)
select * from data where rn<=50 order by id;
Fiddle.
Your description of trying to get the UNION ALL without specifying all the branches ahead of time is aiming for a LATERAL join. And that is one way to solve the problem. But unless you have a table of all distinct ids, you would have to compute one on the fly. For example (using the same fiddle as Pankaj used):
with uniq as (select distinct id from test)
select foo.* from uniq cross join lateral
(select * from test where test.id=uniq.id order by random() limit 3) foo
This could be either slower or faster than the Window Function method, depending on your system and your data and your indexes. In my hands, it was quite a bit faster even with the need to dynamically compute the list of distinct ids.

TSQL Query Calculating number of days and number of occurrences

I'm having some issues writing a query summarizing some stocking information from a query.
I've been trying a few CTE (Common table Expressions ) grouping and subqueries and LAG function to try to summarize the data. Oddly I've gotten stuck on this problem the last few days.
Here is example of the data that I'm dealing with.
--Optional create table
--Drop table testdata
--Create table testdata ( Part int, StockDate date, OutOfStock bit);
with inventorydata(Part, StockDate, OutOfStock) as
(
select 1000, '1/1/2019',1
union
select 1000,'1/2/2019',1
union
select 1000, '1/3/2019',1
union
select 1000, '1/4/2019',0
union
select 1000, '1/5/2019',1
union
select 1005, '1/1/2019',0
union
select 1005,'1/2/2019',1
union
select 1005, '1/3/2019',1
union
select 1005, '1/4/2019',1
union
select 1005, '1/5/2019',0
)
--Insert into testdata ( Part,StockDate,OutOfStock)
Select Part,StockDate,OutOfStock from inventorydata
--Select * from testdata
Output
Part StockDate OutOfStock
----------- --------- -----------
1000 1/1/2019 1
1000 1/2/2019 1
1000 1/3/2019 1
1000 1/4/2019 0
1000 1/5/2019 1
1005 1/1/2019 0
1005 1/2/2019 1
1005 1/3/2019 1
1005 1/4/2019 1
1005 1/5/2019 0
I'm trying get the desired output.
Part StockDate BackInStock Occurance
----------- --------- ----------- -----------
1000 1/1/2019 1/3/2019 1
1000 1/5/2019 1/5/2019 1
1005 1/2/2019 1/4/2019 1
Any help is much appreciated.
Thank you.
Without more detail, it's hard to be sure of exactly what you are going for. Your desired output is missing some detail that would be helpful in giving a better answer. With that said, hopefully this will help you out.
My assumptions:
You are looking to find the periods where parts are out of stock, and note the date they first were out of stock and the date they came back into stock
I'm interpreting "back in stock" date to be the date that an item first had a record noting it was not out of stock. For example, part 1000 went out of stock on 1/1/2019 and would be back in stock starting 1/4/2019, per the data, rather than 1/3/2019 as in the example
If the item is not back in stock, then we don't want a "back in stock" date
I'm going to put your example data into a temp table to make life a little easier for demonstrating:
with inventorydata(Part, StockDate, OutOfStock) as
(
select 1000, '1/1/2019',1
union
select 1000,'1/2/2019',1
union
select 1000, '1/3/2019',1
union
select 1000, '1/4/2019',0
union
select 1000, '1/5/2019',1
union
select 1005, '1/1/2019',0
union
select 1005,'1/2/2019',1
union
select 1005, '1/3/2019',1
union
select 1005, '1/4/2019',1
union
select 1005, '1/5/2019',0
)
Select Part,StockDate,OutOfStock
INTO #Test
from inventorydata
Consider the following:
SELECT
Part,
StockDate,
BackInStockDate
FROM
(
SELECT
Part,
StockDate,
OutOfStock,
LEAD(StockDate, 1) OVER (PARTITION BY Part ORDER BY StockDate) AS BackInStockDate
FROM
(
SELECT
Part,
StockDate,
OutOfStock,
LAG(OutOfStock, 1) OVER (PARTITION BY Part ORDER BY StockDate) AS PrevOutOfStock
FROM #Test
) AS InnerData
WHERE
OutOfStock <> PrevOutOfStock
OR PrevOutOfStock IS NULL
) AS OuterData
WHERE
OutOfStock = 1
The InnerDataquery pulls each record along with the OutOfStock value for the previous row. We use that to find only those rows that represent a date where the status of the stock changed from in-stock to out-of-stock or visa-versa. This is determine by OutOfStock <> PrevOutOfStock and PrevOutOfStock IS NULL.
Once we have just the rows that represent changes we look at the next row to get the date we saw the state of the part change from the state represented in the current row.'
Finally, we filter out rows where OutOfStock = 0, as those would represent in-stock periods, which we are ignoring. This gives us the following result:
Part StockDate BackInStockDate
------- ----------- ----------------
1000 1/1/2019 1/4/2019
1000 1/5/2019 NULL
1005 1/2/2019 1/5/2019
You can modify the structure to get a different BackInStockDate if this isn't the value you are wanting, and add some counting to get whatever Occurances is supposed to be.

Replace infinity with nulls throughout entire table KDB

Example table:
table:([]col1:20 40 30 0w;col2:4?4;col3: 100 200 0w 300)
My solution:
{.[table;(where 0w=table[x];x);:;0n]}'[exec c from meta table where t="f"]
There is a way I am not seeing I'm sure. This just returns a list of for each change which I don't want. I just want the original table returned with nulls replaced.
Thanks in advance!
It would be good to flesh out your question a bit more. Are you always expecting it to be float columns? Will the table have many columns? Will there be string/sym columns mixed in that might complicate things?
If your table has a small number of columns you could just do an update
q)show t
col1 col2 col3
--------------
20 1 100
40 2 200
30 2 0w
0w 1 300
q)inftonull:{(x where x=0w):0n;x}
q)update inftonull col1, inftonull col3 from t
col1 col2 col3
--------------
20 2 100
40 1 200
30 0
3 300
If you think the column names might change or have a very large number of columns you could try a functional update (where you can pass the column names as parameters)
q){![t;();0b;x!inftonull,/:x,:()]}`col1`col3
col1 col2 col3
--------------
20 1 100
40 2 200
30 2
1 300
If your table is comprised of only numeric data something like
q)flip{(x where x=.Q.t[type x]$0w):x 0N;x}each flip t
col1 col2 col3
--------------
20 2 100
40 1 200
30 0
3 300
Might work, which tries to account for the fact the numeric data has different types.
If your data is going to contain string/sym columns the last example won't work

Default value in select query for null values in postgres

I have a table with sales Id, product code and amount. Some places product code is null. I want to show Missing instead of null. Below is my table.
salesId prodTypeCode amount
1 123 150
2 123 200
3 234 3000
4 234 400
5 234 500
6 123 200
7 111 40
8 111 500
9 1000
10 123 100
I want to display the total amount for every prodTypeCode with the option of If the prodTypeCode is null then Missing should be displayed.
select (CASE WHEN prodTypeCode IS NULL THEN
'Missing'
ELSE
prodTypeCode
END) as ProductCode, SUM(amount) From sales group by prodTypeCode
Above query giving error. Please suggest me to overcome this issue. I ahve created a SQLFIDDLE
The problem is a mismatch of datatypes; 'Missing' is text, but the product type code is numeric.
Cast the product type code to text so the two values are compatible:
select (CASE WHEN prodTypeCode IS NULL THEN
'Missing'
ELSE
prodTypeCode::varchar(40)
END) as ProductCode, SUM(amount) From sales group by prodTypeCode
See SQLFiddle.
Or, simpler:
select coalesce(prodTypeCode::varchar(40), 'Missing') ProductCode, SUM(amount)
from sales
group by prodTypeCode
See SQLFiddle.
Perhaps you have a type mismatch:
select coalesce(cast(prodTypeCode as varchar(255)), 'Missing') as ProductCode,
SUM(amount)
From sales s
group by prodTypeCode;
I prefer coalesce() to the case, simply because it is shorter.
I tried all 2 answers in my case and both did not work. I hope this snippet can help if both do not work for someone else:
SELECT
COALESCE(NULLIF(prodTypeCode,''), 'Missing') AS ProductCode,
SUM(amount)
From sales s
group by prodTypeCode;

kdb+: group by and sum over multiple columns

Consider the following data:
table:
time colA colB colC
-----------------------------------
11:30:04.194 31 250 a
11:30:04.441 31 280 a
11:30:14.761 31.6 100 a
11:30:21.324 34 100 a
11:30:38.991 32 100 b
11:31:20.968 32 100 b
11:31:56.922 32.2 1000 b
11:31:57.035 32.6 5000 c
11:32:05.810 33 100 c
11:32:05.810 33 100 a
11:32:14.461 32 300 b
Now how can I sum colB whenever colC is the same, without losing the time order.
So the output would be:
first time avgA sumB colC
-----------------------------------
11:30:04.194 31.2 730 a
11:30:38.991 32.07 1200 b
11:31:57.035 32.8 5100 c
11:32:05.810 33 100 a
11:32:14.461 32 300 b
What I have so far:
select by time from (select first time, avg colA, sum colB by colC, time from table)
But the output is not grouped by colC. How should the query look like?
How about this?
get select first time, avg colA, sum colB, first colC by sums colC<>prev colC from table
A slightly different way to achieve this using differ :
value select first time, avg colA, sum colB , first colC by g:(sums differ colC) from table