I have to make a query which will show used numbers and used times on few columns of integer type.
For this purpose I make a small example table with code suitable to paste into pgAdmin's sql editor:
DROP TABLE IF EXISTS mynums;
CREATE TABLE mynums
(rowindex serial primary key, mydate timestamp, num1 integer, num2 integer, num3 integer);
INSERT INTO mynums (rowindex, mydate, num1, num2, num3)
VALUES (1, '2015-03-09 07:12:45', 1, 2, 3),
(2, '2015-03-09 07:17:12', 4, 5, 2),
(3, '2015-03-09 07:22:43', 1, 2, 4),
(4, '2015-03-09 07:25:15', 3, 4, 5),
(5, '2015-03-09 07:41:46', 2, 5, 4),
(6, '2015-03-09 07:42:05', 1, 4, 5),
(7, '2015-03-09 07:45:16', 4, 1, 2),
(9, '2015-03-09 07:48:38', 5, 2, 3),
(10, '2015-03-09 08:15:44', 2, 3, 4);
Please help to build a query which would give results of used numbers and used times in columns num1, num2 and num3 together ordered by used times.
Result should be:
number times
2 7
4 7
1 4
3 4
5 5
You need to turn your columns into rows in order to be able to aggregate them:
select number, count(*)
from (
select num1 as number
from mynums
union all
select num2
from mynums
union all
select num3
from mynums
) t
group by number
order by number;
In general, having columns like num1, num2, num3 is a sign of a questionable database design. What happens if you need to add more numbers? It's better to create a one-to-many relationship and store the numbers associated with a rowindex in a separate table.
this would work:
select number, count(*) as times
FROM (
select rowindex, mydate, num1 as number FROM mynums
UNION ALL
select rowindex, mydate, num2 FROM mynums
UNION ALL
select rowindex, mydate, num3 FROM mynums
) as src
group by number
order by count(*) desc, number
http://sqlfiddle.com/#!15/cb1a7/3
Related
I have tried many ways but could not find the answer. My problem is:
there is Table ORG_DATA_AS_VARCHAR with a columnFLOATNMBRS (varchar)
FLOATNMBRS
--------------------------
0
0
*0,25 /*Yeah, there is a star in data ... bad data quality ....*/
*0,31
0
Now, my aim is to convert this strings to a float and update these new float values to a new (existing) table CONVERTED_DATA:
FLOATNMBRS (float)
--------------------------
0
0
0.25
0.31
0
...
What I have tried:
UPDATE CONVERTED_DATA
SET
FLOATNMBRS = b.newValue
FROM
(
Select convert (float, replace(replace(FLOATNMBRS, '*', ''),',','.')) as newValue from
ORG_DATA_AS_VARCHAR
) b
or
Replacing and converting it and create a #Temp Table with the new Values and Update CONVERTED_DATA with values from #Temp.
but everytime I ended up like:
FLOATNMBRS (float)
--------------------------
0
0
0
0
0
All values were updated as 0.
When I tried:
Select convert (float, replace(replace(FLOATNMBRS, '*', ''),',','.')) as newValue from
ORG_DATA_AS_VARCHAR
the result is correct. Even when I copy the value to #Temp. All values are correct.
Does someone know what I m doing wrong ???
You are possibly not matching the records in the 2 tables. It is not clear from your example how the rows are identified (what's the key).
Assuming you have the same ID in ORG_DATA_AS_VARCHAR and CONVERTED_DATA tables, this works:
create table #ORG_DATA_AS_VARCHAR (ID int, floatnmbrs_varchar varchar(128))
create table #CONVERTED_DATA (ID int, floatnmbrs float)
go
insert into #ORG_DATA_AS_VARCHAR (ID, floatnmbrs_varchar)
values (1, '0'), (2, '0'), (3, '*0,25'), (4, '*0.31'), (5, '0')
insert into #CONVERTED_DATA (ID, floatnmbrs)
values (1, 0), (2, 0), (3, 0), (4, 0), (5, 0)
go
update #CONVERTED_DATA
set floatnmbrs = x.converted
from (
select ID, converted = convert(float, replace(replace(floatnmbrs_varchar, '*', ''), ',', '.'))
from #ORG_DATA_AS_VARCHAR
) as x
where x.ID = #CONVERTED_DATA.ID
select * from #CONVERTED_DATA
go
drop table #ORG_DATA_AS_VARCHAR
drop table #CONVERTED_DATA
go
I would like to replace a set of running and non running numbers with commas and hyphens where appropriate.
Using STUFF & XML PATH I was able to accomplish some of what I want by getting something like 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 13, 15, 19, 20, 21, 22, 24.
WITH CTE AS (
SELECT DISTINCT t1.ORDERNo, t1.Part, t2.LineNum
FROM [DBName].[DBA].Table1 t1
JOIN Table2 t2 ON t2.Part = t1.Part
WHERE t1.ORDERNo = 'AB12345')
SELECT c1.ORDERNo, c1.Part, STUFF((SELECT ', ' + CAST(LineNum AS VARCHAR(5))
FROM CTE c2
WHERE c2.ORDERNo= c1.ORDERNo
FOR XML PATH('')), 1, 2, '') AS [LineNums]
FROM CTE c1
GROUP BY c1.ORDERNo, c1.Part
Here is some sample output:
ORDERNo Part LineNums
ON5650 PT01-0181 5, 6, 7, 8, 12
ON5652 PT01-0181 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 13, 15, 19, 20, 21, 22, 24
ON5654 PT01-0181 1, 4
ON5656 PT01-0181 1, 2, 4
ON5730 PT01-0181 1, 2
ON5253 PT16-3934 1, 2, 3, 4, 5
ON1723 PT02-0585 1, 2, 3, 6, 8, 9, 10
Would like to have:
OrderNo Part LineNums
ON5650 PT01-0181 5-8, 12
ON5652 PT01-0181 1-10, 13, 15, 19-22, 24
ON5654 PT01-0181 1, 4
ON5656 PT01-0181 1-2, 4
ON5730 PT01-0181 1-2
ON5253 PT16-3934 1-5
ON1723 PT02-0585 1-3, 6, 8-10
This is a classic gaps-and-islands problem.
(a good read on the subject is Itzik Ben-Gan's Gaps and islands from SQL Server MVP Deep Dives)
The idea is that you first need to identify the groups of consecutive numbers. Once you've done that, the rest is easy.
First, create and populate sample table (Please save us this step in your future questions):
DECLARE #T AS TABLE
(
N int
);
INSERT INTO #T VALUES
(1), (2), (3), (4),
(6),
(8),
(10), (11),
(13), (14), (15),
(17),
(19), (20), (21),
(25);
Then, use a common table expression to identify the groups.
With Grouped AS
(
SELECT N,
N - ROW_NUMBER() OVER(ORDER BY N) As Grp
FROM #T
)
The result if this cte is this:
N Grp
1 0
2 0
3 0
4 0
6 1
8 2
10 3
11 3
13 4
14 4
15 4
17 5
19 6
20 6
21 6
25 9
As you can see, while the numbers are consecutive, the grp value stays the same.
When a row has a number that isn't consecutive with the previous number, the grp value changes.
Then you select from that cte, using a case expression to either select a single number (if it's the only one in it's group) or the start and end of the group, separated by a dash:
SELECT STUFF(
(
SELECT ', ' +
CASE WHEN MIN(N) = MAX(N) THEN CAST(MIN(N) as varchar(11))
ELSE CAST(MIN(N) as varchar(11)) +'-' + CAST(MAX(N) as varchar(11))
END
FROM Grouped
GROUP BY grp
FOR XML PATH('')
), 1, 2, '') As GapsAndIslands
The result:
GapsAndIslands
1-4, 6, 8, 10-11, 13-15, 17, 19-21, 25
For fun I put together another way using Window Aggregates (e.g. SUM() OVER ...). I also use some newer T-SQL functionality such as CONCAT (2012+) and STRING_AGG (2017+). This using Zohar's sample data.
DECLARE #T AS TABLE(N INT PRIMARY KEY CLUSTERED);
INSERT INTO #T VALUES (1),(2),(3),(4),(6),(8),(10),(11),(13),(14),(15),(17),(19),(20),(21),(25);
WITH
a AS (
SELECT t.N,isNewGroup = SIGN(t.N-LAG(t.N,1,t.N-1) OVER (ORDER BY t.N)-1)
FROM #t AS t),
b AS (
SELECT a.N, GroupNbr = SUM(a.isNewGroup) OVER (ORDER BY a.N)
FROM a),
c AS (
SELECT b.GroupNbr,
txt = CONCAT(MIN(b.N), REPLICATE(CONCAT('-',MAX(b.N)), SIGN(MAX(b.N)-MIN(b.N))))
FROM b
GROUP BY b.GroupNbr)
SELECT STRING_AGG(c.txt,', ') WITHIN GROUP (ORDER BY c.GroupNbr) AS Islands
FROM c;
Returns:
Islands
1-4, 6 , 8, 10-11, 13-15, 17, 19-21, 25
And here an approach using a recursive CTE.
DECLARE #T AS TABLE(N INT PRIMARY KEY CLUSTERED);
INSERT INTO #T VALUES (1),(2),(3),(4),(6),(8),(10),(11),(13),(14),(15),(17),(19),(20),(21),(25);
WITH Numbered AS
(
SELECT N, ROW_NUMBER() OVER(ORDER BY N) AS RowIndex FROM #T
)
,recCTE AS
(
SELECT N
,RowIndex
,CAST(N AS VARCHAR(MAX)) AS OutputString
,(SELECT MAX(n2.RowIndex) FROM Numbered n2) AS MaxRowIndex
FROM Numbered WHERE RowIndex=1
UNION ALL
SELECT n.N
,n.RowIndex
,CASE WHEN A.TheEnd =1 THEN CONCAT(r.OutputString,CASE WHEN IsIsland=1 THEN '-' ELSE ',' END, n.N)
WHEN A.IsIsland=1 AND A.IsWithin=0 THEN CONCAT(r.OutputString,'-')
WHEN A.IsIsland=1 AND A.IsWithin=1 THEN r.OutputString
WHEN A.IsIsland=0 AND A.IsWithin=1 THEN CONCAT(r.OutputString,r.N,',',n.N)
ELSE CONCAT(r.OutputString,',',n.N)
END
,r.MaxRowIndex
FROM Numbered n
INNER JOIN recCTE r ON n.RowIndex=r.RowIndex+1
CROSS APPLY(SELECT CASE WHEN n.N-r.N=1 THEN 1 ELSE 0 END AS IsIsland
,CASE WHEN RIGHT(r.OutputString,1)='-' THEN 1 ELSE 0 END AS IsWithin
,CASE WHEN n.RowIndex=r.MaxRowIndex THEN 1 ELSE 0 END AS TheEnd) A
)
SELECT TOP 1 OutputString FROM recCTE ORDER BY RowIndex DESC;
The idea in short:
First we create a numbered set.
The recursive CTE will use the row's index to pick the next row, thus iterating through the set row-by-row
The APPLY determines three BIT values:
Is the distance to the previous value 1, then we are on the island, otherwise not
Is the last character of the growing output string a hyphen, then we are waiting for the end of an island, otherwise not.
...and if we've reached the end
The CASE deals with this four-field-matrix:
First we deal with the end to avoid a trailing hyphen at the end
Reaching an island we add a hyphen
Staying on the island we just continue
Reaching the end of an island we add the last number, a comma and start a new island
any other case will just add a comma and start a new island.
Hint: You can read island as group or section, while the commas mark the gaps.
Combining what I already had and using Zohar Peled's code I was finally able to figure out a solution:
WITH cteLineNums AS (
SELECT TOP 100 PERCENT t1.OrderNo, t1.Part, t2.LineNum
, (t2.line_number - ROW_NUMBER() OVER(PARTITION BY t1.OrderNo, t1.Part ORDER BY t1.OrderNo, t1.Part, t2.LineNum)) AS RowSeq
FROM [DBName].[DBA].Table1 t1
JOIN Table2 t2 ON t2.Part = t1.Part
WHERE t1.OrderNo = 'AB12345')
GROUP BY t1.OrderNo, t1.Part, t2.LineNum
ORDER BY t1.OrderNo, t1.Part, t2.LineNum)
SELECT OrderNo, Part
, STUFF((SELECT ', ' +
CASE WHEN MIN(line_number) = MAX(line_number) THEN CAST(MIN(line_number) AS VARCHAR(3))
WHEN MIN(line_number) = (MAX(line_number)-1) THEN CAST(MIN(line_number) AS VARCHAR(3)) + ', ' + CAST(MAX(line_number) AS VARCHAR(3))
ELSE CAST(MIN(line_number) AS VARCHAR(3)) + '-' + CAST(MAX(line_number) AS VARCHAR(3))
END
FROM cteLineNums c1
WHERE c1.OrderNo = c2.OrderNo
AND c1.Part = c2.Part
GROUP BY OrderNo, Part
ORDER BY OrderNo, Part
FOR XML PATH('')), 1, 2, '') AS [LineNums]
FROM cteLineNums c2
GROUP BY OrderNo, Part
I used the ROW_NUMBER() OVER PARTITION BY since I returned multiple records with different Order Numbers and Part Numbers. All this lead to me still having to do the self join in the second part in order to get the correct LineNums to show for each record.
The second WHEN in the CASE statement is due to the code defaulting to having something like 2, 5, 8-9, 14 displayed when it should be 2, 5, 8, 9, 14.
So my tables are:
user_msgs: http://sqlfiddle.com/#!9/7d6a9
token_msgs: http://sqlfiddle.com/#!9/3ac0f
There are only these 4 users as listed. When a user sends a message to another user, the query checks if there is a communication between those 2 users already started by checking the token_msgs table's from_id and to_id and if no token exists, create token and use that in the user_msgs table. So the token is a unique field in these 2 tables.
Now, I want to list the users with whom user1 has started the conversation. So if from_id or to_id include 1 those conversation should be listed.
There are multiple rows for conversations in the user_msgs table for same users.
I think I need to use group_concat but not sure. I am trying to build the query to do the same and show the latest of the conversation on the top, hence ORDER BY time DESC:
SELECT * FROM (SELECT * FROM user_msgs ORDER BY time DESC) as temp_messages GROUP BY token
Please help in building the query.
Thanks.
CREATE TABLE `token_msgs` (
`id` int(11) NOT NULL,
`from_id` int(100) NOT NULL,
`to_id` int(100) NOT NULL,
`token` varchar(50) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
--
-- Dumping data for table `token_msgs`
--
INSERT INTO `token_msgs` (`id`, `from_id`, `to_id`, `token`) VALUES
(1, 1, 2, '1omcda84om2'),
(2, 1, 3, '1omd0666om3'),
(3, 4, 1, '4om6713bom1'),
(4, 3, 4, '3om0e1abom4');
---
CREATE TABLE `user_msgs` (
`id` int(11) NOT NULL,
`token` varchar(50) NOT NULL,
`from_id` int(50) NOT NULL,
`to_id` int(50) NOT NULL,
`message` text NOT NULL,
`time` datetime NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
--
-- Dumping data for table `user_msgs`
--
INSERT INTO `user_msgs` (`id`, `token`, `from_id`, `to_id`, `message`, `time`) VALUES
(1, '1omcda84om2', 1, 2, '1 => 2\r\nCan I have your picture so I can show Santa what I want for Christmas?', '2016-08-14 22:50:34'),
(2, '1omcda84om2', 2, 1, 'Makeup tip: You\'re not in the circus.\r\n2=>1', '2016-08-14 22:51:26'),
(3, '1omd0666om3', 1, 3, 'Behind every fat woman there is a beautiful woman. No seriously, your in the way. 1=>3', '2016-08-14 22:52:08'),
(4, '1omd0666om3', 3, 1, 'Me: Siri, why am I alone? Siri: *opens front facing camera*', '2016-08-14 22:53:24'),
(5, '1omcda84om2', 1, 2, 'I know milk does a body good, but damn girl, how much have you been drinking? 1 => 2', '2016-08-14 22:54:36'),
(6, '4om6713bom1', 4, 1, 'Hi, Im interested in your profile. Please send your contact number and I will call you.', '2016-08-15 00:18:11'),
(7, '3om0e1abom4', 3, 4, 'Girl you\'re like a car accident, cause I just can\'t look away. 3=>4', '2016-08-15 00:42:57'),
(8, '3om0e1abom4', 3, 4, 'Hola!! \r\n3=>4', '2016-08-15 00:43:34'),
(9, '1omd0666om3', 3, 1, 'Sometext from 3=>1', '2016-08-15 13:53:54'),
(10, '3om0e1abom4', 3, 4, 'More from 3->4', '2016-08-15 13:54:46');
Let's try this (on fiddle):
SELECT *
FROM (SELECT * FROM user_msgs
WHERE from_id = 1 OR to_id = 1
ORDER BY id DESC
) main
GROUP BY from_id + to_id
ORDER BY id DESC
Thing to mention GROUP BY from_id + to_id this is because sum makes it unique for each conversation between two persons: like from 1 to 3 is same as from 3 to 1. No need for extra table, and it makes it harder to maintain
UPDATE:
Because sometimes GROUPing works weird in MySQL I've created new approach to this problem:
SELECT
a.*
FROM user_msgs a
LEFT JOIN user_msgs b
ON ((b.`from_id` = a.`from_id` AND b.`to_id` = a.`to_id`)
OR (b.`from_id` = a.`to_id` AND b.`to_id` = a.`from_id`))
AND a.`id` < b.`id`
WHERE (a.from_id = 1 OR a.to_id = 1)
AND b.`id` IS NULL
ORDER BY a.id DESC
I suppose it is not easy to query a table for data which don't exists but maybe here is some trick to achieve holes in one integer column (rowindex).
Here is small table for illustrating concrete situation:
DROP TABLE IF EXISTS examtable1;
CREATE TABLE examtable1
(rowindex integer primary key, mydate timestamp, num1 integer);
INSERT INTO examtable1 (rowindex, mydate, num1)
VALUES (1, '2015-03-09 07:12:45', 1),
(3, '2015-03-09 07:17:12', 4),
(5, '2015-03-09 07:22:43', 1),
(6, '2015-03-09 07:25:15', 3),
(7, '2015-03-09 07:41:46', 2),
(10, '2015-03-09 07:42:05', 1),
(11, '2015-03-09 07:45:16', 4),
(14, '2015-03-09 07:48:38', 5),
(15, '2015-03-09 08:15:44', 2);
SELECT rowindex FROM examtable1;
With showed query I get all used indexes listed.
But I would like to get (say) first five indexes which is missed so I can use them for insert new data at desired rowindex.
In concrete example result will be: 2, 4, 8, 9, 12 what represent indexes which are not used.
Is here any trick to build a query which will give n number of missing indexes?
In real, such table may contain many rows and "holes" can be anywhere.
You can do this by generating a list of all numbers using generate_series() and then check which numbers don't exist in your table.
This can either be done using an outer join:
select nr.i as missing_index
from (
select i
from generate_series(1, (select max(rowindex) from examtable1)) i
) nr
left join examtable1 t1 on nr.i = t1.rowindex
where t1.rowindex is null;
or an not exists query:
select i
from generate_series(1, (select max(rowindex) from examtable1)) i
where not exists (select 1
from examtable1 t1
where t1.rowindex = i.i);
I have used a hardcoded lower bound for generate_series() so that you would also detect a missing rowindex that is smaller than the lowest number.
Is the data imputation method Last Observation Carried Forward (LOCF) implemented in PostgreSQL?
If not, how could I implement this method?
The following code assumes a table tbl with columns a, b (keys), t (time) and v (value to locf impute):
create or replace function locf_s(a float, b float)
returns float
language sql
as '
select coalesce(b, a)
';
drop aggregate if exists locf(float);
CREATE AGGREGATE locf(FLOAT) (
SFUNC = locf_s,
STYPE = FLOAT
);
select a,b,t,v,
locf(v) over (PARTITION by a,b ORDER by t) as v_locf
from tbl
order by a,b,t
;
(SQLFiddle)
For a tutorial: "LOCF and Linear Imputation with PostgreSQL"
I based this table and data directly on the table in the linked article.
create table test (
unit integer not null
check (unit >= 1),
obs_time integer not null
check (obs_time >= 1),
obs_value numeric(5, 1),
primary key (unit, obs_time)
);
insert into test values
(1, 1, 3.8), (1, 2, 3.1), (1, 3, 2.0),
(2, 1, 4.1), (2, 2, 3.5), (2, 3, 3.8), (2, 4, 2.4), (2, 5, 2.8), (2, 6, 3.0),
(3, 1, 2.7), (3, 2, 2.4), (3, 3, 2.9), (3, 4, 3.5);
For the six observations in the linked article we need all the possible combinations of "unit" and "obs_time".
select distinct unit, times.obs_time
from test
cross join (select generate_series(1, 6) obs_time) times;
unit obs_time
--
1 1
1 2
1 3
1 4
1 5
1 6
2 1
. . .
3 6
We also need to know which row has the last observed value in it for each unit.
select unit, max(obs_time) obs_time
from test
group by unit
order by unit;
unit obs_time
--
1 3
2 6
3 4
Knowing those two sets, we can join and coalesce to get the last observation and carry it forward.
with unit_times as (
select distinct unit, times.obs_time
from test
cross join (select generate_series(1, 6) obs_time) times
), last_obs_time as (
select unit, max(obs_time) obs_time
from test
group by unit
)
select t1.unit, t1.obs_time,
coalesce(t2.obs_value, (select obs_value
from test
inner join last_obs_time
on test.unit = last_obs_time.unit
and test.obs_time = last_obs_time.obs_time
where test.unit = t1.unit)) obs_value
from unit_times t1
left join test t2
on t1.unit = t2.unit and t1.obs_time = t2.obs_time
order by t1.unit, t1.obs_time;
unit obs_time obs_value
--
1 1 3.8
1 2 3.1
1 3 2.0
1 4 2.0
1 5 2.0
1 6 2.0
2 1 4.1
. . .
3 4 3.5
3 5 3.5
3 6 3.5
To get the same visual output as the linked article shows, use the crosstab() function in the tablefunc module. You could also do that manipulation with application code.