So let's say I have a table of:
Name Born
John 1994-01-01
John 1994-02-08
Jack 1995-03-09
Bob 1992-03-10
Tom 1995-07-13
Ronda 1984-01-25
And I want to make it that it only shows
John 1994-01-01
Ronda 1984-01-25
Jack 1995-03-09
Bob 1992-03-10
Because they are born in the same months.
I've tried different selects with EXTRACT and such but it doesn't seem to work for me:|
You can do this with window functions:
select t.*
from (select t.*,
count(*) over (partition by extract(month from born)) as cnt
from t
) t
where cnt > 1
order by extract(month from born);
Related
I am having a hard time trying to get this done. I have the following table:
cod_prod seller price date
A Andres 10 anydate
A Paul 5 anydate
A Mike 2.5 anydate
A Josh 1.75 anydate
A Karen 7.5 anydate
.... ..... ... .......
I am trying to calculate quartiles of the price for each product and classify each seller's price into 4 quartiles.
The output I am expecting is:
Cod_Prod Seller Price Quartile 1stQ 2ndQ 3rdQ 4thQ
A Andres 10 4 2.5 5 7.5 10
A Karen 7.5 3 2.5 5 7.5 10
A Paul 5 2 2.5 5 7.5 10
A Mike 2.5 1 2.5 5 7.5 10
A Josh 1.75 1 2.5 5 7.5 10
.. ..... .... .... .... .. ... ...
This table has thousands of distinct cod_prod and thousands of sellers.
I am trying this query:
with cte as (
select seller, cod_prod, sum(price) as sum_price
from tablename
group by 2,1
)
select seller,
cod_prod,
sum_price,
ntile(4) over (partition by seller order by sum_price asc) quartile
from cte
But this not doing what I expect and still mising the 1stQ to 4thQ indicators bins
I tried many different things but this is the closest I got from what I want.
Can someone help me to solve it?
I am not sure if this query is exactly what you want, but I think can help you.
I calculated quartiles grouping by cod_prod.
WITH cte AS (SELECT seller, cod_prod, sum(price) as sum_price
FROM t
GROUP BY seller, cod_prod),
quartiles AS (SELECT
cod_prod,
percentile_cont(0.25) within group (order by sum_price asc) as "1stQ",
percentile_cont(0.50) within group (order by sum_price asc) as "2ndQ",
percentile_cont(0.75) within group (order by sum_price asc) as "3rdQ",
percentile_cont(1) within group (order by sum_price asc) as "4thQ"
FROM cte
GROUP BY cod_prod)
SELECT cte.*,
ntile(4) over (PARTITION BY cte.cod_prod ORDER BY sum_price ASC) quartile,
quartiles.*
FROM cte
INNER JOIN quartiles ON cte.cod_prod = quartiles.cod_prod;
SELECT max(salary),
(SELECT MAX(SALARY) FROM EMPLOYEE
WHERE SALARY NOT IN(SELECT MAX(SALARY) FROM EMPLOYEE)) as 2ND_MAX_SALARY;
This is giving me the error: FROM keyword not found where expected
You want the top 2 of your table ordered by one of the columns (the FETCH NEXT clause is available from Oracle 12c R1)
SELECT Salary FROM Employee ORDER BY Salary DESC LIMIT 2
FETCH NEXT 2 ROWS ONLY;
Use
SELECT Salary FROM Employee ORDER BY Salary DESC LIMIT 2
FETCH NEXT 2 ROWS WITH TIES;
if you want to return all employees that have the 1st or 2nd highest salary: There might only be one highest salary amount in the company, but more than one employee who gets that amount. Those rows are the ties.
If you're on Oracle database version lower than 12c, rank analytic function might help.
For sample rows:
SQL> select * from employee order by salary desc;
ENAME SALARY
---------- ----------
KING 5000 --> highest salary
FORD 3000 --> Ford and Scott "share" the 2nd
SCOTT 3000 --> highest salary
JONES 2975
BLAKE 2850
CLARK 2450
ALLEN 1600
TURNER 1500
MILLER 1300
WARD 1250
MARTIN 1250
ADAMS 1100
JAMES 950
SMITH 800
14 rows selected.
In a subquery (or a CTE, as I did), calculate rank for each salary and then, in the main query, select rows that rank as to top salaries:
SQL> with temp as
2 (select ename,
3 salary,
4 rank() over (order by salary desc) rnk
5 from employee
6 )
7 select ename, salary
8 from temp
9 where rnk <= 2
10 order by rnk desc;
ENAME SALARY
---------- ----------
SCOTT 3000
FORD 3000
KING 5000
SQL>
SELECT MAX(salary) AS max_salary,
(SELECT MAX(salary)
FROM employee
WHERE salary NOT IN (SELECT MAX(salary)
FROM employee
)
) AS 2nD_max_salary
FROM employee;
I have been asked to generate a report to show the number of occurrences an employee is absent from work sick.
If an employee is absent from work for 3 consecutive days this will be counted as 1 occurrence. If they then return to work and are then absent again for another 2 consecutive days this will be recorded as 2 occurrences.
I need to generate a report to show the number of occurrences an employee is away from work sick within a 6 month period.
I have set out an example below of the data showing an employee's absence records and how i need the report to look.
How data shows in database:
enter image description here
Name Absence Dates
John Smith 01-Sep-19
John Smith 02-Sep-19
John Smith 03-Sep-19
John Smith 10-Sep-19
John Smith 11-Sep-19
How i wish for the report to look:
Name Occurrences
John Smith 2
I would be grateful for any assistance with writing to code to achieve this result.
Not a full answer, as you should really do some of this yourself, however, based on what you have detailed in your quesiton, you could use the approach below to count up any spells of absence, within a 6 month period.
Assumes you would be compiling this using SQL Server
declare #absences table (empid nvarchar(10), [abs date] date, [ret date] date);
declare #staff table ([empid] int, [name1] nvarchar(50), [name2] nvarchar(50), [surname] nvarchar(50));
-- put some test values in the staff table to work with
insert into #staff
values
(1, 'John', 'Lewis', 'Smith'), -- using a unique ID here, in any good system this should be an incremental number for each new staff member added to the table
(2, 'James', 'Thomas', 'Brown')
-- put some test values in the absences table to work with
insert into #absences
values
(1, '2019-07-01', '2019-07-04'), -- userid, absence date & return date
(1, '2019-08-04', '2019-08-06'),
(2, '2019-07-02', '2019-07-05'),
(2, '2019-08-05', '2019-08-07')
select count(*) spellsoff, empid, name1, name2, surname, [days absent]
from
(
select
s.empid,
s.name1,
s.name2,
s.surname,
a.[abs date],
a.[ret date],
datediff(d,a.[abs date], a.[ret date]) [days absent]
from #staff s
left join #absences a
on s.empid = a.empid
where [abs date] >= DATEADD(M,-6,GETDATE()) -- pull back those employeess that have been absent in the last 6 months from today's date
)doff
group by empid, name1, name2, surname, [days absent]
Gives you the following breakdown:
spellsoff empid name1 name2 surname days absent
1 1 John Lewis Smith 2
1 1 John Lewis Smith 3
1 2 James Thomas Brown 2
1 2 James Thomas Brown 3
I have 3 columns. SSN|AccountNumber|OpenDate
1 SSN may have multiple AccountNumbers
Each AccountNumber has a corresponding OpenDate
In my list I have many SSN's, each containing several account numbers which may have been opened on different days.
I want the results of my query to be SSN|earlest OpenDate|AccountNumber that corresponds with the earliest opendate.
I'm dealing with about 200,000 records.
EDIT: First I did
select SSN, min(OpenDate), AcctNumber from Table Group By SSN, AccountNumber
but that didn't quite give me the correct data.
The raw data gives me something like this:
SSN | AcctNumber | OpenDate
---------------------------
10 101 Jan
10 102 Feb
10 103 Mar
Where I got 10, Jan, and AccNumber 102 which is not the account number that is associated with Jan OpenDate After looking at others, I found that the account number I got was just one of the account numbers associated with that SSN rather than the one that corresponds with the min(OpenDate)
WITH CTE AS ( SELECT SSN, AcctNumber, OpenDate, ROW_NUM() OVER (PARTITION BY SSN ORDER BY OpenDate DESC) AS RN ) SELECT SSN, AcctNumber, OpenDate FROM CTE WHERE RN=1;
If your table is like this:
SSN | AcctNumber | OpenDate
---------------------------
10 101 April
10 101 May
10 102 April
20 201 June
20 201 July
Do you want your query to return this?
SSN | AcctNumber | OpenDate
---------------------------
10 101 April
10 102 April
20 201 June
Then you would use this query:
select ssn, min(OpenDate), acctNumber from tbl group by ssn, acctNumber
You can try this..
select SSN , AcctNumber, OpenDate
from (SELECT SSN , AcctNumber, OpenDate
, ROW_NUMBER() OVER ( PARTITION BY SSN, ORDER BY OpenDate ASC ) AS RN
FROM table) AS temp
WHERE temp.RN= 1
Please help me undestand how order by influences to over clause. I have read msdn and one book and still misunderstood.
Let's say we have such query:
SELECT Count(OrderID) over(Partition By Year(OrderDate))
,*
FROM [Northwind].[dbo].[Orders]
ORDER BY OrderDate
The result is that each raw has the column with the value how many entries in the table have the same year.
alt text http://img-fotki.yandex.ru/get/3912/svin80.2/0_3b871_3bb591da_XL
But what's happened when i try this query?:
SELECT ROW_NUMBER() over(Partition By Year(OrderDate)
order by OrderDate) as RowN
,*
FROM [Northwind].[dbo].[Orders]
ORDER BY RowN
alt text http://img-fotki.yandex.ru/get/3908/svin80.2/0_3b872_c9352fb1_XL
Now I see the only thing that each RowN has 3 different years for each value (1996, 1997, 1998). I expected that RowN will be the same value for all 1996 year dates. Please explain me what happens and why.
In this case:
SELECT ROW_NUMBER() over(Partition By Year(OrderDate)
order by OrderDate) as RowN,*
FROM [Northwind].[dbo].[Orders]
order by RowN
What you're seeing it it's giving you a row number that is partitioned by year, meaning that each year has it's own climbing row number. To make this a bit cleaerer in the results:
SELECT ROW_NUMBER() over(Partition By Year(OrderDate)
order by OrderDate) as RowN,*
FROM [Northwind].[dbo].[Orders]
order by RowN, Year(OrderDate)
This means that each year, say 1997, will have orders 1 through n ordered by the date that year...like this was the 1st order of 1997, 2nd order of 1997, etc.
The results will make far more sense if you do this:
SELECT
Year(OrderDate),
ROW_NUMBER() over(Partition By Year(OrderDate)order by OrderDate) as RowN,
*
FROM [Northwind].[dbo].[Orders]
ORDER BY Year(OrderDate), RowN
Now you can see that each year has increasing row numbers starting from 1, ordered by order date:
Year RowN Order Date
1997 1 10400 1997-01-01 00:00:00
1997 2 10401 1997-01-01 00:00:00
1997 3 10402 1997-01-02 00:00:00
...
1998 1 10808 1998-01-01 00:00:00
1998 2 10809 1998-01-01 00:00:00
1998 3 10810 1998-01-01 00:00:00
...