sql code to compare previous row in a series - tsql

So I have a table that has an ArticleID (GUID), RevisionNumber (integer), and StatusCode (text).
An Article can have any number of revisions but each time a new revision is created, the StatusCode of the previous revision should be "Revised" and the newest revision's StatusCode could be "Active" or "Draft" or "Canceled". However the data is messed up and I need to identify which records (out of 100's of thousands) do not have the correct status.
Sample data:
Article ID RevisionNumber StatusCode
========== ============== ==========
xx-xxxx-xx 7 Active
xx-xxxx-xx 6 Revised
xx-xxxx-xx 5 Active
xx-xxxx-xx 4 Draft
xx-xxxx-xx 3 Revised
xx-xxxx-xx 2 Active
xx-xxxx-xx 1 Revised
xx-xxxx-xx 0 Revised
xx-yyyy-yy 1 Active
xx-yyyy-yy 0 Active
In the above scenario, I would need to know that xx-xxxx-xx Revision 5, 4, and 2 are not the proper status and xx-yyyy-yy Revision 0 is incorrect. How could I get this information from a sql query using sql server 2012?

To identify any revisions that are not "Revised" if there is a higher number revision.
Then it seems just a matter of knowing what the latest revision is.
A MAX OVER can do that.
SELECT ArticleID, RevisionNumber, StatusCode
FROM
(
SELECT ArticleID, RevisionNumber, StatusCode
, MAX(RevisionNumber) OVER (PARTITION BY ArticleID) AS MaxRevisionNumber
FROM YourTable
) q
WHERE (RevisionNumber < MaxRevisionNumber AND StatusCode != 'Revised')

You can do this with a left join -- for each record we look for one with a greater revision -- like this:
SELECT *
FROM table_you_did_not_name base
LEFT JOIN table_you_did_not_name next ON base.ArticleID = next.ArticleID and base.revisionnumber = next.revisionnumber + 1
WHERE status <> 'Revised' and next.ArticleID is not null

Related

SQL Server 2012 finding text before and after delimiter and sometimes without delimiter

I have been having fun with an issue where I need to break apart a string in SQL Server 2012 and test for values it may or may not contain. The values, when present, will be separated by up to two different ; symbols.
When there is nothing, it will be blank.
When there is a single value, it will show up without the delimiter.
When there are two or more, up to 3, they will be separated by the delimiter.
As I said, if there is nothing in the record, it will be blank. Below are some example of how the data may come across:
' ',
'1',
'24',
'15;1;24',
'10;1;22',
'5;1;7',
'12;1',
'10;12',
'1;5',
'1;1;1',
'15;20;22'
I have searched the forums and found many clues, but I have not been able to come up with a total solution given all potential data values. Essentially, I would like to break it into 3 separate values.
text before the first delimiter or in the absence of the delimiter, just the text.
Text after the first delimiter and before the second in situation where there are two delimiters.
The following has worked consistently:
substring(SUBSTRING(Food_Desc, charindex(';', Food_Desc) + 1, 4), 0,
charindex(';', SUBSTRING(Food_Desc, charindex(';', Food_Desc) + 1, 4))) as [Middle]
Text after the second delimiter in the even there are two delimiters and there is a third value
The main challenge is the fact that the delimiter, when present, moves depending on the value in the table. values 1-9 make it show up as the second character in the string, values 10-24 make it show up as the 3rd, etc.
Any help would be greatly appreciated.
This is simple if you have a well written t-sql splitter function. For this solution I'm using Jeff Moden's delimitedsplit8k.
sample data and solution
DECLARE #table table (someid int identity, sometext varchar(100));
INSERT #table VALUES (' '),('1'),('24'),('15;1;24'),('10;1;22'),
('5;1;7'),('12;1'),('10;12'),('1;5'),('1;1;1'),('15;20;22');
SELECT
someid,
sometext,
ItemNumber,
Item
FROM #table
CROSS APPLY dbo.DelimitedSplit8K_LEAD(sometext, ';');
results
someid sometext ItemNumber Item
----------- ----------------- ----------- --------
1 1
2 1 1 1
3 24 1 24
4 15;1;24 1 15
4 15;1;24 2 1
4 15;1;24 3 24
5 10;1;22 1 10
5 10;1;22 2 1
5 10;1;22 3 22
6 5;1;7 1 5
6 5;1;7 2 1
6 5;1;7 3 7
7 12;1 1 12
7 12;1 2 1
8 10;12 1 10
8 10;12 2 12
9 1;5 1 1
9 1;5 2 5
10 1;1;1 1 1
10 1;1;1 2 1
10 1;1;1 3 1
11 15;20;22 1 15
11 15;20;22 2 20
11 15;20;22 3 22
Below is a modified version of a similar question How do I split a string so I can access item x?. Changing the text value for #sample to each of your possibilities listed seemed to work for me.
DECLARE #sample VARCHAR(200) = '15;20;22';
DECLARE #individual VARCHAR(20) = NULL;
WHILE LEN(#sample) > 0
BEGIN
IF PATINDEX('%;%', #sample) > 0
BEGIN
SET #individual = SUBSTRING(#sample, 0, PATINDEX('%;%', #sample));
SELECT #individual;
SET #sample = SUBSTRING(#sample, LEN(#individual + ';') + 1, LEN(#sample));
END;
ELSE
BEGIN
SET #individual = #sample;
SET #sample = NULL;
SELECT #individual;
END;
END;

Update rows returned by a complex SQL query with data from query result

I have a multi-table join and want to update a table based on the result of that join. The join table produces both the scope of the update (only those rows whose effort.id appears in the result should be updated) and the data for the update (a new column should be set to the value of a calculated column).
I've made progress but can't quite make it work. Here's my statement:
UPDATE
efforts
SET
dropped_int = jt.split
FROM
(
SELECT
ef.id,
s.id split,
s.kind,
s.distance_from_start,
s.sub_order,
max(s.distance_from_start + s.sub_order)
OVER (PARTITION BY ef.id) AS max_dist
FROM
split_times st
LEFT JOIN splits s ON s.id = st.split_id
LEFT JOIN efforts ef ON ef.id = st.effort_id
) jt
WHERE
((jt.distance_from_start + jt.sub_order) = max_dist)
AND
kind <> 1;
The SELECT produces the correct join table:
id split kind dfs sub max_dist dropped dropped_int
403 33 2 152404 1 152405 TRUE 33
404 33 2 152404 1 152405 TRUE 33
405 31 2 143392 1 143393 TRUE 33
406 31 2 143392 1 143393 TRUE 33
407 29 2 132127 1 132128 TRUE 33
408 29 2 132127 1 132128 TRUE 33
409 29 2 132127 1 132128 TRUE 33
and does indeed update the efforts.id column, but there are two problems: First, it updates all efforts, not just those that are produced from the query, and second, it sets effort.id to the split value of the first row in the query result, but I need it to set each effort to the associated split value.
If this were non-SQL, it might look something like:
jt_rows.each do |jt_row|
efforts[jt_row].dropped_int = jt[jt_row].split
end
But I don't know how to do that in SQL. It seems like this should be a fairly common problem, but after a couple of hours of searching I'm coming up short.
How should I modify my statement to produce the described result? If it matters, this is Postgres 9.5. Thanks in advance for any suggestions.
EDIT:
I did not get a workable answer but ended up solving this with a mixture of SQL and native code (Ruby/Rails):
dropped_splits = SplitTime.joins(:split).joins(:effort)
.select('DISTINCT ON (efforts.id) split_times.effort_id, split_times.split_id')
.where(efforts: {dropped: true})
.order('efforts.id, splits.distance_from_start DESC, splits.sub_order DESC')
update_hash = Hash[dropped_splits.map { |x| [x.effort_id, {dropped_split_id: x.split_id, updated_at: Time.now}] }]
Effort.update(update_hash.keys, update_hash.values)
Use a condition in the WHERE clause that relates efforts table with a subquery:
efforts.id = jt.id
that is:
WHERE
((jt.distance_from_start + jt.sub_order) = max_dist)
AND
kind <> 1
AND
efforts.id = jt.id

Selecting rows only if meeting criteria

I am new to PostgreSQL and to database queries in general.
I have a list of user_id with university courses taken, date started and finished.
Some users have multiple entries and sometimes the start date or finish date (or both) are missing.
I need to retrieve the longest course taken by a user or, if start date is missing, the latest.
If multiple choices are still available, then pick random among the multiple options.
For example
on user 2 (below) I want to get only "Economics and Politics" because it has the latest date;
on user 6, only "Electrical and Electronics Engineering" because it is the longer course.
The query I did doesn't work (and I think I am off-track):
(SELECT Q.user_id, min(Q.started_at) as Started_on, max(Q.ended_at) as Completed_on,
q.field_of_study
FROM
(select distinct(user_id),started_at, Ended_at, field_of_study
from educations
) as Q
group by Q.user_id, q.field_of_study )
order by q.user_id
as the result is:
User_id Started_on Completed_on Field_of_studies
2 "2001-01-01" "" "International Economics"
2 "" "2002-01-01" "Economics and Politics"
3 "1992-01-01" "1999-01-01" "Economics, Management of ..."
5 "2012-01-01" "2016-01-01" ""
6 "2005-01-01" "2009-01-01" "Electrical and Electronics Engineering"
6 "2011-01-01" "2012-01-01" "Finance, General"
6 "" "" ""
6 "2010-01-01" "2012-01-01" "Financial Mathematics"
I think this query should do what you need, it relies on calculating the difference in days between ended_at and started_at, and uses 0001-01-01 if the started_at is null (making it a really long interval):
select
educations.user_id,
max(educations.started_at) started_at,
max(educations.ended_at) ended_at,
max(educations.field_of_study) field_of_study
from educations
join (
select
user_id,
max(
ended_at::date
-
coalesce(started_at, '0001-01-01')::date
) max_length
from educations
where (started_at is not null or ended_at is not null)
group by user_id
) x on educations.user_id = x.user_id
and ended_at::date
-
coalesce(started_at, '0001-01-01')::date
= x.max_length
group by educations.user_id
;
Sample SQL Fiddle

Transactional Replication: Column name or number of supplied values does not match table definition

I have set up a transactionnal replication for some tables.
The Master and the Slave Database are identical.
I used this query and compared the result from master and slave to make sure the table is identical
select * from sys.columns c
join sys.tables t on t.object_id = c.object_id
where t.name = 'customers'
In the Replication Monitor I can find this error:
Column name or number of supplied values does not match table definition.
If I check the details I get this:
Command attempted:
if ##trancount > 0 rollback tran
(Transaction sequence number: 0x0011775200000105007600000000, Command ID: 1)
So I checked in the destribution database using this query to find the command that is failing.
sp_browsereplcmds #xact_seqno_start = '0x0011775200000105007600000000',
#xact_seqno_end = '0x0011775200000105007600000000'
This is the command (its in 2 lines in that table):
{CALL [sp_MSins_dboCustomers] (0,'575',N'todelete','575',N'todelete',118594,118595,118596,N'10T 3% Sk 30T net.',0,'Deutschland',4,24399158193054E-314,4,24399158193054E-314,4,24399158193054E-314,4,24399158193054E-314,2,54639494915833E-313,'','','','','','TGW',N'Liefern LKW',NULL,NULL,0,0,6,79038653108887E-311,NULL,'',0,NULL,NULL,NULL,0,0,0,-1,-1,1900-01-01 00:00:00,0,1,{AEB3D911-36D1-4A8A-B713-6B2F2CCA1641},0,0,2,'de-AT',25,NULL,NULL,0,1,NULL,NULL,2014-03-07 08:57:45.727,-1,NULL,0,'','','','','','','','','','','','',
'','','','','','','','')}
This is what I have in my DB
TypeID CustomerID Name SiteID SiteName AddressID BillAddressID ShipAddressID Terms TaxExempt TaxSchedID TaxPercent TaxPercent1 TaxPercent2 TaxPercent3 TaxPercent4 TaxTitle TaxTitle1 TaxTitle2 TaxTitle3 TaxTitle4 LocationID ShipVia PackingType PackingNoteID CutoffDay UploadAction LeadTime ExpDays Notes SalesPersonID CreditLimit OpenOrders OrderValueScheduleID OAHidePrices DefaultAckType DefaultInvType DefaultPackType UploadEmployee UploadDateTime OAHideImages MfgCustomer CustomerGUID PricingMethod DefaultCustomer EngineeringUnitSetID CurrencyCulture FamilyGroupID InvoiceMinimum InvoiceSurcharge InvoiceGroup InvoiceCopies DeliveryMinimum DeliverySurcharge CreateDate EnteredBy LanguageCulture DropShip UserDef1 UserDef2 UserDef3 UserDef4 UserDef5 UserDef6 UserDef7 UserDef8 UserDef9 UserDef10 UserDef11 UserDef12 UserDef13 UserDef14 UserDef15 UserDef16 UserDef17 UserDef18 UserDef19 UserDef20
0 575 todelete 575 todelete 118594 118595 118596 10T 3% Sk 30T net. 0 Deutschland 0 0 0 0 0 TGW Liefern LKW NULL NULL 0 0 0 NULL 302 NULL NULL NULL 0 0 0 -1 -1 1900-01-01 00:00:00 0 1 AEB3D911-36D1-4A8A-B713-6B2F2CCA1641 0 0 2 de-AT 25 NULL NULL 0 1 NULL NULL 2014-03-07 08:57:45.727 -1 NULL 0 1 2 3 4 0 1 2 3 4
As you can see here, the values for the taxpercent fields (after "Deutschland") are 0 in my DB, in the command they are really weird (4,24399158193054E-314)
The Datatype is "real"
Maybe this is not the issue but this is the only weird thing I could find.
I found my problem.
In fact this 4,24399158193054E-314 is a value for "0" in real, the problem is that it did not use the "." but the "," as decimal separator and therefore the call of the procedure had too much argument.
What I did is to change the statement delivery for insert, update, delete from "Call " to INSERT/UPDATE/DELETE statement.
I don't know why this is not selected by default, but now it works.

Pulling correct results from my PitchValues Table

I am getting a tad frustrated and was wondering if you can help:
I have a Pitch Values Table with the following Columns PitchValues_Skey, PitchType_Skey (this is a foreign key), Start Date, End Date and finally value:
For Example:
1 7 01/01/2010 31/12/2010 £15
2 7 01/01/2011 31/12/2011 £20
And all I want to do is update my Bookings table with how much each booking is going to be, so I put together the code below which worked fine when I only had 2010 data, but I know have 2011 and 2012 and want to update it but it will only update with the 2010 prices.
SELECT Bookings.Booking_Skey, DATEDIFF(day, Bookings.ArrivalDate, Bookings.DepartureDate) * PitchValues.Value AS BookingValue,
PitchValues.PitchType_Skey
FROM Bookings INNER JOIN
PitchValues ON Bookings.PitchType_Skey = PitchValues.PitchType_Skey
WHERE (Bookings.Booking_Skey = 1)
So when I run the query above I would expect to see one line of data but instead I see 4 (See Below)
I would expect this:
Booking_Skey BookingValue PitchType_Skey
1 420 4
But I get this
Booking_Skey BookingValue PitchType_Skey
1 420 4
1 453.6 4
1 476.7 4
1 476.7 4
All sorted now, thanks for your help.
SELECT Bookings.Booking_Skey, DATEDIFF(DAY, Bookings.ArrivalDate, Bookings.DepartureDate) * PitchValues.Value AS BookingValue, PitchValues.PitchType_Skey
FROM Bookings
INNER JOIN PitchValues ON Bookings.PitchType_Skey = PitchValues.PitchType_Skey
AND Bookings.ArrivalDate BETWEEN PitchValues.StartDate AND PitchValues.EndDate
WHERE (Bookings.Booking_Skey = 1)