Using TSQL to perform calculations - tsql

Ive got a table called NewCodes with the following records
| NewCode | Mapping |
| -------- | -------------- |
| pp1 | [US1] + [US5] |
| qq1 | [US8] – [US9] |
| ww1 | [RE5] + RE6] + [RE7] |
| zx1 | [KJ1] – [XC4] |
Ive got another table called Source Codes which contains a list of values assigned to all the code in the mapping column.
Code
Value
US1
35
US5
10
US8
20
US9
5
RE5
7
RE6
8
RE7
6
I am trying to figure out a way of assigning a value to the codes in the NewCode column using the calculations defined in the Mapping column. I currently use SSMS. So for example.
I have no idea how to attempt this and I was wondering anyone could help

As long as the production doesn't get too much more complicated than the example then this can be done. Specifically:
Only addition and subtraction can be performed, or at least there is no concern for order of operations.
The expressions are all well and consistently formed.
All variables exist in SourceCodes. (This could be overcome using a LEFT JOIN and providing a default value like 0).
The level of the SQL Server supports string_split. (Though I used to split with xml back in the day so this can be overcome.)
The following query will do the following.
Split each Mapping into a table of symbols.
Determine the proper order of symbols since string_split is non-deterministic.
Normalize the symbols so the codes in brackets will match what is found in SourceCodes.
Accumulate the result for each new code.
Return the accumulated result in the last row for each new code partition.
The secret sauce in this solution is the use of recursive CTEs to act like for loops. The first instance is used when determining the order of symbols. In order to determine the start index for successive occurrences of the same symbol the unioned part of the CTE gets the char index from the previous. The second instance in a similar fashion to accumulate values except it relies on the convention that an operator appears on every even row and code on every odd one.
WITH Symbols AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY NewCode, Symbol ORDER BY Symbol) [SymbolSeqNum]
FROM NewCodes
CROSS APPLY (
SELECT value [Symbol]
FROM string_split( Mapping, ' ')
) x
)
, UnnormalizedOrderedSymbols AS (
-- since string_split is nondeterministic we need a way to restore the order.
SELECT NewCode, Symbol, SymbolSeqNum, CHARINDEX(Symbol, Mapping, 1) SymbolOrderIndex
FROM Symbols
WHERE SymbolSeqNum = 1
UNION ALL
SELECT s.NewCode, s.Symbol, s.SymbolSeqNum, CHARINDEX(s.Symbol, s.Mapping, os.SymbolOrderIndex + 1) SymbolOrderIndex
FROM UnnormalizedOrderedSymbols os
INNER JOIN Symbols s ON s.NewCode = os.NewCode AND s.Symbol = os.Symbol AND s.SymbolSeqNum = os.SymbolSeqNum + 1
)
, NormalizedOrderedSymbols AS (
SELECT NewCode
, CASE SymbolType WHEN 'Code' THEN SUBSTRING(Symbol, 2, LEN(Symbol) - 2) ELSE Symbol END [Symbol]
, SymbolType
, ROW_NUMBER() OVER (PARTITION BY NewCode ORDER BY SymbolOrderIndex) [SymbolOrderIndex]
FROM UnnormalizedOrderedSymbols
CROSS APPLY (
SELECT CASE WHEN Symbol LIKE '[[]%]' THEN 'Code' ELSE 'Operator' END [SymbolType]
) x
)
, RunningTotal AS (
SELECT NewCode, c.Value, SymbolOrderIndex
FROM NormalizedOrderedSymbols o
INNER JOIN SourceCodes c ON c.Code = o.Symbol
WHERE o.SymbolOrderIndex = 1
UNION ALL
SELECT rt.NewCode
, CASE op.Symbol
WHEN '+' THEN rt.Value + c.Value
WHEN '-' THEN rt.Value - c.Value
END
, num.SymbolOrderIndex
FROM RunningTotal rt
INNER JOIN NormalizedOrderedSymbols op ON op.NewCode = rt.NewCode AND op.SymbolOrderIndex = rt.SymbolOrderIndex + 1
INNER JOIN NormalizedOrderedSymbols num ON num.NewCode = rt.NewCode AND num.SymbolOrderIndex = rt.SymbolOrderIndex + 2
INNER JOIN SourceCodes c ON c.Code = num.Symbol
)
SELECT x.NewCode, x.Value
FROM (
SELECT rt.NewCode, rt.Value, ROW_NUMBER() OVER (PARTITION BY rt.NewCode ORDER BY SymbolOrderIndex DESC) rn
FROM RunningTotal rt
) x
WHERE x.rn = 1
ORDER BY NewCode
This is obviously is not a very good use of SQL Server and you're probably better off writing a script to perform whatever you're trying to accomplish.

Related

Extract words before and after a specific word

I need to extract words before and after a word like '%don%' in a ntext column.
table A, column name: Text
Example:
TEXT
where it was done it will retrieve the...
at the end of the trip clare done everything to improve
it is the only one done in these times
I would like the following results:
was done it
clare done everything
one done in
I am using T-SQL, Left and right functions did not work with ntext data type of the column containing text.
As others have said, you can use a string splitting function to split out each word and then return those you require. Using the previously linked DelimitedSplit8K:
CREATE FUNCTION dbo.DelimitedSplit8K
--===== Define I/O parameters
(#pString VARCHAR(8000), #pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(#pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(#pString,t.N,1) = #pDelimiter
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(#pDelimiter,#pString,s.N1),0)-s.N1,8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(#pString, l.N1, l.L1)
FROM cteLen l
;
go
declare #t table (t ntext);
insert into #t values('where it was done it will retrieve the...'),('at the end of the trip clare done everything to improve'),('we don''t take donut donations here'),('ending in don');
with t as (select cast(t as nvarchar(max)) as t from #t)
,d as (select t.t
,case when patindex('%don%',s.Item) > 0 then 1 else 0 end as d
,s.ItemNumber as i
,lag(s.Item,1,'') over (partition by t.t order by s.ItemNumber) + ' '
+ s.Item + ' '
+ lead(s.Item,1,'') over (partition by t.t order by s.ItemNumber) as r
from t
cross apply dbo.DelimitedSplit8K(t.t, ' ') as s
)
select t
,r
from d
where d = 1
order by t
,i;
Output:
+---------------------------------------------------------+-----------------------+
| t | r |
+---------------------------------------------------------+-----------------------+
| at the end of the trip clare done everything to improve | clare done everything |
| ending in don | in don |
| we don't take donut donations here | we don't take |
| we don't take donut donations here | take donut donations |
| we don't take donut donations here | donut donations here |
| where it was done it will retrieve the... | was done it |
+---------------------------------------------------------+-----------------------+
And a working example:
http://rextester.com/RND43071

POSGTRESQL 9.10 - returning the maximum value from a JSON arrays

Looking for a method to calculate the maximum value is an numeric arrays contained in a json array using postgresql.
Simple example:
room, data
1 , '{"history":{"samples":{"101":[5,10,50,20],"102":[10,15,5,5]}}}'
What I'm looking for is the maximum value for a particular "history -> sample" item for a room. This this case, it would be "50" for sample 101 and "15" for sample 102 but the real data is larger than this.
Here is sqlfiddle to some actual data. http://sqlfiddle.com/#!17/2c7a0
Ultimately, I would like to end up with a pivot with the room and samples as columns with the maximum value in that array. Is there a fairly simple way to do this with the large number of elements in the arrays? (crosstab or cross lateral join?) Something like the following based on the simple example from above:
room | 101 | 102 | ... ->
1 | 50 | 15
2 | x | x
etc..
..
again, see sqlfiddle for sample data
You could use LATERAL and json_array_elements:
SELECT j.id, s2.*
FROM jsonData j
,LATERAL (SELECT (data -> 'history') -> 'data' ) s(c)
,LATERAL ( VALUES(
(SELECT MAX(value::text::decimal(10,2))
FROM json_array_elements((s.c -> '101')::json) x),
(SELECT MAX(value::text::decimal(10,2))
FROM json_array_elements((s.c -> '102')::json) x))
)s2("101","102"); -- are more cols here
DBFiddle Demo
This is not a complete answer but it may help getting you close to what you're looking for:
select key, data->'history'->'data' #> array[key] as values
from
(select *, jsonb_object_keys(data->'history'->'data') as key
from jsonData) as a
Output:
See fiddle demo
You can select only a single room and do all the work on it, then it's easier:
select key, max(val::text::float) from
(
select key, jsonb_array_elements(values) as val
from
(select key, data->'history'->'data' #> array[key] as values
from
(select *, jsonb_object_keys(data->'history'->'data') as key
from jsonData) as a)
as b
) as c
group by key
order by 1
Fiddle demo
output:
And if you want to display it in horizontal way instead of vertical, you can use crosstab (tablefunc)

psql/redshift: is there a way to use window functions like FIRST_VALUE in a GROUP BY expression?

motivation: This seems kind of terrible, but I'm trying to write string_agg in Redshift using multiple queries, which will coalesce neighboring rows. My maximum group size isn't that big, so I think the query would only run for a few iterations. I've managed to preprocess my data in a form that's like,
key | merge index | value
a | 0 | foo
a | 0 | bar
a | 1 | baz
b | 0 | fandangle
in one step, everything with the same (key, merge_index) should be concatenated, so we get,
key | merge index | value
a | 0 | foo, bar
a | 1 | baz
b | 0 | fandangle
I want to use first_value and last_value in a GROUP BY statement like so,
SELECT key,
merge_index,
FIRST_VALUE(value) || COALESCE((', ' || NTH_VALUE(value, 2)), '')
GROUP BY key, merge_index;
but, of course, you can't do that because FIRST_VALUE and NTH_VALUE are window functions, not aggregate functions.
question: Why can't I use FIRST_VALUE and friends in a GROUP BY group?
note: It works functionally to do a SELECT DISTINCT, omit the GROUP BY, and use the relevant OVER (PARTITION BY key, merge_index) windows, but I can't imagine this is efficient if it's trying to deduplicate the entire result table. I also realize I could do more preprocessing and add a column like left_or_right which indicates which side it's trying to merge, and then use a left join. That also doesn't seem too efficient, but maybe it's not bad.
I like David's queries, but he didn't get into why:
Window functions are last part of the query to be executed, after the grouping and ordering. Because of this, a window function always outputs one value per record in the final data set. You can use aggregates inside window functions, but not window functions inside aggregates. To achieve your goal, you need to do another pass over the data set to aggregate, which is accomplished with a subquery.
Have you tried something like the following? This way you can avoid FIRST_VALUE() and NTH_VALUE() as well as aggregation:
WITH p AS (
SELECT key, merge, value
, ROW_NUMBER() OVER ( PARTITION BY key, merge ) AS rn
FROM mytable
)
SELECT p1.key, p1.merge, p1.value || p1.value || COALESCE(',' || p2.value, '')
FROM p p1 LEFT JOIN p p2
ON p1.key = p2.key
AND p1.merge = p2.merge
AND p2.rn = 2
WHERE p1.rn = 1
Please see SQL Fiddle demo here. Yes, I did use Postgres 9 for the fiddle; I couldn't get a connection on 8 (but I don't think I'm using any features of 9).
Alternately, you might use the following and avoid a self-join:
WITH p AS (
SELECT key, merge, value
, LEAD(value) OVER ( PARTITION BY key, merge ) AS next_value
, ROW_NUMBER() OVER ( PARTITION BY key, merge ) AS rn
FROM mytable
)
SELECT key, merge, value || COALESCE(',' || next_value, '')
FROM p
WHERE rn = 1
SQL Fiddle here. If you knew in advance how many values you needed to concatenate, you could make multiple calls to LEAD() with increasing offset values (more SQL Fiddle):
WITH p AS (
SELECT key, merge, value
, LEAD(value) OVER ( PARTITION BY key, merge ) AS next_value
, LEAD(value,2) OVER ( PARTITION BY key, merge ) AS n2_value
, LEAD(value,3) OVER ( PARTITION BY key, merge ) AS n3_value
, ROW_NUMBER() OVER ( PARTITION BY key, merge ) AS rn
FROM mytable
)

How to find the last descendant (that matches other criteria) in a linear “ancestor-descendant” relationship

This question is based on the following question, but with an additional requirement: PostgreSQL: How to find the last descendant in a linear "ancestor-descendant" relationship
Basically, what I need is a Postgre-SQL statement that finds the last descendant in a linear “ancestor-descendant” relationship that matches additional criteria.
Example:
Here the content of table "RELATIONSHIP_TABLE":
id | id_ancestor | id_entry | bool_flag
---------------------------------------
1 | null | a | false
2 | 1 | a | false
3 | 2 | a | true
4 | 3 | a | false
5 | null | b | true
6 | null | c | false
7 | 6 | c | false
Every record within a particular hierarchy has the same "id_entry"
There are 3 different “ancestor-descendant” relationships in this example:
1. 1 <- 2 <- 3 <- 4
2. 5
3. 6 <- 7
Question PostgreSQL: How to find the last descendant in a linear "ancestor-descendant" relationship shows how to find the last record of each relationship. In the example above:
1. 4
2. 5
3. 7
So, what I need this time is the last descendant by "id_entry" whose "bool_flag" is set to true. In the example above:
1. 3
2. 5
3. <empty result>
Does anyone know a solution?
Thanks in advance :)
QStormDS
Graphs, trees, chains, etc represented as edge lists are usually good uses for recursive common table expressions - i.e. WITH RECURSIVE queries.
Something like:
WITH RECURSIVE walk(id, id_ancestor, id_entry, bool_flag, id_root, generation) AS (
SELECT id, id_ancestor, id_entry, bool_flag, id, 0
FROM RELATIONSHIP_TABLE
WHERE id_ancestor IS NULL
UNION ALL
SELECT x.id, x.id_ancestor, x.id_entry, x.bool_flag, walk.id_root, walk.generation + 1
FROM RELATIONSHIP_TABLE x INNER JOIN walk ON x.id_ancestor = walk.id
)
SELECT
id_entry, id_root, id
FROM (
SELECT
id, id_entry, bool_flag, id_root, generation,
max(CASE WHEN bool_flag THEN generation END ) OVER w as max_enabled_generation
FROM walk
WINDOW w AS (PARTITION BY id_root ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
) x
WHERE generation = max_enabled_generation;
... though it feels like there really should be a better way to do this than tracking how many generations we've walked down each path.
If id_entry is common for all members of a tree, you can avoid needing to track id_root. You should create a UNIQUE constraint on (id_entry, id) and a foreign key constraint on FOREIGN KEY (id_entry, id_ancestor) REFERENCES (id_entry, id) to make sure that the ordering is consistent, then use:
WITH RECURSIVE walk(id, id_ancestor, id_entry, bool_flag, generation) AS (
SELECT id, id_ancestor, id_entry, bool_flag, 0
FROM RELATIONSHIP_TABLE
WHERE id_ancestor IS NULL
UNION ALL
SELECT x.id, x.id_ancestor, x.id_entry, x.bool_flag, walk.generation + 1
FROM RELATIONSHIP_TABLE x INNER JOIN walk ON x.id_ancestor = walk.id
)
SELECT
id_entry, id
FROM (
SELECT
id, id_entry, bool_flag, generation,
max(CASE WHEN bool_flag THEN generation END ) OVER w as max_enabled_generation
FROM walk
WINDOW w AS (PARTITION BY id_entry ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
) x
WHERE generation = max_enabled_generation;
Since this gives you a table of final descendents matched up with root parents, you can just filter with a regular WHERE clause now, just append AND bool_flag. If you instead want to exclude chains that have bool_flag set to false at any point along the way, you can add WHERE bool_value in the RECURSIVE query's join.
SQLFiddle example: http://sqlfiddle.com/#!12/92a64/3
WITH RECURSIVE tail AS (
SELECT id AS opa
, id, bool_flag FROM boolshit
WHERE bool_flag = True
UNION ALL
SELECT t.opa AS opa
, b.id, b.bool_flag FROM boolshit b
JOIN tail t ON b.id_ancestor = t.id
)
SELECT *
FROM boolshit bs
WHERE bs.bool_flag = True
AND NOT EXISTS (
SELECT * FROM tail t
WHERE t.opa = bs.id
AND t.id <> bs.id
AND t.bool_flag = True
);
Explanation: select all records that have the bool_flag set,
EXCEPT those that have offspring (direct or indirect) that have the bool_flag set, too. This effectively picks the last record of the chain that has the flag set.

query for a range of records in result

I am wondering if there is some easy way, a function, or other method to return data from a query with the following results.
I have a SQL Express DB 2008 R2, a table that contains numerical data in a given column, say col T.
I am given a value X in code and would like to return up to three records. The record where col T equals my value X, and the record before and after, and nothing else. The sort is done on col T. The record before may be beginning of file and therefore not exist, likewise, if X equals the last record then the record after would be non existent, end of file/table.
The value of X may not exist in the table.
This I think is similar to get a range of results in numerical order.
Any help or direction in solving this would be greatly appreciated.
Thanks again,
It might not be the most optimal solution, but:
SELECT T
FROM theTable
WHERE T = X
UNION ALL
SELECT *
FROM
(
SELECT TOP 1 T
FROM theTable
WHERE T > X
ORDER BY T
) blah
UNION ALL
SELECT *
FROM
(
SELECT TOP 1 T
FROM theTable
WHERE T < X
ORDER BY T DESC
) blah2
DECLARE #x int = 100
;WITH t as
(
select ROW_NUMBER() OVER (ORDER BY T ASC) AS row_nm,*
from YourTable
)
, t1 as
(
select *
from t
WHERE T = #x
)
select *
from t
CROSS APPLY t1
WHERE t.row_nm BETWEEN t1.row_nm -1 and t1.row_nm + 1