I have been given a T-SQL task: to convert/format names which are in ALL CAPS into Title Case. I have decided that splitting the names into tokens, and capitalizing the first letter out of each token, would be a reasonable approach (I am willing to take advice if there's a better option, especially in T-SQL).
That said, to accomplish this, I'd have to split the name fields on spaces AND dashes, hyphens, etc. Then, once it is tokenized, I can worry about normalizing the case.
Is there any reasonable way to split a string along any delimiter in a list?
If ease & performance is important then grab a copy of PatExtract8k.
Here's a basic example where I split on any character that is not a letter or number ([^a-z0-9]):
-- Sample String
DECLARE #string VARCHAR(8000) = 'abc.123&xyz!4445556__5566^rrr';
-- Basic Use
SELECT pe.* FROM samd.patExtract8K(#string,'[^a-z0-9]') AS pe;
Output:
itemNumber itemIndex itemLength item
--------------- ----------- ----------- -------------
1 1 3 abc
2 5 3 123
3 9 3 xyz
4 13 7 4445556
5 22 4 5566
6 27 3 rrr
It returns what you need as well as:
the length of the item (ItemLength)
It's position in the string (ItemIndex)
It's ordinal position in the string (ItemNumber.)
Now against a table. Here we're doing the same thing but I'll explicitly call out the characters I want to use as a delimiter. Here it's any of these characters: *.&,?%/>
-- Sample Table
DECLARE #table TABLE (SomeId INT IDENTITY, SomeString VARCHAR(100));
INSERT #table VALUES('abc***332211,,XXX'),('abc.123&&555%jjj'),('ll/111>ff?12345');
SELECT t.*, pe.*
FROM #table AS t
CROSS APPLY samd.patExtract8K(t.SomeString,'[*.&,?%/>]') AS pe;
This returns:
SomeId SomeString itemNumber itemIndex itemLength item
----------- ------------------- ------------ ---------- ----------- ---------
1 abc***332211,,XXX 1 1 3 abc
1 abc***332211,,XXX 2 7 6 332211
1 abc***332211,,XXX 3 15 3 XXX
2 abc.123&&555%jjj 1 1 3 abc
2 abc.123&&555%jjj 2 5 3 123
2 abc.123&&555%jjj 3 10 3 555
2 abc.123&&555%jjj 4 14 3 jjj
3 ll/111>ff?12345 1 1 2 ll
3 ll/111>ff?12345 2 4 3 111
3 ll/111>ff?12345 3 8 2 ff
3 ll/111>ff?12345 4 11 5 12345
On the other hand - If I wanted to extract the delimiters I could change the pattern like this: [^*.&,?%/>]. Now the same query returns:
SomeId itemNumber itemIndex itemLength item
----------- -------------------- -------------------- ----------- ---------
1 1 4 3 ***
1 2 13 2 ,,
2 1 4 1 .
2 2 8 2 &&
2 3 13 1 %
3 1 3 1 /
3 2 7 1 >
3 3 10 1 ?
Related
Given the following table
time kind counter key1 value
----------------------------------------
1 1 1 1 1
2 0 1 1 2
3 0 1 2 3
5 0 1 1 4
5 1 2 2 5
6 0 2 3 6
7 0 2 2 7
8 1 3 3 8
9 1 4 3 9
How would one select the value in the first row
immediately after and immediately before each
row of kind 1 ordered by time where the key1
value is the same in both instances .i.e:
time value prevvalue nextvalue
---------------------------------------------
1 1 0n 2
5 5 3 7
8 8 6 0n
9 9 6 0n
Here are some of the things I have tried, though
to be honest I have no idea how to canonically achieve
something like this in q whereby the prior value has a
variable offset to the current row?
select
prev[value],
next[value],
by key1 where kind<>1
update 0N^prevval,0N^nextval from update prevval:prev value1,nextval:next value1 by key1 from table
Some advice or a pointer on how to achieve this would be great!
Thanks
I was able to use the following code to return a table meeting your requirements. If this is correct, the sample table you have provided is incorrect, otherwise I have misunderstood the question.
q)table:([] time:1 2 3 5 5 6 7 8 9;kind:1 0 0 0 1 0 0 1 1;counter:1 1 1 1 2 2 2 3 4;key1:1 1 2 1 2 3 2 3 3;value1:1 2 3 4 5 6 7 8 9)
q)tab2:update 0N^prevval,0N^nextval from update prevval:prev value1,nextval:next value1 by key1 from table
q)tab3:select from tab2 where kind=1
time value1 prevval nextval
---------------------------
1 1 2
5 5 3 7
8 8 6 9
9 9 8
The update statement in tab2:
update 0N^prevval,0N^nextval from update prevval:prev value1,nextval:next value1 by key1 from table
is simply adding 2 columns onto the original table with the previous and next values for each row. 0^ is filling the empty fields with nulls.
The select statement in tab3:
tab3:select from tab2 where kind=1
is filtering tab2 for rows where kind=1.
The final select statement:
select time,value1,prevval,nextval from tab3
is selecting the rows you want to be returned in the final result.
Hope this answers your question.
Thanks,
Caitlin
I have a partitioned table, similar to below table:
q)t:([]date:3#2019.01.01; a:1 2 3; a_test:2 3 4; b_test:3 4 5; c: 6 7 8);
date a a_test b_test c
----------------------------
2019.01.01 1 2 3 6
2019.01.01 2 3 4 7
2019.01.01 3 4 5 8
Now, I want to fetch date column and all columns have names with suffix "_test" from table t.
Expected output:
date a_test b_test
------------------------
2019.01.01 2 3
2019.01.01 3 4
2019.01.01 4 5
In my original table, there are more than 100 columns with name having _test so below is not a practical solution in this case.
q)select date, a_test, b_test from t where date=2019.01.01
I tried various options like below, but of no use:
q)delete all except date, *_test from select from t where date=2019.01.01
If the columns you are selecting are variable then you should use a functional qSQL statement to perform the query. The following can be used in your case
q)query:{[tab;dt;c]?[tab;enlist (=;`date;dt);0b;(`date,c)!`date,c]}
q)query[t;2019.01.01;cols[t] where cols[t] like "*_*"]
date a_test b_test
------------------------
2019.01.01 2 3
2019.01.01 3 4
2019.01.01 4 5
In order to craft a particular functional statement, you can parse your query, putting dummy columns in place if you aren't sure what they should be
q)parse "select date,c1,c2 from tab where date=dt"
?
`tab
,,(=;`date;`dt)
0b
`date`c1`c2!`date`c1`c2
A functional select is probably the best way to go here if you require adding further filters.
?[`t;();0b;{x!x}`date,exec c from meta t where c like "*_test"]
The functional form of any select quesry can be obtained by using the -5! operator on any SQL style statement.
In the example below I have created a table with 20 fields, each one beginning with either a or b.
I then use the functional form to define which fields I want.
q)tab:{[x] enlist x!count[x]#0}`$"_" sv ' raze string `a`b,/:\:til 10
q){[t;s]?[t;();0b;{[x] x!x} cols[t] where cols[t] like s]}[tab;"b*"]
b_0 b_1 b_2 b_3 b_4 b_5 b_6 b_7 b_8 b_9
---------------------------------------
0 0 0 0 0 0 0 0 0 0
q){[t;s]?[t;();0b;{[x] x!x} cols[t] where cols[t] like s]}[tab;"a*"]
a_0 a_1 a_2 a_3 a_4 a_5 a_6 a_7 a_8 a_9
---------------------------------------
0 0 0 0 0 0 0 0 0 0
q)-5!" select a,b from c"
?
`c
()
0b
`a`b!`a`b
Alternatively, if I don't require any filtering I can use the # operator as in below:
{[x;s] (cols[x] where cols[x] like s)#x}[ tab;"a*"]
I have a table that looks like this:
GroupID UserID Value
1 1 10
1 2 20
1 3 30
1 4 40
1 5 45
1 6 49
1 7 80
1 8 90
2 1 2
2 2 24
2 3 34
2 4 48
2 5 56
3 1 etc.
3 2
3 3
3 4
4 1
4 2
4 3
I am trying to write a LEAD function that will give me the midpoint between each value. To do this I have written the following:
SELECT
[GroupID]
, [UserID]+0.5
, (LEAD ([Value], 1) OVER (ORDER BY GroupID, UserID) + [Value])/2 as [Value]
from dbo.myTable
The problem with this function is that when it gets to the last User in the group, it gives me a bad value because it's taking the [Value] on the current row and the value from the next row.
What I want to do is stop it when it reaches the maximum UserID for each Group. In other words, when it gets to GroupID = 1 and UserID = 8, it should end and start at the next Group. I do not want a row that looks like this:
GroupID UserID Value
1 8.5 46
I could run a DELETE statement after I INSERT the rows into the original table, but I don't have anything to identify when a row is the "maximum" User for it's Group. Ideally, I would like to somehow tell the lead statement not to calculate it in the first place.
my table look like this..
id name count
-- ---- -----
1 Mike 0
2 Duke 2
3 Smith 1
4 Dave 6
5 Rich 3
6 Rozie 8
7 Romeo 0
8 Khan 1
----------------------
I want to select rows with max(count) limit 5 (TOP 5 Names with maximum count)
that would look sumthing like...
id name count
-- ---- -----
6 Rozie 8
4 Dave 6
5 Rich 3
2 Duke 2
3 Smith 1
please help,,
thanks
Here is how:
MySQL:
SELECT * FROM tableName ORDER BY count DESC LIMIT 5
MS SQL:
SELECT TOP 5 * FROM tableName ORDER BY count DESC
I would like a query that will show a sum of columns with a default value for missing data. For example assume I have a table as follows:
type_lookup:
id name
-----------
1 self
2 manager
3 peer
And a table as follows
data:
id type_lookup_id value
--------------------------
1 1 1
2 1 4
3 2 9
4 2 1
5 2 9
6 1 5
7 2 6
8 1 2
9 1 1
After running a query I would like a result set as follows:
type_lookup_id value
----------------------
1 13
2 25
3 0
I would like all rows in type_lookup table to be included in the result set - even if they don't appear in the data table.
It's a bit hard to read your data layout, but something like the following should do the trick:
SELECT tl.type_lookup_id, tl.name, sum(da.type_lookup_id) how_much
from type_lookup tl
left outer join data da
on da.type_lookup_id = tl.type_lookup_id
group by tl.type_lookup_id, tl.name
order by tl.type_lookup_id
[EDIT]
...subsequently edited by changing count() to sum().