Related
I have a string that contains a semicolon separated list of key value pairs.
E.g ref:12345;code:ab etc.
I would like to split it into 'ab' as code,'241376' as ref,etc.
Any help really appreciated.
You can use a combination of regexp_split_to_table and split_part.
Have a look at PostgreSQL docs.
CREATE TABLE t (myText text);
INSERT INTO t VALUES ('ref:12345;code:ab;ref:5678;code:cd');
SELECT
split_part(pair, ':', 1) as name,
split_part(pair, ':', 2) as value
FROM
(SELECT regexp_split_to_table(myText, ';') pair FROM t) t1
Result:
name
value
ref
12345
code
ab
ref
5678
code
cd
db<>fiddle here
UPDATE
According to your comment if your desired result is:
xxx as code
xxx as ref
You can use:
SELECT
CONCAT(split_part(pair, ':', 2), ' as ', split_part(pair, ':', 1)) RESULT
FROM (SELECT regexp_split_to_table(myText, ';') pair FROM t) t1
That returns:
result
12345 as ref
ab as code
5678 as ref
cd as code
db<>fiddle here
A little bit messy but I hope self explanatory.
with t as
(
select (r + 1)/2 as r,
split_part(txt, ':', 1) as k,
split_part(txt, ':', 2) as v
from unnest(string_to_array('ref:12345;code:ab;ref:5678;code:cd;ref:9876;code:yz', ';'))
with ordinality as t(txt, r)
)
select
max(v) filter (where k = 'ref') as ref_fld,
max(v) filter (where k = 'code') as code_fld
from t group by r;
Result:
ref_fld
code_fld
12345
ab
9876
yz
5678
cd
I work with SQL Server 2012 and face an issue: I can't display Text Unit only one time where it repeated for feature using Stuff.
What I need is when Text Unit is repeated for same feature, then no need to repeat it - only display it once.
In my case, I face issue that I can't prevent repeat Text Unit when It be same Text Unit for same Feature.
Voltage | Voltage | Voltage ONLY one Voltage display .
CREATE TABLE #FinalTable
(
PartID INT,
DKFeatureName NVARCHAR(100),
TextUnit NVARCHAR(100),
StatusId INT
)
INSERT INTO #FinalTable (PartID, DKFeatureName, TextUnit, StatusId)
VALUES
(1211, 'PowerSupply', 'Voltage', 3),
(1211, 'PowerSupply', 'Voltage', 3),
(1211, 'PowerSupply', 'Voltage', 3)
SELECT
PartID, DKFeatureName,
COUNT(PartID) AS CountParts,
TextUnit = STUFF ((SELECT ' | ' + TextUnit
FROM #FinalTable b
WHERE b.PartID = a.PartID
AND a.DKFeatureName = b.DKFeatureName
AND StatusId = 3
FOR XML PATH('')), 1, 2, ' ')
INTO
#getUnitsSticky
FROM
#FinalTable a
GROUP BY
PartID, DKFeatureName
HAVING
(COUNT(PartID) > 1)
SELECT *
FROM #getUnitsSticky
Expected result is :
Voltage
Incorrect result or result I don't need is as below :
Voltage|Voltage|Voltage
TomC's answer is basically correct. However, when using this method with SQL Server, it is usually more efficient to get the rows in a subquery and then use stuff() in the outer query. That way, the values in each row are processed only once.
So:
SELECT PartID, DKFeatureName, CountParts,
STUFF( (SELECT ' | ' + TextUnit
FROM #FinalTable b
WHERE b.PartID = a.PartID AND
b.DKFeatureName = a.DKFeatureName AND
StatusId = 3
FOR XML PATH('')
), 1, 3, ' ') as TextUnit
INTO #getUnitsSticky
FROM (SELECT PartID, DKFeatureName, COUNT(*) as CountParts
FROM #FinalTable a
GROUP BY PartID, DKFeatureName
HAVING COUNT(*) > 1
) a;
This also removes the leading space from the concatenated result.
To put this into a complete answer - this should be your SQL (shortened slightly and removed the last temp table):
SELECT
PartID, DKFeatureName,
COUNT(PartID) AS CountParts,
TextUnit = STUFF ((SELECT distinct ' | ' + TextUnit
FROM #FinalTable b
WHERE b.PartID = a.PartID
AND a.DKFeatureName = b.DKFeatureName
AND StatusId = 3
FOR XML PATH('')), 1, 2, ' ')
FROM #FinalTable a
GROUP BY PartID, DKFeatureName
HAVING (COUNT(PartID) > 1)
Function px_explode will be provided with two parameters:
separator
string
Final result will look like this:
SELECT * FROM dbo.px_explode('xxy', 'alfaxxybetaxxygama')
and will return
But...
Query won't finish execution, so I assume that I ran into an infinite loop here, now assuming this, my question might be.
How can I avoid the infinite loop I ran into and what am I missing?
Code:
CREATE FUNCTION dbo.px_explode
(#separator VARCHAR(10), #string VARCHAR(2000))
RETURNS #expl_tbl TABLE
(val VARCHAR(100))
AS
BEGIN
IF (CHARINDEX(#separator, #string) = 0) and (LTRIM(RTRIM(#string)) <> '')
INSERT INTO #expl_tbl VALUES(LTRIM(RTRIM(#string)))
ELSE
BEGIN
WHILE CHARINDEX(#separator, #string) > 0
BEGIN
IF (LTRIM(RTRIM(LEFT(#string, CHARINDEX(#separator, #string) - 1)))
<> '')
INSERT INTO #expl_tbl VALUES(LTRIM(RTRIM(LEFT(#string,
CHARINDEX(#separator, #string) - 1))))
END
IF LTRIM(RTRIM(#string)) <> ''
INSERT INTO #expl_tbl VALUES(LTRIM(RTRIM(#string)))
END
RETURN
END
Loops are bad and so are mutli-statement table valued functions (e.g. where you define the table). If performance is important then you want a tally table and and inline table valued function (iTVF).
For a high-performing way to resolve this I would first grab a copy of Ngrams8k. The solution you're looking for will look like this:
DECLARE #string varchar(8000) = 'alfaxxybetaxxygama',
#delimiter varchar(20) = 'xxy'; -- use
SELECT
itemNumber = row_number() over (ORDER BY d.p),
itemIndex = isnull(nullif(d.p+l.d, 0),1),
item = SUBSTRING
(
#string,
d.p+l.d, -- delimiter position + delimiter length
isnull(nullif(charindex(#delimiter, #string, d.p+l.d),0) - (d.p+l.d), 8000)
)
FROM (values (len(#string), len(#delimiter))) l(s,d) -- 1 is fine for l.d but keeping uniform
CROSS APPLY
(
SELECT -(l.d) union all
SELECT ng.position
FROM dbo.NGrams8K(#string, l.d) as ng
WHERE token = #delimiter
) as d(p); -- delimiter.position
Which returns
itemNumber itemIndex item
-------------------- -------------------- ---------
1 1 alfa
2 8 beta
3 15 gama
Against a table it would look like this:
DECLARE #table table (string varchar(8000));
INSERT #table VALUES ('abcxxyXYZxxy123'), ('alfaxxybetaxxygama');
DECLARE #delimiter varchar(100) = 'xxy';
SELECT *
FROM #table t
CROSS APPLY
(
SELECT
itemNumber = row_number() over (ORDER BY d.p),
itemIndex = isnull(nullif(d.p+l.d, 0),1),
item = SUBSTRING
(
t.string,
d.p+l.d, -- delimiter position + delimiter length
isnull(nullif(charindex(#delimiter, t.string, d.p+l.d),0) - (d.p+l.d), 8000)
)
FROM (values (len(t.string), len(#delimiter))) l(s,d) -- 1 is fine for l.d but keeping uniform
CROSS APPLY
(
SELECT -(l.d) union all
SELECT ng.position
FROM dbo.NGrams8K(t.string, l.d) as ng
WHERE token = #delimiter
) as d(p) -- delimiter.position
) split;
Results:
string itemNumber itemIndex item
------------------------- -------------------- -------------------- ------------------
abcxxyXYZxxy123 1 1 abc
abcxxyXYZxxy123 2 7 XYZ
abcxxyXYZxxy123 3 13 123
alfaxxybetaxxygama 1 1 alfa
alfaxxybetaxxygama 2 8 beta
alfaxxybetaxxygama 3 15 gama
My favourite is the XML splitter. This needs no function and is fully inlineable. If you can introduce a function to your database, the suggested links in Gareth's comment give you some very good ideas.
This is simple and quite straight forward:
DECLARE #YourString VARCHAR(100)='alfaxxybetaxxygama';
SELECT nd.value('text()[1]','nvarchar(max)')
FROM (SELECT CAST('<x>' + REPLACE((SELECT #YourString AS [*] FOR XML PATH('')),'xxy','</x><x>') + '</x>' AS XML)) AS A(Casted)
CROSS APPLY A.Casted.nodes('/x') AS B(nd);
This will first transform your string to an XML like this
<x>alfa</x>
<x>beta</x>
<x>gama</x>
... simply by replacing the delimiters xxy with XML tags. The rest is easy reading from XML .nodes()
So if I have a varchar length string column let's call ID(samples below):
97.128.39.256.1460854333288493
25.365.49.12.13454154815132
346.45.156.354.1523425161233
I want to grab, like a left in excel, everything to the left of the 4th period. How do i create a dynamic string to find the fourth instance of a period?
I know substring is a start but not sure how to write in the dynmic length that exists
This is probably the easiest for someone else to read:
select split_part(i, '.', 1) || '.' ||
split_part(i, '.', 2) || '.' ||
split_part(i, '.', 3) || '.' ||
split_part(i, '.', 4)
from (select '97.128.39.256.1460854333288493' as i) as sub;
Or if you don't like split_part and prefer to use arrays:
select array_to_string((string_to_array(i, '.'))[1:4], '.')
from (select '97.128.39.256.1460854333288493' as i) as sub;
I think the array example is a bit harder to grasp at first glance but both work.
Updated answer based on revised question to also convert the Unix timestamp to a Greenplum timestamp:
select 'epoch'::timestamp + '1 second'::interval *
(split_part(i, '.', 5)::numeric/1000000) as event_time,
array_to_string((string_to_array(i, '.'))[1:4], '.') as ip_address
from (
select '97.128.39.256.1460854333288493' as i
) as sub;
You could also try this:
mydb=> select regexp_replace('97.128.39.256.1460854333288493', E'^((?:\\d+\\.){3}\\d+).+$', E'\\1');
regexp_replace
----------------
97.128.39.256
(1 row)
Time: 0.634 ms
with t (s) as ( values
('97.128.39.256.1460854333288493'),
('25.365.49.12.13454154815132'),
('346.45.156.354.1523425161233')
)
select a[1] || '.' || a[2] || '.' || a[3] || '.' || a[4]
from (
select regexp_split_to_array(s, '\.')
from t
) t (a)
;
?column?
----------------
97.128.39.256
25.365.49.12
346.45.156.354
I have a Sql Server 2K8 R2 DB with a table that have a column containings multiples values, separated by (char 13 and char 10).
I'm building a script to import the data in a properly normalized schema.
My source table contains something like this :
ID | Value
________________
1 | line 1
line 2
________________
2 | line 3
________________
3 | line 4
line 5
line 6
________________
and so on.
[edit] FYI, Id is integer and value is nvarchar(3072) [/edit]
What I want is to query the table to ouput somethnig like this :
ID | Value
________________
1 | line 1
________________
1 | line 2
________________
2 | line 3
________________
3 | line 4
________________
3 | line 5
________________
3 | line 6
________________
I've read many answer here on SO, and also around the web, and I find that using master..sptvalues should be the solution. Especially, I tried to reprodude the solution of the question Split one column into multiple rows.
However, without success (suspecting having two chars causing problems).
By now, I wrote this query :
SELECT
T.ID,
T.Value,
RIGHT(LEFT(T.Value,spt.Number-1),
CHARINDEX(char(13)+char(10),REVERSE(LEFT(char(13)+char(10)+T.Value,spt.Number-1)))) as Extracted
FROM
master..spt_values spt,
ContactsNew T
WHERE
Type = 'P' AND
spt.Number BETWEEN 1 AND LEN(T.Value)+1
AND
(SUBSTRING(T.Value,spt.Number,2) = char(13)+char(10) OR SUBSTRING(T.Value,spt.Number,2) = '')
This query, unfortunately is returning :
ID | Value | Extracted
________________________________
1 | line 1 | <blank>
line 2 |
________________________________
1 | line 1 | line 2
line 2 |
________________________________
2 | line 3 | <blank>
________________________________
3 | line 4 | <blank>
line 5 |
line 6 |
________________________________
3 | line 4 | line 5
line 5 | line 6
line 6 |
________________________________
3 | line 4 | line 6
line 5 |
line 6 |
________________________________
<blank> is an empty string, not null string.
I'd appreciate some help to tune my query.
[Edit2] My source table contains less than 200 records, and performance is not a requirement, so I'm targeting a simple solution rather than an efficient one [Edit2]
[Edit3] The source database is readonly. I can't add stored procedure, function, or clr type. I have to do this in a single query. [Edit3]
[Edit4] Something strange... it seems that whitespaces are also considered as separators.
If I run the following query :
SELECT
T.ID,
replace(T.Value, '#', ' '),
replace(RIGHT(
LEFT(T.Value,spt.Number-1),
CHARINDEX( char(13) + char(10),REVERSE(LEFT(char(10) + char(13)+T.Value,spt.Number-0)))
), '#', ' ')
FROM
master..spt_values spt,
(
select contactID,
replace(Value,' ', '#') Value
from ContactsNew where Value is not null
) T
WHERE
Type = 'P' AND
spt.Number BETWEEN 1 AND LEN(T.Value)+1
AND
(SUBSTRING(T.Value,spt.Number,2) = char(13) + char(10) OR SUBSTRING(T.Value,spt.Number,1) = '')
I got the correct number of returns (however, still having wrong values), while running this query :
SELECT
T.ID,
T.Value,
RIGHT(
LEFT(T.Value,spt.Number-1),
CHARINDEX( char(13) + char(10),REVERSE(LEFT(char(10) + char(13)+T.Value,spt.Number-0)))
)
FROM
master..spt_values spt,
(
select contactID,
Value
from ContactsNew where Value is not null
) T
WHERE
Type = 'P' AND
spt.Number BETWEEN 1 AND LEN(T.Value)+1
AND
(SUBSTRING(T.Value,spt.Number,2) = char(13) + char(10) OR SUBSTRING(T.Value,spt.Number,1) = '')
splits on spaces also
EDIT #1: I've deleted original answer text. Try following query. I slightly modified your logic. If you should have any questions about it, don't hesitate to ask in comment. If You need another split delimiter just introduce another nested query to replace that delimiter with CHAR(13)+CHAR(10).
SELECT
*
FROM
(
SELECT
T.ID,
T.Value,
CASE
WHEN CHARINDEX(CHAR(13) + CHAR(10), SUBSTRING(T.Value, spt.number, LEN(T.Value) - spt.Number + 1)) > 0 THEN
LEFT(
SUBSTRING(T.Value, spt.number, LEN(T.Value) - spt.Number + 1),
CHARINDEX(CHAR(13) + CHAR(10), SUBSTRING(T.Value, spt.number, LEN(T.Value) - spt.Number + 1)) - 1)
/* added by Steve B. see comments for the reasons */
when len(T.Value) = spt.Number then right(t.Value, spt.number -1)
/* end of edit */
ELSE
SUBSTRING(T.Value, spt.number, LEN(T.Value) - spt.Number + 1)
END EXTRACTED
FROM
master..spt_values spt,
ContactsNew T
WHERE
Type = 'P' AND
spt.Number BETWEEN 1 AND LEN(T.Value)+1
) X
WHERE
EXTRACTED <> '' AND
(
LEFT(X.VALUE, LEN(EXTRACTED)) = EXTRACTED OR
X.Value LIKE '%' + CHAR(13) + CHAR(10) + EXTRACTED + CHAR(13) + CHAR(10) + '%' OR
X.Value LIKE '%' + CHAR(13) + CHAR(10) + EXTRACTED
)
A sample query showing how to perform this kind of operation against some test data similar to described.
If you aren't able to declare variables in your final statement you can find/replace them for their values, but it makes things a bit simpler.
This works by replacing CR+LF with a single character before doing the split.
If '|' is in use in your data, select another single character which isn't to use as the temporary delimiter.
declare #crlf nvarchar(2) = char(10) + char(13)
declare #cDelim nvarchar(1) = N'|'
-- test data
declare #t table
(id int
,value nvarchar(3072))
insert #t
select 1, 'line1' + #crlf + 'line2'
union all select 2, 'line3'
union all select 3, 'line4' + #crlf + 'line5' + #crlf + 'line6'
-- /test data
;WITH charCTE
AS
(
--split the string into a dataset
SELECT D.id, D.value, SUBSTRING(D.s,n,CHARINDEX(#cDelim, D.s + #cDelim,n) -n) AS ELEMENT
FROM (SELECT id, value, REPLACE(value,#crlf,#cDelim) as s from #t) AS D
JOIN (SELECT TOP 3072 ROW_NUMBER() OVER (ORDER BY a.type, a.number, a.name) AS n
FROM master.dbo.spt_values a
CROSS
JOIN master.dbo.spt_values b
) AS numsCte
ON n <= LEN(s)
AND SUBSTRING(#cDelim + s,n,1) = #cDelim
)
SELECT id, ELEMENT
FROM charCTE
order by id, element