Expression to look up certain character and store it in SSIS variable - tsql

In SSIS 2008 I have a variable called #[User::EANcode] It contains a string with a product eancode like '1234567891123'. The value is derived from a filename like'1234567891123.jpg' via a foreach loop.
However, sometimes the filenames contain an extra '_1', '_2' etc. at the end like '1234567891123_1.jpg' resulting in a value '1234567891123_1' in the EANcode variable.
This happens when there is more than one image for the same EANcode (product). The _N addition is always a number and it is always at the end of the name/string.
What is the expression to find/cath the '_1' (or_2 or_N etc) so you can store it in another variable called #[User::Addition]?
If there is no addition, the variable stays empty which is fine.
The reason I need to get this _N addition into a separate variable is that I later on need it to rename the filename but paste the addition back at the end.
Thanks!

I think you're looking for CHARINDEX() in conjunction with SUBSTRING(). With that, you can split off that _# to another variable like this (copy/pasta and execute to see. Play with the #temp1 variable to see the limitations of the code):
declare #temp1 varchar(20), #temp2 varchar(20)
set #temp1 = '1234567891123_12'
IF CHARINDEX('_', #temp1) > 1
set #temp2 = SUBSTRING(#temp1,CHARINDEX('_', #temp1),LEN(#temp1)-CHARINDEX('_',#temp1)+1)
select #temp1, #temp2
Hope it helps!

Related

What is breaking this Case When Statement? (Pardon if this has been asked) [duplicate]

I am trying to insert some text data into a table in SQL Server 9.
The text includes a single quote '.
How do I escape that?
I tried using two single quotes, but it threw me some errors.
eg. insert into my_table values('hi, my name''s tim.');
Single quotes are escaped by doubling them up, just as you've shown us in your example. The following SQL illustrates this functionality. I tested it on SQL Server 2008:
DECLARE #my_table TABLE (
[value] VARCHAR(200)
)
INSERT INTO #my_table VALUES ('hi, my name''s tim.')
SELECT * FROM #my_table
Results
value
==================
hi, my name's tim.
If escaping your single quote with another single quote isn't working for you (like it didn't for one of my recent REPLACE() queries), you can use SET QUOTED_IDENTIFIER OFF before your query, then SET QUOTED_IDENTIFIER ON after your query.
For example
SET QUOTED_IDENTIFIER OFF;
UPDATE TABLE SET NAME = REPLACE(NAME, "'S", "S");
SET QUOTED_IDENTIFIER ON;
-- set OFF then ON again
How about:
insert into my_table values('hi, my name' + char(39) + 's tim.')
Many of us know that the Popular Method of Escaping Single Quotes is by Doubling them up easily like below.
PRINT 'It''s me, Arul.';
we are going to look on some other alternate ways of escaping the single quotes.
1. UNICODE Characters
39 is the UNICODE character of Single Quote. So we can use it like below.
PRINT 'Hi,it'+CHAR(39)+'s Arul.';
PRINT 'Helo,it'+NCHAR(39)+'s Arul.';
2. QUOTED_IDENTIFIER
Another simple and best alternate solution is to use QUOTED_IDENTIFIER.
When QUOTED_IDENTIFIER is set to OFF, the strings can be enclosed in double quotes.
In this scenario, we don’t need to escape single quotes.
So,this way would be very helpful while using lot of string values with single quotes.
It will be very much helpful while using so many lines of INSERT/UPDATE scripts where column values having single quotes.
SET QUOTED_IDENTIFIER OFF;
PRINT "It's Arul."
SET QUOTED_IDENTIFIER ON;
CONCLUSION
The above mentioned methods are applicable to both AZURE and On Premises .
2 ways to work around this:
for ' you can simply double it in the string, e.g.
select 'I''m happpy' -- will get: I'm happy
For any charactor you are not sure of: in sql server you can get any char's unicode by select unicode(':') (you keep the number)
So this case you can also select 'I'+nchar(39)+'m happpy'
The doubling up of the quote should have worked, so it's peculiar that it didn't work for you; however, an alternative is using double quote characters, instead of single ones, around the string. I.e.,
insert into my_table values("hi, my name's tim.");
Also another thing to be careful of is whether or not it is really stored as a classic ASCII ' (ASCII 27) or Unicode 2019 (which looks similar, but not the same). This isn't a big deal on inserts, but it can mean the world on selects and updates. If it's the unicode value then escaping the ' in a WHERE clause (e.g where blah = 'Workers''s Comp') will return like the value you are searching for isn't there if the ' in "Worker's Comp" is actually the unicode value.If your client application supports free-key, as well as copy and paste based input, it could be Unicode in some rows, and ASCII in others!
A simple way to confirm this is by doing some kind of open ended query that will bring back the value you are searching for, and then copy and paste that into notepad++ or some other unicode supporting editor. The differing appearance between the ascii value and the unicode one should be obvious to the eyes, but if you lean towards the anal, it will show up as 27 (ascii) or 92 (unicode) in a hex editor.
The following syntax will escape you ONLY ONE quotation mark:
SELECT ''''
The result will be a single quote. Might be very helpful for creating dynamic SQL :).
Double quotes option helped me
SET QUOTED_IDENTIFIER OFF;
insert into my_table values("hi, my name's tim.");
SET QUOTED_IDENTIFIER ON;
This should work
DECLARE #singleQuote CHAR
SET #singleQuote = CHAR(39)
insert into my_table values('hi, my name'+ #singleQuote +'s tim.')
Just insert a ' before anything to be inserted. It will be like a escape character in sqlServer
Example:
When you have a field as, I'm fine.
you can do:
UPDATE my_table SET row ='I''m fine.';
I had the same problem, but mine was not based of static data in the SQL code itself, but from values in the data.
This code lists all the columns names and data types in my database:
SELECT DISTINCT QUOTENAME(COLUMN_NAME),DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS
But some column names actually have a single-quote embedded in the name of the column!, such as ...
[MyTable].[LEOS'DATACOLUMN]
To process these, I had to use the REPLACE function along with the suggested QUOTED_IDENTIFIER setting. Otherwise it would be a syntax error, when the column is used in a dynamic SQL.
SET QUOTED_IDENTIFIER OFF;
SET #sql = 'SELECT DISTINCT ''' + #TableName + ''',''' + REPLACE(#ColumnName,"'","''") + ...etc
SET QUOTED_IDENTIFIER ON;
The STRING_ESCAPE funtion can be used on newer versions of SQL Server
This should work: use a back slash and put a double quote
"UPDATE my_table SET row =\"hi, my name's tim.\";

How can I break a long string in an "XMLTABLE" embedded SQL statement in RPGLE across multiple lines?

I have an XML path that exceeds 100 characters (and therefore truncates when the source is saved). My statement is something like this:
Exec SQL
Select Whatever
Into :Stuff
From Table as X,
XmlTable(
XmlNamespaces('http://namespace.url/' as "namespacevalue"),
'$X/really/long/path' Passing X.C1 as "X"
Columns
Field1 Char(3) Path 'example1',
Field2 Char(8) Path 'example2',
Field3 Char(32) Path '../example3'
) As R;
I must break $X/really/long/path across multiple lines. Per IBM's documentation,
The plus sign (+) can be used to indicate a continuation of a string constant.
However, this does not even pass precompile ("Token + was not valid"). I suspect this is due to where the string is in the statement.
I have also tried:
Putting the path in a host variable; this was not allowed
Using SQL CONCAT or ||; not allowed
Putting the path in a SQL global variable instead of a host variable; not allowed
I have considered:
Preparing the entire statement, but this is not ideal for a multitude of reasons
Truncating the path at a higher level in the hierarchy, but this does not return the desired "granularity" of records
Is there any way to span this specific literal in an XmlTable function across multiple lines in my source? Thanks for any and all ideas!
Something like
Exec SQL
Select Whatever
Into :Stuff
From Table as X,
XmlTable(
XmlNamespaces('http://namespace.url/' as "namespacevalue"),
'$X/really/+
long/path' Passing X.C1 as "X"
Columns
Field1 Char(3) Path 'example1',
Field2 Char(8) Path 'example2',
Field3 Char(32) Path '../example3'
) As R;
Should work, is that what you tried ?
The + didn't worked for me, so I had to shorten the path with // instead of /, which might by suboptimal .

How to use regex capture groups in postgres stored procedures (if possible at all)?

In a system, I'm using a standard urn (RFC8141) as one of the fields. From that urn, one can derive a unique identifier. The weird thing about the urns described in RFC8141 is that you can have two different urns which are equal.
In order to check for unique keys, I need to extract different parts of the urn that make a unique key. To do so, I have this regex (Regex which matches URN by rfc8141):
\A(?i:urn:(?!urn:)(?<nid>[a-z0-9][a-z0-9-]{1,31}[^-]):(?<nss>(?:[-a-z0-9()+,.:=#;$_!*'&~\/]|%[0-9a-f]{2})+)(?:\?\+(?<rcomponent>.*?))?(?:\?=(?<qcomponent>.*?))?(?:#(?<fcomponent>.*?))?)\z
which results in a five named capture groups (nid, nss, rcomponent, qcomponent en fcomponent). Only the nid and nss are important to check for uniqueness/equality. Or: even if the components change, as long as nid and nss are the same, two items/records are equal (no matter the values of the components). nid is checked case-insensitive, nss is checked case-sensitive.
Now, in order to check for uniqueness/equality, I'm defining a 'cleaned urn', which is the primary key. I've added a trigger, so I can extract the different capture groups. What I'd like to do is:
extract the nid and nss (see regex) of the urn
capture them by name. This is where I don't know how to do it: how can I capture these two capture groups in a postgresql stored procedure?
add them as 'cleaned urn', lowercasing nid (so to have case-insensitivity on that part) and url-encoding or url-decoding the string (one of the two, it doesn't matter, as long as it's consistent). (I'm also not sure if there's is a url encode/decode function in Postgres, but I that'll be another question once the previous one is solved :) ).
Example:
all these urns are equal/equivalent (and I want the primary key to be urn:example:a123,z456):
urn:example:a123,z456
URN:example:a123,z456
urn:EXAMPLE:a123,z456
urn:example:a123,z456?+abc (?+ denotes the start of the rcomponent)
urn:example:a123,z456?=xyz/something (?= denotes the start of the qcomponent)
urn:example:a123,z456#789 (# denotes the start of the fcomponent)
urn:example:a123%2Cz456
URN:EXAMPLE:a123%2cz456
urn:example:A123,z456 and urn:Example:A123,z456 both have key urn:example:A123,z456, which is different from the previous examples (because of the case-sensitiveness of the A123,z456).
just for completeness: urn:example:a123,z456?=xyz/something is different from urn:example:a123,z456/something?=xyz: everything after ?= (or ?+ or #) can be omitted, so the /something is part of the primary key in the latter case, but not in the former. (That's what the regex is actually capturing already.)
== EDIT 1: unnamed capture groups ==
with unnamed capture groups, this would be doing the same:
select
g[2] as nid,
g[3] as nss,
g[4] as rcomp,
g[5] as qcomp,
g[6] as fcomp
from (
select regexp_matches('uRn:example:a123,z456?=xyz/something',
'\A(urn:(?!urn:)([a-z0-9][a-z0-9-]{1,31}[^-]):((?:[-a-z0-9()+,.:=#;$_!*''&~\/]|%[0-9a-f]{2})+)(?:\?\+(.*?))?(?:\?=(.*?))?(?:#(.*?))?)$', 'i')
g)
as ar;
(g[1] is the full match, which I don't need)
I updated the query:
case insensitive matching should be done as flag
no capturing groups (postgres seems to have issues with names capturing groups)
and did a select on the array, splitting the array into columns.
Named capture don't seem to be supported and there seem to be some issues with the greedy/lazy lookup and negative lookahead. So, here's a solution that works fine:
DO $$
BEGIN
if not exists (SELECT 1 FROM pg_type WHERE typname = 'urn') then
CREATE TYPE urn AS (nid text, nss text, rcomp text, qcomp text, fcomp text);
end if;
END
$$;
CREATE or REPLACE FUNCTION spliturn(urnstring text)
RETURNS urn as $$
DECLARE
urn urn;
urnregex text = concat(
'\A(urn:(?!urn:)',
'([a-z0-9][a-z0-9-]{1,31}[^-]):',
'((?:[-a-z0-9()+,.:=#;$_!*''&~\/]|%[0-9a-f]{2})+)',
'(?:\?\+(.*?))??',
'(?:\?=(.*?))??',
'(?:#(.*?))??',
')$');
BEGIN
select
lower(g[2]) as nid,
g[3] as nss,
g[4] as rcomp,
g[5] as qcomp,
g[6] as fcomp
into urn
from (select regexp_matches(urnstring, urnregex, 'i')
g) as ar;
RETURN urn;
END;
$$ language 'plpgsql' immutable strict;
note
no named groups (?<...>)
indicate case insensitive search with a flag
replacement of \z with $ to match the end of the string
escaping a quote with another quote ('') to allow for quotes
the double ?? for non-greedy search (Table 9-14)

Keeping original variable order with 'select variables' command

The following code was used to generate a list of numeric variables and their maxima and minima from a datafile containing >500 variables and >2000 cases:
OMS select tables
/if commands=["descriptives"]
subtypes=["descriptive statistics"]
/DESTINATION FORMAT = SAV
OUTFILE = "C:\statyMcStatFace.sav".
SPSSINC SELECT VARIABLES MACRONAME="!nums" /PROPERTIES TYPE= NUMERIC.
DESCRIPTIVES !nums /STATISTICS=MIN MAX.
omsend.
Sadly, the variables weren't listed in the same order in the output file as they were in the original file, nor according to any discernible order I can see. For example, if you run the given code on plantar_fascitiitis.csv at
kaggle.com/rameessahlu/plantar-fasciitis
you'll find that the order of the variables in the original table is age, sex, weight... etc., while the order the variables are listed in the macro is Status, TendernessOfFoot, Alignment, Burning... etc.. Why does this happen, and is there a way for me to order the variables as they are in the original table?
When you are creating your numerical variables list using the select variables command, there is an option to keep the created list in the original order of the dataset. So all you have to do is use the command with this addition:
SPSSINC SELECT VARIABLES MACRONAME="!nums" /PROPERTIES TYPE= NUMERIC /OPTIONS ORDER=FILE.

Creating a list of all variables except some specific ones

I have multiple spss file having multiple number of variables(col1,col2,...col150).I am trying to create a common code for restructure the file using VARSTOCVASES. in this i need to KEEP 3 variables(col1,col34,col66)these are common in all files but the rest variables are different.I know the normal way in that we will add all the remaining variables in to MAKE sub command. that i am adding bellow
VARSTOCVASES
/MAKE VariableName1 FROM Col1 Col2 Col3 ....etc(except 3)
/INDEX=VariableName(VariableName1)
/KEEP=Col1 Col34 Col66
instead of this i want to create some variable list using the (SPSSINC SELECT VARIABLES) command.I got this idea but i don't have any examples for the same.This Select query must be small which means this query should dynamically select all the variables except these 3(Col1 Col34 Col66)because i have different SPSS files and in that these 3(Col1 Col34 Col66) variables are same but the rest are different and all containing different number of variables.
IF i have a variable list(dynamically generated by excluding the 3) then i can point that in MAKE sub Command.Please any one help me.
one way to go about this could be to rename these specific columns and then select all other variables that start with "col":
rename variables (col1 col34 col66=var1 var34 var66).
spssinc select variables MACRONAME = "!allCOL"
/PROPERTIES PATTERN="Col*".
Now all variables with names starting with "Col" are in the list called "!allCOL" which you can use in your syntax, for example:
VARSTOCVASES
/MAKE VariableName1 FROM !allCOL /INDEX=VariableName(VariableName1) .
EDIT: another solution
The solution above is valid only if there is a constant pattern to all the variables you want on the list. If that is not the case, this following solution enables you to name the variables that you don't want on the list, and put all the rest on the list.
* first we define a new attribute in which we mark the
variables we don't want on the list.
VARIABLE ATTRIBUTE VARIABLES=Car_Model_1 Car_Model_2
ATTRIBUTE=IncludeInMake ("no").
* now we create the list, leaving out the unwanted variables.
spssinc select variables MACRONAME = "!forMake"
/ATTRVALUES NAME=IncludeInMake VALUE="".
VARSTOCVASES /MAKE Val FROM !forMake /INDEX=var(val) .