Talend : tExtractRegexFields and globalMap - talend

My job is composed like that :
tRest >> tExtractJSonFields >> ttExtractRegexFields > (row3) > tMSSqlRow
I'm using a tExtractRegexFields compenent with 3 output variables.
The next component is a tMSSqlRow.
I would like to use tExtractRegexFields's output in my SQL query as a parameter.
My SQL Query is like that :
;WITH nums AS
(SELECT 1 AS PAGE
UNION ALL
SELECT PAGE + 1 AS value
FROM nums
WHERE nums.PAGE < "+(Integer)globalMap.get("row3.lastpage")")
INSERT INTO flight.Calendar_Page (DT_CAL, NUM_PAGE)
SELECT '2016-01-01', PAGE
FROM nums
option (maxrecursion 32767);"
"row3.lastpage" is a tExtractRegexFields's output variable.
This one is always contain NULL value.
I don't understand why globalMap.get("row3.lastpage") is null. Anyone knows how to use tExtractRegexFields's outputs ?
Thank you all

Do not use globalMap.get("row3.lastpage") in the SQL query in your tMSSqlRow, but simply concatenate row3.lastpage with your SQL string. And row3.lastpage is not a string, but row3 is a Java class in the Talend generated code and lastpage is one of its fields. This field contains the data you want to inject into the SQL statement.
globalMap.get("row3.lastpage") will always by null.
Your query should be something like:
WITH nums AS (SELECT 1 AS PAGE UNION ALL SELECT PAGE + 1 AS value FROM nums WHERE nums.PAGE < " + row3.lastpage + ") INSERT INTO flight.Calendar_Page (DT_CAL, NUM_PAGE) SELECT '2016-01-01', PAGE FROM nums option (maxrecursion 32767);

Related

REDSHIFT if value in list

I am trying to set some variables on the top of my query via CTEs to make maintenance of a long query more easy to handle.
I have extracted an example of what I am trying to achieve. I am not managing to make 'tags' be perceived as a list rather than a whole string. I have tried split_part but have not managed to get what I require.
WITH tmp AS (
SELECT
'tag1, tag2, tag3' as tags
)
select
CASE WHEN 'tag1' in (select tags from tmp) THEN 1 ELSE 0 END matched_tags
Basically what I need is to have a string 'tag1' and see if it exists in the list 'tag1','tag2' or 'tag3'. This should give me 1 as there is a match
This is obviously not working because it is taking the 'tag1, tag2, tag3' as one string so there is no match.
Can anyone help me with this?
The STRPOS() function should do what you want. https://docs.aws.amazon.com/redshift/latest/dg/r_STRPOS.html
Something like this:
WITH tmp AS (
SELECT
'tag1, tag2, tag3' as tags
)
SELECT
CASE WHEN STRPOS(tags, 'tag1') > 0 THEN 1 ELSE 0 END as matched_tags
FROM tmp;

PostgreSQL calculate prefix combinations after split

I do have a string as entry, of the form foo:bar:something:221. I'm looking for a way to generate a table with all prefixes for this string, like:
foo
foo:bar
foo:bar:something
foo:bar:something:221
I wrote the following query to split the string, but can't figure out where to go from there:
select unnest(string_to_array('foo:bar:something:221', ':'));
An option is to simulate a loop over all elements, then take the sub-array from the input for each element index:
with data(input) as (
values (string_to_array('foo:bar:something:221', ':'))
)
select array_to_string(input[1:g.idx], ':')
from data
cross join generate_series(1, cardinality(input)) as g(idx);
generate_series(1, cardinality(input)) generates as many rows as the array has elements. And the expression input[1:g.idx] takes the "sub-array" starting with the first up to the "idx" one. As the output is an array, I use array_to_string to re-create the representation with the :
You can use string_agg as a window function. The default frame is from the beginning of the partition to the current row:
SELECT string_agg(s, ':') OVER (ORDER BY n)
FROM unnest(string_to_array('foo:bar:something:221', ':')) WITH ORDINALITY AS u(s, n);
string_agg
-----------------------
foo
foo:bar
foo:bar:something
foo:bar:something:221
(4 rows)

Cast a PostgreSQL column to stored type

I am creating a viewer for PostgreSQL. My SQL needs to sort on the type that is normal for that column. Take for example:
Table:
CREATE TABLE contacts (id serial primary key, name varchar)
SQL:
SELECT id::text FROM contacts ORDER BY id;
Gives:
1
10
100
2
Ok, so I change the SQL to:
SELECT id::text FROM contacts ORDER BY id::regtype;
Which reults in:
1
2
10
100
Nice! But now I try:
SELECT name::text FROM contacts ORDER BY name::regtype;
Which results in:
invalid type name "my first string"
Google is no help. Any ideas? Thanks
Repeat: the error is not my problem. My problem is that I need to convert each column to text, but order by the normal type for that column.
regtype is a object identifier type and there is no reason to use it when you are not referring to system objects (types in this case).
You should cast the column to integer in the first query:
SELECT id::text
FROM contacts
ORDER BY id::integer;
You can use qualified column names in the order by clause. This will work with any sortable type of column.
SELECT id::text
FROM contacts
ORDER BY contacts.id;
So, I found two ways to accomplish this. The first is the solution #klin provided by querying the table and then constructing my own query based on the data. An untested psycopg2 example:
c = conn.cursor()
c.execute("SELECT * FROM contacts LIMIT 1")
select_sql = "SELECT "
for row in c.description:
if row.name == "my_sort_column":
if row.type_code == 23:
sort_by_sql = row.name + "::integer "
else:
sort_by_sql = row.name + "::text "
c.execute("SELECT * FROM contacts " + sort_by_sql)
A more elegant way would be like this:
SELECT id::text AS _id, name::text AS _name AS n FROM contacts ORDER BY id
This uses aliases so that ORDER BY still picks up the original data. The last option is more readable if nothing else.

TSQL split comma delimited string

I am trying to create a stored procedure that will split 3 text boxes on a webpage that have user input that all have comma delimited strings in it. We have a field called 'combined_name' in our table that we have to search for first and last name and any known errors or nicknames etc. such as #p1: 'grei,grie' #p2: 'joh,jon,j..' p3: is empty.
The reason for the third box is after I get the basics set up we will have does not contain, starts with, ends with and IS to narrow our results further.
So I am looking to get all records that CONTAINS any combination of those. I originally wrote this in LINQ but it didn't work as you cannot query a list and a dataset. The dataset is too large (1.3 million records) to be put into a list so I have to use a stored procedure which is likely better anyway.
Will I have to use 2 SP, one to split each field and one for the select query or can this be done with one? What function do I use for contains in tsql? I tried using IN win a query but cannot figure out how it works with multiple parameters.
Please note that this will be an internal site that has limited access so worrying about sql injection is not a priority.
I did attempt dynamic SQL but am not getting the correct results back:
CREATE PROCEDURE uspJudgments #fullName nvarchar(100) AS
EXEC('SELECT *
FROM new_judgment_system.dbo.defendants_ALL
WHERE combined_name IN (' + #fullName + ')')
GO
EXEC uspJudgments #fullName = '''grein'', ''grien'''
Even if this did retrieve the correct results how would this be done with 3 parameters?
You may try use this to split string and obtain a tables of strings. Then to have all the combinations you may use full join of these two tables. And then do your select.
Here is the Table valued function I set up:
ALTER FUNCTION [dbo].[Split] (#sep char(1), #s varchar(8000))
RETURNS table
AS
RETURN (
WITH splitter_cte AS (
SELECT CHARINDEX(#sep, #s) as pos, 0 as lastPos
UNION ALL
SELECT CHARINDEX(#sep, #s, pos + 1), pos
FROM splitter_cte
WHERE pos > 0
)
SELECT SUBSTRING(#s, lastPos + 1,
case when pos = 0 then 80000
else pos - lastPos -1 end) as OutputValues
FROM splitter_cte
)
)

How can I query 'between' numeric data on a not numeric field?

I've got a query that I've just found in the database that is failing causing a report to fall over. The basic gist of the query:
Select *
From table
Where IsNull(myField, '') <> ''
And IsNumeric(myField) = 1
And Convert(int, myField) Between #StartRange And #EndRange
Now, myField doesn't contain numeric data in all the rows [it is of nvarchar type]... but this query was obviously designed such that it only cares about rows where the data in this field is numeric.
The problem with this is that T-SQL (near as I understand) doesn't shortcircuit the Where clause thus causing it to ditch out on records where the data is not numeric with the exception:
Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the nvarchar value '/A' to data type int.
Short of dumping all the rows where myField is numeric into a temporary table and then querying that for rows where the field is in the specified range, what can I do that is optimal?
My first parse purely to attempt to analyse the returned data and see what was going on was:
Select *
From (
Select *
From table
Where IsNull(myField, '') <> ''
And IsNumeric(myField) = 1
) t0
Where Convert(int, myField) Between #StartRange And #EndRange
But I get the same error I did for the first query which I'm not sure I understand as I'm not converting any data that shouldn't be numeric at this point. The subquery should only have returned rows where myField contains numeric data.
Maybe I need my morning tea, but does this make sense to anyone? Another set of eyes would help.
Thanks in advance
IsNumeric only tells you that the string can be converted to one of the numeric types in SQL Server. It may be able to convert it to money, or to a float, but may not be able to convert it to an int.
Change your
IsNumeric(myField) = 1
to be:
not myField like '%[^0-9]%' and LEN(myField) < 9
(that is, you want myField to contain only digits, and fit in an int)
Edit examples:
select ISNUMERIC('.'),ISNUMERIC('£'),ISNUMERIC('1d9')
result:
----------- ----------- -----------
1 1 1
(1 row(s) affected)
You'd have to force SQL to evaluate the expressions in a certain order.
Here is one solution
Select *
From ( TOP 2000000000
Select *
From table
Where IsNumeric(myField) = 1
And IsNull(myField, '') <> ''
ORDER BY Key
) t0
Where Convert(int, myField) Between #StartRange And #EndRange
and another
Select *
From table
Where
CASE
WHEN IsNumeric(myField) = 1 And IsNull(myField, '') <> ''
THEN Convert(int, myField) ELSE #StartRange-1
END Between #StartRange And #EndRange
The first technique is "intermediate materialisation": it forces a sort on a working table.
The 2nd relies on CASE ORDER evaluation is guaranteed
Neither is pretty or whizzy
SQL is declarative: you tell the optimiser what you want, not how to do it. The tricks above force things to be done in a certain order.
Not sure if this helps you, but I did read somewhere that incorrect conversion using CONVERT will always generate error in SQL. So I think it would be better to use CASE in where clause to avoid having CONVERT to run on all rows
Use a CASE statement.
declare #StartRange int
declare #EndRange int
set #StartRange = 1
set #EndRange = 3
select *
from TestData
WHERE Case WHEN ISNUMERIC(Value) = 0 THEN 0
WHEN Value IS NULL THEN 0
WHEN Value = '' THEN 0
WHEN CONVERT(int, Value) BETWEEN #StartRange AND #EndRange THEN 1
END = 1