I am not new to ETL and trying to become familiar with Talend. I cannot seem to get any output of a stored procedure (using tMSSqlSP) or a query (using tMSSqlRow). NOTE: Things I read indicated that tMSSqlRow doesn't produce columnar output but not sure that is correct.
The job shown below runs but no output comes from the tMSSqlSP component. The trace debug shows that the output title is null. However, manual execution of the SP in SSMS succeeds, showing both objid and title.
The SP performs a simple query accepting a single input parameter (int) and outputs two columns -- the objid (int) and the title (string):
create procedure st_sp_case_title_get
#objid int
as
select [objid], [title] from [dbo].[table_case] where [objid] = #objid
You need to use tParseRecordSet in order to retrieve and parse result sets from tMSSqlRow and tMSSqlSP :
Define a column to be your result set (mine is called result) of type Object, in addition to your input columns (my input param is personid). In tMSSqlSP parameters tab, set personid as type IN, and result to be of type RECORD SET.
tParseRecordSet schema :
It parses the result column and gets Firstname and Lastname columns (your objid, and title columns)
tMSSqlRow is very similar. Check my previous answer here for an example.
Related
I am trying to create a custom template fragment that builds a table of value properties. I started by creating a SQL query fragment that pulls all properties classified by a Value Type. Now I would like to pull in the default (initial) value assigned. I figured out that it's in the Description table of t_xref, with the property guid in the client field, but I don't know how to write a query that will reliably parse the default value out since the string length may be different depending on other values set. I tried using the template content selector first but I couldn't figure out how to filter to only value properties. I'm still using the default .qeax file but will be migrating to a windows based DBMS soon. Appreciate any help!
Tried using the content selector. Successfully built a query to get value properties but got stuck trying to join and query t_xref for default value.
Edited to add current query and image
Value Properties are block properties that are typed to Value Types. I'm using SysML.
This is my current query, I am no SQL expert! I don't pull anything from t_xref yet but am pulling out only the value properties with this query:
SELECT property.ea_guid AS CLASSGUID, property.Object_Type AS CLASSTYPE, property.Name, property.Note as [Notes], classifier.Name AS TYPE
FROM t_object property
LEFT JOIN t_object classifier ON property.PDATA1 = classifier.ea_guid
LEFT JOIN t_object block on property.ParentID = block.Object_ID
WHERE block.Object_ID = #OBJECTID# AND property.Object_Type = 'Part' AND classifier.Object_Type = 'DataType'
ORDER BY property.Name
I guess that Geert will come up with a more elaborate answer, but (assuming you are after the Run State) here are some details. The value for these Run States is stored in t_object.runstate as one of the crude Sparxian formats. You find something like
#VAR;Variable=v1;Value=4711;Op==;#ENDVAR;
where v1 is the name and 4711 the default in this example. How you can marry that with your template? Not the faintest idea :-/
I can't give a full answer to the original question as I can't reproduce your data, but I can provide an answer for the generic problem of "how to extract data through SQL from the name-value pair in t_xref".
Note, this is heavily dependent on the database used. The example below extracts fully qualified stereotype names from t_xref in SQL Server for custom profiles.
select
substring(
t_xref.Description, charindex('FQName=',t_xref.Description)+7,
charindex(';ENDSTEREO',t_xref.Description,charindex('FQName=',t_xref.Description))
-charindex('FQName=',t_xref.Description)-7
),
Description from t_xref where t_xref.Description like '%FQName%'
This works using:
substring(string, start, length)
The string is the xref description column, and the start and length are set using:
charindex(substring, string, [start position])
This finds the start and end tags within the xref description field, for the data you're trying to parse.
For your data, I imagine something like the below is the equivalent (I haven't tested this). It's then a case of combining it with the query you've already got.
select
substring(
t_xref.Description, #the string to search in
charindex('#VALU=',t_xref.Description,charindex('#NAME=default',t_xref.Description)+6, #the start position, find the position of the first #VALU= tag after name=default
charindex('#ENDVALU;',t_xref.Description,charindex('#VALU=',t_xref.Description))
-charindex('#VALU=',t_xref.Description,charindex('#NAME=default',t_xref.Description))-6 #the length, find the position of the first #ENDVALU tag after the start, and subtract it from the start position
),
Description from t_xref where t_xref.Description like '%#NAME=default%' #filter anything which doesn't contain this tag to avoid "out of range" index errors
I have the following table in redshift:
Column | Type
id integer
value varchar(255)
I'm trying to copy in (using the datapipeline's RedshiftCopyActivity), and the data has the line 1,maybe as the entry trying to be added, but I get back the error 1214:Delimiter not found, and the raw_field_data value is maybe. Is there something I'm missing in the copy parameters?
The entire csv is three lines that goes:
1,maybe
2,no
3,yes
You may want to take a look at the similar question Redshift COPY command delimiter not found.
Make sure your RedshiftCopyActivity configuration includes FORMAT AS CSV from https://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-format.html#copy-csv.
Be sure your input data has your configured delimiter between every field, even in the case of nulls.
Be sure you do not have any trailing blank lines.
You can run the following SQL (from the linked question) to see more specific details of what row is causing the problem.
SELECT le.starttime,
d.query,
d.line_number,
d.colname,
d.value,
le.raw_line,
le.err_reason
FROM stl_loaderror_detail d,
JOIN stl_load_errors le
ON d.query = le.query
ORDER BY le.starttime DESC;
See below for what is returned in my automated test for this query:
Select visit_date
from patient_visits
where patient_id = '50'
AND site_id = '216'
ORDER by patient_id
DESC LIMIT 1
08:52:48.406 DEBUG Executing : Select visit_date from patient_visits
where patient_id = '50' AND site_id = '216' ORDER by patient_id DESC
LIMIT 1 08:52:48.416 TRACE Return: [(datetime.date(2017, 2, 17),)]
When i run this in workbench i get
2017-02-17
How can i make the query return this instead of the datetime.date bit above. Some formatting needed?
What you got from the database is python's datetime.date object - and that happens due to the db connector drivers casting the DB records to the their corresponding python counterparts. Trust me, it's much better this way than plain strings the user would have to parse and cast himself later.
Imaging the result of this query is stored in a variable ${record}, there are a couple of things to get to it, in the form you want.
First, the response is (pretty much always) a list of tuples; as in your case it will always be a single record, go for the 1st list member, and its first tuple member:
${the_date}= Set Variable ${record[0][0]}
Now {the_date} is the datetime.date object; there are at least two ways to get its string representation.
1) With strftime() (the pythonic way):
${the_date_string}= Evaluate $the_date.strftime('%Y-%m-%d') datetime
here's a link for the strftime's directives
2) Using the fact it's a regular object, access its attributes and construct the result as you'd like:
${the_date_string}= Set Variable ${the_date.year}-${the_date.month}-${the_date.day}
Note that this ^ way, you'd most certainly loose the leading zeros in the month and day.
I am pretty new to Pentaho so my query might sound very novice.
I have written a transformation in which am using CSV file input step and table input step.
Steps I followed:
Initially, I created a parameter in transformation properties. The
parameter birthdate doesn't have any default value set.
I have used this parameter in postgresql query in table input step
in the following manner:
select * from person where EXTRACT(YEAR FROM birthdate) > ${birthdate};
I am reading the CSV file using CSV file input step. How do I assign the birthdate value which is present in my CSV file to the parameter which I created in the transformation?
(OR)
Could you guide me the process of assigning the CSV field value directly to the SQL query used in the table input step without the use of a parameter?
TLDR;
I recommend using a "database join" step like in my third suggestion below.
See the last image for reference
First idea - Using Table Input as originally asked
Well, you don't need any parameter for that, unless you are going to provide the value for that parameter when asking the transformation to run. If you need to read data from a CSV you can do that with this approach.
First, read your CSV and make sure your rows are ok.
After that, use a select values to keep only the columns to be used as parameters.
In the table input, use a placeholder (?) to determine where to place the data and ask it to run for each row that it receives from the source step.
Just keep in ming that the order of columns received by the table input (the columns out of the select values) is the same order that it will be used for the placeholders (?). This should not be a problem with your question that uses only one placeholder, but keep that in mind as you ramp up using Pentaho.
Second idea, using a Database Lookup
This is another approach where you can't personalize the query made to the database and may experience a better performance because you can set a "Enable cache" flag and if you don't need to use a function on your where clause this is really recommended.
Third idea, using a Database Join
That is my recommended approach if you need a function on your where clause. It looks a lot like the Table Input approach but you can skip the select values step and select what columns to use, repeat the same column a bunch of times and enable a "outer join" flag that returns the rows without result from the query
ProTip: If you feel the transformation running too slow, try to use multiple copies from the step (documentation here) and obviously make sure the table have the appropriate indexes in place.
Yes there's a way of assigning directly without the use of parameter. Do as follows.
Use Block this step until steps finish to halt the table input step till csv input step completes.
Following is how you configure each step.
Note:
Postgres query should be select * from person where EXTRACT(YEAR
FROM birthdate) > ?::integer
Check Execute for each row and Replace variables in in Table input step.
Select only the birthday column in CSV input step.
I am trying to retrieve values from a PostgreSQL database in a variable using a WHERE clause, but I am getting an error.
The query is:
select age into x from employee where name=name.GetValue()
name is the textcontrol in which I am entering a value from wxpython GUI.
I am getting an error as name schema doesn't exist.
What is the correct method for retrieving values?
"name.GetValue()" is a literal string, you are sending that to your db which knows nothing about wxpython and nothing about the variables in your program. You need to send the value of that data to your db, probably using bound parameters. Something like:
cur.execute("select age from employee where name=%s", [name.GetValue()])
x = cur.fetchone()[0] # returns a row containing [age] from the db
is probably what you're after. This will create a query with a placeholder in the database, then bind the value of name.GetValue() to that placeholder and execute the query. The next line fetches the first row of the result of the query and assigns x to the first item in that row.
I'm not positive what you are trying to do, but I think your issue might be syntax (misuse of INTO instead of AS):
SELECT age AS x FROM employee WHERE name = ....