Filtering column strings that contain substring - talend

I am working on an if else in the Tmap, and one of the conditions is if a column contains a substring.
I am unsure exactly how to go about this being fairly new to talend.
This is the current syntax that I am using.
row16.Location.contains("clos")?"Pending":""
I have not been able to find any good examples of the correct way to go about this, other than the one above.

Talend uses Java as an underlying language, so you need to use the ternary operator of Java:
row16.Location.contains("clos") ? "Pending" : ""
But make sure you first check row16.Location for null, otherwise you'll get a NullPointerException if Location is null :
row16.Location != null && row16.Location.contains("clos") ? "Pending" : ""

Related

Kafka Connect - json path - regex condition not working

I have an application where I use a connector to save data to a database.
I want to filter the messages saved by removing the ones that have a certain property very long.
My messages are like this :
{
field_a : value,
field_b : value,
field_c : possible very long value
}
So, I used in the kafka connector the Confluent Filter like this :
transforms: filterSpam
transforms.filterSpam.type: io.confluent.connect.transforms.Filter$Value
transforms.filterSpam.filter.condition: $[?(#.field_c =~ /^.{32000,}$/)]
transforms.filterSpam.filter.type: exclude
transforms.filterSpam.missing.or.null.behavior: include
For some reason the filter is not working. All messages pass through.
I tried also with the negation :
$[?(!(#.field_c =~ /^.{1,32000}$/))]
In this case, the very long were filtered out, but also some of the shorter ones were.
I do not understand where the issue is coming from. Any help ?
Actually, I needed to update my regex knowledge.
The issue was related to the fact that the string field on which I tried to apply the regex sometimes was multiline.
Thanks to this I managed to use a proper validation of the size of this field.
The final solution is :
transforms.filterSpam.filter.condition: $[?(#.field_c =~ /(\s)^.{32000,}$/)]

Compare String using tMap

I am using Talend to prepare dataware.
I want to compare the string with the contents of a column using the tMap component and create a variable to store in the DB. The problem is that the == operator does not give the right result (Example: row2.recipient == "text"?"text":"" I always get "") and if I use .equals I get errors when executing.
You will get error if row2.recipient is null, and "==" should not be used when comparing strings.
Correct syntax would be :
"text".equals(row2.recipient)?"text":""
Then you will prevent NullPointerExceptions.

tPostgresqlInput - How to convert input to Null-if-empty?

I want to trim each column value to null if they are empty.
Is there a way to attach an inbuilt talend function to those column values? I think informatica has something like that.
PS: I need to do that in talend, not in sql level.
The rows goes to another table in another db:
I like to solve this with Java, I'm sure there are other more graphical ways to do this.
If we put a tJavaRow inbetween, then press generate code it would result in:
output_row.plateid = input_row.plateid;
If we change this to:
output_row.plateid = (input_row.plateid == null || input_row.plateid.length() == 0) ? null : input_row.plateid ;
Then we get the desired results. null stays null, empty string also becomes null.
In most t***Input components you have trim all columns option. Look for it in advanced settings.

Elasticsearch mongodb river script in index doesn't work

I'm trying to change few fields strings using javascript.
For example take only the last part of the URL taken from mongo through the river so in elasticsearch I'll have only the end of it.
When creating the index (using curl) I added under "options" the following script:
"script": "ctx.document.shorturl = ctx.document.url.substr(-4);delete ctx.document.url;
I tried some manipulations such as adding \"...\" or use ctx['doc']['url'] and others but nothing seems to work.
I always get only url field with the full url (shorturl is not created at all).
Can anyone suggest what is the right syntax to make it work?
Another thing I need to do is combine to fields - lat & long, to one "location" field in order to use it in Kibana, can anyone suggest the right script for that? (create new field called "location" which contain both field "lat" & "long" with comma between them).
Thanks.
You did substring(-4), hence it will return the whole string. You should use substring(4) instead:
ctx.document.shorturl = ctx.document.url.substr(4);delete ctx.document.url;

Reuse of conditions (when) in Drool DSL statements

Is it possible to reuse a when/condition statement into another when/condition statement in a DSL file?
For example, I have two conditions:
[condition][]The client is invalid = Client( name == null || email == null )
[condition][]All the clients are invalid = forall( Client( name == null || email == null ) )
Note that the second condition just diff the first for the forall command, but the statement inside is equals. In these case, I want to reuse the first condition into the second.
Is it possible? How?
Thank you.
even the most recent version of drools will only let you substitute values into a template from pojo's or a corresponding map as per their documentation here.
This won't work for your use case though.
Since drool files are simply text files, there is nothing preventing you from considering a more powerful templating toolkit.
Possibilities include Apache Velocity, ANTLR or Scala!