How to extract first digit from a Integer in transformer stage in IBM DataStage? - datastage

I have an integer field coming and I want to extract the first digit from the field, how can I do it. I cannot cast the field since the data is coming from a dataset, is there a way to extract first digit from the transformer stage in IBM datastage?
Example:
Input:
ABC = 1234
Output: 1
Can anyone please help me with the same?
Thanks!

Use a transformer, define a stage variable as varchar and use this formula to get the substring
ABC[1,1]
Alternatively you can also convert your numeric value by using the DecimalToString

You CAN convert to string within the context of your expression, and back again if the result needs to be an integer.
AsInteger(Left(ln_jn_ENCNTR_DTL.CCH,1)
This solution has used implicit conversion from integer to string. It assumes that the value of CCH is always an integer.

I would say- if ABC has type int, you can define a stage variable of type char having length 1.
then you need to convert Number to string first.And use Left function to extract the first char.
Left(DecimalToString(ABC),1).
If you are getting ABC as string, you can directly apply left function.

You can first define a stage variable (name say SV) of varchar type (to convert input integer column into varchar) :
Stage variable definition
Now assign the input integer column to stage variable SV and derive output integer column as AsInteger(SV[1,1]) : Column definition
i.e. input integer => (Type conversion to varchar) Stage variable => Substring[1,1] and Substring Conversion to Integer using AsInteger.

DecimalToString is an implicit conversion, so all you need is the Left() function. Left(MyString,1)

Related

SQLAlchemy IN_ - trouble with leading zeroes

In my sqlalchemy ( sqlalchemy = "^1.4.36" ) query I have a clause:
.filter( some_model.some_field[2].in_(['item1', 'item2']) )
where some_field is jsonb and the value in some_field value in the db formatted like this:
["something","something","123"]
or
["something","something","0123"]
note: some_field[2] is always digits-only double-quoted string, sometimes with leading zeroes and sometimes without them.
The query works fine for cases like this:
.filter( some_model.some_field[2].in_(['123', '345']) )
and fails when the values in the in_ clause have leading zeroes:
e.g. .filter( some_model.some_field[2].in_(['0123', '0345']) ) fails.
The error it gives:
cursor.execute(statement, parameters)\\npsycopg2.errors.InvalidTextRepresentation: invalid input syntax for type json\\nLINE 3: ...d_on) = 2 AND (app_cache.value_metadata -> 2) IN (\\'0123\\'\\n ^\\nDETAIL: Token \"0123\" is invalid.
Again, in the case of '123' (or any string of digits without leading zero) instead of '0123' the error is not thrown.
What is wrong with having leading zeroes for the strings in the list of in_ clause? Thanks.
UPDATE: basically, sqlachemy's IN_ assumes int input and fails accordingly. There must be some reasoning behind this behavior, can't tell what it is. I removed that filter fromm the query and did the filtering of the ouput in python code afterwards.
The problem here is that the values in the IN clause are being interpreted by PostgreSQL as JSON representations of integers, and an integer with a leading zero is not valid JSON.
The IN clause has a value of type jsonb on the left hand side. The values on the right hand side are not explicitly typed, so Postgres tries to find the best match that will allow them to be compared with a jsonb value. This type is jsonb, so Postgres attempts to cast the values to jsonb. This works for values without a leading zero, because digits in single quotes without leading zeroes are valid representations of integers in JSON:
test# select '123'::jsonb;
jsonb
═══════
123
(1 row)
but it doesn't work for values with leading zeroes, because they are not valid JSON:
test# select '0123'::jsonb;
ERROR: invalid input syntax for type json
LINE 1: select '0123'::jsonb;
^
DETAIL: Token "0123" is invalid.
CONTEXT: JSON data, line 1: 0123
Assuming that you expect some_field[2].in_(['123', '345']) and some_field[2].in_(['0123', '345']) to match ["something","something","123"] and ["something","something","123"] respectively, you can either serialise the values to JSON yourself:
some_field[2].in_([json.dumps(x) for x in ['0123', '345']])
or use the contained_by operator (<# in PostgreSQL), to test whether some_field[2] is present in the list of values:
some_field[2].contained_by(['0123', '345'])
or cast some_field[2] to text (that is, use the ->> operator) so that the values are compared as text, not JSON.
some_field[2].astext.in_(['0123', '345'])

Snowflake - Convert varchar to numeric

I have a column(field1) defined as varchar in snowflake. It is storing both string and numbers(ex values: US15876, 1.106336965E9). How can I convert the numeric values to display something like 1106336965, without losing the columns that is storing string values or null values. I am trying try_to_numeric(field1), but this is eliminating the record with string values and showing them as null. Any help is appreciated.
So try_to_number is the way to have numbers, and nulls for non-number without errors. But if you want to keep the strings, you actually have to convert your newly create number, back to text (or variant), otherwise it cannot be in the same column, so nothing is gained:
select column1
,try_to_number(column1) as_num
,nvl(as_num::text, column1) as why_not_both
from values
('US15876'),
('1.106336965E9'),
('1.106336965'),
('1106336965');
COLUMN1
AS_NUM
WHY_NOT_BOTH
US15876
null
US15876
1106336965
1,106,336,965
1106336965
1.106336965
1
1
1106336965
1,106,336,965
1106336965

RIGHT Function in UPDATE Statement w/ Integer Field

I am attempting to run a simple UPDATE script on an integer field, whereby the trailing 2 numbers are "kept", and the leading numbers are removed. For example, "0440" would be updated as "40." I can get the desired data in a SELECT statement, such as
SELECT RIGHT(field_name::varchar, 2)
FROM table_name;
However, I run into an error when I try to use this same functionality in an UPDATE script, such as:
UPDATE schema_name.table_name
SET field_name = RIGHT(field_name::varchar, 2);
The error I receive reads:
column . . . is of type integer but expression is of type text . . .
HINT: You will need to rewrite or cast the expression
You're casting the integer to varchar but you're not casting the result back to integer.
UPDATE schema_name.table_name
SET field_name = RIGHT(field_name::TEXT, 2)::INTEGER;
The error is quite straight forward - right returns textual data, which you cannot assign to an integer column. You could, however, explicitly cast it back:
UPDATE schema_name.table_name
SET field_name = RIGHT(field_name::varchar, 2)::int;
1 is a digit (or a number - or a string), '123' is a number (or a string).
Your example 0440 does not make sense for an integer value, since leading (insignificant) 0 are not stored.
Strictly speaking data type integer is no good to store the "trailing 2 numbers" - meaning digits - since 00 and 0 both result in the same integer value 0. But I don't think that's what you meant.
For operating on the numeric value, don't use string functions (which requires casting back and forth. The modulo operator % does what you need, exactly: field_name%100. So:
UPDATE schema_name.table_name
SET field_name = field_name%100
WHERE field_name > 99; -- to avoid empty updates

Spark SQL change format of the number

After show command spark prints the following:
+-----------------------+---------------------------+
|NameColumn |NumberColumn |
+-----------------------+---------------------------+
|name |4.3E-5 |
+-----------------------+---------------------------+
Is there a way to change NumberColumn format to something like 0.000043?
you can use format_number function as
import org.apache.spark.sql.functions.format_number
df.withColumn("NumberColumn", format_number($"NumberColumn", 5))
here 5 is the decimal places you want to show
As you can see in the link above that the format_number functions returns a string column
format_number(Column x, int d)
Formats numeric column x to a format like '#,###,###.##', rounded to d decimal places, and returns the result as a string column.
If your don't require , you can call regexp_replace function which is defined as
regexp_replace(Column e, String pattern, String replacement)
Replace all substrings of the specified string value that match regexp with rep.
and use it as
import org.apache.spark.sql.functions.regexp_replace
df.withColumn("NumberColumn", regexp_replace(format_number($"NumberColumn", 5), ",", ""))
Thus comma (,) should be removed for large numbers.
You can use cast operation as below:
val df = sc.parallelize(Seq(0.000043)).toDF("num")
df.createOrReplaceTempView("data")
spark.sql("select CAST (num as DECIMAL(8,6)) from data")
adjust the precision and scale accordingly.
In newer versions of pyspark you can use round() or bround() functions.
Theses functions return a numeric column and solve the problem with ",".
it would be like:
df.withColumn("NumberColumn", bround("NumberColumn",5))

Explicit type conversion in postgreSQL

I am joining the two tables using the query below:
update campaign_items
set last_modified = evt.event_time
from (
select max(event_time) event_time
,result
from events
where request = '/campaignitem/add'
group by result
) evt
where evt.result = campaign_items.id
where the result column is of character varying type and the id is of integer type
But the data in the result column contains digits(i.e. 12345)
How would I run this query with converting the type of the result(character) into id
(integer)
Well you don't need to because postgresql will do implicit type conversion in this situation. For example, you can try
select ' 12 ' = 12
You will see that it returns true even though there is extra whitespace in the string version. Nevertheless, if you need explicit conversion.
where evt.result::int = campaign_items.id
According to your comment you have values like convRepeatDelay, these obviously cannot be converted to int. What you should then do is convert your int to char!!
where evt.result = campaign_items.id::char
There are several solutions. You can use the cast operator :: to cast a value from a given type into another type:
WHERE evt.result::int = campaign_items.id
You can also use the CAST function, which is more portable:
WHERE CAST(evt.result AS int) = campaign_items.id
Note that to improve performances, you can add an index on the casting operation (note the mandatory double parentheses), but then you have to use GROUP BY result::int instead of GROUP BY result to take advantage of the index:
CREATE INDEX i_events_result ON events_items ((result::int));
By the way the best option is maybe to change the result column type to int if you know that it will only contain integers ;-)