Redshift how to update the last octec of IP address column to another value - amazon-redshift

a. 123.12.1 -> 123.12.999
b. 123.12.100.0 -> 123.12.100.999
c. 123.123 -> 123.999
I have a Redshift table with one IP address column, cases as above, I used substring and position function nested many time to match the requirement, but I want to learn if is there any other method to do

a cleaner way is using a Python UDF that splits the string by dot symbol, and returns all elements but the last one, with 999 appended. the body of the function is below (val is the parameter, look how to create the function in official Redshift docs)
return '.'.join(val.split('.')[:-1])+'.999'

No need to use split. the easiest way to update your values is:
update table set ip = regexp_replace(ip, '[.][0-9]{1,3}$','.999');
see redshift regexp_replace function
the $ character makes sure that only the last octet is replaced.

Related

Azure Data Factory DataFlow md5 specific columns

First of all, I have my array of columns parameter called $array_merge_keys
$array_merge_keys = ['Column1', 'Column2', 'NoColumnInSomeCases']
So then I am going to hash them, if the third NoColumnInSomeCases is not existed, I would like to treat it as null or some strings else there value.
But actually, when I use them with byNames(), it would return NULL because the last is not existed, even though first and second still have values. So I would expect byNames($array_merge_keys) always return value in order to hash them.
Since that problem cannot be solved, I am back to filter these only existed column
filter(columnNames('', true()), contains(['Column1', 'Column2', 'NoColumnInSomeCases'], #item_1 == #item)) => ['Column1', 'Column2']
But it comes to another problem that byNames() cannot compute on the fly, it said 'byNames' does not accept column or argument
array(byNames(filter(columnNames('', true()), contains(['Column1', 'Column2', 'NoColumnInSomeCases'], #item_1 == #item))))
Spark job failed: { "text/plain":
"{"runId":"649f28bf-35af-4472-a170-1b6ece50c551","sessionId":"a26089f4-b0f4-4d24-8b79-d2a91a9c52af","status":"Failed","payload":{"statusCode":400,"shortMessage":"DF-EXPR-030
at Derive 'CreateTypeFromFile'(Line 35/Col 36): Column name function
'byNames' does not accept column or argument
parameters","detailedMessage":"Failure 2022-04-13 05:26:31.317
failed DebugManager.processJob,
run=649f28bf-35af-4472-a170-1b6ece50c551, errorMessage=DF-EXPR-030 at
Derive 'CreateTypeFromFile'(Line 35/Col 36): Column name function
'byNames' does not accept column or argument parameters"}}\n" } -
RunId: 649f28bf-35af-4472-a170-1b6ece50c551
I have tried lots of ways, even created a new derived column (before that stream name) to store ['Column1', 'Column2']. But it said that column cannot be referenced within byNames() function
Do we have any elegant solution?
It is true that byName() cannot evaluate with late binding. You need to either use a Select transformation to set the columns in the stream you wish to hash first or send in the column names via a parameter. Since that is "early column binding", byName() will work.
You can use a get metadata activity in the pipeline to inspect which columns are present in your source before calling the data flow, allowing you to send a pipeline parameter with just those columns you wish to hash.
Alternatively, you can create a new branch, use a select matching rule, then hash the row based on those columns (see example below).

How to convert the records in a column to Camel Case in Datastage?

I have a table in which i have a name column, The names in that column are all lower case .I want to convert the string to camel case .The table headers are a follows
Table
|Id|Name |Phone Number|
|01|bob wheeler|999999999 |
If you are trying to do this in a server job, then use Oconv(InLink.MyString,"MCT").
There is no directly-available function in a parallel Transformer stage. You could:
(a) put a server Transformer in a server Shared Container and use that in your parallel job,
(b) write your own routine in C++ and refer to it (called a "parallel routine"), or
(c) investigate whether the database in which the table resides has any kind of camel case conversion function.
I don't think there is a specific function for that.
UpCase is available.
You migh consider a workaround to convert the first and every other first character ater a blank to upper case.
Assuming name column only contains first name and Last name.
If there is middle name as well, you can simulate the formula to handle that.
The formula will convert first char after first occurrence of space to upper case.
It will concatenate first name, first character of second name with upper case and then rest of the second name. You can play with positions if getting wrong output. But this would definitely work-
left(NAME,Index(NAME,' ',1)) : Upcase(Name[(Index(NAME,' ',1))+1,1]) : Name[(Index(NAME,' ',1))+2,length(NAME)-Index(NAME,' ',1)+1]

SSRS multi value parameter - can't get it to work

First off this is my first attempt at a multi select. I've done a lot of searching but I can't find the answer that works for me.
I have a postgresql query which has bg.revision_key in (_revision_key) which holds the parameter. A side note, we've named all our parameters in the queries with the underscore and they all work, they are single select in SSRS.
In my SSRS report I have a parameter called Revision Key Segment which is the multi select parameter. I've ticked Allow multi value and in Available Values I have value field pointing to revision_key in the dataset.
In my dataset parameter options I have Parameter Value [#revision_key]
In my shared dataset I also have my parameter set to Allow multi value.
For some reason I can't seem to get the multi select to work so I must be missing something somewhere but I've ran out of ideas.
Unlike with SQL Server, when you connect to a database using an ODBC connection, the parameter support is different. You cannot use named parameters and instead have to use the ? syntax.
In order to accommodate multiple values you can concatenate them into a single string and use a like statement to search them. However, this is inefficient. Another approach is to use a function to split the values into an in-line table.
In PostgreSQL you can use an expression like this:
inner join (select CAST(regexp_split_to_table(?, ',') AS int) as filter) as my on my.filter = key_column
Then in the dataset properties, under the parameters tab, use an expression like this to concatenate the values:
=Join(Parameters!Keys.Value, ",")
In other words, the report is concatenating the values into a comma-separated list. The database is splitting them into a table of integers then inner joining on the values.

Convert varchar parameter with CSV into column values postgres

I have a postgres query with one input parameter of type varchar.
value of that parameter is used in where clause.
Till now only single value was sent to query but now we need to send multiple values such that they can be used with IN clause.
Earlier
value='abc'.
where data=value.//current usage
now
value='abc,def,ghk'.
where data in (value)//intended usage
I tried many ways i.e. providing value as
value='abc','def','ghk'
Or
value="abc","def","ghk" etc.
But none is working and query is not returning any result though there are some matching data available. If I provide the values directly in IN clause, I am seeing the data.
I think I should somehow split the parameter which is comma separated string into multiple values, but I am not sure how I can do that.
Please note its Postgres DB.
You can try to split input string into an array. Something like that:
where data = ANY(string_to_array('abc,def,ghk',','))

Retrieve first value with Xquery using a wildcard

In an XmlData column in SQL Server 2008 that has no schema assigned to it, how can I pull the first item at a particular node level? For example, I have:
SELECT
XmlData.value('//*/*[1]','NVARCHAR(6)')
FROM table
where XmlData.Exist('//*/*[1]') = 1
I assume this does not work because if there are multiple nodes with different names at the 2nd level, the first of each of those could be returned (and the value() requires that a singleton be selected.
Since I don't know what the names of any nodes will be, is there a way to always select whatever the first node is at the 2nd level?
I found the answer by chaining Xquery .query() and .value()
XMLDATA.query('//*/*[1]').value('.[1]','NVARCHAR(6)')
This returns the value of the first node and works perfectly for my needs.