how to parse a field containing different unicode string with logstash - unicode

I used logstash jdbc statement
"select utl_raw.cast_to_varchar2(dbms_lob.substr(field,length(field))) from table;"
to fetch blob field. output in logstash is with unicode charaters like "\v\u0004\u0000\u0000\u0000\u0001\u0005\u0001\u0004\u0019\u0000c\u0000o\u0000m\u0000.\u0000w\u0000m\u0000.\u0000d\u0000a\u0000t\u0000a\u0000.\u0000I\u0000S\u0000M\u0000e\u0000m"'
which actual value is "com.wm.data.ISMem"
I need to replace all the unicode characters and parse my data.
Anyone suggest how this can be resloved and which logstash filter can be used here to parse the SQL output

You can use gsub from the mutate filter.
mutate {
gsub ["field-name","\v\u0004\u0001\u0005\u0001\u0004\u0019",""]
gsub ["field=name","\u0000",""]
}
The first one will remove \v\u0004\u0001\u0005\u0001\u0004\u0019, which seems to be some kind of header.
The second one will remove the remaining \u0000 from your string.

You could mutate it using
mutate {
gsub => [
"message", "[\\]u0000", "",
"message", "[\\]v[\\]u0004[\\]u0001[\\]u0005[\\]u0001[\\]u0004[\\]u0019", ""]
}
I would expect there is a way to fix the encoding in a ruby filter, but I do not know what it is.

Related

Convert comma separated non json string to json

Below is the value of a string in a text column.
select col1 from tt_d_tab;
'A:10000000,B:50000000,C:1000000,D:10000000,E:10000000'
I'm trying to convert it into json of below format.
'{"A": 10000000,"B": 50000000,"C": 1000000,"D": 10000000,"E": 10000000}'
Can someone help on this?
If you know that neither the keys nor values will have : or , characters in them, you can write
select json_object(regexp_split_to_array(col1,'[:,]')) from tt_d_tab;
This splits the string on every colon and comma, then interprets the result as key/value pairs.
If the string manipulation gets any more complicated, SQL may not be the ideal tool for the job, but it's still doable, either by this method or by converting the string into the form you need directly and then casting it to json with ::json.
If your key is a single capital letter as in your example
select concat('{',regexp_replace('A:10000000,B:50000000,C:1000000,D:10000000,E:10000000','([A-Z])','"\1"','g'),'}')::json json_field;
A more general case with any number of letters caps or not
select concat('{',regexp_replace('Ac:10000000,BT:50000000,Cs:1000000,D:10000000,E:10000000','([a-zA-Z]+)','"\1"','g'),'}')::json json_field;

How to add bash array variable to mongo update query

I am writing a bash script which updates a mongo document. If it's a new document then it creates a new document in mongo else updates it.
I want to pass bash array variable to the mongo query since few fields are of array type. I am not sure how to pass the array field. Any help would be appreciated.
This is my query:
db.my_col.update({emp_id: '"'$emp_id'"'}, {$set: {contacts: '"'${contacts}'"', emp_name: '"'$emp_name'"'}}, {upsert: true})
If I just add like normal variable the only first value in variable gets added.
If you put your array variable into a string, only the first item will be there, right
Basically, you get all of them with this syntax ${contacts[#]}, but they will be unquoted and separated with spaces and likely won't work in a query because of wrong syntax.
So you'd need to manually convert a Bash array to a JS array. I don't know an automagical way to do it, but consider this function:
function jsarray {
res=""
for a in ${#}; do
res="$res,\"$a\""
done
echo [ ${res:1} ]
}
It goes through it's arguments, wraps each one in quotes and joins together with commas.
Use it like this (I assume $contacts is an array of string values):
IFS=""
contacts=("skype name" "email address" phone website whatever)
jscontacts=$(jsarray $contacts[#])
# Proceed to using $jscontacts in your query.
After that $jscontacts will be a string value of [ "skype name","email address","phone","website","whatever" ].
Notes:
IFS="" - IFS stands for internal field separator, this is how Bash separates arguments from each other. By default it's " " (a space). Resetting it to nothing to preserve array items with spaces in them; otherwise "skype name" beсome two separate values of "skype" & "name".
${res:1} - skips the first character from $res (because it's a comma).
jscontacts=$(...) captures the terminal output into a jscontacts variable.

Regex to remove data between 2 semicolns perl

A string have data with semicolons now i want to remove all the data within the 2 semicolons and leave the rest as it is. I am using perl regex to remove the unwanted data from the string:
String :
$val="Data;test is here ;&data=1dffvdviofv;&dt&;&data=343";
Now we want to remove all the data between each semicolons ,throughout the string :
$val=~s/(.*)(\;.*\;)(.*)$/$1$3/g;
But this is not working for me. Final out should be like below :
Data &data=1dffvdviofv&data=343
One of the problems is that .* is greedy, that is, it will consume as much as it can. You can make it non-greedy by writing .*?, but that alone won't fix your regex since you've anchored it to the end of the string with $. Personally I don't think there is a need for the capture groups, you can just write
$val =~ s/;.*?;//g;
I'm assuming that the extra space in your expected output (Data &data...) is a typo.
You might also want to consider using a proper parser for whatever data format this is.

Perl JSON pound sign escaping

I am trying to use a web API of a service written in Perl (OTRS).
The data is sent in JSON format.
One of the string values inside the JSON structure contains a pound sign, which in apparently is used as a comment character in JSON.
This results in a parsing error:
unexpected end of string while parsing
JSON string
I couldn't find how to escape the character in order to get the string parsed successfully.
The obvious slash escaping results in:
illegal backslash escape sequence in
string
Any ideas how to escape it?
Update:
The URL I am trying to use looks something like that (simplified but still causes the error):
http://otrs.server.url/otrs/json.pl?User=username&Password=password&Object=TicketObject&Method=ArticleSend&Data={"Subject":"[Ticket#100000] Test Ticket from OTRS"}
Use Uri::escape:
use URI::Escape;
my $safe = uri_escape($url);
See rfc1738 for the list of characters which can be unsafe.
The hash symbol, #, has a special meaning in URLs, not in JSON. Your URL is probably getting truncated at the hash before the remove server even sees it:
http://otrs.server.url/otrs/json.pl?User=username&Password=password&Object=TicketObject&Method=ArticleSend&Data={"Subject":"[Ticket
And that means that the remote server gets mangled JSON in Data. The solution is to URL encode your parameters before pasting them together to form your URL; eugene y tells you how to do this.

Can I use XML::Simple with an in-memory string, rather than a file?

XML::Simple documentation says to initiate the data structure with an XML file, using XMLin('[FILENAME]') ... but I have an in-memory string.
Can I use it directly, or do I need to save it to the filesystem and then load it into XMLin?
You seem to have missed in said documentation the following information:
XMLin() accepts an optional XML
specifier followed by zero or more
'name => value' option pairs. The XML
specifier can be one of the following:
...
A string of XML
A string containing XML (recognised by
the presence of '<' and '>'
characters) will be parsed directly.
eg:
$ref = XMLin('<opt username="bob" password="flurp" />');
It also says
my $ref = $xs->XMLin([<xml file or string>] [, <options>]);