Replace dot in datafusion wrangler not working - google-cloud-data-fusion

I need to remove dots from a number in Google DataFusion. For this I'm using the Wrangler transformation, but I'm having troubles with one file, because if I replace the dots, the whole cells gets empty.
If I replace any other character, it works.
What can be the trouble?
Thanks!
Original Value:
After replacing dots (.) :
Same cell/row but replacing spaces and number 1

The find and replace function of the wrangler is similar with "sed" wherein it applies regular expressions.
Period (.) matches any character except a newline character.
Here is the original data:
I tried this on my own project and here is the result when using the un-escaped period:
You need to escape the period symbol (.) so it will treat it as a regular period. Here is the result when escaping period:
As you can see, the period(.) was removed before "jpg".

Related

Difficulty forming a regular expression

I'm trying to check a string to make sure that it only contains lowercase and uppercased letters, the digits 0-9, underscores, dashes and periods.
The regular expression I've been using for letter, numbers, underscores and dashes works fine and is this:
"[^a-zA-Z0-9_-]"
I'm having difficulty adding the check for spaces and periods though.
I've tried:
"[^a-zA-Z0-9_- ]" (added a space after the dash)
"[^a-zA-Z0-9_-\s\.]" (trying to escape a white space character)
I've also tried putting the \s and \. outside of the main block and also in blocks of their own.
Thanks for any advice.
A hyphen (representing the character) must be at the beginning or at the end of the (negating) character class.
Inside a character class the period is a normal character, it doesn't need to be escaped.
let pattern = "[^a-zA-Z0-9_. -]+"
Be careful about adding characters which have a special meaning: you forgot the hyphen.
I think that this is what you are looking for:
"[\^ a-zA-Z0-9_,\.\-]"

Trying to work around the error DF-CSVWriter-InvalidEscapeSetting

So I have a dataset which I want to export to csv with pipe as separator and no escape character.
That dataset contains in fact 4 source columns, 3 regular ones (just text) and one variable one.
That last column holds another subset of values that are also separated with a pipe.
Purpose is that the export looks like this, where the values are coming from my 4th field.
COL1|COL2|COL3|VAL1|VAL2|VAL3|....
The number of values can be different for each record but.
When I set the csv export separator to ";", I get this result which is expected
COL1;COL2;COL3;VAL1|VAL2|VAL3|....
However setting it to "|", it throws the error DF-CSVWriter-InvalidEscapeSetting.
Most likely because it detected the separator character in my 4th field and then enforces that an escape character needs to be set.
Which is a logical thing in most case but in my case I would like him to ignore this and just export as-is.
Any way how I can work around this, perhaps with a different approach or some additional settings?
Split & flatten produces extra rows but that's not what I want.
Regards,
Sven Peeters
As you have the same characters in the column value same as your delimiter character, with no escape character in your dataset will throw an error.
You have to change the delimiter character to a different character or add a Quote character and Escape character to Double quote(").
Downloaded file:

Notepad++ how to swap characters in a string

I have a computer generated text file. I need to swap positions of certain entries. These entries are always 4 characters long and separated from the rest by semicolons. The 4th character needs to become the first character.
For example:
;1234;
has to become:
;4123;
Note: There's a lot of other text separated by semicolons, but only these are exactly 4 characters long. The rest is longer or shorter
Have a try with:
Find what: ;(\d\d\d)(\d);
Replace with: ;$2$1;

Removing a trailing Space from Regex Matched group

I'm using regular expression lib icucore via RegKit on the iPhone to
replace a pattern in a large string.
The Pattern i'm looking for looks some thing like this
| hello world (P1)|
I'm matching this pattern with the following regular expression
\|((\w*|.| )+)\((\w\d+)\)\|
This transforms the input string into 3 groups when a match is found, of which group 1(string) and group 3(string in parentheses) are of interest to me.
I'm converting these formated strings into html links so the above would be transformed into
Hello world
My problem is the trailing space in the third group. Which when the link is highlighted and underlined, results with the line extending beyond the printed characters.
While i know i could extract all the matches and process them manually, using the search and replace feature of the icu lib is a much cleaner solution, and i would rather not do that as a result.
Many thanks as always
Would the following work as an alternate regular expression?
\|((\w*|.| )+)\s+\((\w\d+)\)\| Where inserting the extra \s+ pulls the space outside the 1st grouping.
Though, given your example & regex, I'm not sure why you don't just do:
\|(.+)\s+\((\w\d+)\)\|
Which will have the same effect. However, both your original regex and my simpler one would both fail, however on:
| hello world (P1)| and on the same line | howdy world (P1)|
where it would roll it up into 1 match.
\|\s*([\w ,.-]+)\s+\((\w\d+)\)\|
will put the trailing space(s) outside the capturing group. This will of course only work if there always is a space. Can you guarantee that?
If not, use
\|\s*([\w ,.-]+(?<!\s))\s*\((\w\d+)\)\|
This uses a lookbehind assertion to make sure the capturing group ends in a non-space character.

crystal reports : substring error

I've developed a workaround since crystal reports doesn't seem to have a substring function with the following formula:
right({_v_hardware.groupname},
truncate(instr(replace({_v_hardware.groupname},".",
","), ","))
What I'm trying to do is search for the period (".") in a string and replace it with a comma. Then find the comma position in the string and print all characters following after the comma. This is assuming the string will only have 1 period in the entire string.
Now when I attempt to do this, I get some weird characters which look like wingdings. Any ideas?
thanks in advance.
I don't know the entire issue that you are attempting to accomplish, but for this question alone, the step of replacing the period with a comma seems to be unnecessary. If you know that there is only one period in the string and you only want the characters right of the period then you should be able to do something like the following (this is #first_formula):
right({_v_hardware.groupname}, len({_v_hardware.groupname}) - instr({_v_hardware.groupname},"."))
If for some reason you want to show the comma then I'd do that in a separate formula. If you need the entire screen with the comma replaced then just do:
replace({_v_hardware.groupname},".",",")
And if you need the comma plus included in the string then it might just be easier to do something like:
"," + {#first_formula}
Hope this helps.