replace multiple line breaks with one new line charatcer in oracle server using REGEXP_REPLACE - oracle10g

How to replace multiple line breaks with one new line charatcer in oracle server using REGEXP_REPLACE.

I think this is what you are after. Dealing with carriage returns can get tricky depending if you are on Windows or UNIX but you'll get the idea. This was run in Toad, which uses a regular expression which looks for occurrences of two or more newline characters in a row and replaces them with one newline.

Related

Last string appears at beginning of line in formatted output

Has anyone any idea why the following would format itself in a weird way? In several years I've had no problem with creating simple text output but this problem has me baffled.
I'm using the line
print "$BC,$Ttl,$FN,$SN,$Finalage,$OurLoc,$OurDT,$FinalPC\n";
Every value is a simple text string on which I've run "chomp" to remove return characters.
I would expect the output to look like the following:
*DD10099999,,Information Services,Guest Ticket 2,41,C G,03/11/2020,NE8 9BB*
$BC is the first item and $FinalPC is the postcode at the end.
Instead I get:
*,NE8 9BB99, ,Information Services,Guest Ticket 2,41,C G,03/11/2020*
The final item has somehow moved to the beginning of the line and overwritten the first item. This is happening consistently on every line of my screen and text file output and I'm completely stumped as to why. The data is read from a text file and compared with database output which is also simple text. There are no occurrences of \b anywhere in my code. Why would a backspace character get into it?
The string in $OurDT ends with a carriage return, which causes your terminal to home the cursor. Presumably, the value of $OurDT came from a Windows file read on a unixy machine.
One option is to fix the file (e.g. by using the dos2unix utility).
Another is to accept both CRLF and LF as line endings (e.g. by using s/\s+\z// instead of chomp).

Postgres regexp_replace: inability to replace source text with first captured group

Using PostgreSQL, I am unable to design the correct regex pattern to achieve the desired output of an SQL statement that uses regexp_replace.
My source text consists of several scattered blocks of text of the form 'PU*' followed by a date string in the form of 'YYYY-MM'--for example, 'PU*2020-11'. These blocks are surrounded by strings of unpredictable, arbitrary text (including other instances of 'PU*' followed by the above date string format, such as 'PU*2017-07), white space, and line feeds.
My desire is to replace the entire source text with the FIRST instance of the 'YYYY-MM' text pattern. In the above example, the desired output would be '2020-11'.
Currently, my search pattern results in the correct replacement text in place of the first capturing group, but unfortunately, all of the text AFTER the first capturing group also inadvertently appears in the output, which is not the desired output.
Specifically:
Version: postgres (PostgreSQL) 13.0
A more complex example of source text:
First line
Exec committee
PU*2020-08
PU*2019-09--cancelled
PU*2017-10
added by Terranze
My pattern so far:
(\s|\S)*?PU\*(\d{4}-\d{2})(\s|\S*)*
Current SQL statement:
select regexp_replace('First line\nExec committee; PU*2020-08\nPU*2019-09\nPU*2017-10\n\nadded by Terranze\n', '(\s|\S)*?PU\*(\d{4}-\d{2})(\s|\S*)*', '\2') as _regex;
Current output on https://regex101.com/
2020-08
Current output on psql
_regex
───────────────────────────────────────────────────────────────────
2020-08\nPU*2019-09--cancelled\nPU*2017-10\n\nadded by Terranze\n
(1 row)
Desired output:
2020-08
Any help appreciated. Thanks--
How about this expression:
'^.*?PU\*(\d{4}-\d{2}).*$'

Can COMMENTS in Postgres contain line breaks?

I have a very long comment I want to add to a Postgres table.
Since I do not want a very long single line as a comment I want to split it into several lines.
Is this possible? \n does not work since Postgres does not use the backslash as an escape character.
Just write a multi-line string:
COMMENT ON TABLE foo IS 'This
comment
is stored
in multiple lines';
You can also embed \n escape sequences in “extended” string constants that start with E:
COMMENT ON TABLE foo IS E'A comment\nwith three\nlines.';
You can use automatic concatenation of adjacent string literals together with E'\n' escape sequences for linebreaks:
COMMENT ON TABLE foo IS E''
'This comment is stored in multiple lines. But only some'
'end with linebreaks like this one.\n'
'You can even create empty lines to simulate paragraphs:'
'\n\n'
'This would be the second paragraph, then.';
Details:
Note the initial E'' at the end of the first line. This is essential to make all the adjacent string literals that follow it use the extended string literal syntax, providing us with the option to write \n for a linebreak. Of course, that E could also be placed into the second line instead, at the start of the real string: E'This comment …'. Me putting it into the first line is just source code aesthetics … character alignment and stuff.
I consider this solution slightly better than multi-line strings (proposed in another answer here) because it allows to fit the comment into the typical line width limit and the indentation requirements of source files. Useful when you keep your SQL in well-formatted files under version control, that is, treating it just as any other source code. When including indentation into multi-line strings, on the other hand, this results in lots of additional whitespace in the live table comment.
Note for OP: When you say "I do not want a very long single line as a comment", it is not clear if you don't want that long line in your .sql source code file, or if you don't want it in the table comment of the live table, such as when seen in a database admin tool. It does not really matter, as this solution gives you tools for both purposes: use adjacent string literals to fit your line into the source code file, without affecting line breaks in the live table comment; and use \n to create line breaks and empty lines in the live table comment.

PostgreSQL Trimming Leading and Trailing Characters: = and "

I'm working to build an import tool that utilizes a quoted CSV file. However, several of the fields in the CSV file are reported as such:
"=""38000"""
Where 38000 is the data I need. The data integration software I use (Talend 6.11) already strips the leading and trailing double quotes for me (so, "38000" becomes 38000), but I can't find a way to get rid of those others.
So, essentially, I need "=""38000""" to become "38000" where the leading "=" is removed and the trailing "" is removed.
Is there a TRIM function that can accomplish this for me? Perhaps there is a method in Talend that can do this?
As the other answer stated, you could do that operation in SQL. Or, you could do it in Java, Groovy, etc, within Talend. However, if there is an existing Talend component which does the job, my preference is to use it. That leads to faster development, potentially less testing, and easier maintenance. Having said that, it is important to review all the components which are available, so you know what's available to you.
You can use the Talend component tReplace, to inspect each of the input columns you want to trim of quotes and equal signs. A single tReplace component can do search and replace operations on multiple input columns. If all the of the replaces are related to each other, I would keep them within a single tReplace. When it gets to the point of doing unrelated replacements, I might place those within a new tReplace so that logical operations are organized and grouped together.
tReplace
For a given Input Column
search for "=", replace with ""
search for "\"", replace with ""
Something like that:
SELECT format( '"%s"', trim( both '"=' from '"=""38000"""' ) );
-[ RECORD 1 ]---
format | "38000"
1st: trim() function removes all " and = chars. Result is simply 38000
2nd: with format can add double quote back to get wishful end result
Alternatively, can use regexp and other Postgres string functions.
See more:
https://www.postgresql.org/docs/current/static/functions-string.html

How to use '^#' in Vim scripts?

I'm trying to work around a problem with using ^# (i.e., <ctrl-#>) characters in Vim scripts. I can insert them into a script, but when the script runs it seems the line is truncated at the point where a ^# was located.
My kludgy solution so far is to have a ^# stored in a variable, then reference the variable in the script whenever I would have quoted a literal ^#. Can someone tell me what's going on here? Is there a better way around this problem?
That is one reason why I never use raw special character values in scripts. While ^# does not work, string <C-#> in mappings works as expected, so you may use one of
nnoremap <C-#> {rhs}
nnoremap <Nul> {rhs}
It is strange, but you cannot use <Char-0x0> here. Some notes about null byte in strings:
Inserting null byte into string truncates it: vim uses old C-style strigs that end with null byte, thus it cannot appear in strings. These strings are very inefficient, so if you want to generate a very large text, try accumulating it into a list of lines (using setline is very fast as buffer is represented as a list of lines).
Most functions that return list of strings (like readfile, getline(start, end)) or take list of strings (like writefile, setline, append) treat \n (NL) as Null. It is also the internal representation of buffer lines, see :h NL-used-for-Nul.
If you try to insert \n character into the command-line, you will get Null shown (but this is really a newline). If you want to edit a file that has \n in a filename (it is possible on *nix), you will need to prepend newline with backslash.
The byte ctrl-# is also known as '\0'. Many languages, programs, etc. use it as an "end of string" marker, so it's not surprising that vim gets confused there. If you must use this byte in the middle of a script string, it sounds like your workaround is a decent one.