Export text content to text file without \n mark - postgresql

When I try to export the text content of a field, and that content have carriage return characters, that chars are output like \N string.
For example:
create table foo ( txt text );
insert into foo ( txt ) values ( 'first line
second line
...
and other lines');
copy foo ( txt ) to '/tmp/foo.txt';
I want to return the following (a):
first line
second line
...
and other lines
But, output is (b):
first line\Nsecond line\N...\Nand other lines
Anybody knows how to get the (a) output?

The \N comes from the fact that one line must correspond to one database row.
This rule is relaxed for the CSV format where multi-line text is possible but then a quote character (by default: ") would enclose the text.
If you want multi-line output and no enclosing character around it, you shouldn't use COPY but SELECT.
Assuming a unix shell as the execution environment of the caller, you could do:
psql -A -t -d dbname -c 'select txt from foo' >/tmp/file.txt

Have you tried: \r\n?
Here's another solution that might work:
E'This is the first part \\n And this is the second'
via https://stackoverflow.com/a/938/1085891
Also, rather than copy the other responses, see here: String literals and escape characters in postgresql

Related

Exporting new line characters to text in Postgres

There is a text field in a Postgres database containing new lines. I would like to export the content of that field to a text file, preserving those new lines. However, the COPY TO command explictly transforms those characters into the \n string. For example:
$ psql -d postgres -c "COPY (SELECT CHR(10)) TO '/tmp/out.txt';"
COPY 1
$ cat /tmp/out.txt
\n
This behaviour seems to match the short description in the documents:
Presently, COPY TO will never emit an octal or hex-digits backslash sequence, but it does use the other sequences listed above for those control characters.
Is there any workaround to get the new line in the output? E.g. that a command like:
$ psql -d postgres -c "COPY (SELECT 'A line' || CHR(10) || 'Another line') TO '/tmp/out.txt';"
Results in something like:
A line
Another line
Update: I do not wish to obtain a CSV file. The output must not have headers, column separators or column decorators such as quotes (exactly as exemplified in the output above). The answers provided in a different question with COPY AS CSV do not fulfil this requirement.
Per my comment:
psql -d postgres -U postgres -c "COPY (SELECT CHR(10)) TO '/tmp/out.txt' WITH CSV;"
Null display is "NULL".
COPY 1
cat /tmp/out.txt
"
"
psql -d postgres -U postgres -c "COPY (SELECT 'A line' || CHR(10) || 'Another line') TO '/tmp/out.txt' WITH CSV;"
Null display is "NULL".
COPY 1
cat /tmp/out.txt
"A line
Another line"
Using the CSV format will maintain the embedded line breaks in the output. This is explained here COPY under CSV Format
The values in each record are separated by the DELIMITER character. If the value contains the delimiter character, the QUOTE character, the NULL string, a carriage return, or line feed character, then the whole value is prefixed and suffixed by the QUOTE character, and any occurrence within the value of a QUOTE character or the ESCAPE character is preceded by the escape character. You can also use FORCE_QUOTE to force quotes when outputting non-NULL values in specific columns.
...
Note
CSV format will both recognize and produce CSV files with quoted values containing embedded carriage returns and line feeds. Thus the files are not strictly one line per table row like text-format files.
UPDATE
Alternate method that does not involve quoting, using psql.
create table line_wrap(id integer, fld_1 varchar);
insert into line_wrap values (1, 'line1
line2');
insert into line_wrap values (2, 'line3
line4');
select fld_1 from line_wrap
\g (format=unaligned tuples_only=on) out.txt
cat out.txt
line1
line2
line3
line4

How to remove part of text after certain sign in bash

I have .txt file which has following data:
user-5
user-10
user-12
user-23(some text)
user-11#dsa.dsd
user-23-sometext
I want to leave only user-NUMBER. So I have to remove text after #, ) and second -.
I'm trying to use sed command, already succed with # and ). How can I remove text after second -?
My code: sed 's/[)|#].*//g'
sed 's|\(user-[0-9]*\).*|\1|'
This way you don't need to include every possible character that would terminate a user-NUMBER match.

Remove some rows with " in front

I have a CSV file that is causing me serious headaches going into Tableau. Some of the rows in the CSV are wrapped in a " " and some not. I would like them all to be imported without this (i.e. ignore it on rows that have it).
Some data:
"1;2;Red;3"
1;2;Green;3
1;2;Blue;3
"1;2;Hello;3"
Do you have any suggestions?
If you have a bash prompt hanging around...
You can use cat to output the file contents so you can make sure you're working with the right data:
cat filename.csv
Then, pipe it through sed so you can visually check that the quotes were delted:
cat filename.csv | sed 's/"// g'
If the output looks good, use the -i flag to edit the file in place:
sed -i 's/"// g' filename.csv
All quotes should now be missing from filename.csv
If your data has quotes in it, and you want to only strip the quotes that appear at the beginning and end of each line, you can use this instead:
sed -i 's/^"\(.*\)"$/\1/' filename.csv
It's not the most elegant way to do it in Tableau but if you cannot remove it in the source file, you could create a calculated field for the first and last column that strips the quotation marks.
right click on the field for the first column choose Create/Calculated Field
Use this formula: INT(REPLACE([FirstColumn],'"',''))
Name the column accordingly
Do the same for the last column
Assuming the data you provided fits the data you work on. The assumption is that these fields are integer field (thus the INT() usage). In case they are string fields you would want to make sure that you don't remove quotation marks that belong to the field value.

command line method to remove single line paragraphs from text file

I have a .txt file with two types of paragraphs:
Some statements and numbers (02) and such followed by a return
With some more stuff followed by two returns
Then a single line paragraph that is followed by two returns
Along with some more double line text return
some more text.
I want to remove all single line paragraphs from the text file. So that the result is:
Some statements and numbers (02) and such followed by a return
With some more stuff followed by two returns
Along with some more double line text return
some more text
I have been attempting to do this with sed and awk, but I keep running into problems coming up with a regex that will look for a newline followed by some characters and ending in two consecutive newlines \n\n.
Is there anyway way to do this with a one liner or am I going to have to write a script to read in line by line and determine the length of the paragraph and strip it out that way?
Thanks.
awk -F '\n' -v RS='' -v ORS='\n\n' 'NF>1' input.txt
When RS is set to the empty string, each record always ends at the first blank line encountered.
When RS is set to the empty string, and FS is set to a single character, the newline character always acts as a field separator.
[read more]
I tend to reach for Perl for paragraph-oriented parsing:
perl -00 -lne 'print if tr/\n/\n/ > 0'

Removing newlines in Postgres dump

I'm trying to format a postgres dump (pg_dump) to be able to import it using a JDBC connection. pg_dump exports text fields that contain newlines to as just that, text with newlines, so when I later try to import using JDBC I reach the end of line and the statement fails.
What I want to do is take the dump, pass it through sed and escape all newlines, so that I end up with one INSERT statement per line. Problem is that I cannot just remove all newlines, but I can remove all newlines that do no match this );\nINSERT INTO. Is there a simple way to do just this?
Update:
A sample would look like this:
INSERT INTO sometable (123, And here goes some text
with
newlines
in
it', 'some more fields');
and the result I'm looking for is something like this:
INSERT INTO sometable (123, And here goes some text\nwith\nnewlines\nin\nit', 'some more fields');
So that each INSERTstatement is on a single line, with the string's newlines escaped.
Not a sed solution, but might the following work?
cat test_dump.txt | perl -pe "s/[^(\);INSERT INTO)]\n/\\$1\\n/"
You can do it in vim.
vim my_dump.sql
:%s/\();\)\#<!\n\(INSERT\)\#!//c
% .. do for all lines
s .. substitute
\n .. newline (Unix style; you are aware, that Windows has \r\n and Apple \r for line breaks?)
flags:
c .. Confirm each substitution (for testing first)
info on negative lookahead and lookbehind
:help \#!
:help \#<!
sed normally operates on lines, it needs to go out of its way to replace line breaks.
Google for "sed multi-line replace", you'll find stuff like this.