How to convert \r to carrot M in PostgreSQL? - postgresql

I've spent quite some time trying to google this issue. However, I keep finding articles and answers to the opposite of what I want.
System Specifications:
CentOS 6.7
JBoss 4.2.3
PostgreSQL v8.4
I'm using JBoss as a message server and storing messages into PostgreSQL database.
Problem:
Prior to the upgrade, messages would be stored into the database with carriage returns formatted as ^M and any database query would return the same format.
After the upgrade, messages in the log for JBoss still shows a ^M in the messages being sent through. However, the ^M is now being populated in the database with \r.
Reasons:
I have many scripts that rely on the ^M to parse out messages and
lines.
I want the data in the database to be the same as the data that
resides in my logs.
When using vim, I find that reading and locating ^M is easier than reading and locating \r.
Update
Encoding of the database is UTF-8.
The field I'm accessing is a text type.
Upon further testing, it seems that the \r is not actually being put into the database.
I have a Perl script that connects to the database and creates a file of records. All these records show the ^M which I desire.
However, using a command like the following in the CLI will output records with \r:
psql dbName -U dbUser -c "select record from table" > records.txt
Work Around
Found a temporary work around to change the \r to a ^M.
psql dbName -U dbUser -t -c"select record from table" | sed 's!\\r!^M!g' > records.txt
In order to correctly use ^M in the sed command:
Hold down control (ctrl)
Press v then release v
Press m then release m
^M should now be displayed and can now release the control key

Related

How to prevent special characters from being corrupted when using UTF8 encoding in Postgres?

I'm using PostgreSQL DB with UTF8 encoding and I'm trying to fetch some data using psql.exe.
My table diacritics contains the following data:
id
name
1
Kočička
2
Mňau
As you can see, there are a few (Czech) characters with diacritics. They are displayed correctly when viewed in pgAdmin, however, when I try to fetch the same data using psql.exe, the special characters get corrupted:
id | name
----+-----------
1 | Ko─ìi─ìka
2 | Mňau
(2 rows)
I manually set client encoding to UTF8 as mentioned in this SO answer as I was getting the same error message as the OP there.
My resulting command looks like this (I'm running it in PowerShell):
./psql.exe -h localhost -p 5433 -U postgres -d experimental -c "SET client_encoding TO 'UTF8'; SELECT * FROM diacritics"
I would expect to get the data with the correct diacritics as the encoding on both sides matches. I tried outputting the fetched data into a file to rule out the possibility that it's a PowerShell related issue but had no success.
Also, I can fetch the data correctly in C# using Npgsql package so the problem only seems to be related to using psql.exe.
I'm using PostgreSQL 10.20 (10.0.20.22038). I have also tried it on version 14.2 (14.0.2.22041) and the results are the same.

How to restore one database from a .sql file in which there are two databases?

I was sent a .sql file in which there are two databases. Previously, I only dealt with .sql files in which there is one database. I also can't ask to send databases in different files.
Earlier I used this command:
psql -d first_db < /Users/colibri/Desktop/first_db.sql
Databases on the server and locally have different names.
Tell me, please, how can I now restore a specific database from a file in which there are several?
You have two choices:
Use an editor to delete everything except the database you want from the SQL file.
Restore the whole file and then drop the database you don't need.
The file was probably generated with pg_dumpall. Use pg_dump to dump a single database.
If this is the output of pg_dumpall and the file is too big to edit with something like vi, you can use a stream editor to isolate just what you want.
perl -ne 'print if /^\\connect foobar/.../^\\connect/' < old.sql > new.sql
The last dozen or so lines that this captures will be setting up for and creating the next database it wants to restore, so you might need to tinker with this a bit to get rid of those if you don't want it to attempt to create that database while you replay. You could change the ending landmark to something like the below so that it ends earlier, but that is more likely to hit false positives (where the data itself contains the magic string) than the '^\connect' landmark is.
perl -ne 'print if /^\\connect foobar/.../^-- PostgreSQL database dump complete/'

How to fix syntax errors in postgresql .sql dump file when restoring with psql?

I have a postgresql .sql dump file created by pg_dump on another windows 10 box. I am trying to restore it on my windows 10 laptop with
"psql -U user -d database -1 -f filename.sql". I created the database, but when I run the command to do the restore I get an error from psql after I give it my password:
psql:filename.sql:1:1: ERROR: syntax error at or near "ÿ_"
LINE 1: ÿ_;
The file looks like straight ascii (I only see two dashes on line one. I don't see a 'y' with an umlaut anywhere). I did a file on the .sql file with cygwin bash, and it says:
Little-endian UTF-16 Unicode text, with very long lines, with CRLF, CR line >terminators
I really don't want to recreate the database by hand. I am looking for any suggestions.
I tried psql with and without the '-1' option; no luck. I tried putting a ';' at the top of the sql file, which I found suggested somewhere; again no luck.
I did a psql -l on my postgresql installation and the encoding on all my databases (including the one to which I am trying to do the restore) shows UTF8.
There really is no code. It is just that I can't seem to restore this dump file because it errors out.
I think that captures my problem. The windows box that I got the dump from is not available to me now; so I'm just hoping there is a way to get around this problem. Recreating the database by hand table by table is something I would prefer to avoid.
Thanks--
Al
In my case , this exact thing happened because I was taking the dump using windows Powershell , due to which other characters got included in the dump file.
Simply using command prompt to take the solved my problem.
I can only give you leads how to debug the problem, because the cause is not immediately obvious.
First, there should be a line close to the beginning of the dump file that sets client_encoding. The dump file should be in that encoding.
I can see two possibilities:
The file got mangled during transfer. To test for that, calculate a checksum for both files and compare.
Always use binary mode to transfer PostgreSQL dumps.
some editor or something else sneaked a BOM (byte order mark) into the file at the very beginning.
That's my prime suspect, since the problem is at line 1.
Use a hex editor or od (in Cygwin) to verify that. If this is the problem, simply replace the BOM with spaces.

Greenplum to file using PSQL

I'm trying to export data from Green-plum to a text file(client) with pipe delimiter using PSQL and \copy. In the output i see single slash is converted to double slash and tab is converted \t.
Example
N\A is converted to N\\A
So how to get just N\A instead N\\A and just spaces instead of \t ?
Note: i`m allowed to use only \copy. Since my file is huge im getting space issue while use SED or Perl for find and replace
Assuming you don't have any "^" characters, you could use that as the escape character.
copy tpcds.call_center to stdout with delimiter '|' escape '^';
More on copy can be found here: https://www.postgresql.org/docs/8.2/static/sql-copy.html
This technique will be relatively slow and put a burden on the Master. If you used gpfdist instead, you could leverage the parallelism in the cluster and bypass the master. This solution is ideal for unloading large amounts of data.
First, start the gpfidst process:
[gpadmin#gpdbsne ~]$ gpfdist -p 8888 > gpfdist_8888.log 2>&1 < gpfdist_8888.log &
[1] 2255
Now, you can create the external table.
[gpadmin#gpdbsne ~]$ psql
SET
Timing is on.
psql (8.2.15)
Type "help" for help.
gpadmin=# create writable external table tpcds.et_call_center
(like tpcds.call_center)
location ('gpfdist://gpdbsne:8888/call_center.txt')
format 'text' (delimiter '|' escape '^');
NOTICE: Table doesn't have 'distributed by' clause, defaulting to distribution columns from LIKE table
CREATE EXTERNAL TABLE
Time: 18.681 ms
Now, you insert the data:
gpadmin=# insert into tpcds.et_call_center select * from tpcds.call_center;
INSERT 0 6
Time: 72.653 ms
gpadmin=# \q
Verify:
[gpadmin#gpdbsne ~]$ wc -l call_center.txt
6 call_center.txt
In my example, I used the hostname "gpdbsne" which is accessible to all segments in this cluster. Typically, Greenplum uses a private network for communication between segments so this hostname will need to be connected to the private network.
Since the writable external table is written to with SQL, you can use whatever transformation logic you want in the SQL so you can change tabs to spaces if you want. This eliminates the need for awk or sed for post processing the files. Copy can use SQL too but like I said, it is a slower than using writable external tables.

Automating PostgreSQL output to csv

mohpc04pp1: /h/u544835 % psql arco
Welcome to psql 8.1.17, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit
WARNING: You are connected to a server with major version 8.3,
but your psql client is major version 8.2. Some backslash commands,
such as \d, might not work properly.
dbname=> \o /h/u544835/data25000.csv
dbname=> select url from urltable where scoreid=1 limit 25000;
dbname=> \q
This is took from a link online of basically what I have been doing, but what I need to do is make a script that I can use to produce csv files daily
So my aim of the script is to while in the script connect to the db, run the \o etc commands then close it
but I'm having trouble scripting it to say go into the psql arco database then run those queries.
command line to connect to db = psql arco then once the scrits recognised I'm in that databse perform those commands to automate a query to a csv file.
if anyone can get me started or point me towards reading material for me to get past that bit, it will be duely appreciated.
i'm running all this off a standard windows xp, ssh'ing to a SLES set-up web server that holds my postgresql database running psql version 8.1.17
First of all you should fix your setup. As it turns out, we are dealing with PostgreSQL 8.1 here. This version has reached end of live in 2010. You need to seriously think about upgrading - or at least remind the guys running the server. Current version is 9.1.
The command you are looking for:
psql arco -c "\copy (select url from urltable where scoreid=1 limit 25000) to '/h/u544835/data25000.csv'"
Assuming your db is named "arco". Adjusted for changed question (including changed port).
I now see version 8.1 popping up in your question, but it's all contradictory. You need Postgres 8.2 or later to use a query (instead of a table) with the \copy meta-command.
Details about psql in the fine manual.
Alternative approach that should work with obsolete PostgreSQL 8.1:
psql arco -o /h/u544835/data25000.csv -t -A -c 'SELECT url FROM urltable WHERE scoreid = 1 LIMIT 25000'
Find some more info about these command line options under this related question on dba.SE.
With function (syntax compatible with 8.1)
Another way would be to create a server side function (if you can!) that executes COPY from a temp table (old syntax - works with pg 8.1):
CREATE OR REPLACE FUNCTION f_copy_file()
RETURNS void AS
$BODY$
BEGIN
CREATE TEMP TABLE u_tmp AS (
SELECT url FROM urltable WHERE scoreid = 1 LIMIT 25000
);
COPY u_tmp TO '/h/u544835/data25000.csv';
DROP TABLE u_tmp;
END;
$BODY$
LANGUAGE plpgsql;
And then from the shell:
psql arco -c 'SELECT f_copy_file()'
Change the separator
\f sets the field separator. I quote the manual again:
-F separator
--field-separator=separator
Use separator as the field separator for unaligned output.
This is equivalent to \pset fieldsep or \f.
Or you can change the column separator in Excel, here are the instructions from MS.
Thanks to Erwin's help and a link I read up on he posted for me I managed to combine the two to get
#!/bin/sh
dbname='arco'
username='' # If you actually supply a username, you need to add the -U switch!
psql $dbname $username << EOF
\f ,
\o /h/u544835/showme.csv
SELECT * FROM storage;
EOF
which will write my queries to a csv file etc for me.
From what there is above, it is not separating the sql query so if I load it straight into excel, they all stay in the same column too which means I'm having a problem with the delimiter
I've tried tabbed delimiters, also tried , ; etc but none are letting me separate it
I need for it
is there an option I can click to see which delimiter is being used with my psql? or a different way of dumping the data from a query into a file that can be read by excel, so a different column for each row etc