change SAS encoding to utf-8 - encoding

I am struggling greatly with changing my SAS session to utf-8.
I have tried changing the cfg file - sasv9.cfg; however, I am told that access is denied
I have tried using the encoding option; however, it says I must apply this on start-up, I tried restarting SAS and entering it then, still no luck.
I have tried the locale function; however, it returns an invalid command warning.
I am pretty stuck here. I am using SAS base9.4 and I have tried to follow multiple different threads but cannot find a solution.

Copy the sasv9.cfg file from the location you do not have write access to a place you do (i.e. your Documents folder). Change the encoding in there.
Start SAS on the command line with -CONFIG "C:\path\to\cfg\sasv9.cfg"
That should force SAS to start the session with the new config file and the specified encoding.

Related

How does the pgadmin encodes the file path in backups?

I'm trying to restore dump files from locations that contain character from other languages besides English.
So here is what I did:
From inside the pgadmin I used the backup tool like:
And inside the FileName input provided an actual real folder named "א":
C:\א\toc.dump
The actual file argument (-f file) has been auto decoded into:
pg_dump.exe --file "C:\\0F04~1\\TOC~1.DUM"
My question is what is the decoding system pgadmin uses in order to decode the file path argument?
How did it came up with 0F04~1 from א?
I'm asking it because pg_restore is not supporting file path that contains not English chars (from cmd):
pg_dump.exe --file "C:\\0F04~1\\TOC1.DUMP" .... WORKS OK!
pg_dump.exe --file "C:\\א\\TOC1.DUMP" ... Not Working!
pg_restore: [custom archiver] could not open input file "..."
As in this question, so if I'll find the encoding system for pgadmin I'll use it from code.
My goal is to encode the path that contain not-English chars from a batch code so it will work.
This is not something weird pgadmin does, but rather it is something weird Windows itself does when needing to represent such file names in a DOS-like setting. Like when the name is more than 8 chars, or extension more than 3.
In my hands the weird presentation is only there in the logs and status messages. If I use the GUI file chooser, the file names look normal, and replay successfully.
If you really want to know what Windows is doing, I think that is a better question for superuser with a Windows tag. I don't know why you can't restore these files. Are you using the pgAdmin GUI file chooser or trying to type the names in directly to something?

ERROR: could not stat file "XX.csv": Unknown error

I run this command:
COPY XXX FROM 'D:/XXX.csv' WITH (FORMAT CSV, HEADER TRUE, NULL 'NULL')
In Windows 7, it successfully imports CSV files of less than 1GB.
If the file is more then 1GB big, I get an “unknown error”.
[Code: 0, SQL State: XX000] ERROR: could not stat file "'D:/XXX.csv' Unknown error
How can I fix this issue?
You can work around this by piping the file through a program. For example I just used this to copy from a 24GB file on Windows 10 and PostgreSQL 11.
copy t(c,d) from program 'cmd /c "type x:\path\to\file.txt"' with (format text);
This copies the text file file.txt into the table t, columns c and d.
The trick here is to run cmd in a single command mode, with /c and telling it to type out the file in question.
https://github.com/MIT-LCP/mimic-code/issues/493
alistairewj commented Nov 3, 2018 • ►
edited
Okay, the could not stat file "CHARTEVENTS.csv": Unknown error is actually a bug in PostgreSQL 11. Under the hood it makes a call to fstat() to make sure the file is not a directory, and unfortunately fstat() is a 32-bit program which can't handle large files like chartevents. I tested the build on Windows with PostgreSQL 10.5 and I didn't get this error so I think it's fairly new.
The best workaround is to keep the files compressed (i.e. keep them as .csv.gz files) and use 7zip to load in the data directly from compressed files. In testing this seemed to still work. There is a pretty detailed tutorial on how to do this here: https://mimic.physionet.org/tutorials/install-mimic-locally-windows/
The brief version of above is that you keep the .csv.gz files, you add the 7zip binary to your windows environment path, and then you call the postgres_load_data_7zip.sql file to load in the data. You can use the postgres_checks.sql file after everything to make sure you loaded in all the data correctly.
edit: For your later error, where you are using this 7zip approach, I'm not sure why it's not loading. Try redownloading just the ADMISSIONS.csv.gz file and seeing if it still throws you that same error. Maybe there is a new version of 7zip which requires me to update the script or something!
For anyone else who googled this Postgres error message after attempting to work with a >1gb file in Postgres 11, I can confirm that #亚军吴's answer above is spot-on. It is indeed a size issue.
I tried a different approach, though, than #亚军吴's and #Loren's: I simply uninstalled Postgres 11 and installed the stable version of Postgres 10.7. (I'm on Windows 10, by the way, in case that matters.)
I re-ran the original code that had prompted the error and voila, a few minutes later I'd filled in a new table with data from a medium-ish-size csv file (~3gb). I initially tried to use CSVSplitter, per #Loren, which was working fine until I got close to running out of storage space on my machine. (Thanks, Battlefield 5.)
In my case, there isn't anything in PGSQL 11 that I was relying on that wasn't in version 10.7, so I think this could be a good solution for anyone else who runs into this problem. Thanks everyone above for contributing, especially to the OP for posting this in the first place. I cured a huge, huge headache!
This has been fixed in commit bed90759f in PostgreSQL v14.
The file limit for the error is actually 4 GB.
The fix was too invasive to be backported, so you can only upgrade to avoid the problem. Once the fix has had some field testing, you could lobby the pgsql-hackers mailing list to get it backported.
With pgAdmin and AWS, I used CSVSplitter to split into files less than 1GB. Lame, but worked. pgAdmin import appends to the existing table. (Changed escape character from ' to " in order to avoid error due to unquoted text in the source file. Typically I apply quotes in LibreOffice, but these files were too big to open.)
It seems this is not a database problem, but a problem of psql / pgadmin. The workaround is using an admin software from the previous psql versions:
Use the existing PostgreSQL 11 database
Install psql or pgadmin from the PostgreSQL 10 installation and use it to upload the file (with the command shown in the question)
Hope this helps anyone coming across the same problem.
Add two lines to your CSV file: One at the begining and one at the end:
COPY XXX FROM STDIN WITH (FORMAT CSV, HEADER TRUE, NULL 'NULL');
<here are the lines your file already contains>
\.
Don't forget another newline after the \. line. Then call
psql -h hostname -d dbname -U username -f 'D:/XXX.csv'
This is what worked for me:
\COPY member_data.lab_result FROM PROGRAM 'gzip -dcf lab_result.dat.gz' WITH (FORMAT 'csv', DELIMITER '|', QUOTE '`')

postgreSQL COPY command error

Hallo everyone once again,
I did various searches but couldn't gind a suitable/applicable answer to the simple problem below:
On pgAdminIII (Windows 7 64-bit) I am running the following command using SQL editor:
COPY public.Raw20120113 FROM 'D:\my\path\to\Raw CSV Data\13_01_2012.csv';
I tried many different variations for the path name and verified the path, but I keep getting:
ERROR: could not open file "D:\my\path\to\Raw CSV Data\13_01_2012.csv" for reading: No such file or directory
Any suggestions why this happens?
Thank you all in advance
Petros
UPDATE!!
After some tests I came to the following conclusion: The reason I am getting this error is that the path includes some Greek characters. So, while Windows uses codepage 1253, the console is using 727 and this whole thing is causing the confusion. So, some questions arise, you may answer them if you like or prompt me to other questions?
1) How can I permanently change the codepageof the console?
2) How can I define the codepage is SQL editor?
Thank you again, and sorry if the place to post the question was inappropriate!
Try DIR "D:\my\path\to\Raw CSV Data\13_01_2012.csv" from command line and see if it works - just to ensure that you got the directory, file name, extension etc correct.
The problem is that COPY command runs on server so it takes the path to the file from the server's scope.
To use local file to import you need to use \COPY command. This takes local path to the file into account and loads it correctly.

PostgreSQL's COPY statement

Using COPY statement of PostgreSQL, we can load data from a text file into data base's table as below:
COPY CME_ERROR_CODES FROM E'C:\\Program Files\\ERROR_CODES\\errcodes.txt' DELIMITER AS '~'
The above statement is run from a machine which has postgresql client where as the server is in another windows machine. Running the above statement is complaining me that ERROR: could not open file "C:\Program Files\ERROR_CODES\errcodes.txt" for reading: No such file or directory.
After some research, i observed that COPY statement is looking for the loader file(errcodes.txt) in the postgresql server's machine at the same path (C:\Program Files\ERROR_CODES). To test this , i have create the same folder structure in the postgresql server's machine and kept the errcodes.txt file in there. Then the COPY statement worked well. It looks very tough constraint for me with COPY statement.
Is there any setting needed to avoid this? or it is the behavior of COPY statement? I didn't find any information on PostgreSQL documents.
here's the standard solution:
COPY foo (i, j, k) FROM stdin;
1<TAB>2<TAB>3
\.
The data must be properly escaped and tab-separated.
Actually, it is in the docs, even in grammar definition you have STDIN... See: http://www.postgresql.org/docs/9.1/static/sql-copy.html
If you're using some programming language with COPY support, you will have pg_putcopy or similar function. So you don't have to worry about escaping and concatenation.
Hints how to do this manually in Python -> Recreating Postgres COPY directly in Python?
The Perl way -> http://search.cpan.org/dist/DBD-Pg/Pg.pm#COPY_support
Hope this helps.
From the documentation
Quote:
COPY with a file name instructs the PostgreSQL server to directly read from or write to a file. The file must be accessible to the server and the name must be specified from the viewpoint of the server. When STDIN or STDOUT is specified, data is transmitted via the connection between the client and the server.
If you want to copy from a local machine file to a server use \copy command.

Oracle -Character Encoding

I am facing a peculiar problem where i need to update a particular value in database to say 'Hellò'. When i run normal update statement the values are updated fine. But when i put it i a .sql script file and then run the update statement the last character gets replaced by a junk value. Can some one enlighten me on this and how oracle processes script files?
If the update statement works then this isn't an issue with the character encoding in the database so there are two main culprits to look at; your software and your file encoding.
Check that you editor is UTF8 compliant - Notepad for example is not, but Wordpad is, and so are better editors like UltraEdit. You also need to check that the file is saved as UTF8 since this in not always the default and if you edited and saved a file with Notepad, for example, it won't be UTF8 any more.
Once you have a UTF8 file then you must load it into the database via software that supports UTF8, which excludes SQLPlus in Windows 10g. SQL Developer is OK. If you want to use SQLPLus upgrade your client to 11g. You don't say how you are loading it, so check that whatever you are using supported UTF8.