Unexpected characters on bulk insert with t-sql

Unexpected characters on bulk insert with t-sql - tsql

I'm using a report tool which , needs to load text files in sql. But I have a problem. I searched and I can't find an answer for this. I tried some ways but the problem exists.
All txt files are with utf encoding.
I'm using t-sql's bulk insert statement as exist.
IF OBJECT_ID ( 'dbo.tableName','U') IS NOT NULL
DROP TABLE dbo.tableName
CREATE TABLE tableName ( line nvarchar(50) );
BULK INSERT dbo.tableName from 'file_with_path.txt'
WITH ( ROWTERMINATOR='\n' )
select * from dbo.tableName
And finally I'm getting these lines
(Sorry ! But I can't upload images. I haven't reputation :( )
What's wrong ?
Note : I tried as data type in the column also
varchar,text... But the result is the same

try using ACP conversion:
BULK INSERT dbo.tableName from ''
WITH ( ROWTERMINATOR='\n', CODEPAGE = 'ACP' )

I found the answer. The problem is from the openrowset which not supports utf-8.
I used directly bulk insert with temporary table and now I can read utf-8 encoded files without problem. Here is the code for people which maybe has the same problem
CREATE TABLE #temp
(
lines nvarchar(max) NULL
)
BULK INSERT #temp
FROM 'c:\my\test.txt'
WITH(ROWTERMINATOR = '', CODEPAGE='65001')
SELECT * FROM #temp
DROP TABLE #temp
The codepage 65001 is the UTF-8
The file text.txt is with utf-encoding

Related

Creating a table with many columns in PostgreSQL

In order to use COPY (in my case, from a csv file) function in PostgreSQL I need to create the destination table first.
Now, in case my table has 60 columns, for instance, it feel weird and inefficient to write manually this:
CREATE TABLE table_name(
column1 datatype,
column2 datatype,
column3 datatype,
.....
column60 datatype
Those who use PostgreSQL - how do you ger around this issue?

I usually use file_fdw extension to read data from CSV files.
But unfortunately, file_fdw is not that convenient/flexible when you solve such tasks as reading from a CSV file with many columns. CREATE TABLE will work with any number of columns, but if it doesn't correspond to the CSV file, it will fail later, when performing SELECT. So the problem of explicit creation of table remains. However, it is possible to solve it.
Here is brute-force approach that doesn't require anything except Postgres. Written in PL/pgSQL, this function tries to create a table with one single column, and attempt to SELECT from it. If it fails, it drops the table and tries again, but with 2 columns. And so on, until SELECT is OK. All columns are of type text – this is quite a limitation, but it still solves the task of having ready-to-SELECT table instead of doing manual work.
create or replace function autocreate_table_to_read_csv(
fdw_server text,
csv text,
table_name text,
max_columns_num int default 100
) returns void as $$
declare
i int;
sql text;
rec record;
begin
execute format('drop foreign table if exists %I', table_name);
for i in 1..max_columns_num loop
begin
select into sql
format('create foreign table %I (', table_name)
|| string_agg('col' || n::text || ' text', ', ')
|| format(
e') server %I options ( filename \'%s\', format \'csv\' );',
fdw_server,
csv
)
from generate_series(1, i) as g(n);
raise debug 'SQL: %', sql;
execute sql;
execute format('select * from %I limit 1;', table_name) into rec;
-- looks OK, so the number of columns corresponds to the first row of CSV file
raise info 'Table % created with % column(s). SQL: %', table_name, i, sql;
exit;
exception when others then
raise debug 'CSV has more than % column(s), making another attempt...', i;
end;
end loop;
end;
$$ language plpgsql;
Once it founds the proper number of columns, it reports about it (see raise info).
To see more details, run set client_min_messages to debug; before using the function.
Example of use:
test=# create server csv_import foreign data wrapper file_fdw;
CREATE SERVER
test=# set client_min_messages to debug;
SET
test=# select autocreate_table_to_read_csv('csv_import', '/home/nikolay/tmp/sample.csv', 'readcsv');
NOTICE: foreign table "readcsv" does not exist, skipping
DEBUG: SQL: create foreign table readcsv (col1 text) server csv_import options ( filename '/home/nikolay/tmp/sample.csv', format 'csv' );
DEBUG: CSV has more than 1 column(s), making another attempt...
DEBUG: SQL: create foreign table readcsv (col1 text, col2 text) server csv_import options ( filename '/home/nikolay/tmp/sample.csv', format 'csv' );
DEBUG: CSV has more than 2 column(s), making another attempt...
DEBUG: SQL: create foreign table readcsv (col1 text, col2 text, col3 text) server csv_import options ( filename '/home/nikolay/tmp/sample.csv', format 'csv' );
INFO: Table readcsv created with 3 column(s). SQL: create foreign table readcsv (col1 text, col2 text, col3 text) server csv_import options ( filename '/home/nikolay/tmp/sample.csv', format 'csv' );
autocreate_table_to_read_csv
------------------------------
(1 row)
test=# select * from readcsv limit 2;
col1 | col2 | col3
-------+-------+-------
1313 | xvcv | 22
fvbvb | 2434 | 4344
(2 rows)
Update: found implementation of very similar (but w/o "brute-force", requiring explicit specification of # of columns in CSV file) approach, for COPY .. FROM: How to import CSV file data into a PostgreSQL table?
P.S. Actually, this would be a really good task to improve file_fdw and COPY .. FROM capabilities of Postgres making them more flexible – for example, for postgres_fdw, there is a very handy IMPORT FOREIGN SCHEMA command, which allows very quickly define remote ("foreign") objects, just with one line – it saves a lot of efforts. Having similar thing for CSV dta would be awesome.

Is it possible to use t-SQL to store a hash using HASHBYTES for a file not in the database?

I would just like to know if it is or is not possible to use t-SQL (SQL Server 2012) to store a MD5 hash using HASHBYTES for a file not in the database?
I am only trying to achieve what is stated above without using anything other then T-SQL. That is the constraint weather or not there is a better way. It's a inflexible constraint/mandate that nothing but T-SQL is to be used. No addons no powrshell no nothing but T-SQL.
Can anyone provide the T-SQL code that would achieve the goal of putting the hash of some file (whatever name you want) into a field in a table (use whatever names you prefer)?
I know some might think there is a better way but that better way is not an option like it or not. So please refrain for offering any non 100% T-SQL solutions or comments.

see if this works for you.(tested both text and binary file types)
Declare #x varbinary
Declare #y table ( C1 varbinary(max))
SELECT #x = convert(varbinary , BulkColumn )
FROM OPENROWSET (BULK 'c:\temp\MyFile.txt', SINGLE_CLOB) MyFile
select HASHBYTES('MD5', #x)
insert into #y select HASHBYTES('MD5', #x)
select * from #y
output:
0x93B885ADFE0DA089CDF634904FD59F71
msdn Hashbyte ref:
https://msdn.microsoft.com/en-us/library/ms174415.aspx?f=255&MSPPError=-2147217396

DECLARE #csCommand varchar(4000);
DECLARE #csResult varchar(4000);
DECLARE #result table
([ID] int identity(1,1),
[Output] varchar(4000));
SET #csCommand = 'certutil -hashfile ''C:\Temp\MyFile.txt'' MD5';
INSERT INTO #result ([Output])
EXEC master..xp_cmdshell #csCommand
SELECT #csResult=[Output] FROM #result WHERE ID=2;

AWS Redshift Bulk Insert + Encoding definition

Is it possible to do a bulk insert into REdshift using the create table as syntax while defining data type and encoding at the same time? What's the correct syntax?
EG The following gives a syntax error near 'as':
create table my_table (
a int not null encode runlength,
b int not null encode runlength
) distkey(a) sortkey (a, b) as (
select * from other_table
);
I can only get it to work by defining column name only (a or b) and that's a huge limitation...

You can specify the DIST and SORT keys in a CREATE TABLE … AS query like this:
CREATE TABLE new_table
DISTSTYLE KEY
DISTKEY ( my_dist )
SORTKEY ( my_sort )
AS
SELECT *
FROM old_table
;
As per the docs: http://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_TABLE_AS.html I don't believe you can alter the compression encoding from the source table using CREATE TABLE AS.

More details on Redshift CTAS is given here: http://docs.aws.amazon.com/redshift/latest/dg/r_CTAS_usage_notes.html . In a nutshell, no where its mentioned that you can define the encoding in the CTAS statement. But you can define Sort Keys and Hash Keys. The default encoding chosesn by this statement is none.
However if you want to do a bulk insert, why don't you do in two steps.
Create table new_table with your encoding and sort/hash keys
Insert into new_table as select * from old_table

Dump file from Sqlite3 to PostgreSQL: why do I always get errors when import it?

I have many tables in Sqlite3 db and now I want to export it to PostgreSQL, but all the time I get errors.
I've used different techniques to dump from sqlite:
.mode csv
.header on
.out ddd.sql
select * from my_table
and
.mode insert
.out ddd.sql
select * from my_table
And when I try to import it through phppgadmin I get errors like this:
ERROR: column "1" of relation "my_table" does not exist
LINE 1: INSERT INTO "public"."my_table" ("1", "1", "Vitas", "a#i.ua", "..
How to avoid this error?
Thanks in advance!

Rant
You get this "column ... does not exist" error with INSERT INTO "public"."my_table" ("1", ... - because quotes around the "1" mean this is an identifier, not literal.
Even if you fix this, the query still will give error, because of missing VAULES keyword, as Jan noticed in other answer.
The correct form would be:
INSERT INTO "public"."my_table" VALUES ('1', ...
If this SQL was autogenerated by sqlite, bad for sqlite.
This great chapter about SQL syntax is only about 20 pages in print. My advice to whoever generated this INSERT, is: read it :-) it will pay off.
Real solution
Now, to the point... To transfer table from sqlite to postgres, you should use COPY because it's way faster than INSERT.
Use CSV format as it's understood on both sides.
In sqlite3:
create table tbl1(one varchar(20), two smallint);
insert into tbl1 values('hello',10);
insert into tbl1 values('with,comma', 20);
insert into tbl1 values('with "quotes"', 30);
insert into tbl1 values('with
enter', 40);
.mode csv
.header on
.out tbl1.csv
select * from tbl1;
In PostgreSQL (psql client):
create table tbl1(one varchar(20), two smallint);
\copy tbl1 from 'tbl1.csv' with csv header delimiter ','
select * from tbl1;
See http://wiki.postgresql.org/wiki/COPY.

Seems there is missing "VALUES" keyword:
INSERT INTO "public"."my_table" VALUES (...)
But! - You have to insert values with appropriate quotes - single quotes for text and without quotes for numbers.

How to insert JPEG into a SQL Server 2000 database field of image type using Transact SQL

I'm trying to figure out how to insert a .JPG file into a SQL Server 2000 database field of type image using Transact SQL. Thanks.

Use OPENROWSET:
INSERT MyTable (ImageColumnName)
SELECT BulkColumn FROM OPENROWSET (BULK 'c:\myjpeg.jpg', SINGLE_BLOB) AS X
EDITED Whoops, you're using 2000--the previous solution is not supported. You have to use WRITETEXT:
CREATE TABLE MyTable
(
ID INT PRIMARY KEY IDENTITY (1,1),
ImageColumnName IMAGE NULL
)
GO
-- must insert a dummy value into the image column for TEXTPTR
-- to work in next bit
DECLARE #RowId INT
INSERT MyTable (ImageColumnName) VALUES (0xFFFFFFFF)
SELECT #RowId = SCOPE_IDENTITY()
-- get a pointer value to the row+column you want to
-- write the image to
DECLARE #Pointer_Value varbinary(16)
SELECT #Pointer_Value = TEXTPTR(ImageColumnName)
FROM MyTable
WHERE Id = #RowId
-- write the image to the row+column pointer
WRITETEXT MyTable.ImageColumnName #Pointer_Value 'c:\myjpeg.jpg'

There is a tool called textcopy.exe
You can find it under MSSQL\Binn or get it with SQL Server 2000 SP4
Alexander Chigrik wrote a nice stored procedure for usinig it with SQL query:
http://www.mssqlcity.com/Articles/KnowHow/Textcopy.htm

The stored procedure found in this tutorial worked for me:
Brief tutorial on text, ntext, and image

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Unexpected characters on bulk insert with t-sql - tsql

try using ACP conversion: BULK INSERT dbo.tableName from '' WITH ( ROWTERMINATOR='\n', CODEPAGE = 'ACP' )

Related

Creating a table with many columns in PostgreSQL

Is it possible to use t-SQL to store a hash using HASHBYTES for a file not in the database?

AWS Redshift Bulk Insert + Encoding definition

Dump file from Sqlite3 to PostgreSQL: why do I always get errors when import it?

How to insert JPEG into a SQL Server 2000 database field of image type using Transact SQL

Categories

Resources