I need to load an entire file (contains only ASCII text), to the database (DB2 Express ed.). The table has only two columns (ID, TEXT). The ID column is PK, with auto generated data, whereas the text is CLOB(5): I have no idea about the input parameter 5, it was entered by default in the Data Studio.
Now I need to use the load utility to save a text file (contains 5 MB of data), in a single row, namely in the column TEXT. I do not want the text to be broken into different rows.
thanks for your answer in advance!
Firstly, you may want to redefine your table: CLOB(5) means you expect 5 bytes in the column, which is hardly enough for a 5 MB file. After that you can use the DB2 IMPORT or LOAD commands with the lobsinfile modifier.
Create a text file and place LOB Location Specifiers (LLS) for each file you want to import, one per line.
LLS is a way to tell IMPORT where to find LOB data. It has this
format: <file path>[.<offset>.<length>/], e.g.
/tmp/lobsource.dta.0.100/ to indicate that the first 100 bytes of
the file /tmp/lobsource.dta should be loaded into the particular LOB
column. Notice also the trailing slash. If you want to import the
entire file, skip the offset and length part. LLSes are placed in
the input file instead of the actual data for each row and LOB column.
So, for example:
echo "/home/you/yourfile.txt" > /tmp/import.dat
Since you said the IDs will be generated in the input data, you don't need to enter them in the input file, just don't forget to use the appropriate command modifier: identitymissing or generatedmissing, depending on how the ID column is defined.
Now you can connect to the database and run the IMPORT command, e.g.
db2 "import from /tmp/import.dat of del
modified by lobsinfile identitymissing
method p (1)
insert into yourtable (yourclobcolumn)"
I split the command onto multiple lines for readability, but you should type it on a single line.
method p (1) means parse the input file and read the column in position 1.
More info in the manual
Related
I have a strange source CSV file where it contains a trailing column delimiter at the end of each record just before the carriage return/new line.
When ADF is previewing this data, it displays only 2 columns without issue and all the data rows. However, when using the copy activity, it fails with the following exception.
ErrorCode=DelimitedTextColumnNameNotAllowNull,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The
name of column index 3 is empty. Make sure column name is properly
specified in the header
Now I understand why it's complaining about this due to trailing delimiter, but my question is whether or not there is a way to deal with this condition? I've tried including the trailing comma in the record delimiter (,\r\n), but then it just pivots the data where all the columns become rows.
Is there a way to address this condition in copy activity?
When preview the data in dataset, it seams correct:
But actually in copy actives, the data will derived to 3 columns by the column delimiter ",", the third column is empty or NULL value. This will cause the error.
If you use Data Flow import projection from source, you can see the third column:
Just for now, copy active doesn't support modify the data schema. You must use Data flow Derived Column to create a new schema for the source. For example:
Then mapping the new column/schema to sink will solve the problem.
HTH.
Use a different encoding for your CSV. CSV utf-8 will do the trick.
The file is not importing after having created a table. The first line of code is for the table (COPY), the second line of code is for the path of the file (FROM) and the WITH I am not entirely sure if there's a prior line of code that needs to be entered for its success as its not being highlighted in pink. The importing should be going through in either the built-in tool of pgAdmin or the syntax but neither of them generates the needed output. Here are some screenshots:
So I did another table, this time focusing on a single column and ensuring that the name of the column matched on both the table and the file and it worked. The prior example had several columns that had difference in spellings of the column content in table and the file:
You can try this sequentially...
1. First create csv file. .csv file column sequence is most important.
2. Consider the below employee_info.csv file
And consider your database table employee_info table which contain (emp_id [numeric],emp_name[character],emp_sal[numeric],emp_loc [character])
Then Execute the below query
a. copy employee_info(emp_id,emp_name,emp_sal,emp_loc) from 'C:\Users\Zbook\Desktop\employee_info.csv' DELIMITERS ',' CSV;
Note: Ensure that each .csv file row value has not null. Like below...
I need to copy a text file which has confusing delimiter. I believe the delimiter is space. However, some of the column values are empty and I cannot differentiate which column which making it harder to load the data to database since the space is not indicating anything. Thus, when I try to COPY, the mapping is not right and I am getting ERROR: extra data after last expected column
I have tried to change the delimiter to comma and such, I am still getting the same error above. The below code can be used when I try to load some dummy data with proper delimiter.
COPY usm00070219(HEADREC_ID,YEAR,MONTH,DAY,HOUR,RELTIME,NUMLEV,P_SRC,NP_SRC,LAT,LON) FROM 'D:\....\USM00070219-data.txt' DELIMITER ' ';
This is example data:
It should have 11 columns but the data on the first row is only 10 and it cannot identify the empty value column. The spacings are not helpful at all!
Is there any way I can separate the columns by character size as delimiter and force the data to be divided by the size given?
COPY is not made to handle fixed-width text files. I can think of two options:
Load the file as it is into a table with a single text column using COPY. Then use regexp_split_to_array to split it into its components and inser these into another table.
You can use file_fdw to create a foreign table with a single text column like above and operate on that. That saves loading the file into the database.
There is a foreign data wrapper for fixed-width text files that you can try.
I run this
db2 "IMPORT FROM C:\my.csv OF DEL MODIFIED BY COLDEL, LOBSINFILE DATEFORMAT=\"D/MM/YYYY\" SKIPCOUNT 1 REPLACE INTO scratch.table_name"
However some of my rows have a empty date field so I get this error
SQL3191N which begins with """" does not match the user specified DATEFORMAT, TIMEFORMAT, or TIMESTAMPFORMAT. The row will be rejected.
My CSV file looks like this
"XX","25/10/1985"
"YY",""
"ZZ","25/10/1985"
I realise if I insert charater instead of a blank string I could use NULL INDICATORS paramater.
However I do not have access to change the CSV file. Is there a way to ignore import a blank string as a null?
This is an error in your input file. DB2 differentiates between a NULL and a zero-length string. If you need to have NULL dates, a NULL would have no quotes at all, like:
"AA",
If you can't change the format of the input file, you have 2 options:
Insert your data into a staging table (changing the DATE column to a char) and then using SQL to populate the ultimate target table
Write a program to parse ("fix") the input file and then import the resulting fixed data. You can often do this without having to write the entire file out to disk – your program could write to a named pipe, and the DB2 IMPORT (and LOAD) utility is capable of reading from named pipes.
I'm not aware of anything. Yes, ideally that date field should be null.
Probably the best thing to do would be load the data into a scratch/temp table where that isn't a date column - just leave it as character data (it looks like you're already using a scratch table anyways). It should be trivial after that to use a CASE statement to transform the information into a null date if the value is blank, when doing your INSERT to the real table.
On a Mac, I have a txt file with two columns, one being an autoincrement in an sqlite table:
, "mytext1"
, "mytext2"
, "mytext3"
When I try to import this file, I get a datatype mismatch error:
.separator ","
.import mytextfile.txt mytable
How should the txt file be structured so that it uses the autoincrement?
Also, how do I enter in text that will have line breaks? For example:
"this is a description of the code below.
The text might have some line breaks and indents. Here's
the related code sample:
foreach (int i = 0; i < 5; i++){
//do some stuff here
}
this is a little more follow up text."
I need the above inserted into one row. Is there anything special I need to do to the formatting?
For one particular table, I want each of my rows as a file and import them that way. I'm guessing it is a matter of creating some sort of batch file that runs multiple imports.
Edit
That's exactly the syntax I posted, minus a tab since I'm using a comma. The missing line break in my post didn't make it as apparent. Anyways, that gives the mismatch error.
I was looking on the same problem. Looks like I've found an answer on the first part of your question — about importing a file into a table with ID field.
So yes, create a temporary table without ID, import your file into it, then do insert..select to copy its data into your target table. (Remove leading commas from mytextfile.txt).
-- assuming your table is called Strings and
-- was created like this:
-- create table Strings( ID integer primary key, Code text )
create table StringsImport( Code text );
.import mytextfile.txt StringsImport
insert into Strings ( Code ) select * from StringsImport;
drop table StringsImport;
Do not know what to do with newlines. I've read some mentions that importing in CSV mode will do the trick (.mode csv), but when I tried it did not seem to work.
In case anyone is still having issues with this you can download an SQLLite manager.
There are several that allow importing from a CSV file.
Here is one but a google search should reveal a few: http://sqlitemanager.en.softonic.com/
I'm in the process of moving data containing long text fields with various punctuation marks (they are actually articles on coding) into SQLite and I've been experimenting with various text imports.
I created a database in SQLite with a table:
CREATE TABLE test (id PRIMARY KEY AUTOINCREMENT, textfield TEXT);
then do a backup with .dump.
I then add the text below the "CREATE TABLE" line manually in the resulting .dump file as such:
INSERT INTO test textfield VALUES (1,'Is''t it great to have
really long text with various punctaution marks and
newlines');
Change any single quotes to two single quotes (change ' to ''). Note that an index number needs to be added manually (I'm sure there is an AWK/SED command to do it automatically). Change the auto increment number in the "sequence" line in the dump file to one above the last index number you added (I don't have SQLite in front of me to give you the exact line, but it should be obvious).
With the new file, I can then do a restore onto the database