Import Flat File via SSMS to SQL Server fails - import

When importing a seemingly valid flat file (csv, text etc) into a SQL Server database using the SSMS Import Flat File option, the following error appears:
Microsoft SQL Server Management Studio
Error inserting data into table. (Microsoft.SqlServer.Import.Wizard)
Error inserting data into table. (Microsoft.SqlServer.Prose.Import)
Object reference not set to an instance of an object. (Microsoft.SqlServer.Prose.Import)
The target table may contain rows that imported just fine. The first row that is not imported appears to have no formatting errors.
What's going wrong?

Check the following:
that there are no blank lines at the end of the file (leaving the last line's line terminator intact) - this seems to be the most common issue
there are no unexpected blank columns
there are no badly escaped quotes
It looks like the import process loads lines in chunks. This means that the lines following the last successfully loaded chunk may appear to have no errors. You need to look at subsequent lines, that are part of the failing chunk, to find the offending line(s).
This cost me hours of hair pulling while dealing with large files. Hopefully this saves someone some time.

If the file you're importing is already open, SSMS will throw this error. Close the file and try again.

Make sure when you are creating your flat-file IF you have text (varchar) value in any of your columns, DO NOT select your file to be comma "," delimited. Instead, select vertical line "|" or something that you are SURE it can't be in those values. the comma is supper common to have in nvarchar filed.
I have this issue and none of the recommendations from other answers helped me!
I hope this saves someone some times and it took me hours to figure it out!!!

None of these other ones worked for me, however this did:
When you import a flat file, SSMS gives you a brief summary of the data types within each column. Whenever you see a nvarchar that's in an int or double column, change it to int or double. And change all nvarchars to nvarchar(max). This worked for me.

I've been working with csv data for a long time. I encountered the similar problems when I first started this job, however as a novice, I couldn't obtain a precise fault from the exceptions.
Here are a few things you should look at before importing anything.
Your csv file must not be opened in any software, such as Excel.
Your csv file cells should not include comma or quotation symbols.
There are no unnecessary blanks at the end of your data.
There is no usage of a reserved term as data. In Excel, open
yourfile and save it as a new file.

After considering all the suggestions, if anyone is still having issues, check the length of the DataType for your columns. It took hours for me to figure this out but increasing the nvarchar length from (50) to (100) worked for me.

One thing that worked for me : You can change the error range to 1 in "Modify colums"
Image for clarity of where it is
You get an error message with the specific line that's problematic in your file instead of "ran out of memory"

I fixed these errors by playing around with the data type. For instance, change my tinyint to smallint, smallint to int, and increased my nvarchar() to reasonable values, else I set it to nvarchar(MAX). Since most of the real-life data do have missing values, I checked allowed missing values in all columns. Everything then worked with a warning message.

Related

PostgreSQL copy from succeeds but fewer rows copied then lines in file

I have several large csv (tab delim) files I need to copy into one table. Each file is five million rows. The first two files copy just fine with all five million records. But on the the third I get missing data.
I only copy about 3.9 million instead of five million. There is no error. It runs just fine. But fewer rows copied then exist in the file.
I have reviewed the text files and indeed there are five million distinct rows in the text file.
So after a very manual trial and error process I found the the last row that wrote correctly (annoying the frame that didn't write was neither in the end or the beginign). It appears that perhaps their is an issue with a particular field. The field ends with the following string: ."' (this is period, double quote, single quote) I am using tab deliminted, but is it posible that postgress is reading this as some kind of special character? I think all of the subsequent rows may be writing into the that field for that row.
Just to add some more context -- the field in which the double quotes are throwing things off also happens to be an email field. So there a email with a typo in it and a double quote. Then 1.1 million rows later there is another email with a typo in it with a double quote. All the records between these two double quotes don't get written correctly.
That is not surprising if you consider that a logical line in a CSV file can span more than one physical line:
1,a text,2019-11-24
2,"a text
that contains a newline",2020-04-01

Data Conversion Failed SQL

I am using the import and export wizard and imported a large csv file. I get the following error.
Error 0xc02020a1: Data Flow Task 1: Data conversion failed. The data
conversion for column "firms" returned status value 2 and status text "The
value could not be converted because of a potential loss of data.".
(SQL Server Import and Export Wizard)
Upon importing, I use the advanced tab and make all of the adjustments. As for the field in question, I set it is numeric (8,0). I have since went through this process multiple times and tried 7,8,9,10,and 11 to no avail. I import the csv into excel and look at the respective column, firms. It shows no entry with more than 5 characters. I thought about making it DT_String but will need to manipulate that column eventually by averaging it. Also, have searched for spaces or strange characters and found none.
Any other ideas?
1) Try changing the Numeric precision to numeric(30,20) both in source and destination table.
2) Change the data type to str/wstr and adjust the output column width while importing. It will run fine. It happened with me as well while loading large CSV file of approx 5 GB. After load, use Try_convert function to convert it back to numeric and check the values which went null while conversion, you will find the root cause then.

Keep leading zeros when joining data sources in tableau

I am trying to create a data source in Tableau (10.0) where I am joining a table from SQL with an Excel file. The join happens on a site id but when reading the id from the excel source, Tableau strips the leading zeros (and SQL keeps leading zeros). I see this example
to add the leading zeros back as a new, calculated field. But the join is still dropping rows because the id is not properly formatted when making the join.
How do I get the excel data source to read the column with the leading zeros so I can do the join?
Launch Excel and choose to open a new blank workbook.
Click the Data tab and select From Text.
Browse to the saved CSV file and select Import.
Ensure that Delimited is selected and click Next.
Leave Tab as the delimiter and click Next.
Select the column containing the data with leading zeros and click
Text.
Repeat for each column which contains leading zeros.
Click Finish.
Click OK.
Never heard of or used tableau, but it sounds as though something (jet/ace database driver being used to read excel file?) is determining the column to be numeric and parsing the data as numbers, losing leading zeroes
If your attempts at putting them back are giving you grief, I'd recommend trying the other direction instead; get sqlserver to convert its strings to numbers. Number matching should be more reliable than String matching, so long as the two systems don't handle rounding differently :)
If your Excel file was read in from a CSV and the Site ID is showing "Number Stored as Text", I think you can solve your problem by telling Tableau on the Data Source entry that the field is actually a string. On the preview data source view, change the "#" (designating number) to string so that both the SQL source and the Excel source are both strings before doing the join.
This typically has to do with the way Excel stores values as mentioned above. I would play around with the number formatting for the Site ID column in Excel itself, not Tableau, and changed that two "Text" in Excel. You can verify if Tableau will read it properly with the leading 0s by exporting your excel file to csv and looking in the csv files to see if the leading 0s are still there.

COPY ignore blank columns

Unfortunately I've got some huge number of csv files with missing separator as following. Notice the second data got only 1 separator with 2 values. Currently I'm getting "delimiter not found error".
Only if I could insert NULL to 3rd column in case there is only two values.
1,avc,99
2,xyz
3,timmy,6
Is there anyway I can COPY this files into Redshift without modifying CSV files?
Use the FILLRECORD parameter to load NULLs for blank columns
You can check the docs for more details

Changing text in CSV file using Perl

I know nothing about Perl. I have looked at some online tutorials and am at a loss for the following.
I do a query in PostgreSQL that saves to a CSV file. However, one element needs to be changed after the CSV file is created, and I have no idea how to do it.
The existing query results are like this
phone date time staff email and customer ID -- my explanation
1112223333,10/21/2013,3:00 AM,sklund#myemail.comSMIB010170 -- data in csv
After query is completed, the data in the time field must be converted to:
1112223333,10/21/2013,03:00am,sklund#myemail.comSMIB010170
As you can see, the time needs to be ammended to include a 0 if the hour is less than ten, and the AM must be changed to am.
Is there a simple Perl script that can do this? The lines of data, of course, will be different, as each line would reflect results of the query for the day.
If someone can point me to a tutorial, link, or help in this I'd be very grateful.
This will do what you need.
I assume you want the space before AM removed as well? You don't mention it in your question.
perl -pe 's/,(\d{1,2}):(\d\d)\s+([AP]M),/sprintf ",%02d:%02d%s,",$1,$2,lc $3/ei' mylogfile > newlogfile