Data Type Cast Won't Stick in SSIS - date

I'm trying to automate a process with SSIS that exports data into a flat file (.csv) that is then saved to a directory, where it will be scanned and imported by some accounting software. The software (unfortunately) only recognizes dates that are in MM/DD/YYYY fashion. I have tried every which way to cast or convert the data pulled from SQL to be in the MM/DD/YYYY, but somehow the data is always recognized as either a DT_Date or DT_dbDate data type in the flat file connection, and saved down as YYYY-MM-DD.
I've tried various combinations of data conversion, derived columns, and changing the properties of the flat file columns to string in hopes that I can at least use substring operations to get this formatted correctly, but it never fails to save down as YYYY-MM-DD. It is truly baffling. The preview in the OLE DB source will show the dates as "MM/DD/YYYY" but somehow it always changes to "YYYY-MM-DD" when it hits the flat file.
I've tried to look up solutions (for example, here: Stubborn column data type in SSIS flat flat file connection manager won't change. :() but with no luck. Amazingly if I merely open the file in Excel and save it, it will then show dates in a text editor as "MM/DD/YYYY", only adding more mystery to this Bermuda Triangle-esque caper.
If there are any tips, I would be very appreciative.

This is a date formatting issue.
In SQL and in SSIS, dates have one literal string format and that is YYYY-MM-DD. Ignore the way they appear to you in the data previewer and/or Excel. Dates are displayed to you based upon your Windows regional prefrences.
Above - unlike the US - folks in the UK will see all dates as DD/MM/YYYY. The way we are shown dates is NOT the way they are stored on disk. When you open in Excel it does this conversion as a favor. It's not until you SAVE that the dates are stored - as text - according to your regional preferences.
In order to get dates to always display the same way. We need to save them not as dates, but as strings of text. TO do this, we have to get the data out of a date column DT_DATE or DT_DBDATE and into a string column: DT_STR or DT_WSTR. Then, map this new string column into your csv file. Two ways to do this "date-to-string" conversion...
First, have SQL do it. Update your OLE DB Source query and add one more column...
SELECT
*,
CONVERT(VARCHAR(10), MyDateColumn, 101) AS MyFormattedDateColumn
FROM MyTable
The other way is let SSIS do it. Add a Derived Column component with the expression
SUBSTRING([MyDateColumn],6,2) + "/" + SUBSTRING([MyDateColumn],8,2) + "/" + SUBSTRING([MyDateColumn],1,4)
Map the string columns into your csv file, NOT the date columns. Hope this helps.

It's been a while but I just came across this today because I had the same issue and hope to be able to spare someone the trouble of figuring it out. What worked for me was adding a new field in the Derived Column transform rather than trying to change the existing field.
Edit
I can't comment on Troy Witthoeft's answer, but wanted to note that if you have a Date type input, you wouldn't be able to do SUBSTRING. Instead, you could use something like this:
(DT_WSTR,255)(MONTH([Visit Date])) + "/" + (DT_WSTR,255)(DAY([Visit Date])) + "/" + (DT_WSTR,255)(YEAR([Visit Date]))

Related

I need to split a text column in Access that stores dates in MM/YYYY and MM/DD/YYYY format to remove the day format

I have a program that writes dates as text in a column called Expression. Is there a way to write a query that will remove the day of the month? I need it to be something that I can share with multiple people to use in multiple databases. Examples of the format of the values stored are 6/2021, 06/2021, and 06/01/2021.
You can use IIf in the query to compare the length of the data, and pad/split as required. Something like:
SELECT
Expression,
IIf(Len([Expression])=6,"0" & [Expression],IIf(Len([Expression])=7,[Expression],IIf(Len([expression])=10,Left(Expression,2) & "/" & right(expression,4)))) AS OutputData
FROM tblFormatDate;
You may need to add a few more IIfs, and also a final false return statement. If it starts to get messy with the data, then you may want to look at creating a custom VBA function that makes the process flow easier to understand.
Regards,

How to check date format in Azure Data Factory

I am creating a pipeline where the source is csv files and sink is SQL Server.
The date column in CSV file may have values like
12/31/2020
10162018
20201017
31/12/1982
1982/12/31
I do not find the function which checks the format of the date. How do I check the format and convert the above values to yyyy-MM-dd format.
The solution is given by HimanshuSinha-msft
Solved the issues using expression builder in Derived Column in Mapping Data Flow.
coalesce(toDate(Somedate,'MM/dd/yyyy'),toDate(Somedate,'yyyy/MM/dd'),toDate(Somedate,'dd/MM/yyyy'),toDate(Somedate,'MMddyyyy'),toDate(Somedate,'yyyyddMM'),toDate(Somedate,'MMddyyyy'),toDate(Somedate,'yyyyMMdd'))
This coalesce function answer will not actually solve the problem. It just gets rid of the errors. There are plenty of dates that are valid in multiple formats. For example: "2/1/2020" (mm/dd/yyyy) and "1/2/2020" (dd/mm/yyyy). The previous answer just gets rid of errors, but your analyses downstream will be very incorrect.
You need to do an aggregate analysis of which date format best fits the incoming stream, and the route the logic to the respective separate pipeline branches.
You can configure this in the Mapping tab of your copy activity. The datetime format can be specified, but it only supports one format type. If you have a mix of formats like in your example then it will not work.
One option would be to ingest the column into a staging table as a nvarchar. Then in another copy activity use a custom select statement to detect the column format and cast the date as needed. You should be able to do this using a CASE SQL statement in your SELECT from the staging table.
FYI: data type mapping
https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-schema-and-type-mapping#data-type-mapping

Date in table is dd.mm.yyyy - Can't import to postgres via csv

I'm trying to add a .csv to a table in database.
All dates in the .csv is in this format dd.mm.yyyy ( 18.10.2017).
I'm importing via pgadmin and always get an invalid input error.
I've tried to use almost all date formatting options for the column but without any luck.
I would rather not change the csv manually.
Can anyone help me with this?
I almost always import data into a staging table where all the columns are strings.
Then I use queries to load the final table.
This has several advantages:
It gives me much more control over how the data is transformed.
It makes it easier to debug problems -- the entire staging table can be queried to find all rows with a particular issue (for instance).
Additional validations can be performed before loading into the final table.
This is just a suggestion, but you might find that overall this takes less time.
The DateStyle setting is probably set to MDY. You can check this by running:
show datestyle;
Although dd.mm.yyy isn't listed as a standard input format, if you expect it to work, you will need the DateStyle to line up with the ordering here (DMY).
The date/time style can be selected by the user using the SET datestyle command, the DateStyle parameter in the postgresql.conf configuration file, or the PGDATESTYLE environment variable on the server or client.
See section "Date Order Conventions":
https://www.postgresql.org/docs/current/static/datatype-datetime.html

MS Access mporting dates

At the end of importing a .txt file through the help of the wizard i get a message that some elements were not imported correctly. I have a column in the .txt which should contain dates, but for some reason when i select the column containing dates, and i set its type to date and time, for some reason access cannot recognize them as dates. I'm thinking that it's because of the language difference. I use dates like: 1.1.2011, whereas access uses 1/1/2011.
Where can i change the format?
You can in the Advanced section of the Import Wizard.
If that doesn't work, don't import but link the file and specify the date field as text.
Then create a simple select query where you use the linked table as source. Select all the fields you need.
For the date field, use this expression:
TrueDate: CDate(Replace([YourTextDateField], ".", "/"))
Clean up other fields as well.
Now use this query for the further processing of the data.

Excel 2010 - Pivot using external csv file - how to make dates work?

I have a set of pivot tables that use external csv files as their data sources. The csv files originally contained dates in the format dd/mm/yy (e.g. 31/01/13). The pivot tables did not recognise these as dates. I converted the dates in the csv files to dd/mm/yyyy (e.g. 31/01/2013) but these were still not recognised as dates by the pivot tables.
I tried setting up a calculated field =DATEVALUE(date_from_csv) but when used in the pivot table (I'm using the Max option to select the most recent date) I get #VALUE! errors.
I have tried converting the csv file to xlsx and also importing the data into the workbook that contains the pivot table - but I can't change from the external connection to use the internal data. I don't want to rebuild the pivots as there are a lot of variables and formatting that would take ages to redo.
Any ideas??
The problem was caused by the date column being blank for some rows and I found that if I moved a row to the top (after the header line) that had all the fields filled in, then Excel got the formats correct and the pivot tables now work!