How to connect Excel to MS SQL and get data WITH column names?

How to connect Excel to MS SQL and get data WITH column names? - tsql

One of my users wants to get data into Excel from SQL 2008 query/stored proc.
I never actually did it before.
I tried a sample using ADO and got data but user reasonably asked - where are the column names?
How do I connect a spreadsheet to an SQL resultset and get it with column names?

Apparently the field names are in the recordset object already.. just needed to pull them out.
i = 1
For Each objField In rs.Fields
Sheet1.Cells(1, i) = objField.Name
i = i + 1
Next objField

I don't know which version of Excel you are using but in Excel 2007 you can just connect to the SQL DB by going to Data -> From Other Sources -> From SQL Server. After you select your server and database, your connection will be created. Then you can edit it (Data -> Connections -> Properties) where in the Definition tab you change the Command type to SQL and enter your query in the Command text box. You can also create a view on the server and just point to that from Excel.
This should do it unless I misunderstood your question.

Related

How to populated the table via Pentaho Data Integration's table_output step?

I am performing an ETL job via Pentaho 7.1.
The job is to populate a table 'PRO_T_TICKETS' in PostgreSQL 9.2 via the Pentaho Jobs and transformations?
I have mapped the table fields with respect to the stream fields
Mapped Fields
My Table PRO_T_TICKETS contains the Schema (Column Names) in UPPERCASE.
Is this the reason I can't populate the table PRO_T_TICKETS with my ETL Job?
I duplicated the step TABLE_OUTPUT to PRO_T_TICKETS and changed the Target table field to 'PRO_T_TICKETS2'. Pentaho created a new table with lowercase schema and populated the data in it.
But I want this data to be uploaded in the table PRO_T_TICKETS only and with the UPPERCASE schema if possible.
I am attaching the whole job here and the error thrown by Pentaho. Pentaho Error I have also tried my query by adding double quotes to the column names as you can see in the error. But it didn't help.
What do you think I should do?

When you create (or modify) the connection, select Advanced on the left panel and click on the Force to upper case or Force to lower case or, even better, Preserve case of reserved words.
To know which option to choose, copy the 4th line of your error log, the line starting with INSERT INTO "public"."PRO_T_TICKETS("OID"... in your SQL-developer tool and change the connection advanced parameters until it works.
Also, at debug time, don't use batch updates, don't use lazy conversion on previous steps, and try with one (1) field rather than all (25).

Just as a complement: it worked for me following the tips from AlainD and using specific configurations that I'd like to share with you. I have a transformation streaming data from MySQL to PostgreSQL using a Table Input and Output. In both of DBs I have uppercase objects.
I did the following steps to work in the right way:
In the table input (MySQL) the objects are uppercase too, but I typed in lowercase and it worked and I didn't set any special option in the DB Connection.
In the table output (PostgreSQL) I typed everything in uppercase (schema, table name and columns) and I also set "specify the database fields" (clicking on "Get fields").
In the target DB Connection (PostgreSQL) I put the options (in "Advanced" section): "Quote all in database" and "Preserve case of reserved words".
PS: Ah, the last option is because I've found out that there was one more problem with my fields: there was a column called "Admin" (yes guys, they created a camelcase column using a reserved word!) and for that reason I must to put "Preserve case of reserved words" and type it as "Admin" (without quotes and in camelcase) in the Table Output.

SSIS Convert Between Unicode and Non-Unicode Error

I have an ssis package where I am using an OLEDB source linking to SQL Server 2005 table. All columns except a date column are NVARCHAR(255). I am using an Excel destination and using a SQL statement to create the sheet in the Excel workbook, the SQL is in the excel connection manager (effectively a create table statement that creates a sheet) and is derived from the mapping of the columns from the DB.
No matter what I have done I keep getting this unicode --> non-unicode conversion error between my source and destination. Tried conversion to string[DT_STR] between S > D, removed it, changed SQL Table VARCHAR to NVARCHAR and still get this flippin error.
Because I am creating the sheet in Excel with a SQL statement I do not see any way to actually pre-define what the data types of the columns will be in the Excel sheet. I imagine it would be a default meta data but I do not know.
So between my SQL table destination and the creation of my Excel sheet with this SSIS sql statement how can I stop this error coming up?
My error is:
Error at Data Flow Task [OLE DB Source [1]]: Column "MyColumn" cannot convert between unicode and non-unicode string data types.
And for all nvarchar columns.
Appreciate any help
Thanks
Andrew

The below Steps worked for me:
right click on source task.
click on "Show Advanced editor".
Go to "Input and Output Properties" tab.
select the output column for which you are getting the error.
Its data type will be "String[DT_STR]".
Change that data type to "Unicode String[DT_WSTR]".
save and close.

Add Data Conversion transformations to convert string columns from non-Unicode (DT_STR) to Unicode (DT_WSTR) strings.
You need to do this for all the string columns...

The missing piece here is Data Conversion object. It should be in between OLE DB Source and Destination object.

First, add a data conversion block into your data flow diagram.
Open the data conversion block and tick the column for which the error is showing. Below change its data type to unicode string(DT_WSTR) or whatever datatype is expected and save.
Go to the destination block. Go to mapping in it and map the newly created element to its corresponding address and save.
Right click your project in the solution explorer.select properties. Select configuration properties and select debugging in it. In this, set the Run64BitRunTime option to false (as excel does not handle the 64 bit application very well).

Instead of adding an earlier suggested Data Conversion you can cast the nvarchar column to a varchar column. This prevents you from having an unnecessary step and has a higher performance then the alternative.
In the select of your SQL statement replace date with CAST(date AS varchar([size])). For some reason this does not yet change the output data type. To do this do the following:
Right click your OLE DB Source step and open the advanced editor.
Go to Input and Output Properties
Select Output Columns
Select your column
Under Data Type Properties change DataType to string [DT_STR]
Change Length to the length you specified in your CAST statement
After doing this your source data will be output as a varchar and your error will disappear.
Source

I have been having the same issue and tried everything written here but it was still giving me the same error.
Turned out to be NULL value in the column which I was trying to convert.
Removing the NULL value solved my issue.
Cheers,
Ahmed

No-one seems to mention this but, converting varchar to nvarchar in the source query also solves the issue.

On the above example I kept losing the values, I think that delaying the Validation will allow the new data types to be saved as part of the meta data.
On the connection Manager for 'Excel Connection Manager' set the Delay Validation to False from the Properties.
Then on the data flow Destination task for Excel set the ValidationExternalMetaData to False, again from the properties.
This will now allow you to right click on the Excel Destination Task and go to Advanced Editor for Excel Destination --> far right tab - Input and Output Properties. In the External Columns folder section you will be able to now change the Data Types and Length values of the problematic columns and this can now be saved.
Good Luck!

I experienced this condition when I had installed Oracle version 12 client 32 bit client connected to an Oracle 12 Server running on windows.
Although both of Oracle-source and SqlServer-destination are NOT Unicode, I kept getting this message, as if the oracle columns were Unicode.
I solved the problem inserting a data conversion box, and
selecting type DT-STR (not unicode) for varchar2 fields and DT-WSTR (unicode) for numeric fields, then I've dropped the 'COPY OF' from the output field name.
Note that I kept getting the error because I had connected the source box arrow with the conversion box BEFORE setting the convertion types. So I had to switch source box and this cleaned all the errors in the destination box.

When creating table in SQL Server make your table columns NVARCHAR instead of VARCHAR.

I think people are missing this. In my case I had 100 character columns to convert between Oracle and MS Sql. All this stuff about Data Conversion and Advanced Editor is incredibly tedious if you have a 100 separate character columns to assign. Plus SSIS being SSIS, it will sometimes reset all your 100 advanced editor changes even if you set VALIDATEEXTERNALMETADATA to false, incredibly obnoxious. I wouldn't mind doing the Data Conversion if there was some value to it but 20 years ago ETL tools used to take oracle character to ms sql characters without fussing. What Bakalolo and Zafer say is the answer if you have a lot of character columns and you can live with nvarchar, just declare all your output ms sql columns (nvarchar) and your data task will automatically assign your oracle fields into ms sql fields with no manual overrides. I have also found that the new Oracle Source (2021) doesn't complain about a unicode conversion to varchar in ms sql. A colleague just told me that the ssis wizard (it may be only in vs 2019+) to assign oracle character to ms sql varchar will do the assignments automatically with no override, but I haven't tried that personally.
2022 update - I think this is just vs 2019 created packages and later. An ado.net task reading a varchar ms sql table going to oledb (and ado.net I think) ms sql varchar will throw the unicode error. If you switch the input task to oledb reading ms sql varchar table you won't have to do the advanced editor overrides for the varchar fields. If you don't want to do advanced editor overrides (who does?) try different tasks and more oledb tasks.

I just encounter same issue, I solve it in my SQL request : using convert directly
CONVERT(NVARCHAR(50),'') AS MyVarName
I need to put an empty (or fix size string) into excel file. Converting force type of MyVarName from DT-STR to DT-WSTR (unicode)

I know this is a very old post but I ran into the same issue and found that I had to manually select the conversion component output alias as the mapping in the excel destination component. Since the names of the OLE DB Source match the excel column names it was mapping it to the OLE DB and not to the Output Alias. Such as SourceID column from the OLE DB component being named Copy of SourceID after conversion. I don't see the original question saying they specifically selected the new alias name just that they mapped to DB columns. #Serge Voloshenko post comes the closest but also does not mention to make sure the mapping happens. To a new SSIS user this might be overlooked.

Using Script Task to create ADO NET (ODBC) Data Flow Source

I need some help with a SSIS Script Task (SQL 2008 R2) that dynamically creates a package. I am refining a package that copies data from a Sage Timberline (Now rebranded to Sage 300) Pervasive SQL environment to a SQL server data warehouse. I can create a package that opens the connection to Timberline and copies the data to a table in SQL Server. The problem is, for each company in timberline and each table in SQL, I need to create a separate data flow task. Given the three Timberline company folders and the number of tables in each folder, this would take a lot of time to create and be cumbersome to maintain and troubleshoot.
I am trying to create a package that uses a Foreach Loop to create a package that creates a ADO/ODBC source (Timberline), a OLE destination (SQL) and dynamically handles the column mapping. I found code here that almost does what I need.
I tested this code and it works great using OLE SQL source and destinations. What makes this script work is that it dynamically handles the column mapping. So, it you placed it into a Foreach Loop of the 100 or so tables, with each loop it could dynamically create the data flow and map the columns, then execute the new package.
My problem is that I can only connect to Timberline using ODBC. So, I need to modify the script to create the source connection with ADO NET (ODBC) instead of OLE. I’m having a lot of trouble trying to figure this out. Could someone please help me out with this?
Here the other couple of things I tried first, other than this approach:
Solution: Setup a Linked server to Timberline Pervasive SQL
Problem: SQL server is 64-bit and the Timberline driver is 32-bit. Using a linked server returns a architecture mismatch error. I called Sage and they said they have no plans to release a 64-bit drive.
Solution: Use one of the SQL Transfer tasks
Problem: Only works with SQL databases. This source is a Pervasive SQL database
Solution: Use a “INSERT … INTO …” type script
Problem: This requires a linked server. See the problem above
Here’s the section of the original VB .NET code I need help with:
'To Create a package named [Sample Package]
Dim package As New Package()
package.Name = "Sample Package"
package.PackageType = DTSPackageType.DTSDesigner100
package.VersionBuild = 1
'To add Connection Manager to the package
'For source database (OLTP)
Dim OLTP As ConnectionManager = package.Connections.Add("OLEDB")
OLTP.ConnectionString = "Data Source=.;Initial Catalog=OLTP;Provider=SQLNCLI10;Integrated Security=SSPI;Auto Translate=False;"
OLTP.Name = "LocalHost.OLTP"
'To add Load Employee Dim to the package [Data Flow Task]
Dim dataFlowTaskHost As TaskHost = DirectCast(package.Executables.Add("SSIS.Pipeline.2"), TaskHost)
dataFlowTaskHost.Name = "Load Employee Dim"
dataFlowTaskHost.FailPackageOnFailure = True
dataFlowTaskHost.FailParentOnFailure = True
dataFlowTaskHost.DelayValidation = False
dataFlowTaskHost.Description = "Data Flow Task"
'-----------Data Flow Inner component starts----------------
Dim dataFlowTask As MainPipe = TryCast(dataFlowTaskHost.InnerObject, MainPipe)
' Source OLE DB connection manager to the package.
Dim SconMgr As ConnectionManager = package.Connections("LocalHost.OLTP")
' Create and configure an OLE DB source component.
Dim source As IDTSComponentMetaData100 = dataFlowTask.ComponentMetaDataCollection.[New]()
source.ComponentClassID = "DTSAdapter.OLEDBSource.2"
' Create the design-time instance of the source.
Dim srcDesignTime As CManagedComponentWrapper = source.Instantiate()
' The ProvideComponentProperties method creates a default output.
srcDesignTime.ProvideComponentProperties()
source.Name = "Employee Dim from OLTP"
' Assign the connection manager.
source.RuntimeConnectionCollection(0).ConnectionManagerID = SconMgr.ID
source.RuntimeConnectionCollection(0).ConnectionManager = DtsConvert.GetExtendedInterface(SconMgr)
' Set the custom properties of the source.
srcDesignTime.SetComponentProperty("AccessMode", 0)
' Mode 0 : OpenRowset / Table - View
srcDesignTime.SetComponentProperty("OpenRowset", "[dbo].[Employee_Dim]")
' Connect to the data source, and then update the metadata for the source.
srcDesignTime.AcquireConnections(Nothing)
srcDesignTime.ReinitializeMetaData()
srcDesignTime.ReleaseConnections()
Thanks in advance!

The C# code here is what you need if you need a Derived Column transform between the Source and Destination...
http://bifuture.blogspot.com/2011/01/ssis-adding-derived-column-to-ssis.html
To get the Source & Destination connections working, there is some secret sauce here to get things working between COM and .Net...
http://blogs.msdn.com/b/mattm/archive/2008/12/30/api-sample-ado-net-source.aspx
There is a similar page showing what to do for OleDB connections too.
Creating the source tables is easy. The available ODBC Metadata collections accessible should be retrieved with GetSchema("MetaDataCollections"). This will return a list of the available schema collections available for that particular ODBC driver.
Next, you'll want to see the data types returned from GetSchema("DataTypes"), so you can correctly interpret the data types for each column retrieved from GetSchema("Columns") to make your SQL Server create table script (which I'm assuming you've done).
To at least figure out which tables have primary keys, you'll need to loop over each table returned from GetSchema("Tables") in order to work with GetSchema("Indexes"). There's a bug that requires you to query the Indexes one table at a time. It is easy to google this - create a string array to pass in as the 3rd parameter: GetSchema("Indexes", tblName, resultArray[])
What I did was got the Tables and Columns collections into object variables in my parent SSIS package. Because Timberline is so fast (not), it seemed more efficient to pull all the columns down and filter them locally...which I do to create the tables in SQL Server, if necessary.
Once that is done, use the local copy of Tables again to manipulate a SSIS package in a Script task in "design mode" (change source and destination target tables, and redo the column mappings), and execute the now-in-memory SSIS package.
For me it took awhile to figure out. Both above URLs were required. I found and copied the .Net 2.0 Dts.PipelineWrap and Dts.RuntimeWrap .dlls to Microsoft.Net\FrameworkV2.0xxxxx folder, then referenced these in each script task wanting to use them, before setting up my "using DtsPW = Microsoft.SqlServer.Dts.Pipeline.Wrapper", etc.
Of note, because Timberline is 32-bit ODBC, I think it's necessary to build the SSIS package to use "X86", and target the script tasks to use .Net 2.0 framework.
I used the Derived Column code because I needed to copy multiple Timberline DBs into one SQL Server DB. Derived Column adds a "CompanyID" value to the output pipeline to SQL Server.
In the end, map the Destination's Virtual Input columns to its External Metadata columns, based off of the pipeline the Destination is attached to:
foreach (DtsPW.IDTSVirtualInputColumn100 vColumn in destVirtInput.VirtualInputColumnCollection)
{
var vCol = destInst.SetUsageType(destInput.ID, destVirtInput, vColumn.LineageID, DtsPW.DTSUsageType.UT_READWRITE);
destInst.MapInputColumn(destInput.ID, vCol.ID, destInput.ExternalMetadataColumnCollection[vColumn.Name].ID);
}
Anyways, that code will make more sense in the context of the bifuture.blogspot.com page.
The EzApi library could help with this too, but the AdoNet connection source for it is coded as a virtual class, so you'd need to implement specific classes to use. My C# kungfu is not strong enough for that in the time I have...
Also, CozyRoc sells a toolset with custom SSIS controls (data flow Source and Destination controls...) that looks like it does this on the fly input-to-output column mapping as well.
My package seems to work good enough now... Oh, and one more, I did not have luck trying to use DSN-less ODBC connections to Timberline, just: Dsn=dsnname;Uid=user;Pwd=pwd;
SSIS packages running in 64-bit land cannot see 32-bit DSNs on 64-bit OS, it seems...at least, it didn't work for me (win7-64, 32-bit Text ODBC DSN).

Why does SQL Server 2000 treat SELECT test.* and SELECT t.est.* the same?

I butter-fingered a query in SQL Server 2000 and added a period in the middle of the table name:
SELECT t.est.* FROM test
Instead of:
SELECT test.* FROM test
And the query still executed perfectly. Even SELECT t.e.st.* FROM test executes without issue.
I've tried the same query in SQL Server 2008 where the query fails (error: the column prefix does not match with a table name or alias used in the query). For reasons of pure curiosity I have been trying to figure out how SQL Server 2000 handles the table names in a way that would allow the butter-fingered query to run, but I haven't had much luck so far.
Any sql gurus know why SQL Server 2000 ran the query without issue?
Update: The query appears to work regardless of the interface used (e.g. Enterprise Manager, SSMS, OSQL) and as Jhonny pointed out below it bizarrely even works when you try:
SELECT TOP 1000 dbota.ble.* FROM dbo.table

Maybe table names are constructed from a naive concatenation of prefix and base name.
't' + 'est' == 'test'
And maybe in the later versions of SQL Server, the distinction was made more semantic/more rigorously.
{ owner = t, table = est } != { table = test }

SQL Server 2005 and up has a "proper" implementation of schemas. SQL 2000 and earlier did not. The details escape me (its been years since I used SQL 2000), all I recall clearly is that you'd be nuts to create anything that wasn't owned by "dbo". It all ties into users and object ownership, but the 2000 and earlier model was pretty confusticated. Hopefully someone will read up on BOL, do some experimentation, and post their results here.

S-SQL reference manual:
"[dot] Can be used to combine multiple names into a name of the form A.B to refer to a column in a table, or a table in a schema. Note that you calso just use a symbol with a dot in it."
So I think if you referenced tblTest as tblT.est it would work OK as long as there isn't a column called 'est' in tblTest.
If it can't find a column name referenced with the dot I imagine it checks the parent of the object.

I found a reference to it being a bug
Note: as a result of a comparison
algorithm bug in SQL Server 2000, dot
symbols themselves have no effect on
matching, so "dbo.t" will successfully
match with tables "dbot", "d.b.o.t",
etc
from Link
It's been fixed in SQL Server 2005. Same link > Changes introduced in SQL Server 2005
Dot-related comparison bug has been fixed.

Is it in the "Open table" view of SSMS or via Enterprise Manager or via an SSMS Query Window?
There is/was a SQL Server 2005 issue with SSMS so how you run the query affects how it behaves.

This is a bug.
It has to do with internal representation of column names in SQL server 2000 that leaked out.
You will also not be able to create tablecolumn with a name which collides with table+column concatenation with another column, like, if you have tables User and UserDetail, you won't be able to have columns DetailAge and Age in these tables, respectively.

OpenRowSet command in TSQL is returning NULLS

Been investigating for a while now and keep hitting a brick wall. I am importing from xls files into temp tables via the OpenRowset command. Now I have a problem where I’m trying to import a certain column has a range values but the most common are the following. Columns structured as long numbers i.e. 15598 and the some columns as strings i.e. 15598-E.
Now the openrowset is reading the string version no problem but is reporting the number version as a NULL. I read (http://www.sqldts.com/254.aspx ) that openrowset has that issue and the author speaks of implementing “HDR=YES;IMEX=1” into the query string but that’s not working for me at all.
Have any of you guys every encountered this?
Just some more info as well. I may not do this with the JET engine (Microsoft.Jet.OLEDB.4.0) so this is what my query looks like:
SELECT *
FROM
OPENROWSET('MSDASQL'
, 'Driver=Microsoft Excel Driver (*.xls);HDR=YES;IMEX=1;DBQ=C:\ImportFile.xls;'
, 'SELECT * FROM [Sheet1$]')

I notice you are using the Excel ODBC driver. Have you tried the JET OLEDB Provider with the equivalent connection string?
select * from openrowset(
'Microsoft.Jet.OLEDB.4.0',
'Data Source=C:\ImportFile.xls;Extended Properties="Excel 8.0;HDR=Yes;IMEX=1"',
'SELECT * FROM [Sheet1$]')
EDIT: Sorry, just noticed your last paragraph. Surely the Excel ODBC driver still goes via the JET engine, so what difference would it make?
EDIT: I have looked at the KB194124 link, and the registry values it recommends are the default values on my machine, which I have never changed. I have used the above method several times myself without problems. Maybe it's an environmental issue?

If you don't mind opening the file in Excel, take the columns that have the problem, select the column, and do
Data -> Text to Columns -> Next -> Next -> Text
Save the spreadsheet and they should all come in as Text in OPENROWSET
I've found using .CSV files instead of Excel, opened by setting up a Linked Server, and setting up the format of the files in schema.ini a more practical approach for handling imports like this, with that method you can explicitly choose each column's format.

We've come across the same issue. Unfortunately we've not found a solution either. There's more information here which indicates that there might be a registry fix.

I had the same problem. I fixed it cuting and pasting a row that contains a column with the string/numeric value (for example 123ABC) in the first row position of the sheet. For some reason T-SQL reads the first row and assumes that all the values are numeric.

Response by SqlACID in this link worked great [https://wikigurus.com/Article/Show/185717/OpenRowSet-command-in-TSQL-is-returning-NULLS] :-
If you don't mind opening the file in Excel, take the columns that have the problem, select the column, and do
Data -> Text to Columns -> Next -> Next -> Text
Save the spreadsheet and they should all come in as Text in OPENROWSET
I've found using .CSV files instead of Excel, opened by setting up a Linked Server, and setting up the format of the files in schema.ini a more practical approach for handling imports like this, with that method you can explicitly choose each column's format.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse