Using Script Task to create ADO NET (ODBC) Data Flow Source - ado.net

I need some help with a SSIS Script Task (SQL 2008 R2) that dynamically creates a package. I am refining a package that copies data from a Sage Timberline (Now rebranded to Sage 300) Pervasive SQL environment to a SQL server data warehouse. I can create a package that opens the connection to Timberline and copies the data to a table in SQL Server. The problem is, for each company in timberline and each table in SQL, I need to create a separate data flow task. Given the three Timberline company folders and the number of tables in each folder, this would take a lot of time to create and be cumbersome to maintain and troubleshoot.
I am trying to create a package that uses a Foreach Loop to create a package that creates a ADO/ODBC source (Timberline), a OLE destination (SQL) and dynamically handles the column mapping. I found code here that almost does what I need.
I tested this code and it works great using OLE SQL source and destinations. What makes this script work is that it dynamically handles the column mapping. So, it you placed it into a Foreach Loop of the 100 or so tables, with each loop it could dynamically create the data flow and map the columns, then execute the new package.
My problem is that I can only connect to Timberline using ODBC. So, I need to modify the script to create the source connection with ADO NET (ODBC) instead of OLE. I’m having a lot of trouble trying to figure this out. Could someone please help me out with this?
Here the other couple of things I tried first, other than this approach:
Solution: Setup a Linked server to Timberline Pervasive SQL
Problem: SQL server is 64-bit and the Timberline driver is 32-bit. Using a linked server returns a architecture mismatch error. I called Sage and they said they have no plans to release a 64-bit drive.
Solution: Use one of the SQL Transfer tasks
Problem: Only works with SQL databases. This source is a Pervasive SQL database
Solution: Use a “INSERT … INTO …” type script
Problem: This requires a linked server. See the problem above
Here’s the section of the original VB .NET code I need help with:
'To Create a package named [Sample Package]
Dim package As New Package()
package.Name = "Sample Package"
package.PackageType = DTSPackageType.DTSDesigner100
package.VersionBuild = 1
'To add Connection Manager to the package
'For source database (OLTP)
Dim OLTP As ConnectionManager = package.Connections.Add("OLEDB")
OLTP.ConnectionString = "Data Source=.;Initial Catalog=OLTP;Provider=SQLNCLI10;Integrated Security=SSPI;Auto Translate=False;"
OLTP.Name = "LocalHost.OLTP"
'To add Load Employee Dim to the package [Data Flow Task]
Dim dataFlowTaskHost As TaskHost = DirectCast(package.Executables.Add("SSIS.Pipeline.2"), TaskHost)
dataFlowTaskHost.Name = "Load Employee Dim"
dataFlowTaskHost.FailPackageOnFailure = True
dataFlowTaskHost.FailParentOnFailure = True
dataFlowTaskHost.DelayValidation = False
dataFlowTaskHost.Description = "Data Flow Task"
'-----------Data Flow Inner component starts----------------
Dim dataFlowTask As MainPipe = TryCast(dataFlowTaskHost.InnerObject, MainPipe)
' Source OLE DB connection manager to the package.
Dim SconMgr As ConnectionManager = package.Connections("LocalHost.OLTP")
' Create and configure an OLE DB source component.
Dim source As IDTSComponentMetaData100 = dataFlowTask.ComponentMetaDataCollection.[New]()
source.ComponentClassID = "DTSAdapter.OLEDBSource.2"
' Create the design-time instance of the source.
Dim srcDesignTime As CManagedComponentWrapper = source.Instantiate()
' The ProvideComponentProperties method creates a default output.
srcDesignTime.ProvideComponentProperties()
source.Name = "Employee Dim from OLTP"
' Assign the connection manager.
source.RuntimeConnectionCollection(0).ConnectionManagerID = SconMgr.ID
source.RuntimeConnectionCollection(0).ConnectionManager = DtsConvert.GetExtendedInterface(SconMgr)
' Set the custom properties of the source.
srcDesignTime.SetComponentProperty("AccessMode", 0)
' Mode 0 : OpenRowset / Table - View
srcDesignTime.SetComponentProperty("OpenRowset", "[dbo].[Employee_Dim]")
' Connect to the data source, and then update the metadata for the source.
srcDesignTime.AcquireConnections(Nothing)
srcDesignTime.ReinitializeMetaData()
srcDesignTime.ReleaseConnections()
Thanks in advance!

The C# code here is what you need if you need a Derived Column transform between the Source and Destination...
http://bifuture.blogspot.com/2011/01/ssis-adding-derived-column-to-ssis.html
To get the Source & Destination connections working, there is some secret sauce here to get things working between COM and .Net...
http://blogs.msdn.com/b/mattm/archive/2008/12/30/api-sample-ado-net-source.aspx
There is a similar page showing what to do for OleDB connections too.
Creating the source tables is easy. The available ODBC Metadata collections accessible should be retrieved with GetSchema("MetaDataCollections"). This will return a list of the available schema collections available for that particular ODBC driver.
Next, you'll want to see the data types returned from GetSchema("DataTypes"), so you can correctly interpret the data types for each column retrieved from GetSchema("Columns") to make your SQL Server create table script (which I'm assuming you've done).
To at least figure out which tables have primary keys, you'll need to loop over each table returned from GetSchema("Tables") in order to work with GetSchema("Indexes"). There's a bug that requires you to query the Indexes one table at a time. It is easy to google this - create a string array to pass in as the 3rd parameter: GetSchema("Indexes", tblName, resultArray[])
What I did was got the Tables and Columns collections into object variables in my parent SSIS package. Because Timberline is so fast (not), it seemed more efficient to pull all the columns down and filter them locally...which I do to create the tables in SQL Server, if necessary.
Once that is done, use the local copy of Tables again to manipulate a SSIS package in a Script task in "design mode" (change source and destination target tables, and redo the column mappings), and execute the now-in-memory SSIS package.
For me it took awhile to figure out. Both above URLs were required. I found and copied the .Net 2.0 Dts.PipelineWrap and Dts.RuntimeWrap .dlls to Microsoft.Net\FrameworkV2.0xxxxx folder, then referenced these in each script task wanting to use them, before setting up my "using DtsPW = Microsoft.SqlServer.Dts.Pipeline.Wrapper", etc.
Of note, because Timberline is 32-bit ODBC, I think it's necessary to build the SSIS package to use "X86", and target the script tasks to use .Net 2.0 framework.
I used the Derived Column code because I needed to copy multiple Timberline DBs into one SQL Server DB. Derived Column adds a "CompanyID" value to the output pipeline to SQL Server.
In the end, map the Destination's Virtual Input columns to its External Metadata columns, based off of the pipeline the Destination is attached to:
foreach (DtsPW.IDTSVirtualInputColumn100 vColumn in destVirtInput.VirtualInputColumnCollection)
{
var vCol = destInst.SetUsageType(destInput.ID, destVirtInput, vColumn.LineageID, DtsPW.DTSUsageType.UT_READWRITE);
destInst.MapInputColumn(destInput.ID, vCol.ID, destInput.ExternalMetadataColumnCollection[vColumn.Name].ID);
}
Anyways, that code will make more sense in the context of the bifuture.blogspot.com page.
The EzApi library could help with this too, but the AdoNet connection source for it is coded as a virtual class, so you'd need to implement specific classes to use. My C# kungfu is not strong enough for that in the time I have...
Also, CozyRoc sells a toolset with custom SSIS controls (data flow Source and Destination controls...) that looks like it does this on the fly input-to-output column mapping as well.
My package seems to work good enough now... Oh, and one more, I did not have luck trying to use DSN-less ODBC connections to Timberline, just: Dsn=dsnname;Uid=user;Pwd=pwd;
SSIS packages running in 64-bit land cannot see 32-bit DSNs on 64-bit OS, it seems...at least, it didn't work for me (win7-64, 32-bit Text ODBC DSN).

Related

How to get the servername\hostname in Firebird 2.5.x

I use Firebird 2.5, and I want to retrieve the following values
Username:
I used SELECT rdb$get_context('SYSTEM', 'CURRENT_USER') FROM ...
Database name:
I used SELECT rdb$get_context('SYSTEM', 'DB_NAME') FROM ...
But for server name, I did not find any client API, do you know how can I retrieve the server name with a SELECT statement.
There is nothing built-in in Firebird to obtain the server host name (or host names, as a server can have multiple host names) through SQL.
The closest you can get is by requesting the isc_info_firebird_version information item through the isc_database_info API function. This returns version information that - if connecting through TCP/IP - includes a host name of the server.
But as your primary reason for this is to discern between environments in SQL, it might be better to look for a different solution. Some ideas:
Use an external table
You can create an external table to contain the environment information you need
In this example I just put in a short, fixed width name for the environment types, but you could include other information, just be aware the external table format is a binary format. When using CHAR it will look like a fixed width format, where values shorter than declared need to be padded with spaces.
You can follow these steps:
Configure ExternalFileAccess in firebird.conf (for this example, you'd need to set ExternalFileAccess = Restrict D:\data\DB\exttables).
You can then create a table as
create table environment
external file 'D:\data\DB\exttables\environment.dat' (
environment_type CHAR(3) CHARACTER SET ASCII NOT NULL
)
Next, create the file D:\data\DB\exttables\environment.dat and populate it with exactly three characters (eg TST for test, PRO for production, etc). You can also insert the value instead, the external table file will be created automatically. Inserting might be simpler if you want more columns, or data with varying length, etc. Just keep in mind it is binary, but using CHAR for all columns will make it look like text.
Do this for each environment, and make sure the file is read-only to avoid accidental inserts.
After this is done, you can obtain the environment information using
select environment_type
from environment
You can share the same file for multiple databases on the same host, and external files are - by default - not included in a gbak backup (they are only included if you apply the -convert backup option), so this would allow moving database between environments without dragging this information along.
Use an UDF or UDR
You can write an UDF (User-Defined Function) or UDR (User Defined Routine) in a suitable programming language to return the information you want and define this function in your database. Firebird can then call this function from SQL.
UDFs are considered deprecated, and you should use UDRs - introduced in Firebird 3 - instead if possible.
I have never written an UDF or UDR myself, so I can't describe it in detail.

Automate SSAS tabular model refresh in the table level

I am trying to automate SSAS tabular model refresh. The requirement is - depending on the tables chosen, the model will be refreshed only for those tables. I am looking for a way to dynamically build the script to process only the selected tables in the first step of an SQL agent job and pass that dynamically built script to next step which will be SQL Server Analysis Services Command step. Or maybe execute the script built in step 1 itself. But I am not sure how could this be achieved. Please let me know the possible ways.
Have you considered doing this through SSIS and executing the package from SQL Agent? You can use an Analysis Services Processing Task and select the tables that you want to process. If you want to do this in a more dynamic manner, the follow items outline how this can be done.
The table names that you want to process will be stored in an object variable. One option for this is to query an SSAS DMV from an Execute SQL Task for the names of that tables that will be processed and output these names into an object variable. You'll need to set the Result Set to use a full result set and map the object variable in the Result Set pane. The following command will return the unique table names (table_type filter is used to remove results prefixed with $) select table_name from $SYSTEM.DBSCHEMA_TABLES where table_catalog = 'YourTabularModel' and table_schema = 'Model' and table_type = 'SYSTEM TABLE'
If you will be using SSAS DMVs then create an OLE DB connection manager using Microsoft OLE DB Provider for Analysis Services 13.0 as the provider. Make sure to set the initial catalog to the SSAS model with the tables that will be processed.
Add a Foreach ADO Enumerator Loop that will use the object variable as the source variable in the Collection pane. In the Variable Mappings pane, add a variable to store the table name.
Inside the Foreach Loop, add an Analysis Services Execute DDL Task.
Create a string variable with an expression that is the SSAS process command for the table. In the expression replace the table field (assuming you're using TMSL) with the variable holding the table name.

SQL Server CE. Delete data from all tables for integration tests

We are using SQL Server CE for our integration tests. At the moment before every test, we delete all data from all columns, then re-seed test data. And we drop the database file when the structure changes.
For deletion of data we need to go through every table in correct order and issue Delete from table blah and that is error-prone. Many times I simply forget to add delete statement when I add new entities. So it would be good if we can automate data-deletion from the tables.
I have seen Jimmy Bogard's goodness for deletion of data in the correct order. I have implemented that for Entity Frameworks and that works in full-blown SQL Server. But when I try to use that in SQL CE for testing, I get exception, saying
System.Data.SqlServerCe.SqlCeException : The specified table does not exist. [ ##sys.tables ]
SQL CE does not have supporting system tables that hold required information.
Is there a script that works with SQL CE version that can delete all data from all tables?
SQL Server Compact does in fact have system tables listing all tables. In my SQL Server Compact scripting API, I have code to list the tables in the "correct" order, not a trivial task! I use QuickGraph, it has an extension method for sorting a DataSet. You should be able to reuse some of that in your test code:
33
public void SortTables()
{
var _tableNames = _repository.GetAllTableNames();
try
{
var sortedTables = new List<string>();
var g = FillSchemaDataSet(_tableNames).ToGraph();
foreach (var table in g.TopologicalSort())
{
sortedTables.Add(table.TableName);
}
_tableNames = sortedTables;
//Now iterate _tableNames and issue DELETE statement for each
}
catch (QuickGraph.NonAcyclicGraphException)
{
_sbScript.AppendLine("-- Warning - circular reference preventing proper sorting of tables");
}
}
You must add the QuickGraph DLL files (from Codeplex or NuGet) and you can find the implementation of GetAllTableNames and FillSchemaDataSet here http://exportsqlce.codeplex.com/SourceControl/list/changesets (in Generator.cs and DbRepository.cs)

How to connect Excel to MS SQL and get data WITH column names?

One of my users wants to get data into Excel from SQL 2008 query/stored proc.
I never actually did it before.
I tried a sample using ADO and got data but user reasonably asked - where are the column names?
How do I connect a spreadsheet to an SQL resultset and get it with column names?
Apparently the field names are in the recordset object already.. just needed to pull them out.
i = 1
For Each objField In rs.Fields
Sheet1.Cells(1, i) = objField.Name
i = i + 1
Next objField
I don't know which version of Excel you are using but in Excel 2007 you can just connect to the SQL DB by going to Data -> From Other Sources -> From SQL Server. After you select your server and database, your connection will be created. Then you can edit it (Data -> Connections -> Properties) where in the Definition tab you change the Command type to SQL and enter your query in the Command text box. You can also create a view on the server and just point to that from Excel.
This should do it unless I misunderstood your question.

T-SQL 2000: Four part table name

I don't usually work with linked servers, and so I'm not sure what I'm doing wrong here.
A query like this will work to a linked foxpro server from sql 2000:
EXEC('Select * from openquery(linkedServer, ''select * from linkedTable'')')
However, from researching on the internet, something like this should also work:
Select * from linkedserver...linkedtable
but I receive this error:
Server: Msg 7313, Level 16, State 1, Line 1
Invalid schema or catalog specified for provider 'MSDASQL'.
OLE DB error trace [Non-interface error: Invalid schema or catalog specified for the provider.].
I realize it's supposed to be ServerAlias.Category.Schema.TableName, but if I run sp_ tables _ex on the linked server, for the category for all tables I just get the network path to where the data files are, and the schema is null.
Is this server setup incorrectly? Or is what I'm trying to do not possible?
From MSDN:
Always use fully qualified names when
working with objects on linked
servers. There is no support for
implicit resolution to the dbo owner
name for tables in linked servers
You cannot rely on the implicit schema name resolution of the '..' notation for linked servers. For a FoxPro 'server' you're going to have to use the database and schema as they map to their FoxPro counterparts in the driver you use (I think they map to folder and file name, but I have't use a ISAM file driver in more than 10 years now).
I think you need to be explicit about resources in the linked server part of the query, for example:
EXEC SomeLinkedServer.Database.dbo.SomeStoredProc
In other words just dotting them out doesn't work in this case, you have to be more specific.
It's actually:
ServerAlias.Catalog.Schema.LinkedTable
Catalog is the database that you're querying on the linked server, and catalog is the catalog of the remote table. So a valid four-part name would look lik this
ServerAlias.AdventureWorks.HumanResources.Employee
or
ServerAlias.MyDB.dbo.MyTable