index was out of range must be non negative and less than the size of the collection when generating script with data - sql-server-2008-r2

I am trying to generate the script with data i am getting
index was out of range must be non negative and less than the size of the collection in sql server when generating script with data. How can i resolve this issue.
To get script with data i doing following steps
right click on database.
Task - Generate Script
Select table .
Select specified database object
Select Table
Select Advance The select type of database script to script data
Press Next Then Next to start

Related

Capturing Each Skipped Record - Copy Data Activity

I'm doing a large-scale project with multiple pipelines, millions of records per pipeline. I'm trying to develop a generic skipped row capture process.
What I need to do is: for every source row skipped due to any error encountered on the attempted load, I want to capture a key column value from the row and write it to a distinct log file (or separate DB table row). This can't be summary data: for each individual row that fails, I need to capture the row key from that row so we can review/re-load later (I will add in system variable values to identify pipeline, component, time stamp, etc). Pipeline must complete with all successful rows loaded, all unsuccessful rows logged.
This is no-brainer functionality in most ETL tools; I have to be overlooking something in ADF, because I can't find a way to do this. Appreciate any/all suggestions.
You can enable Fault tolerance and choose Skip incompatible rows option. It will skip the incompatible rows between source and target store during copy data. e.g. type and field mismatch or PK violation.
Then you can enable session log and choose Warning log level in copy activity to log skipped rows. Finally, you can save your log file in Azure Storage or Azure Data Lake Storage Gen2.
Reference:
https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-fault-tolerance
https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-log
With your first copy activity, check the fault tolerance option in 'settings' to log skipped fault rows.
Make sure to place your rows key column, as the first in the mapping definition.
Get the copy activity logFilePath from the activity output into a variable
Add another copy activity to load skipped rows into relational table
it source path will be the variable holds logFilePath
Set the file path type to: 'Wildcard file path'
Keep the 'Wildcard file path' empty
Will be the value in Wildcard file name
Make sure that the delimited file dataset escape character is set to quotations.
The OperationItem field of the lg file holds your record fields seperated by ,; because we placed the rowID first on mapping, it will appear first in OperationalItem as well.
Goodluck

Automate SSAS tabular model refresh in the table level

I am trying to automate SSAS tabular model refresh. The requirement is - depending on the tables chosen, the model will be refreshed only for those tables. I am looking for a way to dynamically build the script to process only the selected tables in the first step of an SQL agent job and pass that dynamically built script to next step which will be SQL Server Analysis Services Command step. Or maybe execute the script built in step 1 itself. But I am not sure how could this be achieved. Please let me know the possible ways.
Have you considered doing this through SSIS and executing the package from SQL Agent? You can use an Analysis Services Processing Task and select the tables that you want to process. If you want to do this in a more dynamic manner, the follow items outline how this can be done.
The table names that you want to process will be stored in an object variable. One option for this is to query an SSAS DMV from an Execute SQL Task for the names of that tables that will be processed and output these names into an object variable. You'll need to set the Result Set to use a full result set and map the object variable in the Result Set pane. The following command will return the unique table names (table_type filter is used to remove results prefixed with $) select table_name from $SYSTEM.DBSCHEMA_TABLES where table_catalog = 'YourTabularModel' and table_schema = 'Model' and table_type = 'SYSTEM TABLE'
If you will be using SSAS DMVs then create an OLE DB connection manager using Microsoft OLE DB Provider for Analysis Services 13.0 as the provider. Make sure to set the initial catalog to the SSAS model with the tables that will be processed.
Add a Foreach ADO Enumerator Loop that will use the object variable as the source variable in the Collection pane. In the Variable Mappings pane, add a variable to store the table name.
Inside the Foreach Loop, add an Analysis Services Execute DDL Task.
Create a string variable with an expression that is the SSAS process command for the table. In the expression replace the table field (assuming you're using TMSL) with the variable holding the table name.

How to populated the table via Pentaho Data Integration's table_output step?

I am performing an ETL job via Pentaho 7.1.
The job is to populate a table 'PRO_T_TICKETS' in PostgreSQL 9.2 via the Pentaho Jobs and transformations?
I have mapped the table fields with respect to the stream fields
Mapped Fields
My Table PRO_T_TICKETS contains the Schema (Column Names) in UPPERCASE.
Is this the reason I can't populate the table PRO_T_TICKETS with my ETL Job?
I duplicated the step TABLE_OUTPUT to PRO_T_TICKETS and changed the Target table field to 'PRO_T_TICKETS2'. Pentaho created a new table with lowercase schema and populated the data in it.
But I want this data to be uploaded in the table PRO_T_TICKETS only and with the UPPERCASE schema if possible.
I am attaching the whole job here and the error thrown by Pentaho. Pentaho Error I have also tried my query by adding double quotes to the column names as you can see in the error. But it didn't help.
What do you think I should do?
When you create (or modify) the connection, select Advanced on the left panel and click on the Force to upper case or Force to lower case or, even better, Preserve case of reserved words.
To know which option to choose, copy the 4th line of your error log, the line starting with INSERT INTO "public"."PRO_T_TICKETS("OID"... in your SQL-developer tool and change the connection advanced parameters until it works.
Also, at debug time, don't use batch updates, don't use lazy conversion on previous steps, and try with one (1) field rather than all (25).
Just as a complement: it worked for me following the tips from AlainD and using specific configurations that I'd like to share with you. I have a transformation streaming data from MySQL to PostgreSQL using a Table Input and Output. In both of DBs I have uppercase objects.
I did the following steps to work in the right way:
In the table input (MySQL) the objects are uppercase too, but I typed in lowercase and it worked and I didn't set any special option in the DB Connection.
In the table output (PostgreSQL) I typed everything in uppercase (schema, table name and columns) and I also set "specify the database fields" (clicking on "Get fields").
In the target DB Connection (PostgreSQL) I put the options (in "Advanced" section): "Quote all in database" and "Preserve case of reserved words".
PS: Ah, the last option is because I've found out that there was one more problem with my fields: there was a column called "Admin" (yes guys, they created a camelcase column using a reserved word!) and for that reason I must to put "Preserve case of reserved words" and type it as "Admin" (without quotes and in camelcase) in the Table Output.

How to assign csv field value to SQL query written inside table input step in Pentaho Spoon

I am pretty new to Pentaho so my query might sound very novice.
I have written a transformation in which am using CSV file input step and table input step.
Steps I followed:
Initially, I created a parameter in transformation properties. The
parameter birthdate doesn't have any default value set.
I have used this parameter in postgresql query in table input step
in the following manner:
select * from person where EXTRACT(YEAR FROM birthdate) > ${birthdate};
I am reading the CSV file using CSV file input step. How do I assign the birthdate value which is present in my CSV file to the parameter which I created in the transformation?
(OR)
Could you guide me the process of assigning the CSV field value directly to the SQL query used in the table input step without the use of a parameter?
TLDR;
I recommend using a "database join" step like in my third suggestion below.
See the last image for reference
First idea - Using Table Input as originally asked
Well, you don't need any parameter for that, unless you are going to provide the value for that parameter when asking the transformation to run. If you need to read data from a CSV you can do that with this approach.
First, read your CSV and make sure your rows are ok.
After that, use a select values to keep only the columns to be used as parameters.
In the table input, use a placeholder (?) to determine where to place the data and ask it to run for each row that it receives from the source step.
Just keep in ming that the order of columns received by the table input (the columns out of the select values) is the same order that it will be used for the placeholders (?). This should not be a problem with your question that uses only one placeholder, but keep that in mind as you ramp up using Pentaho.
Second idea, using a Database Lookup
This is another approach where you can't personalize the query made to the database and may experience a better performance because you can set a "Enable cache" flag and if you don't need to use a function on your where clause this is really recommended.
Third idea, using a Database Join
That is my recommended approach if you need a function on your where clause. It looks a lot like the Table Input approach but you can skip the select values step and select what columns to use, repeat the same column a bunch of times and enable a "outer join" flag that returns the rows without result from the query
ProTip: If you feel the transformation running too slow, try to use multiple copies from the step (documentation here) and obviously make sure the table have the appropriate indexes in place.
Yes there's a way of assigning directly without the use of parameter. Do as follows.
Use Block this step until steps finish to halt the table input step till csv input step completes.
Following is how you configure each step.
Note:
Postgres query should be select * from person where EXTRACT(YEAR
FROM birthdate) > ?::integer
Check Execute for each row and Replace variables in in Table input step.
Select only the birthday column in CSV input step.

Remove header (column names) from query result

I am using a Java based program and I am writing a simple select query inside that program to retrieve data from the PostgreSQL database. The data come with the header which is an error for the rest of my codes.
How do I get rid of all column headings in an SQL query? I just want to
print out the raw data without any headings.
I am using Building Controls Virtual Test Bed (BCVTB) to connect my database to EnergyPlus. This BCVTB has a database actor that you can write a query in it and receive data and send it to your other simulation program. I decided to use PostgreSQL. however when I write Select * From mydb, it brings data with the column names (header). I just want raw data without header. what should I do?
PostgreSQL does not send table headings, not like a CSV file. The protocol (as used via JDBC) sends the rows. The driver does request a description of the rows that includes column names, but it is not part of the result set rows like the "header first" convention for CSV.
Whatever is happening must be a consequence of the BCVTB tools you are using, and I suggest pursuing it on that side of things.