I am in the process of re-writing the script below to be able to run in parallel, as can be seen in the code, an array of servers is passed to the script, and then it loads it onto a hash table, loops through each server at a time to do the deployment, for each server there are files to execute in a particular order (see array of files). Looking at the structure, I feel workspace is the way to go here but I could be wrong.
Where the performance gains can be seen in my opinion or having the code such that multiple servers can be executed at thesame time rather than waiting for each server to complete and move onto the next one. foreach parallel
I ran a test to call a function declared outside a workspace, it worked.Is this good practice to call a function declared outside a workspace ? I ask this because I would like to reuse some functions outside the workspace, or is it generally better to put all the code in the workspace even ones that are not intended for parallel workloads i.e one off calls to the code. ?
The below is the code I am testing with.
Function Check-Instance-Connection{
param
(
[Parameter(Mandatory=$true,
ValueFromPipelineByPropertyName=$true,
Position=0)]
$sql_server,
[Parameter(Mandatory=$true,
ValueFromPipelineByPropertyName=$true,
Position=1)]
$db_name
)
try
{
#Return extra useful info by using custom objects
$check_outcome = "" | Select-Object -Property log_date, stage, status, error_message
$check_outcome.log_date = (Get-Date)
$check_outcome.stage = 'Ping SQL instance for $sql_server'
#test connection for a sql instance
$connectionstring = "Data Source=$sql_server;Integrated Security =true;Initial Catalog=$db_name;Connect Timeout=5;"
$sqllconnection = New-Object System.Data.SqlClient.SqlConnection $connectionstring
$sqllconnection.Open();
$check_outcome.status = $true
$check_outcome.error_message = ''
return $check_outcome
}
Catch
{
$check_outcome.status = $false
$check_outcome.error_message = $_.Exception.Message
return $check_outcome
}
finally{
$sqllconnection.Close();
}
}
$file_list = #("deployment_1.sql","deployment_2.sql","deployment_3.sql","deployment_4.sql","deployment_5.sql")
$x = (1,"Server1",3,1),(4,"Server2",6,2),(3,"Server3",4,3)
$k = 'serverid','servername','locationid','appid' # key names correspond to data positions in each array in $x
$h = #{}
For($i=0;$i -lt $x[0].length; $i++){
$x |
ForEach-Object{
[array]$h.($k[$i]) += [string]$_[$i]
}
}
$folder = "F:\Files\"
$database_name = "Test"
foreach ($server_id in $all_server_ids)
{
$severid = $h["serverid"][$all_server_ids.indexof($server_id)]
$servername = $h["servername"][$all_server_ids.indexof($server_id)]
$locationid = $h["locationid"][$all_server_ids.indexof($server_id)]
$message = 'ServerID {0} has a servername of {1} and a location id of {2}' -f $server_id, $h["servername"][$all_server_ids.indexof($server_id)],$h["locationid"][$all_server_ids.indexof($server_id)]
Write-Output $message
Write-Output "This $severid and this $servername and this $locationid"
foreach ($file in $file_list)
{
$is_instance_ok = Check-Instance-Connection $servername $database_name
if ($is_instance_ok.check_outcome -eq $true){
invoke-sqlcmd -ServerInstance "$servername" -inputfile $folder$file -Database "$database_name" -Querytimeout 60 -OutputSqlErrors $true -ConnectionTimeout 10 -ErrorAction Continue -Errorvariable generated_error | Out-Null
}
}
}
Thanks, I did a lot more research and looked at a lot of examples on how workflows work. This is what I have come up with.
Workflow RunExecution
{
Function Check-Instance-Connection{
param
(
[Parameter(Mandatory=$true,
ValueFromPipelineByPropertyName=$true,
Position=0)]
$sql_server,
[Parameter(Mandatory=$true,
ValueFromPipelineByPropertyName=$true,
Position=1)]
$db_name
)
try
{
#Return extra useful info by using custom objects
$check_outcome = "" | Select-Object -Property log_date, stage, status, error_message
$check_outcome.log_date = (Get-Date)
$check_outcome.stage = 'Ping SQL instance for $sql_server'
#test connection for a sql instance
$connectionstring = "Data Source=$sql_server;Integrated Security =true;Initial Catalog=$db_name;Connect Timeout=5;"
$sqllconnection = New-Object System.Data.SqlClient.SqlConnection $connectionstring
$sqllconnection.Open();
$check_outcome.status = $true
$check_outcome.error_message = ''
return $check_outcome
}
Catch
{
$check_outcome.status = $false
$check_outcome.error_message = $_.Exception.Message
return $check_outcome
}
finally{
$sqllconnection.Close();
}
}
$file_list = #("deployment_1.sql","deployment_2.sql","deployment_3.sql","deployment_4.sql","deployment_5.sql")
$x = (1,"server1\DEV3",3,1),(4,"serer1\DEV2",6,2),(3,"serer2\DEV1",4,3)
$k = 'serverid','servername','locationid','appid'
$h = #{}
For($i=0;$i -lt $x[0].length; $i++){
$x |
ForEach-Object{
[array]$h.($k[$i]) += [string]$_[$i]
}
}
$folder = "C:\Temp\"
$database_name = "Test"
$all_server_ids = $h['serverid']
foreach -parallel ($server_id in $all_server_ids)
{
$severid = $h["serverid"][$all_server_ids.indexof($server_id)]
$servername = $h["servername"][$all_server_ids.indexof($server_id)]
$locationid = $h["locationid"][$all_server_ids.indexof($server_id)]
foreach ($file in $file_list)
{
# $check_fine = $is_instance_ok.check_outcome
# if ($check_fine = $true){
invoke-sqlcmd -ServerInstance "$servername" -inputfile $folder$file -Database "$database_name" -Querytimeout 60 -OutputSqlErrors $true -ConnectionTimeout 10 -ErrorAction Continue
write-output "invoke-sqlcmd -ServerInstance $servername -inputfile $folder$file -Database $database_name -Querytimeout 60 -OutputSqlErrors $true -ConnectionTimeout 10 -ErrorAction Continue "
# }
}
}
}
RunExecution
So im trying to make a backup script that will download a csv from my mssql, then zip the file, then upload the backup to amazon S3.
The issue im having is the table is 20Million lines on average when i run the script daily. and it looks like it just lags forever untill it completes like 20 minutes later. I was wondering if there is a way to show a progress bar for the invoke-sqlcmd specificly. ive done some research and all the examples i could find is to make a progress bar on a for loop only, not for a single commands progress.
Here is my code:
ECHO "Starting Download"
Import-Module sqlps
#$SQLquery="SELECT * FROM dbo.$PREFIX$i"
$SQLquery="SELECT * FROM dbo.events"
ECHO "Executing query = $SQLquery"
$hostname = "."
$pass = "test"
$usern = "test"
$database = "theDB"
$result=invoke-sqlcmd -ServerInstance $hostname -query $SQLquery -HostName $hostname -Password $pass -Username $usern -Database $database -verbose
#echo $result
pause
$result |export-csv -path $CSVPATH -notypeinformation
pause
ECHO "Starting Zip:"
Compress-Archive -LiteralPath $CSVPATH -CompressionLevel Optimal -DestinationPath $ZIPPATH
ECHO "Starting Delete: $CSVPATH "
del "$CSVPATH"
echo "Removed $CSVNAME"
aws s3 cp $ZIPPATH s3://test_$ZIPNAME
pause
this script works but as i said i would like to add a progress bar to the invoke-sqlcmd so that it doesnt look like its frozen while it downloads the huge file.
this is what i could find so far but this only works for a loops progression
$VerbosePreference = "Continue"
Write-Verbose "Test Message"
for ($a=1; $a -lt 100; $a++) {
Write-Progress -Activity "Working..." -PercentComplete $a -CurrentOperation "$a% complete" -Status "Please wait."
Start-Sleep -Milliseconds 100
}
Considering your huge ~20 million record data set, it's probably a good idea to use some of the .NET classes in the System.Data.Common namespace. And I'm not sure about how Export-Csv is implemented, but System.IO.StreamWriter is very efficient for writing large files.
A simple tested/working example with inline comments:
# replace $tableName with yours
$sqlCount = "SELECT COUNT(*) FROM dbo.$($tableName)";
$sqlSelect = "SELECT * FROM dbo.$($tableName)";
$provider = [System.Data.Common.DbProviderFactories]::GetFactory('System.Data.SqlClient');
$connection = $provider.CreateConnection();
# replace $connectionString with yours, e.g.:
# "Data Source=$($INSTANCE-NAME);Initial Catalog=$($DATABASE-NAME);Integrated Security=True;";
$connection.ConnectionString = $connectionString;
$command = $connection.CreateCommand();
# get total record count for Write-Progress
$command.CommandText = $sqlCount;
$connection.Open();
$reader = $command.ExecuteReader();
$totalRecords = 0;
while ($reader.Read()) {
$totalRecords = $reader[0];
}
$reader.Dispose();
# select CSV data
$command.CommandText = $sqlSelect;
$reader = $command.ExecuteReader();
# get CSV field names
$columnNames = #();
for ($i = 0; $i -lt $reader.FieldCount; $i++) {
$columnNames += $reader.GetName($i);
}
# read and populate data one row at a time
$values = New-Object object[] $columnNames.Length;
$currentCount = 0;
# replace $CSVPATH with yours
$writer = New-Object System.IO.StreamWriter($CSVPATH);
$writer.WriteLine(($columnNames -join ','));
while ($reader.Read()) {
$null = $reader.GetValues($values);
$writer.WriteLine(($values -join ','));
if (++$currentCount % 1000 -eq 0) {
Write-Progress -Activity 'Reading data' `
-Status "Finished reading $currentCount out of $totalRecords records." `
-PercentComplete ($currentCount / $totalRecords * 100);
}
}
$command.Dispose();
$reader.Dispose();
$connection.Dispose();
$writer.Dispose();
I have a PowerShell script to pull data from a database, but some of the fields contain commas and that is resulting in breaking up the fields because the StreamReader splits it up into fields by comma. How can I change the delimiter of how the data is split into it's fields?
$ConnectionString = "Data Source=server1; Database=Development; Trusted_Connection=True;";
$streamWriter = New-Object System.IO.StreamWriter ".\output.csv"
$sqlConn = New-Object System.Data.SqlClient.SqlConnection $ConnectionString
$sqlCmd = New-Object System.Data.SqlClient.SqlCommand
$sqlCmd.Connection = $sqlConn
$sqlCmd.CommandText = "SELECT * FROM Development.dbo.All_Opportunities WITH(NOLOCK)"
$sqlConn.Open();
$reader = $sqlCmd.ExecuteReader();
# Initialze the array the hold the values
$array = #()
for ( $i = 0 ; $i -lt $reader.FieldCount; $i++ )
{ $array += #($i) }
# Write Header
$streamWriter.Write($reader.GetName(0))
for ( $i = 1; $i -lt $reader.FieldCount; $i ++)
{ $streamWriter.Write($("," + $reader.GetName($i))) }
$streamWriter.WriteLine("") # Close the header line
while ($reader.Read())
{
# get the values;
$fieldCount = $reader.GetValues($array);
# add quotes if the values have a comma or double quote
for ($i = 0; $i -lt $array.Length; $i++)
{
if ($array[$i] -match "`"|\S")
{
$array[$i] = '"' + $array[$i].Replace("`"", "`"`"").ToString() + '"';
}
}
$newRow = [string]::Join(",", $array);
$streamWriter.WriteLine($newRow)
}
$reader.Close();
$sqlConn.Close();
$streamWriter.Close();
Have you read this post to see if it helps your effort. It's for a text fiel, but could open you creativity to what is possible.
'stackoverflow.com/questions/14954437/streamreader-with-tab-delimited-text-file'
FYI, there is no delimiter type called 'field'
Otherwise, for those columns that have a comma as part of the value, a common approach is either to double quote the value or escape it.
I am attempting to query a CSV file using the Microsoft ACE OLEDB provider. When I add "PrctBusy > 60" to the where clause I receive the Error "Data type mismatch in criteria expression." I have searched StackOverFlow and used google to search for solutions, I see this is not an uncommon issue. From my readings it looks to be datatype issue. The data in the column PrctBusy is all numeric. I think I need to force it to be number but I have not found a solution.
Below is the code I am currently working with:
$ArrayNameUtil = "000198701258"
$CatNameUtil = "FE_DIR"
$sdLocalPath = "D:\Logs\SANData\Perf"
$InputCSV = "VMaxSANReportUtilFile.csv"
$csv = Join-Path $sdLocalPath $InputCSV
$provider = (New-Object System.Data.OleDb.OleDbEnumerator).GetElements() | Where-Object { $_.SOURCES_NAME -like "Microsoft.ACE.OLEDB.*" }
if ($provider -is [system.array]) { $provider = $provider[0].SOURCES_NAME } else { $provider = $provider.SOURCES_NAME }
$connstring = "Provider=$provider;Data Source=$(Split-Path $csv);Extended Properties='text;HDR=$firstRowColumnNames;';"
$firstRowColumnNames = "Yes"
$delimiter = ","
$tablename = (Split-Path $csv -leaf).Replace(".","#")
$conn = New-Object System.Data.OleDb.OleDbconnection
$conn.ConnectionString = $connstring
$provider = (New-Object System.Data.OleDb.OleDbEnumerator).GetElements() | Where-Object { $_.SOURCES_NAME -like "Microsoft.ACE.OLEDB.*" }
if ($provider -is [system.array]) { $provider = $provider[0].SOURCES_NAME } else { $provider = $provider.SOURCES_NAME }
$connstring = "Provider=$provider;Data Source=$(Split-Path $csv);Extended Properties='text;HDR=$firstRowColumnNames;';"
$firstRowColumnNames = "Yes"
$delimiter = ","
$tablename = (Split-Path $csv -leaf).Replace(".","#")
$conn = New-Object System.Data.OleDb.OleDbconnection
$conn.ConnectionString = $connstring
$conn.Open()
#
$sql = "SELECT TimeStamp, count(PrctBusy) AS Above60 FROM [$tablename] WHERE array = '$ArrayNameUtil' and Category like '$CatNameUtil' and PrctBusy > 60 Group by TimeStamp "
$cmd = New-Object System.Data.OleDB.OleDBCommand
$cmd.Connection = $conn
$cmd.CommandText = $sql
$dtp = New-Object System.Data.DataTable
$dtp.Load($cmd.ExecuteReader())
Because of the pointer from TessellatingHeckler to Codeproject and some follow on queries, I was lead to http://aspdotnetcodes.com/Importing_CSV_Database_Schema.ini.aspx. I found that a schema.ini file in the same directory as the CSV file could specify the data type.
The schema.ini file ended up in the following format:
[VMaxSANReportUtilFile.csv]
ColNameHeader=True
Format=CSVDelimited
Col1=Array Text Width 20
Col2=TimeStamp Text Width 20
Col3=Category Text Width 20
Col4=Instance Text Width 20
Col5=PrctBusy Short
Col6=QueUtil Short
I went through several revisions to get the data type correct for an ACE OLE DB provider. If the columns are named the names need to be in the schema.ini file.
The abstracted code:
for($i=0;$i -le $filecount;$i++){
$name = $droper.Items.Item($i).text
$copytemp = Split-Path $name.ToString() -leaf -resolve
$pasteitem = $datepath+"\" + $copytemp
$setclipboard = [System.Windows.Clipboard]::SetFileDropList($name)
#$t= [System.IO.File]::copy(,$true)
$t = [Microsoft.VisualBasic.FileIO.FileSystem]::CopyFile($name, $pasteitem, Microsoft.VisualBasic.FileIO.UIOption]::AllDialogs)
}
This works perfectly, except that for every loop for every file it copies the dialog appears.
Any way to have this copy dialog to copy all the file in array or loop only once?
When in doubt, read the documentation. If you tell CopyFile() to show all dialogs ([Microsoft.VisualBasic.FileIO.UIOption]::AllDialogs) then it most certainly will do as it's told. Call CopyFile() without that option if you don't want the dialogs:
$t = [Microsoft.VisualBasic.FileIO.FileSystem]::CopyFile($name, $pasteitem)
or (better yet), do it the PoSh way:
for($i=0; $i -le $filecount; $i++) {
$name = $droper.Items.Item($i).text
Copy-Item $name "$datepath\"
}
You can add Write-Progress to the mix if you want the overall progress displayed:
for($i=0; $i -le $filecount; $i++) {
$name = $droper.Items.Item($i).text
Write-Progress -Activity 'Copying ...' -Percent ($i*100/$filecount) -Current $name
Copy-Item $name "$datepath\"
}
If you need a graphical overall progress bar you'll probably need to build it yourself. Jeffrey Hicks published an example here.
Add-Type -Assembly System.Windows.Forms
$form = New-Object Windows.Forms.Form
$form.Text = 'Copying ...'
$form.Height = 100
$form.Width = 400
$form.StartPosition = [Windows.Forms.FormStartPosition]::CenterScreen
$progress = New-Object Windows.Forms.ProgressBar
$progress.Name = 'progressBar1'
$progress.Left = 5
$progress.Top = 40
$progress.Value = 0
$progress.Style = 'Continuous'
$drawingSize = New-Object Drawing.Size
$drawingSize.Width = 360
$drawingSize.Height = 20
$progress.Size = $drawingSize
$form.Controls.Add($progress)
$form.Show()
[void]$form.Focus()
for($i=0; $i -le $filecount; $i++) {
$name = $droper.Items.Item($i).text
Copy-Item $name "$datepath\"
$progress.Value = [int]($i*100/$filecount)
$form.Refresh()
}
$form.Close()