Hope someone can offer a suggestion to help me speed up a Powershell script. What I am doing is reading in hundreds of CSV files, parsing the information to get data about missing entries, and then writing that output to a HTML file. Here is the loop that I am using to process the files:
ForEach ($Filename in $FileList) {
$CustTemp = import-csv "$FilePath\$Filename"
$CustName = $CustTemp[0].CustName
Write-Host "Reading data for $CustName"`r
For ($counter=0;$counter -lt 31;$counter++){
$CheckDate = (get-date).AddDays(-$counter)
$CheckShortDate = $CheckDate.ToShortDateString()
$TempData = import-csv "$FilePath\$Filename" | Select FileName,FileDate | where {$_.FileDate -eq $CheckShortDate}
If ($TempData -eq $null) {
$row = "No file found for $CheckShortDate for $CustName"
$HTMLReportItems += $row
}
$HTMLReportItems = $HTMLReportItems | ConvertTo-Html -Fragment
}
}
This loop worked fine when I was testing with a few CSV files but when running it against a large number of files (300+) the loop is taking an extremely long time to complete for each file (30s-1m). I'm pretty sure the reason why is that the CSV file is being accessed 30 times per iteration. What I am hoping is that someone will have a better suggestion on how I can process the data.
You're reading $FilePath\$Filename multiple times. Read it outside the for loop and only do the filtering inside. Move the HTML generation outside the loop as well.
$HTMLReportItems = foreach ($Filename in $FileList) {
$csv = Import-Csv (Join-Path $FilePath $Filename)
$CustName = $csv[0].CustName
$data = $csv | select FileName,FileDate
Write-Host "Reading data for $CustName"
for ($counter=0;$counter -lt 31;$counter++){
$CheckShortDate = (Get-Date).AddDays(-$counter).ToShortDateString()
$TempData = $data | ? {$_.FileDate -eq $CheckShortDate}
if ($TempData -eq $null) {
"No file found for $CheckShortDate for $CustName"
}
}
}
$HTMLReportItems = $HTMLReportItems | ConvertTo-Html -Fragment
Related
I'm trying to find a way to reliably replace all occurrences of a string found in a file with data from a column in a CSV using one column as the search pattern with data from the same row on the next column for the replace pattern. The new data is then written to a new file as to keep the original intact. The purpose of this is to simplify exchanging IDs between environments that are hardcoded into the Master pages of a SharePoint site collection. Here's what I have so far.
$file = "C:\Users\jeffery\Documents\ids.csv"
$csv = Import-Csv -Path $file -Delimiter `,
$prd2016 = $csv.'2016 PRD ID'
$stg2016 = $csv.'2016 STG ID'
$prd2010 = $csv.'2010 PRD ID'
$srcFile = "C:\Users\jeffery\Downloads\v5.master"
$dstFile = "C:\Users\jeffery\Downloads\v6.master"
Set-Variable 2010,2016
$content = Get-Content -Path $srcFile
For($i=0; $i -lt $prd2016.Count; $i++){
Clear-Variable 2010
Clear-Variable 2016
$2010 = $prd2010[$i]
$2016 = $prd2016[$i]
$content.replace("$2016", "$2010") | Set-Content -Path $dstFile -Force
}
I've also tried nested loops and using foreach loops to no avail as of yet. Any help will be greatly appreciated. Also, here's some sample data to assist with any answers.
CSV Data:
Navigation,a5a0c64c-17b1-4cba-a8ff-a6a61d8466f3,a66d1d48-ab5e-4aed-9eb9-e8763b88ff2a,2d3cd026-7e2a-4241-8500-abd9a83a0803
Source file data:
<WebPartPages:DataFormWebPart runat="server" IsIncluded="True" AsyncRefresh="false" NoDefaultStyle="TRUE" ViewFlag="8" Title="Navigation" PageType="PAGE_NORMALVIEW" __markuptype="vsattributemarkup" __WebPartId="{9CDA54AA-5C9F-4E62-A0D6-BE149C8B27F0}" partorder="2" id="g_9cda54aa_5c9f_4e62_a0d6_be149c8b27f0" listname="{a5a0c64c-17b1-4cba-a8ff-a6a61d8466f3}" pagesize="1" chrometype="None" __AllowXSLTEditing="true" WebPart="true" Height="" Width="">
<DataSources><SharePoint:SPDataSource runat="server" DataSourceMode="List" UseInternalName="true" UseServerDataFormat="true" selectcommand="<View><Query><Where><Eq><FieldRef Name="Title"/><Value Type="Text">Top Nav</Value></Eq></Where></Query></View>" id="dataformwebpart8"><SelectParameters><WebPartPages:DataFormParameter Name="ListID" ParameterKey="ListID" PropertyName="ParameterValues" DefaultValue="{a5a0c64c-17b1-4cba-a8ff-a6a61d8466f3}"/><asp:Parameter Name="MaximumRows" DefaultValue="1"/></SelectParameters><DeleteParameters><WebPartPages:DataFormParameter Name="ListID" ParameterKey="ListID" PropertyName="ParameterValues" DefaultValue="{a5a0c64c-17b1-4cba-a8ff-a6a61d8466f3}"/></DeleteParameters><UpdateParameters><WebPartPages:DataFormParameter Name="ListID" ParameterKey="ListID" PropertyName="ParameterValues" DefaultValue="{a5a0c64c-17b1-4cba-a8ff-a6a61d8466f3}"/></UpdateParameters><InsertParameters><WebPartPages:DataFormParameter Name="ListID" ParameterKey="ListID" PropertyName="ParameterValues" DefaultValue="{a5a0c64c-17b1-4cba-a8ff-a6a61d8466f3}"/></InsertParameters></SharePoint:SPDataSource></DataSources>
<datafields>#Title,Title;#Navigation,Navigation;#ID,ID;#ContentType,Content Type;#Modified,Modified;#Created,Created;#Author,Created By;#Editor,Modified By;#_UIVersionString,Version;#Attachments,Attachments;#File_x0020_Type,File Type;#FileLeafRef,Name (for use in forms);#FileDirRef,Path;#FSObjType,Item Type;#_HasCopyDestinations,Has Copy Destinations;#_CopySource,Copy Source;#ContentTypeId,Content Type ID;#_ModerationStatus,Approval Status;#_UIVersion,UI Version;#Created_x0020_Date,Created;#FileRef,URL Path;#ItemChildCount,Item Child Count;#FolderChildCount,Folder Child Count;#AppAuthor,App Created By;#AppEditor,App Modified By;</datafields>
<XSL><xsl:stylesheet xmlns:x="http://www.w3.org/2001/XMLSchema" xmlns:d="http://schemas.microsoft.com/sharepoint/dsp" version="1.0" exclude-result-prefixes="xsl msxsl ddwrt" xmlns:ddwrt="http://schemas.microsoft.com/WebParts/v2/DataView/runtime" xmlns:asp="http://schemas.microsoft.com/ASPNET/20" xmlns:__designer="http://schemas.microsoft.com/WebParts/v2/DataView/designer" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:SharePoint="Microsoft.SharePoint.WebControls" xmlns:ddwrt2="urn:frontpage:internal">
Thank you in advance for any and all help with this query.
You're jumping through some serious hoops to avoid using the objects Import-Csv gives you and foreach.
You're also overwriting the destination file every time you loop, which is unnecessary.
$srcFile = "C:\Users\jeffery.grantham\Downloads\v5.master"
$dstFile = "C:\Users\jeffery.grantham\Downloads\v6.master"
$file = "C:\Users\jeffery\Documents\ids.csv"
$csv = Import-Csv -Path $file
$content = Get-Content -Path $srcFile -Raw # Read the file into a single string.
foreach ($row in $csv) {
$content = $content.Replace($row.'2016 PRD ID', $row.'2010 PRD ID')
}
$content | Set-Content -Path $dstFile -Force
I found my own answer finally. The problem was that the string was getting replaced in realtime but it was not updating the variable prior to trying to write the result to the $dstFile.
So my final script looks like the following:
$file = "C:\Users\jeffery.grantham\Documents\peer-ids.csv"
$csv = Import-Csv -Path $file -Delimiter `,
$prd2016 = $csv.'2016 PRD ID'
$stg2016 = $csv.'2016 STG ID'
$prd2010 = $csv.'2010 PRD ID'
$srcFile = "C:\Users\jeffery.grantham\Downloads\v5.master"
$dstFile = "C:\Users\jeffery.grantham\Downloads\v6.master"
Set-Variable 2010,2016
$content = Get-Content -Path $srcFile
For($i=0; $i -lt $prd2016.Count; $i++){
Clear-Variable 2010
Clear-Variable 2016
$2010 = [string]$prd2010[$i]
$2016 = [string]$prd2016[$i]
$content = $content.replace("$2016", "$2010")
$content | Set-Content -Path $dstFile -Force
}
I hate when I find my own answer less than an hour after posting a question, but hopefully, finding this will help someone else in the future attempt to do the same thing or something similar.
I am new to PowerShell or any scripting stuff.
I have here a DataResources(CSV File) which contains of bunch of data's that needs to be inserted into a Results.CSV
Here is my DataResources
For my Result.csv looks like this.
My goal is to export a NEW csv file based from a results.csv as a template
and if a HostName/Host match the data contains a value equivalent to UA and PWD will be inserted/updated from ExportResources CSV file to Exported new CSV File
after executing scripts that I have created it only modifies 1 row not all.
This is what I've done so far.
$DataSource = Import-Csv -Path D:\coding\Powershell\csv\Resources\ExportResources.csv
$DataResults = Import-Csv -Path D:\coding\Powershell\csv\Result\results.csv
foreach ($ItemDataResults in $DataResults)
{
$HostName = $ItemDataResults.HostName
$UA = $ItemDataResults.UA
$PASSWD = $ItemDataResults.PWD
}
$ItemDataSource = $DataSource | ? {$_.Host -eq $HostName}
if ($UA -eq "" -and $PASSWD -eq "")
{
$ItemDataResults.UA = $ItemDataSource.UA
$ItemDataResults.PWD = $ItemDataSource.Passwd
}
$DataResults | Export-Csv D:\coding\Powershell\csv\new.csv -NoTypeInformation
The result almost meet my expectation but the problem here that it only fills one hostname the rest are empty.
The issue with your script is that you loop through $DataResults once, but you have to iterate through it once for each item in $DataSource if you use the method that you're employing. I believe that a better method is to create a hashtable from one CSV, using the host name as the key, and the whole object as the value, then loop through the second array updating the value for each host. That would look something like this:
$DataSource = Import-Csv -Path D:\coding\Powershell\csv\Resources\ExportResources.csv
$DataResults = Import-Csv -Path D:\coding\Powershell\csv\Result\results.csv
$DataHT = #{}
$DataResults | ForEach-Object { $DataHT.Add($_.HostName,$_) }
ForEach( $Record in $DataSource ){
$DataHT[$Record.Host].UA = $Record.UA
$ItemDataResults.PWD = $Record.Passwd
}
$DataHT.Values | Export-Csv D:\coding\Powershell\csv\new.csv -NoTypeInformation
I have a text file that contains millions of records
I want to find out from each line that does not start with string + that line number (String starts with double quote 01/01/2019)
Can you help me modify this code?
Get-Content "(path).txt" | Foreach { if ($_.Split(',')[-1] -inotmatch "^01/01/2019") { $_; } }
Thanks
Based on your comments the content will look something like the array.
So you want to read the content, filter it, and get the resulting line from that content:
# Get the content
# $content = Get-Content -Path 'pathtofile.txt'
$content = #('field1,field2,field3', '01/01/2019,b,c')
# Convert from csv
$csvContent = $content | ConvertFrom-Csv
# Add your filter based on the field
$results = $csvContent | Where-Object { $_.field1 -notmatch '01/01/2019'} | % { $_ }
# Convert your results back to csv if needed
$results | ConvertTo-Csv
If performance is an issue then .net would handle millions of records with CsvHelper just like PowerBi.
# install CsvHelper
nuget install CsvHelper
# import csvhelper
import-module CsvHelper.2.16.3.0\lib\net45\CsvHelper.dll
# write the content to the file just for this example
#('field1,field2,field3', '01/01/2019,b,c') | sc -path "c:\temp\text.csv"
$results = #()
# open the file for reading
try {
$stream = [System.IO.File]::OpenRead("c:\temp\text.csv")
$sr = [System.IO.StreamReader]::new($stream)
$csv = [CsvHelper.CsvReader]::new($sr)
# read in the records
while($csv.Read()){
# add in the result
$result= #{}
[string] $value = "";
for($i = 0; $csv.TryGetField($i, [ref] $value ); $i++) {
$result.Add($i, $value);
}
# add your filter here for the results
$results.Add($result)
}
# dispose of everything once we are done
}finally {
$stream.Dispose();
$sr.Dispose();
$csv.Dispose();
}
My .txt file looks like this...
date,col2,col3
"01/01/2019 22:42:00", "column2", "column3"
"01/02/2019 22:42:00", "column2", "column3"
"01/01/2019 22:42:00", "column2", "column3"
"02/01/2019 22:42:00", "column2", "column3"
This command does exactly what you are asking...
Get-Content -Path C:\myFile.txt | ? {$_ -notmatch "01/01/2019"} | Select -Skip 1
The output is:
"01/02/2019 22:42:00", "column2", "column3"
"02/01/2019 22:42:00", "column2", "column3"
I skipped the top row. If you want to deal with particular columns, change myFile.txt to a .csv and import it.
Looking at the question and comments, you are dealing with a headerless CSV file it seems. Because the file contains millions of records, I think using Get-Content or Import-Csv could slow down too much. Using [System.IO.File]::ReadLines() would then be faster.
If indeed each line starts with a quoted date, you could use various methods of figuring out if the line start with "01/01/2019 or not. Here, I use the -notlike operator:
$fileIn = "D:\your_text_file_which_is_in_fact_a_CSV_file.txt"
$fileOut = "D:\your_text_file_which_is_in_fact_a_CSV_file_FILTERED.txt"
foreach ($line in [System.IO.File]::ReadLines($fileIn)) {
if ($line -notlike '"01/01/2019*') {
# write to a NEW file
Add-Content -Path $fileOut -Value $line
}
}
Update
Judging from your comment, you are apparently using an older .NET framework, as the [System.IO.File]::ReadLines() became available as of version 4.0.
In that case, the below code should work for you:
$fileIn = "D:\your_text_file_which_is_in_fact_a_CSV_file.txt"
$fileOut = "D:\your_text_file_which_is_in_fact_a_CSV_file_FILTERED.txt"
$reader = New-Object System.IO.StreamReader($fileIn)
$writer = New-Object System.IO.StreamWriter($fileOut)
while (($line = $reader.ReadLine()) -ne $null) {
if ($line -notlike '"01/01/2019*') {
# write to a NEW file
$writer.WriteLine($line)
}
}
$reader.Dispose()
$writer.Dispose()
I am trying to figure out a more efficient way of using Import-CSV (powershell) to place values into an array of csv files. The problem is that some of these files have several hundred thousand lines and running this script in conjunction with other lines of code is what appears to be a big bottle neck. Do you guys have any suggestions of how to make this code more efficient and faster?
foreach($csv in $csvfiles)
{
$csvname = $csv.name;
$paygroup = $csvname.substring(4,3);
$batch = $csvname.substring(14,4);
write-host "Writing $csvname";
$csvimportdata = Import-CSV $CurrentPath"\$csvname";
foreach($record in $csvimportdata)
{
$record.chartfield1 = $paygroup;
$record.chartfield2 = $batch;
$record.chartfield3 = $record.line_descr.substring(0,6);
}
$csvimportdata | Export-CSV $CurrentPath"\$csvname" -NoTypeInformation
};
If your CSVs are large then loading into memory is probably not a good idea. How about something like this:
foreach($csv in $csvfiles)
{
$csvname = $csv.name
$paygroup = $csvname.substring(4,3)
$batch = $csvname.substring(14,4)
write-host "Writing $csvname"
Get-Content $CurrentPath"\$csvname" -Readcount 1 | % {
# Regex below assumes a three column CSV
$_ -replace '^([^,]+,[^,]+,[^,]{6}).*$', '$1'
} | Set-Content $CurrentPath"\$csvname"
}
I have an Excel File which has an unknown number of records in it, and these 3 columns:
Variable Name, Store Number, Email Address
I use this in QlikView to import data for certain stores and then create a separate report for each store in the list. I then need to email each report to each individual store (store number will be in the report file name).
So in PowerShell I would like to read the Excel File and set variables for each store:
$Store1 = The Store Number in Row 2 of the Excel File
$Store1Email = The Store Email in Row 2 of the Excel File
$Store2 = The Store Number in Row 3 of the Excel File
$Store2Email = The Store Email in Row 3 of the Excel File
etc. for each Storein the file (can be any number of stores).
Please note the "Variable Name" in the excel file must be ignored (that is for QLikView) and the PowerShell variables must be named as per my above examples, each time incrementing the number.
Check out my PowerShell Excel Module on Github. You can also grab it from the PowerShell Gallery.
$stores = Import-Excel C:\Temp\store.xlsx
$stores[2].Name
$stores[2].StoreNumber
$stores[2].EmailAddress
''
'All stores'
'----------'
$stores
Ok, first off if you are going to be working with actual .XLS or .XLSX or .XLSM files I would highly suggest using the Import-XLS function from the TechNet gallery (found here).
After that, just reference the object it imports to send the emails instead of making objects for each store. Such as:
$StoreList = Import-XLS <path to Excel file>
GC <report folder> | %{
$Current = $_
$Store = $StoreList|?{$_.StoreNumber -match $Current.BaseName}|Select -ExpandProperty StoreNumber
$Email = $StoreList|?{$_.StoreNumber -match $Current.BaseName}|Select -ExpandProperty StoreEmail
<code to send $Current to $Email>
}
My preference is to Save-As the Excel file to a '.csv' type. The comma separated value can easily be imported into PowerShell.
$csvFile = Import-Csv -Path c:\scripts\temp\excelFile.csv
#now the entire Excel '.csv' file is saved into csvFile variable
$csvFile |Get-Member
#look at the properties
Remember to study the greats so your PowerShell script looks great. Jeffery Snover, Jason Hicks, Don Jones, Ashley McGlone, and anyone on their friends list ha ha
The above answers usually work, but I just had a project with excel datasheets that caused some problems.
edit: Here's a much more advanced version that will pull it into an object, can handle blank and duplicate column names, and can skip human information at the beginning of the worksheet by looking for something in the header row. I've also included some example usages
Your example:
$file = New-Module -AsCustomObject -ScriptBlock $file_template
$file.from_excel("c:\folder\file.xls")
$Store1 = $file.data[0]."Store Number" #first row, column named "Store Number"
$Store1Email = $file.data[0]."Store Email" #first row, column named "Store Email"
foreach ($row in $file.data)
{
write-host "Store: $($row."Store Number")"
write-host "Store Email: $($row."Store Email")"
}
Example 1:
# Simplest example
$file = New-Module -AsCustomObject -ScriptBlock $file_template
$file.from_excel("c:\folder\file.xls")
$file.data[0]
Example 2:
#advanced usage
$file = New-Module -AsCustomObject -ScriptBlock $file_template
$file.header_contains="First Name" # if included it will drop everything before the first line that contains this, useful if there are instructions for humans in the worksheet
$file.indexer_column = 5 # Default: 1 (first column); This column's contents will set the minimum number of rows, use if there are blank rows in your file but more data after them
$file.worksheet_index = "January" # Default: 1; can be a sheet index or sheet name
$file.filename = "c:\folder\file.xls" #can set this independently, useful for validation and troubleshooting
$file.from_excel() #This is where we actually pull from excel
$collected = $file.data|ogv -pass thru #this is a neat way to select some rows you want
$file.headers.count # It stores an array of the headers here, useful for troubleshooting and advanced logic
Excel Reader pseudoclass
$file_template = {
# -- universal --
$filename = ""
$delimiter = ","
$headers = #()
$data = #()
# -- used by some functions --
# we put these here to allow assigning them before calling functions, which improves readability and auditability
$header_contains=""
$indexer_column=1
$worksheet_index=1
function from_excel(
$filename=$this.filename,
$worksheet_index=$this.worksheet_index
)`
{
$this.filename = $filename
$this.worksheet_index = $worksheet_index
$data_by_row = $this.from_excel_as_csv() # $data_by_row = $file.from_excel_as_csv($test_file)
$data_by_row = $data_by_row -split"`n"
#if ($this.headers.count -lt 1) {$this.headers = $data_by_row[0] -split $this.delimiter} #this would let us set headers elsewhere which is more flexible but less adaptive, Because columns change unpredicably we need something more adaptive
$temp_headers = $data_by_row[0] -split $this.delimiter
$temp_headers = $this.fix_blank_headers($temp_headers)
$this.headers = $this.dedupe_headers($temp_headers)
$this.data = $data_by_row|select -Skip 1|ConvertFrom-Csv -Header $this.headers -Delimiter $this.delimiter
}
function from_csv($filename=$this.filename)`
{
$this.filename = $filename
$this.headers = (Get-Content $this.filename -ReadCount 1|select -first 1) -split $this.delimiter
$this.data = Get-Content $this.filename|ConvertFrom-Csv -Delimiter $this.delimiter
}
function from_excel_as_csv(
$filename=$this.filename,
$worksheet_index=$this.worksheet_index
)`
{
$this.filename = $filename
$this.worksheet_index = $worksheet_index
#set up excel
Write-Host "Importing from excel, this may take a little while..."
$excel = New-Object -ComObject Excel.Application
$excel.DisplayAlerts = $false
$excel.Visible = $false
$workbook = $excel.workbooks.open($this.filename)
$worksheet = $workbook.Worksheets.Item($this.worksheet_index)
#import from excel
try{
$data_by_row = ""
$indexed_column = $worksheet.columns.item($this.indexer_column).value2 #we use this to work around some files having headers with blank space
$minimum_rows = (($indexed_column -join "◘").TrimEnd("◘") -split "◘").count # This Strips the million or so extra blank rows excel appends to get a realistic column length.
[bool]$header_found = 0
$i=1
do `
{
$row = $worksheet.rows.item($i).value2
$row_as_text = $row -join "◘" # ◘ (alt+8) is just a placeholder that's unlikely to show up in the text
$row_as_text = $row_as_text -replace $this.delimiter,"."
$row_as_text = $row_as_text.TrimEnd("◘")
$row_as_text = $row_as_text -replace "◘",$this.delimiter
if ($row_as_text -like "*$($this.header_contains)*"){[bool]$header_found=1}
if ($header_found) {$data_by_row+="$row_as_text`n"}
$i++
}
while ( ($row_as_text.Length -gt 1) -or ($i -lt $minimum_rows) )
}
catch {Write-Warning "ERROR Importing from excel"}
#close excel
$workbook.Close()
$excel.Quit()
write-host "Done importing from excel"
return $data_by_row
}
function dedupe_headers($headers){
$dupes = ($headers|group)|?{$_.count -gt 1}
if ($dupes.count -ge 1)
{
foreach ($dupe in $dupes)
{ #$dupe = $dupes[0]
$i=1
$new_headers = #()
foreach ($header in $headers)
{ #$header = $headers[0]
if ($header -eq $dupe.name)
{
$header = "$($header)_$($i)" # "header_#"
$i++
}
$new_headers += $header
}
}
}
else {$new_headers = $headers} # no duplicates found
return $new_headers
}
function fix_blank_headers($headers)
{
$replace_blanks_with = "_"
$new_headers = #()
foreach ($header in $headers)
{
if ($header -eq "") {$new_headers += $replace_blanks_with}
else {$new_headers += $header}
}
if ($new_headers.count -ne $header)
{
$error_json = #($headers),#($new_headers)|ConvertTo-Json -Compress
Write-Error "Error when fixing blank headers, original and new counts are different $($error_json)"
}
return $new_headers
}
<# function some_function($some_parameter){return $some_parameter} #>
Export-ModuleMember -Function * -Variable *
}
Forgive the ugliness here. I am not a programmer, so there are undoubtedly more optimized ways to do this, as well as better formatting. It will work, however, if I understand your requirements correctly.
$excelfile = import-csv "c:\myfile.csv"
$i = 1
$excelfile | ForEach-Object {
New-Variable "Store$i" $_."Store Number"
$iemail = $i.ToString() + "Email"
New-Variable "Store$iemail" $_."Email Address"
$i ++
}
edit: as per the reply to your original post, this works with a csv file. Just save it to csv first if necessary.
$excelfile = import-csv "C:\Temp\store.csv"
$i = 1 $excelfile | ForEach-Object {
$NA= $_."Name"
$SN= $_."StoreNumber"
Write-Output "row $i"
$NA
$SN
$i++ }