Import Excel data into PowerShell variables - powershell

I have an Excel File which has an unknown number of records in it, and these 3 columns:
Variable Name, Store Number, Email Address
I use this in QlikView to import data for certain stores and then create a separate report for each store in the list. I then need to email each report to each individual store (store number will be in the report file name).
So in PowerShell I would like to read the Excel File and set variables for each store:
$Store1 = The Store Number in Row 2 of the Excel File
$Store1Email = The Store Email in Row 2 of the Excel File
$Store2 = The Store Number in Row 3 of the Excel File
$Store2Email = The Store Email in Row 3 of the Excel File
etc. for each Storein the file (can be any number of stores).
Please note the "Variable Name" in the excel file must be ignored (that is for QLikView) and the PowerShell variables must be named as per my above examples, each time incrementing the number.

Check out my PowerShell Excel Module on Github. You can also grab it from the PowerShell Gallery.
$stores = Import-Excel C:\Temp\store.xlsx
$stores[2].Name
$stores[2].StoreNumber
$stores[2].EmailAddress
''
'All stores'
'----------'
$stores

Ok, first off if you are going to be working with actual .XLS or .XLSX or .XLSM files I would highly suggest using the Import-XLS function from the TechNet gallery (found here).
After that, just reference the object it imports to send the emails instead of making objects for each store. Such as:
$StoreList = Import-XLS <path to Excel file>
GC <report folder> | %{
$Current = $_
$Store = $StoreList|?{$_.StoreNumber -match $Current.BaseName}|Select -ExpandProperty StoreNumber
$Email = $StoreList|?{$_.StoreNumber -match $Current.BaseName}|Select -ExpandProperty StoreEmail
<code to send $Current to $Email>
}

My preference is to Save-As the Excel file to a '.csv' type. The comma separated value can easily be imported into PowerShell.
$csvFile = Import-Csv -Path c:\scripts\temp\excelFile.csv
#now the entire Excel '.csv' file is saved into csvFile variable
$csvFile |Get-Member
#look at the properties
Remember to study the greats so your PowerShell script looks great. Jeffery Snover, Jason Hicks, Don Jones, Ashley McGlone, and anyone on their friends list ha ha

The above answers usually work, but I just had a project with excel datasheets that caused some problems.
edit: Here's a much more advanced version that will pull it into an object, can handle blank and duplicate column names, and can skip human information at the beginning of the worksheet by looking for something in the header row. I've also included some example usages
Your example:
$file = New-Module -AsCustomObject -ScriptBlock $file_template
$file.from_excel("c:\folder\file.xls")
$Store1 = $file.data[0]."Store Number" #first row, column named "Store Number"
$Store1Email = $file.data[0]."Store Email" #first row, column named "Store Email"
foreach ($row in $file.data)
{
write-host "Store: $($row."Store Number")"
write-host "Store Email: $($row."Store Email")"
}
Example 1:
# Simplest example
$file = New-Module -AsCustomObject -ScriptBlock $file_template
$file.from_excel("c:\folder\file.xls")
$file.data[0]
Example 2:
#advanced usage
$file = New-Module -AsCustomObject -ScriptBlock $file_template
$file.header_contains="First Name" # if included it will drop everything before the first line that contains this, useful if there are instructions for humans in the worksheet
$file.indexer_column = 5 # Default: 1 (first column); This column's contents will set the minimum number of rows, use if there are blank rows in your file but more data after them
$file.worksheet_index = "January" # Default: 1; can be a sheet index or sheet name
$file.filename = "c:\folder\file.xls" #can set this independently, useful for validation and troubleshooting
$file.from_excel() #This is where we actually pull from excel
$collected = $file.data|ogv -pass thru #this is a neat way to select some rows you want
$file.headers.count # It stores an array of the headers here, useful for troubleshooting and advanced logic
Excel Reader pseudoclass
$file_template = {
# -- universal --
$filename = ""
$delimiter = ","
$headers = #()
$data = #()
# -- used by some functions --
# we put these here to allow assigning them before calling functions, which improves readability and auditability
$header_contains=""
$indexer_column=1
$worksheet_index=1
function from_excel(
$filename=$this.filename,
$worksheet_index=$this.worksheet_index
)`
{
$this.filename = $filename
$this.worksheet_index = $worksheet_index
$data_by_row = $this.from_excel_as_csv() # $data_by_row = $file.from_excel_as_csv($test_file)
$data_by_row = $data_by_row -split"`n"
#if ($this.headers.count -lt 1) {$this.headers = $data_by_row[0] -split $this.delimiter} #this would let us set headers elsewhere which is more flexible but less adaptive, Because columns change unpredicably we need something more adaptive
$temp_headers = $data_by_row[0] -split $this.delimiter
$temp_headers = $this.fix_blank_headers($temp_headers)
$this.headers = $this.dedupe_headers($temp_headers)
$this.data = $data_by_row|select -Skip 1|ConvertFrom-Csv -Header $this.headers -Delimiter $this.delimiter
}
function from_csv($filename=$this.filename)`
{
$this.filename = $filename
$this.headers = (Get-Content $this.filename -ReadCount 1|select -first 1) -split $this.delimiter
$this.data = Get-Content $this.filename|ConvertFrom-Csv -Delimiter $this.delimiter
}
function from_excel_as_csv(
$filename=$this.filename,
$worksheet_index=$this.worksheet_index
)`
{
$this.filename = $filename
$this.worksheet_index = $worksheet_index
#set up excel
Write-Host "Importing from excel, this may take a little while..."
$excel = New-Object -ComObject Excel.Application
$excel.DisplayAlerts = $false
$excel.Visible = $false
$workbook = $excel.workbooks.open($this.filename)
$worksheet = $workbook.Worksheets.Item($this.worksheet_index)
#import from excel
try{
$data_by_row = ""
$indexed_column = $worksheet.columns.item($this.indexer_column).value2 #we use this to work around some files having headers with blank space
$minimum_rows = (($indexed_column -join "◘").TrimEnd("◘") -split "◘").count # This Strips the million or so extra blank rows excel appends to get a realistic column length.
[bool]$header_found = 0
$i=1
do `
{
$row = $worksheet.rows.item($i).value2
$row_as_text = $row -join "◘" # ◘ (alt+8) is just a placeholder that's unlikely to show up in the text
$row_as_text = $row_as_text -replace $this.delimiter,"."
$row_as_text = $row_as_text.TrimEnd("◘")
$row_as_text = $row_as_text -replace "◘",$this.delimiter
if ($row_as_text -like "*$($this.header_contains)*"){[bool]$header_found=1}
if ($header_found) {$data_by_row+="$row_as_text`n"}
$i++
}
while ( ($row_as_text.Length -gt 1) -or ($i -lt $minimum_rows) )
}
catch {Write-Warning "ERROR Importing from excel"}
#close excel
$workbook.Close()
$excel.Quit()
write-host "Done importing from excel"
return $data_by_row
}
function dedupe_headers($headers){
$dupes = ($headers|group)|?{$_.count -gt 1}
if ($dupes.count -ge 1)
{
foreach ($dupe in $dupes)
{ #$dupe = $dupes[0]
$i=1
$new_headers = #()
foreach ($header in $headers)
{ #$header = $headers[0]
if ($header -eq $dupe.name)
{
$header = "$($header)_$($i)" # "header_#"
$i++
}
$new_headers += $header
}
}
}
else {$new_headers = $headers} # no duplicates found
return $new_headers
}
function fix_blank_headers($headers)
{
$replace_blanks_with = "_"
$new_headers = #()
foreach ($header in $headers)
{
if ($header -eq "") {$new_headers += $replace_blanks_with}
else {$new_headers += $header}
}
if ($new_headers.count -ne $header)
{
$error_json = #($headers),#($new_headers)|ConvertTo-Json -Compress
Write-Error "Error when fixing blank headers, original and new counts are different $($error_json)"
}
return $new_headers
}
<# function some_function($some_parameter){return $some_parameter} #>
Export-ModuleMember -Function * -Variable *
}

Forgive the ugliness here. I am not a programmer, so there are undoubtedly more optimized ways to do this, as well as better formatting. It will work, however, if I understand your requirements correctly.
$excelfile = import-csv "c:\myfile.csv"
$i = 1
$excelfile | ForEach-Object {
New-Variable "Store$i" $_."Store Number"
$iemail = $i.ToString() + "Email"
New-Variable "Store$iemail" $_."Email Address"
$i ++
}
edit: as per the reply to your original post, this works with a csv file. Just save it to csv first if necessary.

$excelfile = import-csv "C:\Temp\store.csv"
$i = 1 $excelfile | ForEach-Object {
$NA= $_."Name"
$SN= $_."StoreNumber"
Write-Output "row $i"
$NA
$SN
$i++ }

Related

How to split through the whole list using PowerShell

In my CSV file I have "SharePoint Site" column and a few other columns. I'm trying to split the ID from "SharePoint Site" columns and put it to the new column call "SharePoint ID" but not sure how to do it so I'll be really appreciated If I can get any help or suggestion.
$downloadFile = Import-Csv "C:\AuditLogSearch\New folder\Modified-Audit-Log-Records.csv"
(($downloadFile -split "/") -split "_") | Select-Object -Index 5
CSV file
SharePoint Site
Include:[https://companyname-my.sharepoint.com/personal/elksn7_nam_corp_kl_com]
Include:[https://companyname-my.sharepoint.com/personal/tzksn_nam_corp_kl_com]
Include:[https://companyname.sharepoint.com/sites/msteams_c578f2/Shared%20Documents/Forms/AllItems.aspx?id=%2Fsites%2Fmsteams%5Fc578f2%2FShared%20Documents%2FBittner%2DWilfong%20%2D%20Litigation%20Hold%2FWork%20History&viewid=b3e993a1%2De0dc%2D4d33%2D8220%2D5dd778853184]
Include:[https://companyname.sharepoint.com/sites/msteams_c578f2/Shared%20Documents/Forms/AllItems.aspx?id=%2Fsites%2Fmsteams%5Fc578f2%2FShared%20Documents%2FBittner%2DWilfong%20%2D%20Litigation%20Hold%2FWork%20History&viewid=b3e993a1%2De0dc%2D4d33%2D8220%2D5dd778853184]
Include:[All]
After spliting this will show it under new Column call "SharePoint ID"
SharePoint ID
2. elksn
3. tzksn
4. msteams_c578f2
5. msteams_c578f2
6. All
Try this:
# Import csv into an array
$Sites = (Import-Csv C:\temp\Modified-Audit-Log-Records.csv).'SharePoint Site'
# Create Export variable
$Export = #()
# ForEach loop that goes through the SharePoint sites one at a time
ForEach($Site in $Sites){
# Clean up the input to leave only the hyperlink
$Site = $Site.replace('Include:[','')
$Site = $Site.replace(']','')
# Split the hyperlink at the fifth slash (Split uses binary, so 0 would be the first slash)
$SiteID = $Site.split('/')[4]
# The 'SharePoint Site' Include:[All] entry will be empty after doing the split, because it has no 4th slash.
# This If statement will detect if the $Site is 'All' and set the $SiteID as that.
if($Site -eq 'All'){
$SiteID = $Site
}
# Create variable to export Site ID
$SiteExport = #()
$SiteExport = [pscustomobject]#{
'SharePoint ID' = $SiteID
}
# Add each SiteExport to the Export array
$Export += $SiteExport
}
# Write out the export
$Export
A concise solution that appends a Sharepoint ID column to the existing columns by way of a calculated property:
Import-Csv 'C:\AuditLogSearch\New folder\Modified-Audit-Log-Records.csv' |
Select-Object *, #{
Name = 'SharePoint ID'
Expression = {
$tokens = $_.'SharePoint Site' -split '[][/]'
if ($tokens.Count -eq 3) { $tokens[1] } # matches 'Include:[All]'
else { $tokens[5] -replace '_nam_corp_kl_com$' }
}
}
Note:
To see all resulting column values, pipe the above to Format-List.
To re-export the results to a CSV file, pipe to Export-Csv
You have 3 distinct patterns you are trying to extract data from. I believe regex would be an appropriate tool.
If you are wanting the new csv to just have the single ID column.
$file = "C:\AuditLogSearch\New folder\Modified-Audit-Log-Records.csv"
$IdList = switch -Regex -File ($file){
'Include:.+(?=/(\w+?)_)(?<=personal)' {$matches.1}
'Include:(?=\[(\w+)\])' {$matches.1}
'Include:.+(?=/(\w+?)/)(?<=sites)' {$matches.1}
}
$IdList |
ConvertFrom-Csv -Header "Sharepoint ID" |
Export-Csv -Path $newfile -NoTypeInformation
If you want to add a column to your existing CSV
$file = "C:\AuditLogSearch\New folder\Modified-Audit-Log-Records.csv"
$properties = ‘*’,#{
Name = 'Sharepoint ID'
Expression = {
switch -Regex ($_.'sharepoint Site'){
'Include:.+(?=/(\w+?)_)(?<=personal)' {$matches.1}
'Include:(?=\[(\w+)\])' {$matches.1}
'Include:.+(?=/(\w+?)/)(?<=sites)' {$matches.1}
}
}
}
Import-Csv -Path $file |
Select-Object $properties |
Export-Csv -Path $newfile -NoTypeInformation
Regex details
.+ Match any amount of any character
(?=...) Positive look ahead
(...) Capture group
\w+ Match one or more word characters
? Lazy quantifier
(?<=...) Positive look behind
This would require more testing to see if it works well, but with the input we have it works, the main concept is to use System.Uri to parse the strings. From what I'm seeing, the segment you are looking for is always the third one [2] and depending on the previous segments, perform a split on _ or trim the trailing / or leave the string as is if IsAbsoluteUri is $false.
$csv = Import-Csv path/to/test.csv
$result = foreach($line in $csv)
{
$uri = [uri]($line.'SharePoint Site' -replace '^Include:\[|]$')
$id = switch($uri)
{
{-not $_.IsAbsoluteUri} {
$_
break
}
{ $_.Segments[1] -eq 'personal/' } {
$_.Segments[2].Split('_')[0]
break
}
{ $_.Segments[1] -eq 'sites/' } {
$_.Segments[2].TrimEnd('/')
}
}
[pscustomobject]#{
'SharePoint Site' = $line.'SharePoint Site'
'SharePoint ID' = $id
}
}
$result | Format-List

Powershell-MS Word docx table to csv

I've looked for several solutions regarding this as I'm still new to powershell, but there is this same kind of code everywhere. My problem is that it doesnot output all the contents of the word table to csv in the right format. Only one single last column data is output to the csv file. I can't understad where I am wrong. Please help me out.
$objWord = New-Object -Com Word.Application
$filename = 'path to file'
$outputfile= 'path to file'
$objDocument = $objWord.Documents.Open($filename)
$Table = $objDocument.Tables.Item(1)
$TableCols = $Table.Columns.Count
$TableRows = $Table.Rows.Count
for($r=1; $r -le $TableRows; $r++) {
for($c=1; $c -le $TableCols; $c++) {
#Write-Host $r "x" $c
$content = $Table.Cell($r,$c).Range.Text
Write-Host $content
$content | Out-File $outputfile
}
}
$objDocument.Close()
$objWord.Quit()
# Stop Winword Process
$rc = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($objWord)
I've tried using Export-Csv but it gives me length even with adding -noTypeInformation there is no result.
Also how can I get a csv file created dynamically, instead of having to create a new empty csv file always?
Your code currently outputs every single cell contents to the file individually.
Creating a cvs from a Word table like this is doable, but you need to capture the cell contents for each row in an array variable first and join the elements with a comma.
Then output the row.
For safety, quote every cell value so that fields having a comma inside do not make for a mis-aligned file afterwards.
Another snag is that Word appends each cel value from a table with control characters 0x0D and 0x07, so you need to remove those aswell.
Try
$objWord = New-Object -Com Word.Application
$filename = 'D:\Test\blah.docx'
$outputfile = 'D:\Test\blah.csv'
$objDocument = $objWord.Documents.Open($filename)
$Table = $objDocument.Tables.Item(1)
$TableCols = $Table.Columns.Count
$TableRows = $Table.Rows.Count
# this gets the list separator character your local Excel expects when double-clicking a CSV file
$delimiter = [cultureinfo]::CurrentCulture.TextInfo.ListSeparator
for($r = 1; $r -le $TableRows; $r++) {
# capture an array of cell contents
$content = for($c = 1; $c -le $TableCols; $c++) {
# surround each value with quotes to prevent fields that contain the delimiter character would ruin the csv,
# double any double-quotes the value may contain,
# remove the control characters (0x0D 0x07) Word appends to the cell text
# trim the resulting value from leading or trailing whitespace characters
'"{0}"' -f ($Table.Cell($r,$c).Range.Text -replace '"', '""' -replace '[\x00-\x1F\x7F]').Trim()
}
# output this array joined with the delimiter, both on screen and to file
$content -join $delimiter | Add-Content -Path $outputfile -PassThru
}
$objDocument.Close()
$objWord.Quit()
# Stop Winword Process
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($objDocument)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($objWord)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
Using the file you have made available, the output CSV (opened in Excel) looks like this:

How to sort 30Million csv records in Powershell

I am using oledbconnection to sort the first column of csv file. Oledb connection is executed up to 9 million records within 6 min duration successfully. But when am executing 10 million records, getting following alert message.
Exception calling "ExecuteReader" with "0" argument(s): "The query cannot be completed. Either the size of the query result is larger than the maximum size of a database (2 GB), or
there is not enough temporary storage space on the disk to store the query result."
is there any other solution to sort 30 million using Powershell?
here is my script
$OutputFile = "D:\Performance_test_data\output1.csv"
$stream = [System.IO.StreamWriter]::new( $OutputFile )
$sb = [System.Text.StringBuilder]::new()
$sw = [Diagnostics.Stopwatch]::StartNew()
$conn = New-Object System.Data.OleDb.OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;Data Source='D:\Performance_test_data\';Extended Properties='Text;HDR=Yes;CharacterSet=65001;FMT=Delimited';")
$cmd=$conn.CreateCommand()
$cmd.CommandText="Select * from 1crores.csv order by col6"
$conn.open()
$data = $cmd.ExecuteReader()
echo "Query has been completed!"
$stream.WriteLine( "col1,col2,col3,col4,col5,col6")
while ($data.read())
{
$stream.WriteLine( $data.GetValue(0) +',' + $data.GetValue(1)+',' + $data.GetValue(2)+',' + $data.GetValue(3)+',' + $data.GetValue(4)+',' + $data.GetValue(5))
}
echo "data written successfully!!!"
$stream.close()
$sw.Stop()
$sw.Elapsed
$cmd.Dispose()
$conn.Dispose()
You can try using this:
$CSVPath = 'C:\test\CSVTest.csv'
$Delimiter = ';'
# list we use to hold the results
$ResultList = [System.Collections.Generic.List[Object]]::new()
# Create a stream (I use OpenText because it returns a streamreader)
$File = [System.IO.File]::OpenText($CSVPath)
# Read and parse the header
$HeaderString = $File.ReadLine()
# Get the properties from the string, replace quotes
$Properties = $HeaderString.Split($Delimiter).Replace('"',$null)
$PropertyCount = $Properties.Count
# now read the rest of the data, parse it, build an object and add it to a list
while ($File.EndOfStream -ne $true)
{
# Read the line
$Line = $File.ReadLine()
# split the fields and replace the quotes
$LineData = $Line.Split($Delimiter).Replace('"',$null)
# Create a hashtable with the properties (we convert this to a PSCustomObject later on). I use an ordered hashtable to keep the order
$PropHash = [System.Collections.Specialized.OrderedDictionary]#{}
# if loop to add the properties and values
for ($i = 0; $i -lt $PropertyCount; $i++)
{
$PropHash.Add($Properties[$i],$LineData[$i])
}
# Now convert the data to a PSCustomObject and add it to the list
$ResultList.Add($([PSCustomObject]$PropHash))
}
# Now you can sort this list using Linq:
Add-Type -AssemblyName System.Linq
# Sort using propertyname (my sample data had a prop called "Name")
$Sorted = [Linq.Enumerable]::OrderBy($ResultList, [Func[object,string]] { $args[0].Name })
Instead of using import-csv I've written a quick parser which uses a streamreader and parses the CSV data on the fly and puts it in a PSCustomObject.
This is then added to a list.
edit: fixed the linq sample
Putting the performance aside and at least come to a solution that works (meaning one that doesn't hang due to memory shortage) I would rely on the PowerShell pipeline. The issue is thou that for sorting an object you will need to stall te pipeline as the last object might potentially become the first object.
To resolve this part, I would do a coarse division on the first character(s) of the concern property first. Once that is done, fine sort each coarse division and append the results:
Function Sort-BigObject {
[CmdletBinding()] param(
[Parameter(ValueFromPipeLine = $True)]$InputObject,
[Parameter(Position = 0)][String]$Property,
[ValidateRange(1,9)]$Coarse = 1,
[System.Text.Encoding]$Encoding = [System.Text.Encoding]::Default
)
Begin {
$TemporaryFiles = [System.Collections.SortedList]::new()
}
Process {
if ($InputObject.$Property) {
$Grain = $InputObject.$Property.SubString(0, $Coarse)
if (!$TemporaryFiles.Contains($Grain)) { $TemporaryFiles[$Grain] = New-TemporaryFile }
$InputObject | Export-Csv $TemporaryFiles[$Grain] -Encoding $Encoding -Append
} else { $InputObject.$Property }
}
End {
Foreach ($TemporaryFile in $TemporaryFiles.Values) {
Import-Csv $TemporaryFile -Encoding $Encoding | Sort-Object $Property
Remove-Item -LiteralPath $TemporaryFile
}
}
}
Usage
(Don't assign the stream to a variable and don't use parenthesis.)
Import-Csv .\1crores.csv | Sort-BigObject <PropertyName> | Export-Csv .\output.csv
If the temporary files still get too big to handle, you might need to increase the -Coarse parameter
Caveats (improvement considerations)
Objects with an empty sort property will be immediately outputted
The sort column is presumed to be a (single) string column
I presume the performance is poor (I didn't do a full test on 30 million records, but 10.000 records take about 8 second which means about 8 hours). Consider replacing native PowerShell cmdlets with .Net streaming methods. buffer/cache file input and outputs, parallel processing?
You could try SQLite:
$OutputFile = "D:\Performance_test_data\output1.csv"
$sw = [Diagnostics.Stopwatch]::StartNew()
sqlite3 output1.db '.mode csv' '.import 1crores.csv 1crores' '.headers on' ".output $OutputFile" 'Select * from 1crores order by 最終アクセス日時'
echo "data written successfully!!!"
$sw.Stop()
$sw.Elapsed
I have added a new answer as this is a complete different approach to tackle this issue.
Instead of creating temporary files (which presumable causes a lot of file opens and closures), you might consider to create a ordered list of indices and than go over the input file (-FilePath) multiple times and each time, process a selective number of lines (-BufferSize = 1Gb, you might have to tweak this "memory usage vs. performance" parameter):
Function Sort-Csv {
[CmdletBinding()] param(
[string]$InputFile,
[String]$Property,
[string]$OutputFile,
[Char]$Delimiter = ',',
[System.Text.Encoding]$Encoding = [System.Text.Encoding]::Default,
[Int]$BufferSize = 1Gb
)
Begin {
if ($InputFile.StartsWith('.\')) { $InputFile = Join-Path (Get-Location) $InputFile }
$Index = 0
$Dictionary = [System.Collections.Generic.SortedDictionary[string, [Collections.Generic.List[Int]]]]::new()
Import-Csv $InputFile -Delimiter $Delimiter -Encoding $Encoding | Foreach-Object {
if (!$Dictionary.ContainsKey($_.$Property)) { $Dictionary[$_.$Property] = [Collections.Generic.List[Int]]::new() }
$Dictionary[$_.$Property].Add($Index++)
}
$Indices = [int[]]($Dictionary.Values | ForEach-Object { $_ })
$Dictionary = $Null # we only need the sorted index list
}
Process {
$Start = 0
$ChunkSize = [int]($BufferSize / (Get-Item $InputFile).Length * $Indices.Count / 2.2)
While ($Start -lt $Indices.Count) {
[System.GC]::Collect()
$End = $Start + $ChunkSize - 1
if ($End -ge $Indices.Count) { $End = $Indices.Count - 1 }
$Chunk = #{}
For ($i = $Start; $i -le $End; $i++) { $Chunk[$Indices[$i]] = $i }
$Reader = [System.IO.StreamReader]::new($InputFile, $Encoding)
$Header = $Reader.ReadLine()
$i = $Start
$Count = 0
For ($i = 0; ($Line = $Reader.ReadLine()) -and $Count -lt $ChunkSize; $i++) {
if ($Chunk.Contains($i)) { $Chunk[$i] = $Line }
}
$Reader.Dispose()
if ($OutputFile) {
if ($OutputFile.StartsWith('.\')) { $OutputFile = Join-Path (Get-Location) $OutputFile }
$Writer = [System.IO.StreamWriter]::new($OutputFile, ($Start -ne 0), $Encoding)
if ($Start -eq 0) { $Writer.WriteLine($Header) }
For ($i = $Start; $i -le $End; $i++) { $Writer.WriteLine($Chunk[$Indices[$i]]) }
$Writer.Dispose()
} else {
$Start..$End | ForEach-Object { $Header } { $Chunk[$Indices[$_]] } | ConvertFrom-Csv -Delimiter $Delimiter
}
$Chunk = $Null
$Start = $End + 1
}
}
}
Basic usage
Sort-Csv .\Input.csv <PropertyName> -Output .\Output.csv
Sort-Csv .\Input.csv <PropertyName> | ... | Export-Csv .\Output.csv
Note that for 1Crones.csv it will probably just export the full file in once unless you set the -BufferSize to a lower amount e.g. 500Kb.
I downloaded gnu sort.exe from here: http://gnuwin32.sourceforge.net/packages/coreutils.htm It also requires libiconv2.dll and libintl3.dll from the dependency zip. I basically did this within cmd.exe, and it used a little less than a gig of ram and took about 5 minutes. It's a 500 meg file of about 30 million random numbers. This command can also merge sorted files with --merge. You can also specify begin and end key position for sorting --key. It automatically uses temp files.
.\sort.exe < file1.csv > file2.csv
Actually it works in a similar way with the windows sort from the cmd prompt. The windows sort also has a /+n option to specify what character column to start the sort by.
sort.exe < file1.csv > file2.csv

Powershell Mass Rename files with a excel reference list

I need help with PowerShell.
I will have to start renaming files in a weekly basis which I will be renaming more than 100 a week or more each with a dynamic name.
The files I want to rename are in a folder name Scans located in the "C: Documents\Scans". And they would be in order, to say time scanned.
I have an excel file located in "C: Documents\Mapping\ New File Name.xlsx.
The workbook has only one sheet and the new names would be in column A with x rows. Like mention above each cell will have different variables.
P Lease make comments on your suggestions so that I may understand what is going on since I'm a new to coding.
Thank you all for your time and help.
Although I agree with Ad Kasenally that it would be easier to use CSV files, here's something that may work for you.
$excelFile = 'C:\Documents\Mapping\New File Name.xlsx'
$scansFolder = 'C:\Documents\Scans'
########################################################
# step 1: get the new filenames from the first column in
# the Excel spreadsheet into an array '$newNames'
########################################################
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$workbook = $excel.Workbooks.Open($excelFile)
$worksheet = $workbook.Worksheets.Item(1)
$newNames = #()
$i = 1
while ($worksheet.Cells.Item($i, 1).Value() -ne $null) {
$newNames += $worksheet.Cells.Item($i, 1).Value()
$i++
}
$excel.Quit
# IMPORTANT: clean-up used Com objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($worksheet) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($excel) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
########################################################
# step 2: rename the 'scan' files
########################################################
$maxItems = $newNames.Count
if ($maxItems) {
$i = 0
Get-ChildItem -Path $scansFolder -File -Filter 'scan*' | # get a list of FileInfo objects in the folder
Sort-Object { [int]($_.BaseName -replace '\D+', '') } | # sort by the numeric part of the filename
Select-Object -First ($maxItems) | # select no more that there are items in the $newNames array
ForEach-Object {
try {
Rename-Item -Path $_.FullName -NewName $newNames[$i] -ErrorAction Stop
Write-Host "File '$($_.Name)' renamed to '$($newNames[$i])'"
$i++
}
catch {
throw
}
}
}
else {
Write-Warning "Could not get any new filenames from the $excelFile file.."
}
You may want to have 2 columns in the excel file:
original file name
target file name
From there you can save the file as a csv.
Use Import-Csv to pull the data into Powershell and a ForEach loop to cycle through each row with a command like move $item.original $item.target.
There are abundant threads describing using import-csv with forEach.
Good luck.

Get IIS log location via powershell?

I'm writing a script that I'd like to be able to easily move between IIS servers to analyze logs, but these servers store the logs in different places. Some on C:/ some on D:/ some in W3SVC1, some in W3SVC3. I'd like to be able to have powershell look this information up itself rather than having to manually edit this on each server. (Yeah, I'm a lazy sysadmin. #automateallthethings.)
Is this information available to PowerShell if I maybe pass the domain to it or something?
I found this to work for me since I want to know all of the sites log directory.
Import-Module WebAdministration
foreach($WebSite in $(get-website))
{
$logFile="$($Website.logFile.directory)\w3svc$($website.id)".replace("%SystemDrive%",$env:SystemDrive)
Write-host "$($WebSite.name) [$logfile]"
}
Import-Module WebAdministration
$sitename = "mysite.com"
$site = Get-Item IIS:\Sites\$sitename
$id = $site.id
$logdir = $site.logfile.directory + "\w3svc" + $id
Thanks for Chris Harris for putting the website ID idea in my head. I was able to search around better after that and it led me to the WebAdministration module and examples of its use.
Nice... I updated your script a little bit to Ask IIS for the log file location.
param($website = 'yourSite')
Import-Module WebAdministration
$site = Get-Item IIS:\Sites\$website
$id = $site.id
$logdir = $site.logfile.directory + "\w3svc" + $id
$time = (Get-Date -Format "HH:mm:ss"(Get-Date).addminutes(-30))
# Location of IIS LogFile
$File = "$logdir\u_ex$((get-date).ToString("yyMMdd")).log"
# Get-Content gets the file, pipe to Where-Object and skip the first 3 lines.
$Log = Get-Content $File | where {$_ -notLike "#[D,S-V]*" }
# Replace unwanted text in the line containing the columns.
$Columns = (($Log[0].TrimEnd()) -replace "#Fields: ", "" -replace "-","" -replace "\(","" -replace "\)","").Split(" ")
# Count available Columns, used later
$Count = $Columns.Length
# Strip out the other rows that contain the header (happens on iisreset)
$Rows = $Log | where {$_ -like "*500 0 0*"}
# Create an instance of a System.Data.DataTable
#Set-Variable -Name IISLog -Scope Global
$IISLog = New-Object System.Data.DataTable "IISLog"
# Loop through each Column, create a new column through Data.DataColumn and add it to the DataTable
foreach ($Column in $Columns) {
$NewColumn = New-Object System.Data.DataColumn $Column, ([string])
$IISLog.Columns.Add($NewColumn)
}
# Loop Through each Row and add the Rows.
foreach ($Row in $Rows) {
$Row = $Row.Split(" ")
$AddRow = $IISLog.newrow()
for($i=0;$i -lt $Count; $i++) {
$ColumnName = $Columns[$i]
$AddRow.$ColumnName = $Row[$i]
}
$IISLog.Rows.Add($AddRow)
}
$IISLog | select #{n="DateTime"; e={Get-Date ("$($_.date) $($_.time)")}},csuristem,scstatus | ? { $_.DateTime -ge $time }