How to sort an excel column using Powershell - powershell

I need to sort an excel spreadsheet by sorting one column be ascending numbers so 1,2,3,4,5...
Does anyone know a quick and dirty way to sort a excel column in powershell?

function Release-Ref ($ref) {
([System.Runtime.InteropServices.Marshal]::ReleaseComObject(
[System.__ComObject]$ref) -gt 0)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
}
$objExcel = new-object -comobject excel.application
$objExcel.Visible = $True
$objWorkbook = $objExcel.Workbooks.Open("C:\test\flag for errors\1921BB.xls")
$objWorksheet = $objWorkbook.Worksheets.Item(1)
$objRange = $objWorksheet.UsedRange
$objRange2 = $objworksheet.Range("E1")
[void] $objRange.Sort($objRange2)
$objWorkbook.Save()
$a = Release-Ref($objWorksheet)
$a = Release-Ref($objWorkbook)
$a = Release-Ref($objExcel)

Related

powershell rename XLSX spreadsheet columns

I have a spreadsheet that has spaces in the column names, how do I go about replacing the space with underscores on the column headers?
Note: I am new at this so bear with me
using this code with no luck:
Powershell: search & replace in xlsx except first 3 columns
Theo's code works great!
$sheetname = 'my Data'
$file = 'C:\Users\donkeykong\Desktop\1\booka.xlsx'
# create a COM Excel object
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $false
$workbook = $objExcel.Workbooks.Open($file)
$sheet = $workbook.Worksheets.Item($sheetname)
$sheet.Activate()
# get the number of columns used
$colMax = $sheet.UsedRange.Columns.Count
# loop over the column headers and replace the whitespaces
for ($col = 1; $col -le $colMax; $col++) {
$header = $sheet.Cells.Item(1, $col).Value() -replace '\s+', '_'
$sheet.Cells.Item(1, $col) = $header
}
# close and save the changes
$workbook.Close($true)
$objExcel.Quit()
# IMPORTANT: clean-up used Com objects
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($sheet)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($objExcel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
Assuming your Excel file has the headers in the first row, this should work without using the ImportExcel module:
$sheetname = 'my Data'
$file = 'C:\Users\donkeykong\Desktop\1\booka.xlsx'
# create a COM Excel object
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $false
$workbook = $objExcel.Workbooks.Open($file)
$sheet = $workbook.Worksheets.Item($sheetname)
$sheet.Activate()
# get the number of columns used
$colMax = $sheet.UsedRange.Columns.Count
# loop over the column headers and replace the whitespaces
for ($col = 1; $col -le $colMax; $col++) {
$header = $sheet.Cells.Item(1, $col).Value() -replace '\s+', '_'
$sheet.Cells.Item(1, $col) = $header
}
# close and save the changes
$workbook.Close($true)
$objExcel.Quit()
# IMPORTANT: clean-up used Com objects
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($sheet)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($objExcel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()

Powershell Word Document Page Numbers

$Doc = $Word.Documents.Add();
$Section = $Doc.Sections.Item(1);
$Header = $Section.Headers.Item(1);
$Header.Range.Text = "document1_${FirstName}_${SecondName}_${StaffID}.doc";
$objRange = $Doc.Range()
$objRange.Font.Name = “Arial”
$objRange.Font.Size = 12
$Doc.SaveAs("$fileserver\document1_${FirstName}_${SecondName}_${StaffID}.docx");
$Doc.Close();
I am using this script to create word documents and deploy them to staff areas on a fileserver. Is there a way to add page numbers into the documents?
The easiest way of doing this is as below:
$Word = New-Object -ComObject word.application
$word.Visible = $false
$Doc = $Word.Documents.Add()
$Section = $Doc.Sections.Item(1)
$Header = $Section.Headers.Item(1)
$Header.Range.Text = "document1_${FirstName}_${SecondName}_${StaffID}.docx"
$objRange = $Doc.Range()
$objRange.Font.Name = "Arial"
$objRange.Font.Size = 12
# ad a pagenumber (for demo centered to the page width)
# for other values see https://learn.microsoft.com/en-us/office/vba/api/word.wdpagenumberalignment
$wdAlignPageNumberCenter = 1
[void]$Section.Footers(1).PageNumbers.Add($wdAlignPageNumberCenter)
$Doc.SaveAs("$fileserver\document1_${FirstName}_${SecondName}_${StaffID}.docx")
$Doc.Close()
# IMPORTANT Cleanup COM objects after use
$Word.Quit()
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($objRange)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($Section)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($Header)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($Word)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()

Issue with splitting Excel file (.xlxs) with Powershell based on column values

I have the below script that was posted here: Split Excelfile .xlxs with Powershell based on column values and it works as described but I am experiencing an issue.
The script will check one xlsx file and sort the content by unique values in column A and then copy these data set and create a new file.
The issue I am having is out of about 4000 files, 20 of them are created with the full data set from the original file. The rest of the files a split and created correctly. Not sure as to why this is occurring. Any help is appreciated.
Function Create-Excel-Spreadsheet {
Param($NameOfSpreadsheet)
# open excel
$excel = New-Object -ComObject excel.application
$excel.visible = $true
# add a worksheet
$workbook = $excel.Workbooks.Add()
$xl_wksht= $workbook.Worksheets.Item(1)
$xl_wksht.Name = $NameOfSpreadsheet
return $workbook
}
$objexcel = New-Object -ComObject Excel.Application
$wb = $objexcel.WorkBooks.Open("C:\Temp\Test.xlsx") # Changing path for test.xlsx file.
$objexcel.Visible = $true
$objexcel.DisplayAlerts = $False
$ws = $wb.Worksheets.Item(1)
$usedRange = $ws.UsedRange
$usedRange.AutoFilter()
$totalRows = $usedRange.Rows.Count
$rangeForUnique = $usedRange.Offset(1, 0).Resize($UsedRange.Rows.Count-1)
[string[]]$UniqueListOfRowValues = $rangeForUnique.Columns.Item(1).Value2 | sort -Unique
for ($i = 0; $i -lt $UniqueListOfRowValues.Count; $i++) {
$newRange = $usedRange.AutoFilter(1, $UniqueListOfRowValues[$i])
$workbook = Create-Excel-Spreadsheet $UniqueListOfRowValues[$i]
$wksheet = $workbook.Worksheets.Item(1)
$range = $ws.UsedRange.Cells
$range.Copy()
$wksheet.Paste($wksheet.Range("A1"))
$workbook.SaveAs("C:\temp\" + $UniqueListOfRowValues[$i], $xlFixedFormat)
$workbook.Close()
}

How PowerShell Sum an Excel column and print the result?

I have an excel file with 10 columns. I want to get the sum of the column with header "Sales" and print it on the console.
How this can be done with PowerShell? I am using the below code but I do not know how to replace H with $i in the following expression:
='=SUM(H1:H'+$RowCount')'
Where H is column "Sales"
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $False
$NewWorkbook = $Excel.Workbooks.open("C:\Test.xlsx")
$NewWorksheet = $NewWorkbook.Worksheets.Item(1)
$NewWorksheet.Activate() | Out-Null
$NewWorksheetRange = $NewWorksheet.UsedRange
$RowCount = $NewWorksheetRange.Rows.Count
$ColumnCount = $NewWorksheetRange.Columns.Count
for ($i = 1; $i -lt $ColumnCount; $i++)
{
if ($NewWorksheet.cells.Item(1,$i).Value2 -eq "Sales")
{
$NewWorksheet.Cells.Item($RowCount+2,$i)='=SUM(H1:H'+$RowCount')'
Write-Host $NewWorksheet.Cells.Item($RowCount+1,$i).Value2
}
}
$Excel.Application.DisplayAlerts=$False
$NewWorkbook.SaveAs("C:\Test_New.xlsx")
$NewWorkbook.close($false)
$Excel.quit()
spps -n excel
I have replaced:
$NewWorksheet.Cells.Item($RowCount+2,$i) ='=SUM(H1:H'+(1+$RowCount)+')'
Write-Host $NewWorksheet.Cells.Item($RowCount+2,$i).Value2
with:
$FirstCell = $NewWorksheet.Cells(2,$i).Address('+True, False+')
$LastCell = $NewWorksheet.Cells(1+$RowCount,$i).Address('+True, False+')
$NewWorksheet.Cells.Item($RowCount+2,$i)='=SUM('+$FirstCell+':'+$LastCell+')'
Write-Host $NewWorksheet.Cells.Item($RowCount+2,$i).Value2
I can highly recommend the PowerShell-Module "ImportExcel". This Modules enables you to import Excel-Files as easy as with Import-Csv
Without knowing much about your files/enviroment, you could try something like this:
foreach ($data in (Import-Excel "$PSScriptRoot\test.xlsx")) {
$result += $data.Sales
}
Write-Host $result

How to Remove Duplicate rows in Excel from Powershell?

I have the following code:
$xl = New-Object -comobject Excel.Application
$xl.visible = $true
$wb = $xl.Workbooks.Add("D:temp\test.xls")
$ws = $wb.worksheets.item(1)
$Range = $ws.range("J6:J65000")
$Range.Removeduplicates()
[gc]::collect()
[gc]::WaitForPendingFinalizers()
$xl.workbooks.close()
$xl.application.quit()
Comes back with "doesn't contain a method named 'RemoveDuplicates'
All i want to do is delete the row if a duplicate value in column J is found.
The data in column J is a long string (20 characters) of letters and numbers and some symbols like "#,=;-"
Can anyone help me?
it would be useful for someone, this works for me in office 365 pro plus version 1703 and powershell v5:
$path = 'C:\Users\john\Desktop\rows.xlsx'
$Excel = New-Object -ComObject excel.application
$Excel.visible = $True
$Excel.DisplayAlerts = $False
$Workbook = $excel.Workbooks.open($path)
$WS = $Workbook.WorkSheets.Add()
$Worksheet = $Workbook.WorkSheets.item('Sheet1')
$worksheet.activate()
$Range = $Worksheet.Range('A2:G154')
$Range.Copy() | Out-null
$Worksheet = $Workbook.WorkSheets.item('Sheet2')
$Range = $Worksheet.Range('A2:G154')
$Range.PasteSpecial(-4104)
$Range = $Worksheet.Range('A2:G154')
$Worksheet.UsedRange.RemoveDuplicates(1)
$workbook.Save()
$Excel.Quit()
Remove-Variable -Name excel
[gc]::collect()
[gc]::WaitForPendingFinalizers()
[System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$Workbook)
[System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$Worksheet)
[System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$Range)
BR!!