How to Remove Duplicate rows in Excel from Powershell? - powershell

I have the following code:
$xl = New-Object -comobject Excel.Application
$xl.visible = $true
$wb = $xl.Workbooks.Add("D:temp\test.xls")
$ws = $wb.worksheets.item(1)
$Range = $ws.range("J6:J65000")
$Range.Removeduplicates()
[gc]::collect()
[gc]::WaitForPendingFinalizers()
$xl.workbooks.close()
$xl.application.quit()
Comes back with "doesn't contain a method named 'RemoveDuplicates'
All i want to do is delete the row if a duplicate value in column J is found.
The data in column J is a long string (20 characters) of letters and numbers and some symbols like "#,=;-"
Can anyone help me?

it would be useful for someone, this works for me in office 365 pro plus version 1703 and powershell v5:
$path = 'C:\Users\john\Desktop\rows.xlsx'
$Excel = New-Object -ComObject excel.application
$Excel.visible = $True
$Excel.DisplayAlerts = $False
$Workbook = $excel.Workbooks.open($path)
$WS = $Workbook.WorkSheets.Add()
$Worksheet = $Workbook.WorkSheets.item('Sheet1')
$worksheet.activate()
$Range = $Worksheet.Range('A2:G154')
$Range.Copy() | Out-null
$Worksheet = $Workbook.WorkSheets.item('Sheet2')
$Range = $Worksheet.Range('A2:G154')
$Range.PasteSpecial(-4104)
$Range = $Worksheet.Range('A2:G154')
$Worksheet.UsedRange.RemoveDuplicates(1)
$workbook.Save()
$Excel.Quit()
Remove-Variable -Name excel
[gc]::collect()
[gc]::WaitForPendingFinalizers()
[System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$Workbook)
[System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$Worksheet)
[System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$Range)
BR!!

Related

Poweshell: force save of xlsm

First time here, hope the Community can help :-)
I tried the following with Powershell
it works well with an xlsx file
but it is KO with xlsm
EDIT:
I changed the code sequence below and force .DisplayAlert .
CSV files save are OK, but with console error.
Any idea for these errors ?
Console error
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $false
$Excel.DisplayAlerts = $false
$wb = $Excel.Workbooks.Open($fullpathExcelFile)
Start-Sleep -s 30
foreach ($ws in $wb.Worksheets) {
$Excel.DisplayAlerts = $false
$ws.SaveAs("$fullpathOutputdir\" + $ws.Name + ".csv", 6)
Write-Host "Save csv $ws.Name"
}
$Excel.DisplayAlerts = $false
$wb.Save()
$Excel.Quit()
Thanks !

powershell rename XLSX spreadsheet columns

I have a spreadsheet that has spaces in the column names, how do I go about replacing the space with underscores on the column headers?
Note: I am new at this so bear with me
using this code with no luck:
Powershell: search & replace in xlsx except first 3 columns
Theo's code works great!
$sheetname = 'my Data'
$file = 'C:\Users\donkeykong\Desktop\1\booka.xlsx'
# create a COM Excel object
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $false
$workbook = $objExcel.Workbooks.Open($file)
$sheet = $workbook.Worksheets.Item($sheetname)
$sheet.Activate()
# get the number of columns used
$colMax = $sheet.UsedRange.Columns.Count
# loop over the column headers and replace the whitespaces
for ($col = 1; $col -le $colMax; $col++) {
$header = $sheet.Cells.Item(1, $col).Value() -replace '\s+', '_'
$sheet.Cells.Item(1, $col) = $header
}
# close and save the changes
$workbook.Close($true)
$objExcel.Quit()
# IMPORTANT: clean-up used Com objects
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($sheet)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($objExcel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
Assuming your Excel file has the headers in the first row, this should work without using the ImportExcel module:
$sheetname = 'my Data'
$file = 'C:\Users\donkeykong\Desktop\1\booka.xlsx'
# create a COM Excel object
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $false
$workbook = $objExcel.Workbooks.Open($file)
$sheet = $workbook.Worksheets.Item($sheetname)
$sheet.Activate()
# get the number of columns used
$colMax = $sheet.UsedRange.Columns.Count
# loop over the column headers and replace the whitespaces
for ($col = 1; $col -le $colMax; $col++) {
$header = $sheet.Cells.Item(1, $col).Value() -replace '\s+', '_'
$sheet.Cells.Item(1, $col) = $header
}
# close and save the changes
$workbook.Close($true)
$objExcel.Quit()
# IMPORTANT: clean-up used Com objects
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($sheet)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($objExcel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()

Issue with splitting Excel file (.xlxs) with Powershell based on column values

I have the below script that was posted here: Split Excelfile .xlxs with Powershell based on column values and it works as described but I am experiencing an issue.
The script will check one xlsx file and sort the content by unique values in column A and then copy these data set and create a new file.
The issue I am having is out of about 4000 files, 20 of them are created with the full data set from the original file. The rest of the files a split and created correctly. Not sure as to why this is occurring. Any help is appreciated.
Function Create-Excel-Spreadsheet {
Param($NameOfSpreadsheet)
# open excel
$excel = New-Object -ComObject excel.application
$excel.visible = $true
# add a worksheet
$workbook = $excel.Workbooks.Add()
$xl_wksht= $workbook.Worksheets.Item(1)
$xl_wksht.Name = $NameOfSpreadsheet
return $workbook
}
$objexcel = New-Object -ComObject Excel.Application
$wb = $objexcel.WorkBooks.Open("C:\Temp\Test.xlsx") # Changing path for test.xlsx file.
$objexcel.Visible = $true
$objexcel.DisplayAlerts = $False
$ws = $wb.Worksheets.Item(1)
$usedRange = $ws.UsedRange
$usedRange.AutoFilter()
$totalRows = $usedRange.Rows.Count
$rangeForUnique = $usedRange.Offset(1, 0).Resize($UsedRange.Rows.Count-1)
[string[]]$UniqueListOfRowValues = $rangeForUnique.Columns.Item(1).Value2 | sort -Unique
for ($i = 0; $i -lt $UniqueListOfRowValues.Count; $i++) {
$newRange = $usedRange.AutoFilter(1, $UniqueListOfRowValues[$i])
$workbook = Create-Excel-Spreadsheet $UniqueListOfRowValues[$i]
$wksheet = $workbook.Worksheets.Item(1)
$range = $ws.UsedRange.Cells
$range.Copy()
$wksheet.Paste($wksheet.Range("A1"))
$workbook.SaveAs("C:\temp\" + $UniqueListOfRowValues[$i], $xlFixedFormat)
$workbook.Close()
}

In worksheet, if a word is found in A5 cell then want to copy B5 cell value, using PowerShell

In Excel there are 2 columns with A and B value. If the word "courage" is found in A5 cell then I want to copy B5 cell value.
And if "courage" word is in A10 then copy B10 value (opposite cell). I have prepared the below script to find courage it is working correctly but unable to copy opposite cell value.
$Excel = New-Object -ComObject Excel.Application
$Workbook = $Excel.Workbooks.Open('C:\Users\Raj\Desktop\Book1.xlsx')
$workSheet = $Workbook.Sheets.Item(1)
$WorkSheet.Name
$Found = $WorkSheet.Cells.Find('courage')
If ($Found.)
Try this out -
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $true
$Excel.DisplayAlerts = $false
$Workbook = $Excel.Workbooks.Open('C:\Users\Raj\Desktop\Book1.xlsx')
$workSheet = $Workbook.Sheets.Item(1)
$column = 1
$LastRow = $workSheet.UsedRange.Rows.Count
for($row=1; $row -lt $LastRow; $row++)
{
if ($WorkSheet.Cells.Item($row, $column).Value2 -eq "courage")
{
$WorkSheet.Cells.Item($row, $column+1).Value2 = $WorkSheet.Cells.Item($row, $column).Value2
}
}
#Proceed with other operations
#else save the worksheet and workbook
#Quit excel after you are done

How to sort an excel column using Powershell

I need to sort an excel spreadsheet by sorting one column be ascending numbers so 1,2,3,4,5...
Does anyone know a quick and dirty way to sort a excel column in powershell?
function Release-Ref ($ref) {
([System.Runtime.InteropServices.Marshal]::ReleaseComObject(
[System.__ComObject]$ref) -gt 0)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
}
$objExcel = new-object -comobject excel.application
$objExcel.Visible = $True
$objWorkbook = $objExcel.Workbooks.Open("C:\test\flag for errors\1921BB.xls")
$objWorksheet = $objWorkbook.Worksheets.Item(1)
$objRange = $objWorksheet.UsedRange
$objRange2 = $objworksheet.Range("E1")
[void] $objRange.Sort($objRange2)
$objWorkbook.Save()
$a = Release-Ref($objWorksheet)
$a = Release-Ref($objWorkbook)
$a = Release-Ref($objExcel)