I have a spreadsheet that has spaces in the column names, how do I go about replacing the space with underscores on the column headers?
Note: I am new at this so bear with me
using this code with no luck:
Powershell: search & replace in xlsx except first 3 columns
Theo's code works great!
$sheetname = 'my Data'
$file = 'C:\Users\donkeykong\Desktop\1\booka.xlsx'
# create a COM Excel object
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $false
$workbook = $objExcel.Workbooks.Open($file)
$sheet = $workbook.Worksheets.Item($sheetname)
$sheet.Activate()
# get the number of columns used
$colMax = $sheet.UsedRange.Columns.Count
# loop over the column headers and replace the whitespaces
for ($col = 1; $col -le $colMax; $col++) {
$header = $sheet.Cells.Item(1, $col).Value() -replace '\s+', '_'
$sheet.Cells.Item(1, $col) = $header
}
# close and save the changes
$workbook.Close($true)
$objExcel.Quit()
# IMPORTANT: clean-up used Com objects
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($sheet)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($objExcel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
Assuming your Excel file has the headers in the first row, this should work without using the ImportExcel module:
$sheetname = 'my Data'
$file = 'C:\Users\donkeykong\Desktop\1\booka.xlsx'
# create a COM Excel object
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $false
$workbook = $objExcel.Workbooks.Open($file)
$sheet = $workbook.Worksheets.Item($sheetname)
$sheet.Activate()
# get the number of columns used
$colMax = $sheet.UsedRange.Columns.Count
# loop over the column headers and replace the whitespaces
for ($col = 1; $col -le $colMax; $col++) {
$header = $sheet.Cells.Item(1, $col).Value() -replace '\s+', '_'
$sheet.Cells.Item(1, $col) = $header
}
# close and save the changes
$workbook.Close($true)
$objExcel.Quit()
# IMPORTANT: clean-up used Com objects
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($sheet)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($objExcel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
Related
I have the below script that was posted here: Split Excelfile .xlxs with Powershell based on column values and it works as described but I am experiencing an issue.
The script will check one xlsx file and sort the content by unique values in column A and then copy these data set and create a new file.
The issue I am having is out of about 4000 files, 20 of them are created with the full data set from the original file. The rest of the files a split and created correctly. Not sure as to why this is occurring. Any help is appreciated.
Function Create-Excel-Spreadsheet {
Param($NameOfSpreadsheet)
# open excel
$excel = New-Object -ComObject excel.application
$excel.visible = $true
# add a worksheet
$workbook = $excel.Workbooks.Add()
$xl_wksht= $workbook.Worksheets.Item(1)
$xl_wksht.Name = $NameOfSpreadsheet
return $workbook
}
$objexcel = New-Object -ComObject Excel.Application
$wb = $objexcel.WorkBooks.Open("C:\Temp\Test.xlsx") # Changing path for test.xlsx file.
$objexcel.Visible = $true
$objexcel.DisplayAlerts = $False
$ws = $wb.Worksheets.Item(1)
$usedRange = $ws.UsedRange
$usedRange.AutoFilter()
$totalRows = $usedRange.Rows.Count
$rangeForUnique = $usedRange.Offset(1, 0).Resize($UsedRange.Rows.Count-1)
[string[]]$UniqueListOfRowValues = $rangeForUnique.Columns.Item(1).Value2 | sort -Unique
for ($i = 0; $i -lt $UniqueListOfRowValues.Count; $i++) {
$newRange = $usedRange.AutoFilter(1, $UniqueListOfRowValues[$i])
$workbook = Create-Excel-Spreadsheet $UniqueListOfRowValues[$i]
$wksheet = $workbook.Worksheets.Item(1)
$range = $ws.UsedRange.Cells
$range.Copy()
$wksheet.Paste($wksheet.Range("A1"))
$workbook.SaveAs("C:\temp\" + $UniqueListOfRowValues[$i], $xlFixedFormat)
$workbook.Close()
}
Im using the below code for splitting the huge file into 20K TSV UTF-8 files..
However, i need every split files should have the header in the 20k count, how we can do it?
$sourceFile = "C:\Users\lingaguru.c3\Desktop\Test\DE.txt"
$partNumber = 1
$batchSize = 20000
$pathAndFilename = "C:\Users\lingaguru.c3\Desktop\Test\Temp part $partNumber file.tsv"
[System.Text.Encoding]$enc = [System.Text.Encoding]::GetEncoding(65001) # utf8 this one
$fs=New-Object System.IO.FileStream ($sourceFile,"OpenOrCreate", "Read", "ReadWrite",8,"None")
$streamIn=New-Object System.IO.StreamReader($fs, $enc)
$streamout = new-object System.IO.StreamWriter $pathAndFilename
$line = $streamIn.readline()
$counter = 0
while ($line -ne $null)
{
$streamout.writeline($line)
$counter +=1
if ($counter -eq $batchsize)
{
$partNumber+=1
$counter =0
$streamOut.close()
$pathAndFilename = "C:\Users\lingaguru.c3\Desktop\Test\Temp part $partNumber file.tsv"
$streamout = new-object System.IO.StreamWriter $pathAndFilename
}
$line = $streamIn.readline()
}
$streamin.close()
$streamout.close()
Without altering your code too much, you need to capture the header line in a variable and write that out first thing on every new file:
$sourceFile = "C:\Users\lingaguru.c3\Desktop\Test\DE.txt"
# using a template filename saves writing
$pathOut = "C:\Users\lingaguru.c3\Desktop\Test\Temp part {0} file.tsv"
$partNumber = 1
$batchSize = 20000 # max number of data lines to write in each part
# construct the output filename using the template $pathOut
$pathAndFilename = $pathOut -f $partNumber
$enc = [System.Text.Encoding]::UTF8
$fs = [System.IO.FileStream]::new($sourceFile,"Open", "Read") # don't need write access on source file
$streamIn = [System.IO.StreamReader]::new($fs, $enc)
$streamout = [System.IO.StreamWriter]::new($pathAndFilename)
# assuming the first line contains the headers
$header = $streamIn.ReadLine()
# write out the header on the first part
$streamout.WriteLine($header)
$counter = 0
while (($line = $streamIn.ReadLine()) -ne $null) {
$streamout.WriteLine($line)
$counter++
if ($counter -ge $batchsize) {
$partNumber++
$counter = 0
$streamOut.Flush()
$streamOut.Dispose()
$pathAndFilename = $pathOut -f $partNumber
$streamout = [System.IO.StreamWriter]::new($pathAndFilename)
# write the header on this new part
$streamout.WriteLine($header)
}
}
$streamin.Dispose()
$streamout.Dispose()
$fs.Dispose()
I have an excel file with 10 columns. I want to get the sum of the column with header "Sales" and print it on the console.
How this can be done with PowerShell? I am using the below code but I do not know how to replace H with $i in the following expression:
='=SUM(H1:H'+$RowCount')'
Where H is column "Sales"
$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $False
$NewWorkbook = $Excel.Workbooks.open("C:\Test.xlsx")
$NewWorksheet = $NewWorkbook.Worksheets.Item(1)
$NewWorksheet.Activate() | Out-Null
$NewWorksheetRange = $NewWorksheet.UsedRange
$RowCount = $NewWorksheetRange.Rows.Count
$ColumnCount = $NewWorksheetRange.Columns.Count
for ($i = 1; $i -lt $ColumnCount; $i++)
{
if ($NewWorksheet.cells.Item(1,$i).Value2 -eq "Sales")
{
$NewWorksheet.Cells.Item($RowCount+2,$i)='=SUM(H1:H'+$RowCount')'
Write-Host $NewWorksheet.Cells.Item($RowCount+1,$i).Value2
}
}
$Excel.Application.DisplayAlerts=$False
$NewWorkbook.SaveAs("C:\Test_New.xlsx")
$NewWorkbook.close($false)
$Excel.quit()
spps -n excel
I have replaced:
$NewWorksheet.Cells.Item($RowCount+2,$i) ='=SUM(H1:H'+(1+$RowCount)+')'
Write-Host $NewWorksheet.Cells.Item($RowCount+2,$i).Value2
with:
$FirstCell = $NewWorksheet.Cells(2,$i).Address('+True, False+')
$LastCell = $NewWorksheet.Cells(1+$RowCount,$i).Address('+True, False+')
$NewWorksheet.Cells.Item($RowCount+2,$i)='=SUM('+$FirstCell+':'+$LastCell+')'
Write-Host $NewWorksheet.Cells.Item($RowCount+2,$i).Value2
I can highly recommend the PowerShell-Module "ImportExcel". This Modules enables you to import Excel-Files as easy as with Import-Csv
Without knowing much about your files/enviroment, you could try something like this:
foreach ($data in (Import-Excel "$PSScriptRoot\test.xlsx")) {
$result += $data.Sales
}
Write-Host $result
I have the following code:
$xl = New-Object -comobject Excel.Application
$xl.visible = $true
$wb = $xl.Workbooks.Add("D:temp\test.xls")
$ws = $wb.worksheets.item(1)
$Range = $ws.range("J6:J65000")
$Range.Removeduplicates()
[gc]::collect()
[gc]::WaitForPendingFinalizers()
$xl.workbooks.close()
$xl.application.quit()
Comes back with "doesn't contain a method named 'RemoveDuplicates'
All i want to do is delete the row if a duplicate value in column J is found.
The data in column J is a long string (20 characters) of letters and numbers and some symbols like "#,=;-"
Can anyone help me?
it would be useful for someone, this works for me in office 365 pro plus version 1703 and powershell v5:
$path = 'C:\Users\john\Desktop\rows.xlsx'
$Excel = New-Object -ComObject excel.application
$Excel.visible = $True
$Excel.DisplayAlerts = $False
$Workbook = $excel.Workbooks.open($path)
$WS = $Workbook.WorkSheets.Add()
$Worksheet = $Workbook.WorkSheets.item('Sheet1')
$worksheet.activate()
$Range = $Worksheet.Range('A2:G154')
$Range.Copy() | Out-null
$Worksheet = $Workbook.WorkSheets.item('Sheet2')
$Range = $Worksheet.Range('A2:G154')
$Range.PasteSpecial(-4104)
$Range = $Worksheet.Range('A2:G154')
$Worksheet.UsedRange.RemoveDuplicates(1)
$workbook.Save()
$Excel.Quit()
Remove-Variable -Name excel
[gc]::collect()
[gc]::WaitForPendingFinalizers()
[System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$Workbook)
[System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$Worksheet)
[System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$Range)
BR!!
I need to sort an excel spreadsheet by sorting one column be ascending numbers so 1,2,3,4,5...
Does anyone know a quick and dirty way to sort a excel column in powershell?
function Release-Ref ($ref) {
([System.Runtime.InteropServices.Marshal]::ReleaseComObject(
[System.__ComObject]$ref) -gt 0)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
}
$objExcel = new-object -comobject excel.application
$objExcel.Visible = $True
$objWorkbook = $objExcel.Workbooks.Open("C:\test\flag for errors\1921BB.xls")
$objWorksheet = $objWorkbook.Worksheets.Item(1)
$objRange = $objWorksheet.UsedRange
$objRange2 = $objworksheet.Range("E1")
[void] $objRange.Sort($objRange2)
$objWorkbook.Save()
$a = Release-Ref($objWorksheet)
$a = Release-Ref($objWorkbook)
$a = Release-Ref($objExcel)