How to convert XML based XLS file to XLSX? - powershell

I have a bunch of XLS files. On opening the file I got the prompt: format and extension don't match
Later, I found out that this is an old XML based XLS file. For that reason I couldn't directly import those files into R or SAS.
I tried opening one and use Save as to save the file in any format supported by R and SAS like XLSX or CSV etc.
The problem is there are hundreds of files, not quite viable to open and save as one by one.
Any process will be great that I can incorporate into PowerShell process.

Try this powershell solution:
$Excel = New-Object -Com Excel.Application
foreach ($File in (gci *xls)) {
$Workbook = $Excel.Workbooks.Open($File.FullName)
$Workbook.SaveAs(($File.FullName + "x"), 51)
$Workbook.Close($false)}
$Excel.Quit()
Or if you want the files in csv:
$Excel = New-Object -Com Excel.Application
foreach ($File in (gci *xls)) {
$Workbook = $Excel.Workbooks.Open($File.FullName)
$Workbook.SaveAs($File.FullName.Replace(".xls",".csv"), 6)
$Workbook.Close($false)}
$Excel.Quit()

Related

I have a script that makes some manipulation with .xlsx file. How do i loop it with all files within folder?

I have script that updates query in excel file
$filePath = "C:\Scripts\SheetToRefresh.xlsx"
$excelObj = New-Object -ComObject Excel.Application
$excelObj.Visible = $true
$workBook = $excelObj.Workbooks.Open($filePath)
$workSheet = $workBook.Sheets.Item("Data")
$workSheet.Select()
$workBook.RefreshAll()
$workBook.Save()
Original script comes from here
Now i need to loop it wihtin folder, i came up with:
$files = Get-ChildItem "C:\path" -Filter *.xlsx
foreach ($f in $files){
}
but struggling with changing filename for each file.(newbie with ps)
Let's break down what needs to happen:
Before:
Open Excel
Enumerate files
During, for each file:
Open workbook
Run the relevant part of your existing script
Save and close workbook
After:
Close Excel
So, let's start by moving the "Before" actions to the top of your new script:
# Open Excel
$excelObj = New-Object -ComObject Excel.Application
$excelObj.Visible = $true
# Enumerate files
$files = Get-ChildItem "C:\path" -Filter *.xlsx
Now we need to move the relevant parts of the existing script into the new loop. To get the full path of the file object returned by Get-ChildItem, use the FullName property:
foreach($file in $files){
# Open workbook from $file
$workBook = $excelObj.Workbooks.Open($file.FullName)
# Refresh query results
$workSheet = $workBook.Sheets.Item("Data")
$workSheet.Select()
$workBook.RefreshAll()
# Save updated workbook to file
$workBook.Save()
# Close workbook
$workBook.Close()
}
And finally we just need to quit Excel:
$excelObj.Quit()

Extract data from excel using powershell

sample excel image I need export data from a column in excel and exclude some characters then export to a txt file.(excel Sample attached). Basically I need to extract ONLY names in the Orders column and output to a text file, here is what I have so far:
#Specify the Sheet name
$SheetName = "Today Order"
# Create an Object Excel.Application using Com interface
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $false
# Open the Excel file and save it in $WorkBook
$WorkBook = $objExcel.Workbooks.Open($FilePath)
# Load the WorkSheet 'Change Auditor Events'
$WorkSheet = $WorkBook.sheets.item($SheetName)
#====
I can use the replace command below to trim off unneeded characters in the Orders column, I only need the names
-replace "order from " -replace " California"
How can I assign variable to the orders column and process each line then use the out-file to export? Or do you have any other good suggestion to do this.
Thanks in Advance.
I assumed your data is in column A. Correct as needed.
I used regex to pull the name out from your sentence. -Match writes to the magic variable "$matches"
It's worth mentioning that using COM objects is the "hard" way to do this.
The very easy way is saving as csv.
The easy way is using a module that handles .xlsx files.
#Specify the Sheet name
$SheetName = "Today Order"
$FilePath = "C:\whatever.xlsx"
# Create an Object Excel.Application using Com interface
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $false
# Open the Excel file and save it in $WorkBook
$WorkBook = $objExcel.Workbooks.Open($FilePath)
# Load the WorkSheet 'Change Auditor Events'
$WorkSheet = $WorkBook.sheets.item($SheetName)
$MyData = [System.Collections.Generic.List[string]]::new() #Potentially better performance than immutable array if you're running this on thousands of rows.
for($i = 2; $i -le $WorkSheet.UsedRange.Rows.Count; $i++) {
($Worksheet.Range("a$i").text) -match 'from (?<name>.+) in'
$MyData.Add($Matches.name)
}
$MyData | Out-File "C:\output.txt"

Powershell script to convert excel files to csv files with card numbers with 16 digits

I have this excel sheets and I want to have the same format for csv files. Could some one help me with a automation script please (to convert multiple excel sheets to csv files)??
I tried this script, but the 16th digit of the card number is turning to be zero as excel can read only 15 digits right. Can we modify this code to convert multiple excel sheets to csv files?
Could someone help me with this.
Convert Excel file to CSV
$xlCSV=6
$Excelfilename = “C:\Temp\file.xlsx”
$CSVfilename = “C:\Temp\file.csv”
$Excel = New-Object -comobject Excel.Application
$Excel.Visible = $False
$Excel.displayalerts=$False
$Workbook = $Excel.Workbooks.Open($ExcelFileName)
$Workbook.SaveAs($CSVfilename,$xlCSV)
$Excel.Quit()
If(ps excel){kill -name excel}
Excel is really particular in its handling of CSV files..
Although the 16 digit numbers are written out in full when using the SaveAs method, if you re-open it by double-clicking the csv file, Excel screws up these numbers by converting them to numeric values instead of strings.
In order to force Excel to NOT interpret these values and simply regard them as strings, you need to adjust the values in the csv file afterwards, by prefixing them with a TAB character.
(this will make the file useless for other applications..)
Of course, you need to know the correct column header to do this.
Let's assume your Excel file looks like this:
As you can see, the value we need to adjust is stored in column Number
To output csv files on which you can double-click so they are opened in Excel, the code below would do that for you:
$xlCSV = 6
$Excelfiles = 'D:\test.xlsx', 'D:\test2.xlsx' # an array of files to convert
$ColumnName = 'Number' # example, you need to know the column name
# create an Excel COM object
$Excel = New-Object -comobject Excel.Application
$Excel.Visible = $False
$Excel.DisplayAlerts = $False
foreach ($fileName in $Excelfiles) {
$Workbook = $Excel.Workbooks.Open($fileName)
# use the same file name, but change the extension to .csv for output
$CSVfile = [System.IO.Path]::ChangeExtension($fileName, 'csv')
# have Excel save the csv file
$Workbook.SaveAs($CSVfile, $xlCSV)
$Workbook.Close($false)
}
# close excel and clean up the used COM objects
$Excel.Quit()
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($Workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($Excel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
# now import the csv files just created and update the card number
# column by prefixing the value with a TAB character ("`t").
# this will effectively force Excel NOT to interpret the value as numeric.
# you better not do this inside the same loop, because Excel keeps a lock
# on outputted csv files there.
foreach ($fileName in $Excelfiles) {
# use the same file name, but change the extension to .csv for output
$CSVfile = [System.IO.Path]::ChangeExtension($fileName, 'csv')
# the '-UseCulture' switch makes sure the same delimiter character is used
$csv = Import-Csv -Path $CSVfile -UseCulture
foreach ($item in $csv) { $item.$ColumnName = "`t" + $item.$ColumnName }
# re-save the csv file with updates values
$csv | Export-Csv -Path $CSVfile -UseCulture -NoTypeInformation
}

create csv from xls using powershell

I want create powershell script which create me csv file from .xls file but I don't know excacly how to use powershell wihout vba.
So far i have this :
ConvertTo-Csv "C:\Users\Me\TestsShella\test.xlsx" | Out-File Q:\test\testShella.csv
But it doesn't working.
With Excel present on the running machine use it as a COM-object:
## Q:\Test\2019\01\31\SO_54461362.ps1
$InFile = Get-Item "$($Env:USERPROFILE)\TestsShella\test.xlsx"
$OutFile= $InFile.FullName.replace($InFile.Extension,".csv")
$Excel = new-object -ComObject "Excel.Application"
$Excel.DisplayAlerts = $True
$Excel.Visible = $False # $True while testing
$WorkBook = $Excel.Workbooks.Open($InFile.FullName)
$WorkBook.SaveAs($OutFile, 6) # 6 -> type csv
$WorkBook.Close($True)
$Excel.Quit()
[void][System.Runtime.Interopservices.Marshal]::ReleaseComObject($Excel)
Depending on the locale (decimal point/comma) the csv file will either be comma or semicolon seperated.
Without Excel being installed, use the already suggest module ImportExcel
$InFile = Get-Item "$($Env:USERPROFILE)\TestsShella\test.xlsx"
$OutFile= $InFile.FullName.replace($InFile.Extension,".csv")
Import-Excel $Infile.FullName | Export-Csv $OutFile -NoTypeInformation
This yields a .csv file with all fields double quoted and comma seperated.
There is a prebuilt library for this:
https://www.powershellgallery.com/packages/ImportExcel/5.4.4
You will then have the import-excel function/cmdlet available to you and will be able to import, convert to csv and then export
Maybe this could work:
rename-item -Path "C:\Users\Me\TestsShella\test.xlsx" -NewName "item.csv"
you will get a message when open the CSV, but the format of CSV is like XLSX.

Copy Sheets from Existing to new Workbook and use a cell as a reference for file name

Am trying to automate certain tasks that I have to do, that although simple are tedious, due to the number of files. I currently have a script that will refresh every file within a folder, now these files have more worksheets that what my client needs, so after refreshing, I need to copy/paste the first two sheets in a new workbook, save in a general location where the client pick's it up. I have added what I thought was good code to do this copy/paste, but unfortunately, I'm getting errors in the copy/paste section as well as the SaveAs part. I did some research here and at "powershell.org", but couldn't find anything that helped :(.
This is my code:
Measure-Command {
$excel = new-object -comobject excel.application
$excel.DisplayAlerts = $false
$excelFiles = Get-ChildItem -Path "Network folder location" -Include *.xls, *.xlsm,*.xlsx, *.lnk -Recurse
Foreach($file in $excelFiles) {
$workbook = $excel.workbooks.open($file.fullname)
foreach ($Conn in $workbook.Connections){
$Conn.OLEDBConnection.BackgroundQuery = $false
$Conn.refresh()
}
$workBook.RefreshAll()
$workbook.save()
$wb2 = $excel.Workbooks.Add()
$sheetToCopy = $workbook.sheets.item(1),$workbook.sheets.item(2) #Source
$sheetToCopy.CopyTo($wb2) #Destination
$filename = $wb2.Sheets.Item(2).Cells.Item(4,2) #Destination file, 2nd sheet, column D row 2 has what I want to call the file (RVP John Doe - Dashboard)
$wb2.SavesAs("Networkfolder\$filename.xlsx")
$workbook.close()
$wb2.close()
}
$excel.quit()
}