Extract data from excel using powershell - powershell

sample excel image I need export data from a column in excel and exclude some characters then export to a txt file.(excel Sample attached). Basically I need to extract ONLY names in the Orders column and output to a text file, here is what I have so far:
#Specify the Sheet name
$SheetName = "Today Order"
# Create an Object Excel.Application using Com interface
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $false
# Open the Excel file and save it in $WorkBook
$WorkBook = $objExcel.Workbooks.Open($FilePath)
# Load the WorkSheet 'Change Auditor Events'
$WorkSheet = $WorkBook.sheets.item($SheetName)
#====
I can use the replace command below to trim off unneeded characters in the Orders column, I only need the names
-replace "order from " -replace " California"
How can I assign variable to the orders column and process each line then use the out-file to export? Or do you have any other good suggestion to do this.
Thanks in Advance.

I assumed your data is in column A. Correct as needed.
I used regex to pull the name out from your sentence. -Match writes to the magic variable "$matches"
It's worth mentioning that using COM objects is the "hard" way to do this.
The very easy way is saving as csv.
The easy way is using a module that handles .xlsx files.
#Specify the Sheet name
$SheetName = "Today Order"
$FilePath = "C:\whatever.xlsx"
# Create an Object Excel.Application using Com interface
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $false
# Open the Excel file and save it in $WorkBook
$WorkBook = $objExcel.Workbooks.Open($FilePath)
# Load the WorkSheet 'Change Auditor Events'
$WorkSheet = $WorkBook.sheets.item($SheetName)
$MyData = [System.Collections.Generic.List[string]]::new() #Potentially better performance than immutable array if you're running this on thousands of rows.
for($i = 2; $i -le $WorkSheet.UsedRange.Rows.Count; $i++) {
($Worksheet.Range("a$i").text) -match 'from (?<name>.+) in'
$MyData.Add($Matches.name)
}
$MyData | Out-File "C:\output.txt"

Related

Powershell - add multiple columns to multiple excel files

I have a folder that has over 50 excel files in it ("Project dump' in the path below.) All of these files contain the same exact data (its archived monthly data that's used for a MoM report) I need to update all of these files to add 10 new column headers - none of these columns will have any data in them, they just need to be added to the table to match the most current month extract that will have data in it going forward.
I've been using Powershell, and have a script that can add one column to one file at a time, but it would honestly be faster for me to manually open each file and add the columns myself. I cant seem to figure out how to change my script to do what its doing to multiple files (and with multiple columns), any help would be greatly appreciated!
background; the reference is a specific file in my project dump folder. Column 50 is the first blank column, that needs to be added to the table:
(Get-ChildItem "C:\Downloads\Project dump\ArchiveJAN21.xlsx")|
foreach-object {
$xl=New-Object -ComObject Excel.Application
$wb=$xl.workbooks.open($_)
$ws = $wb.worksheets.Item(1)
$ws.Columns.ListObject.ListColumns.Add(50)
$ws.Cells.Item(1,50) ='Call Type'
$wb.Save()
$xl.Quit()
while([System.Runtime.Interopservices.Marshal]::ReleaseComObject([System.__ComObject]$xl)){'released'| Out-Null}
}
You need to define the Excel object before the loop and quit afterwards.
Also, use Get-ChildItem to get FileInfo objects from a folder path, not a hardcoded path to a file.
Try:
# an array with the new column names
$newColumns = 'Call Type','NewCol2','NewCol3','NewCol4','NewCol5','NewCol6','NewCol7','NewCol8','NewCol9','NewCol10'
# create the Excel object outside of the loop
$xl = New-Object -ComObject Excel.Application
$xl.Visible = $false
# loop thtrough the files in the folder
(Get-ChildItem -Path 'C:\Downloads\Project dump' -Filter '*.xlsx' -File ) | ForEach-Object {
$wb = $xl.WorkBooks.Open($_.FullName)
$ws = $wb.Worksheets.Item(1)
# get the number of columns in the sheet
$startColumn = $ws.UsedRange.Columns.Count
for ($i = 0; $i -lt $newColumns.Count; $i++) {
$startColumn++ # increment the column counter
$ws.Cells.Item(1, $startColumn) = $newColumns[$i]
}
$wb.Close($true) # $true saves the changes
}
# quit Excel and clean COM objects from memory
$xl.Quit()
# clean up the COM objects used
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($ws)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($wb)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($xl)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()

Filling Color to the first row in excell sheet using PS script

I wrote a script where in it will export all the SSL certificate details to an excel sheet, but i wanted a help to fill the color to the first row of the sheet.
Please help me in writing the script.
My Script
Clear-Host
$threshold = 300 #Number of days to look for expiring certificates
$deadline = (Get-Date).AddDays($threshold) #Set deadline date
Invoke-Command -ComputerName 'AAA', 'BBB' {
Get-ChildItem -Path 'Cert:\LocalMachine\My' -Recurse |
Select-Object -Property Issuer, Subject, NotAfter,
#{Label = 'ServerName';Expression = {$env:COMPUTERNAME}},
#{Label='Expires In (Days)';Expression = {(New-TimeSpan -Start (Get-Date) -End $PSitem.NotAfter).Days}}
} | Export-Csv -Path C:\users\$env:username\documents\WorkingScript.csv -NoTypeInformation -Force
Thanks in Advance.
You are creating a CSV file which does not hold formatting. Instead you could interact with Excel directly and create formatting that way.
As Olaf mentioned you could import a module to do this, or use the Excel com object.
Example using the COM object below
# Creating COM object to interact with excel
$excel = New-Object -ComObject Excel.Application
# set to false to hide the application
$excel.visible = $true
# Add a workbook to the application
$workbook = $excel.Workbooks.Add()
# Adding a workbook automatically adds a sheet.
# We select it and then name it
$worksheetOne = $workbook.Worksheets.Item(1)
$worksheetOne.Name = 'Data'
# Setting the text in two different cells
$worksheetOne.Cells.Item(1, 1) = 'Column One Text'
$worksheetOne.Cells.Item(1, 2) = 'Column Two Text'
# Selecting the EntireRow of the cell "1,1" and setting it to a color
$worksheetOne.Cells.Item(1, 1).EntireRow.Interior.ColorIndex = 4
# Setting the same row to bold
$worksheetOne.Cells.Item(1, 1).EntireRow.Font.Bold = $true
# Option autofit all columns
$worksheetOne.UsedRange.EntireColumn.AutoFit() | Out-Null
# Save the file
$excel.ActiveWorkbook.SaveAs('C:\Users\Username\example.xlsx')
You can see some of the colors below.
https://learn.microsoft.com/en-us/office/vba/api/excel.colorindex

Powershell script to convert excel files to csv files with card numbers with 16 digits

I have this excel sheets and I want to have the same format for csv files. Could some one help me with a automation script please (to convert multiple excel sheets to csv files)??
I tried this script, but the 16th digit of the card number is turning to be zero as excel can read only 15 digits right. Can we modify this code to convert multiple excel sheets to csv files?
Could someone help me with this.
Convert Excel file to CSV
$xlCSV=6
$Excelfilename = “C:\Temp\file.xlsx”
$CSVfilename = “C:\Temp\file.csv”
$Excel = New-Object -comobject Excel.Application
$Excel.Visible = $False
$Excel.displayalerts=$False
$Workbook = $Excel.Workbooks.Open($ExcelFileName)
$Workbook.SaveAs($CSVfilename,$xlCSV)
$Excel.Quit()
If(ps excel){kill -name excel}
Excel is really particular in its handling of CSV files..
Although the 16 digit numbers are written out in full when using the SaveAs method, if you re-open it by double-clicking the csv file, Excel screws up these numbers by converting them to numeric values instead of strings.
In order to force Excel to NOT interpret these values and simply regard them as strings, you need to adjust the values in the csv file afterwards, by prefixing them with a TAB character.
(this will make the file useless for other applications..)
Of course, you need to know the correct column header to do this.
Let's assume your Excel file looks like this:
As you can see, the value we need to adjust is stored in column Number
To output csv files on which you can double-click so they are opened in Excel, the code below would do that for you:
$xlCSV = 6
$Excelfiles = 'D:\test.xlsx', 'D:\test2.xlsx' # an array of files to convert
$ColumnName = 'Number' # example, you need to know the column name
# create an Excel COM object
$Excel = New-Object -comobject Excel.Application
$Excel.Visible = $False
$Excel.DisplayAlerts = $False
foreach ($fileName in $Excelfiles) {
$Workbook = $Excel.Workbooks.Open($fileName)
# use the same file name, but change the extension to .csv for output
$CSVfile = [System.IO.Path]::ChangeExtension($fileName, 'csv')
# have Excel save the csv file
$Workbook.SaveAs($CSVfile, $xlCSV)
$Workbook.Close($false)
}
# close excel and clean up the used COM objects
$Excel.Quit()
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($Workbook)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($Excel)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
# now import the csv files just created and update the card number
# column by prefixing the value with a TAB character ("`t").
# this will effectively force Excel NOT to interpret the value as numeric.
# you better not do this inside the same loop, because Excel keeps a lock
# on outputted csv files there.
foreach ($fileName in $Excelfiles) {
# use the same file name, but change the extension to .csv for output
$CSVfile = [System.IO.Path]::ChangeExtension($fileName, 'csv')
# the '-UseCulture' switch makes sure the same delimiter character is used
$csv = Import-Csv -Path $CSVfile -UseCulture
foreach ($item in $csv) { $item.$ColumnName = "`t" + $item.$ColumnName }
# re-save the csv file with updates values
$csv | Export-Csv -Path $CSVfile -UseCulture -NoTypeInformation
}

How to convert XML based XLS file to XLSX?

I have a bunch of XLS files. On opening the file I got the prompt: format and extension don't match
Later, I found out that this is an old XML based XLS file. For that reason I couldn't directly import those files into R or SAS.
I tried opening one and use Save as to save the file in any format supported by R and SAS like XLSX or CSV etc.
The problem is there are hundreds of files, not quite viable to open and save as one by one.
Any process will be great that I can incorporate into PowerShell process.
Try this powershell solution:
$Excel = New-Object -Com Excel.Application
foreach ($File in (gci *xls)) {
$Workbook = $Excel.Workbooks.Open($File.FullName)
$Workbook.SaveAs(($File.FullName + "x"), 51)
$Workbook.Close($false)}
$Excel.Quit()
Or if you want the files in csv:
$Excel = New-Object -Com Excel.Application
foreach ($File in (gci *xls)) {
$Workbook = $Excel.Workbooks.Open($File.FullName)
$Workbook.SaveAs($File.FullName.Replace(".xls",".csv"), 6)
$Workbook.Close($false)}
$Excel.Quit()

Using powershell to create a PDF

I am working on a script to create a PDF from PowerShell. I have it working using Word, using the ComObject. But that means that computer I run it on has to have Word. I was wandering if there was a way to make a PDF file from notepad maybe. This is the code I have making the PDF from Word.
<`#make pdf
# Required Word Variables
$wdExportFormatPDF = 17
$wdDoNotSaveChanges = 0
# Create a hidden Word window
$word = New-Object -ComObject word.application
$word.visible = $false
# Add a Word document
$doc = $word.documents.add()
# Put the text into the Word document
$txt = Get-Content $txtPath
$selection = $word.selection
foreach($line in $txt){
$selection.typeText($line) | Format-wide
$selection.typeparagraph()
}
# Export the PDF file and close without saving a Word document
$doc.ExportAsFixedFormat($pdfPath,$wdExportFormatPDF)
if($?){write-host 'Users and Groups saved to ' $pdfPath -ForegroundColor Cyan}
$doc.close([ref]$wdDoNotSaveChanges)
$word.Quit()`>