Break Excel File by column value

Break Excel File by column value - powershell

I have an Excel data source file which contains some customer records.
Now I would like to break the large file into small batches.
I would like to break Excel file into four batches by the customer name column by not just evenly break it down.
I have a source file that have a column called "Customer Name", which I would like use as an indicator to break the source file. Currently I write a Power Shell script but I got stuck ,the current method I use is
Get the column value on the customer name column.
Unique the customer array on the customer name
Filter the excel by the customer name and break down into batches
Below is my script but I stuck on do not know how to filter the items by the company name array.
# Load the Microsoft Excel Com Object
$Excel = New-Object -ComObject Excel.Application
# Open the workbook
$Workbook = $Excel.Workbooks.Open("XXXXX.xlsx")
# Get the first worksheet
$Worksheet = $Workbook.Sheets.Item(1)
# Get the range of cells that contain the customer names
$Range = $Worksheet.Range("D1:D1000")
# Get the values of the cells and store them in a variable
$Values = $Range.Value2
# Sort the values and remove duplicates
$UniqueValues = ($Values | Sort-Object) | Select-Object -Unique
# Clear the original range of cells
$Range.Clear()
Write-Output $UniqueValues
$ArrayLength = $UniqueValues.Length
$numberofrecordperbatch=$UniqueValues.Length/4
Write-Output $numberofrecordperbatch
#--------Divided into 4 batches-------------
function DivideList {
param(
[object[]]$list,
[int]$chunkSize
)
$j=1
$batch= #()
for ($i = 0; $i -lt $list.Count; $i += $chunkSize) {
$j+=1
$batch =($list | select -Skip $i -First $chunkSize)
#$companynamearray|Export-Excel -Path "XXXX.xlsx"
Write-Output $i
Write-Output $j
Write-Output $chunkSize
}
}
DivideList -list $UniqueValues -chunkSize $numberofrecordperbatch| foreach { $_ -join ',' }
Write-Output "Start"
#---------Filter------------
# Get the first worksheet
# Get the range of cells that contain the customer names
$Range2 = $Worksheet.Range("A1:X1000")
# Get the values of the cells and store them in a variable
$Value2 = $Range2.Value2
Write-Output $Value2

Related

Display a CSV file in 2 columns in Powershell

I want to read a csv file in 2 columns that when i will open my excel it will shows the informations of my CSV file displayed in 2 columns.
Actually, i only display in one column with the Following code :
Function Read-File {
Param([string]$file, [string]$head)
if (Test-Path $file) {
Write-Verbose "$file exists, importing list."
$data = Import-Csv -Path $file
$s = #()
if ($data.$head) {
foreach ($device in $data) {
try {
$s += $device.$head
Write-Verbose -Message "$device.$head"
} catch {
Write-Error -Message "Failed to add $device to list."
}
}
} else {
Write-Error -Message "No such column, $head, in CSV file $file."
}
return $s
}
}
$list = Read-File $file $fileColumn
So now i want to do it but in 2 columns , i'm a beginner in PowerShell so i would apreciate some help :)
thank you
this is my CSV file :
Asset Number, Serial Number
cd5013172cffd6a317bd2a6003414c1,N5WWGGNL
8df47f5b1f1fcda12ed9d390c1c55feaab8,SN65AGGNL
dc0d1aaba9d55ee8992c657,B2NGAA3501119
i am only trying to display thoses both ( asset number and serial number) on 2 columns on my excel , dont worry thoses informations are not sensitive at all so its ok :)

Use Select-Object to extract multiple properties from input objects (wrapped in custom object instances, [pscustomobject]).
Use implicit output from your function - no need to collect results in an array first, especially not by building it inefficiently with +=, which behind the scenes creates a new array in every iteration.
Note that return is never required to return (output) data from a function in PowerShell; any command whose output is neither captured, suppressed, nor redirected contributes to function's overall output.
Function Read-File {
# Not that $column (formerly: $head) now accepts an *array* of strings,
# [string[]]
Param([string] $file, [string[]] $column)
if (Test-Path $file) {
Write-Verbose "$file exists, importing list."
# Import the CSV file, create custom objects with only the column(s)
# of interest, and output the result.
# Also store the result in variable $result via -OutVariable, for
# inspection below.
Import-Csv -Path $file | Select-Object -Property $column -OutVariable result
# If at least 1 object was output, see if all columns specified exist
# in the CSV data.
if ($result.Count) {
foreach ($col in $column) {
if ($null -eq $result[0].$col) {
Write-Error -Message "No such column in CSV file $file`: $column"
}
}
}
}
}
# Sample call with 2 column names.
$list = Read-File $file 'Asset Number', 'Serial Number'
$list can then be piped to Export-Csv to create a new CSV file, which you can open in Excel:
# Export the selected properties to a new CSV file.
$list | Export-Csv -NoTypeInformation new.csv
# Open the new file with the registered application, asssumed to be Excel.
Invoke-Item new.csv

Powershell Mass Rename files with a excel reference list

I need help with PowerShell.
I will have to start renaming files in a weekly basis which I will be renaming more than 100 a week or more each with a dynamic name.
The files I want to rename are in a folder name Scans located in the "C: Documents\Scans". And they would be in order, to say time scanned.
I have an excel file located in "C: Documents\Mapping\ New File Name.xlsx.
The workbook has only one sheet and the new names would be in column A with x rows. Like mention above each cell will have different variables.
P Lease make comments on your suggestions so that I may understand what is going on since I'm a new to coding.
Thank you all for your time and help.

Although I agree with Ad Kasenally that it would be easier to use CSV files, here's something that may work for you.
$excelFile = 'C:\Documents\Mapping\New File Name.xlsx'
$scansFolder = 'C:\Documents\Scans'
########################################################
# step 1: get the new filenames from the first column in
# the Excel spreadsheet into an array '$newNames'
########################################################
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$workbook = $excel.Workbooks.Open($excelFile)
$worksheet = $workbook.Worksheets.Item(1)
$newNames = #()
$i = 1
while ($worksheet.Cells.Item($i, 1).Value() -ne $null) {
$newNames += $worksheet.Cells.Item($i, 1).Value()
$i++
}
$excel.Quit
# IMPORTANT: clean-up used Com objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($worksheet) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($workbook) | Out-Null
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($excel) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
########################################################
# step 2: rename the 'scan' files
########################################################
$maxItems = $newNames.Count
if ($maxItems) {
$i = 0
Get-ChildItem -Path $scansFolder -File -Filter 'scan*' | # get a list of FileInfo objects in the folder
Sort-Object { [int]($_.BaseName -replace '\D+', '') } | # sort by the numeric part of the filename
Select-Object -First ($maxItems) | # select no more that there are items in the $newNames array
ForEach-Object {
try {
Rename-Item -Path $_.FullName -NewName $newNames[$i] -ErrorAction Stop
Write-Host "File '$($_.Name)' renamed to '$($newNames[$i])'"
$i++
}
catch {
throw
}
}
}
else {
Write-Warning "Could not get any new filenames from the $excelFile file.."
}

You may want to have 2 columns in the excel file:
original file name
target file name
From there you can save the file as a csv.
Use Import-Csv to pull the data into Powershell and a ForEach loop to cycle through each row with a command like move $item.original $item.target.
There are abundant threads describing using import-csv with forEach.
Good luck.

Error Adding Columns in Powershell: 'You Cannot Call a Method on a Null-Valued Expression"

I'm trying to append a couple of columns onto a CSV file in Powershell, but for some reason the variables seem to have Null values even though I am trying to populate them. Specifically, I'm having trouble with the line
$Stats2.Columns.Add($colVZA)
which is apparently passed a null value by
$colVZA = New-Object System.Data.DataColumn VZA,([double])
$colVZA = $filename[0].VZA
I thought trying to populate it with data from the first cell in the VZA column in $filename would 'un-null' it, but apparently that's not how it works. Any ideas on how to get these columns populated and appended to the table? Here is my full code:
$i = 1
While ($i -le 211) {
#Set the variable to the filename with the iteration number
$filename = "c:\zMFM\z550Output\20dSummer\fixed20dSum550Output$i.csv"
#Check to see if that a file with $filename exists. If not, skip to the next iteration of $i. If so, run the code to collect the
statistics for each variable and output them each to a different file
If (Test-Path $filename) {
#Calculate the Standard Deviation
#First get the average of the values in the column
$STDEVInputFile = Import-CSV $filename
#Find the average and count for column 'td'
$STDEVAVG = $STDEVInputFile | Measure-Object td -Average | Select Count, Average
$DevMath = 0
# Sum the squares of the differences between the mean and each value in the array
Foreach ($Y in $STDEVInputFile) {
$DevMath += [math]::pow(($Y.Average - $STDEVAVG.Average), 2)
#Divide by the number of samples minus one
$STDEV = [Math]::sqrt($DevMath / ($STDEVAVG.Count-1))
}
#Calculate the basic statistics for column 'td' with the MEASURE-OBJECT cmdlet
$STATS = Import-CSV $Filename |
Measure-Object td -ave -max -min |
#Export the statistics as a CSV and import it back in so you can add columns
Export-CSV -notype "c:\zMFM\z550Output\20dSummer\tempstats$i.csv"
$STATS2 = Import-CSV "c:\zMFM\z550Output\20dSummer\tempstats$i.csv"
#$colSTDDEV = New-Object System.Data.DataColumn StdDev,([double])
$colVZA = New-Object System.Data.DataColumn VZA,([double])
#$colVAZ = New-Object System.Data.DataColumn VAZ,([double])
#Append the standard deviation variable to the statistics table and add the value
$colVZA = $filename[0].VZA
#$colVAZ = $filename[0].VAZ #COMMENTED FOR DEBUGGING
#$colSTDDEV = $STDEV
#$STATS2.Columns.Add($colSTDDEV) #COMMENTED FOR DEBUGGING
#$STATS2[0].StdDev = $STDEV #COMMENTED FOR DEBUGGING
$STATS2.Columns.Add($colVZA) |
#$STATS2[0].VZA = $VZA *****This line may be unnecessary now
#$STATS2.Columns.Add($colVAZ) #COMMENTED FOR DEBUGGING
#$STATS2[0].VZA = $VAZ #COMMENTED FOR DEBUGGING
#Export the $STATS file containing everything you need in the correct folder
Export-CSV -notype "c:\zMFM\z550Output\20dSummer\20dSum550Statistics.csv"
}
$i++
}

Throughout your script, the value of $filename is a string. When you index into a string, like this:
$filename[0]
you get an object of type [char] (a character) returned:
PS C:\> "string"[0]
s
A [char] doesn't have a property called VZA, so this:
$colVZA = $filename[0].VZA
is essentially the same as this:
$colVZA = "C".VZA
which (since VZA doesn't exist) ends up like:
$colVZA = $null
even though $filename is non-null

Import Excel data into PowerShell variables

I have an Excel File which has an unknown number of records in it, and these 3 columns:
Variable Name, Store Number, Email Address
I use this in QlikView to import data for certain stores and then create a separate report for each store in the list. I then need to email each report to each individual store (store number will be in the report file name).
So in PowerShell I would like to read the Excel File and set variables for each store:
$Store1 = The Store Number in Row 2 of the Excel File
$Store1Email = The Store Email in Row 2 of the Excel File
$Store2 = The Store Number in Row 3 of the Excel File
$Store2Email = The Store Email in Row 3 of the Excel File
etc. for each Storein the file (can be any number of stores).
Please note the "Variable Name" in the excel file must be ignored (that is for QLikView) and the PowerShell variables must be named as per my above examples, each time incrementing the number.

Check out my PowerShell Excel Module on Github. You can also grab it from the PowerShell Gallery.
$stores = Import-Excel C:\Temp\store.xlsx
$stores[2].Name
$stores[2].StoreNumber
$stores[2].EmailAddress
''
'All stores'
'----------'
$stores

Ok, first off if you are going to be working with actual .XLS or .XLSX or .XLSM files I would highly suggest using the Import-XLS function from the TechNet gallery (found here).
After that, just reference the object it imports to send the emails instead of making objects for each store. Such as:
$StoreList = Import-XLS <path to Excel file>
GC <report folder> | %{
$Current = $_
$Store = $StoreList|?{$_.StoreNumber -match $Current.BaseName}|Select -ExpandProperty StoreNumber
$Email = $StoreList|?{$_.StoreNumber -match $Current.BaseName}|Select -ExpandProperty StoreEmail
<code to send $Current to $Email>
}

My preference is to Save-As the Excel file to a '.csv' type. The comma separated value can easily be imported into PowerShell.
$csvFile = Import-Csv -Path c:\scripts\temp\excelFile.csv
#now the entire Excel '.csv' file is saved into csvFile variable
$csvFile |Get-Member
#look at the properties
Remember to study the greats so your PowerShell script looks great. Jeffery Snover, Jason Hicks, Don Jones, Ashley McGlone, and anyone on their friends list ha ha

The above answers usually work, but I just had a project with excel datasheets that caused some problems.
edit: Here's a much more advanced version that will pull it into an object, can handle blank and duplicate column names, and can skip human information at the beginning of the worksheet by looking for something in the header row. I've also included some example usages
Your example:
$file = New-Module -AsCustomObject -ScriptBlock $file_template
$file.from_excel("c:\folder\file.xls")
$Store1 = $file.data[0]."Store Number" #first row, column named "Store Number"
$Store1Email = $file.data[0]."Store Email" #first row, column named "Store Email"
foreach ($row in $file.data)
{
write-host "Store: $($row."Store Number")"
write-host "Store Email: $($row."Store Email")"
}
Example 1:
# Simplest example
$file = New-Module -AsCustomObject -ScriptBlock $file_template
$file.from_excel("c:\folder\file.xls")
$file.data[0]
Example 2:
#advanced usage
$file = New-Module -AsCustomObject -ScriptBlock $file_template
$file.header_contains="First Name" # if included it will drop everything before the first line that contains this, useful if there are instructions for humans in the worksheet
$file.indexer_column = 5 # Default: 1 (first column); This column's contents will set the minimum number of rows, use if there are blank rows in your file but more data after them
$file.worksheet_index = "January" # Default: 1; can be a sheet index or sheet name
$file.filename = "c:\folder\file.xls" #can set this independently, useful for validation and troubleshooting
$file.from_excel() #This is where we actually pull from excel
$collected = $file.data|ogv -pass thru #this is a neat way to select some rows you want
$file.headers.count # It stores an array of the headers here, useful for troubleshooting and advanced logic
Excel Reader pseudoclass
$file_template = {
# -- universal --
$filename = ""
$delimiter = ","
$headers = #()
$data = #()
# -- used by some functions --
# we put these here to allow assigning them before calling functions, which improves readability and auditability
$header_contains=""
$indexer_column=1
$worksheet_index=1
function from_excel(
$filename=$this.filename,
$worksheet_index=$this.worksheet_index
)`
{
$this.filename = $filename
$this.worksheet_index = $worksheet_index
$data_by_row = $this.from_excel_as_csv() # $data_by_row = $file.from_excel_as_csv($test_file)
$data_by_row = $data_by_row -split"`n"
#if ($this.headers.count -lt 1) {$this.headers = $data_by_row[0] -split $this.delimiter} #this would let us set headers elsewhere which is more flexible but less adaptive, Because columns change unpredicably we need something more adaptive
$temp_headers = $data_by_row[0] -split $this.delimiter
$temp_headers = $this.fix_blank_headers($temp_headers)
$this.headers = $this.dedupe_headers($temp_headers)
$this.data = $data_by_row|select -Skip 1|ConvertFrom-Csv -Header $this.headers -Delimiter $this.delimiter
}
function from_csv($filename=$this.filename)`
{
$this.filename = $filename
$this.headers = (Get-Content $this.filename -ReadCount 1|select -first 1) -split $this.delimiter
$this.data = Get-Content $this.filename|ConvertFrom-Csv -Delimiter $this.delimiter
}
function from_excel_as_csv(
$filename=$this.filename,
$worksheet_index=$this.worksheet_index
)`
{
$this.filename = $filename
$this.worksheet_index = $worksheet_index
#set up excel
Write-Host "Importing from excel, this may take a little while..."
$excel = New-Object -ComObject Excel.Application
$excel.DisplayAlerts = $false
$excel.Visible = $false
$workbook = $excel.workbooks.open($this.filename)
$worksheet = $workbook.Worksheets.Item($this.worksheet_index)
#import from excel
try{
$data_by_row = ""
$indexed_column = $worksheet.columns.item($this.indexer_column).value2 #we use this to work around some files having headers with blank space
$minimum_rows = (($indexed_column -join "◘").TrimEnd("◘") -split "◘").count # This Strips the million or so extra blank rows excel appends to get a realistic column length.
[bool]$header_found = 0
$i=1
do `
{
$row = $worksheet.rows.item($i).value2
$row_as_text = $row -join "◘" # ◘ (alt+8) is just a placeholder that's unlikely to show up in the text
$row_as_text = $row_as_text -replace $this.delimiter,"."
$row_as_text = $row_as_text.TrimEnd("◘")
$row_as_text = $row_as_text -replace "◘",$this.delimiter
if ($row_as_text -like "*$($this.header_contains)*"){[bool]$header_found=1}
if ($header_found) {$data_by_row+="$row_as_text`n"}
$i++
}
while ( ($row_as_text.Length -gt 1) -or ($i -lt $minimum_rows) )
}
catch {Write-Warning "ERROR Importing from excel"}
#close excel
$workbook.Close()
$excel.Quit()
write-host "Done importing from excel"
return $data_by_row
}
function dedupe_headers($headers){
$dupes = ($headers|group)|?{$_.count -gt 1}
if ($dupes.count -ge 1)
{
foreach ($dupe in $dupes)
{ #$dupe = $dupes[0]
$i=1
$new_headers = #()
foreach ($header in $headers)
{ #$header = $headers[0]
if ($header -eq $dupe.name)
{
$header = "$($header)_$($i)" # "header_#"
$i++
}
$new_headers += $header
}
}
}
else {$new_headers = $headers} # no duplicates found
return $new_headers
}
function fix_blank_headers($headers)
{
$replace_blanks_with = "_"
$new_headers = #()
foreach ($header in $headers)
{
if ($header -eq "") {$new_headers += $replace_blanks_with}
else {$new_headers += $header}
}
if ($new_headers.count -ne $header)
{
$error_json = #($headers),#($new_headers)|ConvertTo-Json -Compress
Write-Error "Error when fixing blank headers, original and new counts are different $($error_json)"
}
return $new_headers
}
<# function some_function($some_parameter){return $some_parameter} #>
Export-ModuleMember -Function * -Variable *
}

Forgive the ugliness here. I am not a programmer, so there are undoubtedly more optimized ways to do this, as well as better formatting. It will work, however, if I understand your requirements correctly.
$excelfile = import-csv "c:\myfile.csv"
$i = 1
$excelfile | ForEach-Object {
New-Variable "Store$i" $_."Store Number"
$iemail = $i.ToString() + "Email"
New-Variable "Store$iemail" $_."Email Address"
$i ++
}
edit: as per the reply to your original post, this works with a csv file. Just save it to csv first if necessary.

$excelfile = import-csv "C:\Temp\store.csv"
$i = 1 $excelfile | ForEach-Object {
$NA= $_."Name"
$SN= $_."StoreNumber"
Write-Output "row $i"
$NA
$SN
$i++ }

Powershell - Splitting multiple lines of text into individual records

Fairly new to PowerShell and wondering if someone could provide a hand with the following. Basically I have a .txt document with a number of records separated by a special character ($)(e.g)
ID
Date Entered
Name
Title
Address
Phone
$
ID
Date Entered
Name
Title
Address
Phone
$
I want to split each item between the $ into individual "records" so that I can loop through each record. So for example I could check each record above and return the record ID for all records that didn't have a phone number entered.
Thanks in advance

If each record contains a fixed number of properties you can take this approach. It loops through creating custom objects while skipping the dollar sign line.
$d = Get-Content -Path C:\path\to\text\file.txt
for ($i = 0; $i -lt $d.Length; $i+=7) {
New-Object -TypeName PsObject -Property #{
'ID' = $d[$i]
'Date Entered' = $d[$i+1]
'Name' = $d[$i+2]
'Title' = $d[$i+3]
'Address' = $d[$i+4]
'Phone' = $d[$i+5]
}
}

I once had the same requirement... this is how I did it
$loglocation = "C:\test\dump.txt"
$reportlocation = "C:\test\dump.csv"
$linedelimiter = ":"
$blockdelimiter = "FileSize"
$file = Get-Content $loglocation
$report = #()
$block = #{}
foreach ($line in $file)
{
if($line.contains($linedelimiter))
{
$key = $line.substring(0,$line.indexof($linedelimiter)).trimend()
$value = $line.substring($line.indexof($linedelimiter)+1).trimstart()
$block.Add($key,$value)
if ($block.keys -contains $blockdelimiter)
{
$obj = new-object psobject -property $block
$report += $obj
$block = #{}
}
}
}
$report
$report | Export-Csv $reportlocation -NoTypeInformation
So you cycle through each line, define key and value and add the object to a hashtable. Once the keys contains the blockdelimiter a new object gets written to an array and the hashtable gets cleared.
In my case the linedelimiter was a colon and the blockdelimiter was a valid record so you will have to make some changes. Let me know if this approach suits your needs and you can't find what to do.
P.S. By default only the noteproperties of the first object in the array will be shown so you will have to pipe the array to Select-Object and add all properties needed.
Grts.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Break Excel File by column value - powershell

Related

Display a CSV file in 2 columns in Powershell

Powershell Mass Rename files with a excel reference list

Error Adding Columns in Powershell: 'You Cannot Call a Method on a Null-Valued Expression"

Import Excel data into PowerShell variables

Powershell - Splitting multiple lines of text into individual records

Categories

Resources