Read a CSV in powershell with a variable number of columns - powershell

I have a CSV that contains a username, and then one or more values for the rest of the record. There are no headers in the file.
joe.user,Accounting-SG,CustomerService-SG,MidwestRegion-SG
frank.user,Accounting-SG,EastRegion-SG
I would like to read the file into a powershell object where the Username property is set to the first column, and the Membership property is set to either the remainder of the row (including the commas) or ideally, an array of strings with each element containing a single membership value.
Unfortunately, the following line only grabs the first membership and ignores the rest of the line.
$memberships = Import-Csv -Path C:\temp\values.csv -Header "username", "membership"
#{username=joe.user; membership=Accounting-SG}
#{username=frank.user; membership=Accounting-SG}
I'm looking for either of these outputs:
#{username=joe.user; membership=Accounting-SG,CustomerService-SG,MidwestRegion-SG}
#{username=frank.user; membership=Accounting-SG,EastRegion-SG}
or
#{username=joe.user; membership=string[]}
#{username=frank.user; membership=string[]}
I've been able to get the first result by enclosing the "rest" of the data in the csv file in quotes, but that doesn't really feel like the best answer:
joe.user,"Accounting-SG,CustomerService-SG,MidwestRegion-SG"

Well, the issue is that what you have isn't really a (proper) CSV. The CSV format doesn't support that notation.
You can "roll your own" and just process the file yourself, something like this:
$memberships = Get-Content -LiteralPath C:\temp\values.csv |
ForEach-Object -Process {
$user,$membership = $_.Split(',')
New-Object -TypeName PSObject -Property #{
username = $user
membership = $membership
}
}
You could do a half and half sort of thing. Using your modification, where the groups are all a single field in quotes, do this:
$memberships = Import-Csv -Path C:\temp\values.csv -Header "username", "membership" |
ForEach-Object -Process {
$_.membership = $_.membership.Split(',')
$_
}
The first example just reads the file line by line, splits on commas, then creates a new object with the properties you want.
The second example uses Import-Csv to create the object initially, then just resets the .membership property (it starts as a string, and we split the string so it's now an array).
The second way only makes sense if whatever is creating the "CSV" can create it that way in the first place. If you have to modify it yourself every time, just skip this and process it as it is.

Related

Powershell: Replace headers while using Import-CSV

I found a related answer here that is really helpful, but not quite what I'm looking for. There are also a number of other questions I've looked at, but I can't figure out how to get this to work unfortunately and it seems rather simple.
Basically, I'm using Import-Csv and manipulating a lot of data; but the names of the headers can sometimes change. So instead of re-writing my code, I'd like to map the headers I'm given to the headers that are used in my code blocks. Outputting the final data as a CSV, I can leave it using the 'updated headers' or, if I can figure out how to swap headers easily, I could always swap them back to what they were.
So let's say I have a mapping file in Excel. I can do the mapping in rows or columns, whichever will be easier. For this first example, I have the mapping in rows. When I use Import-CSV, I want to use the Headers from Row #2 instead of the headers in Row #1. Here's the content of the mapping file:
So basically if I hard coded this all, I'd have something like:
$null, $headerRow, $dataRows = (Get-Content -Raw foo.csv) -split '(^.+\r?\n)', 2
ConvertFrom-Csv ($headerRow.Trim() -replace 'Identification', 'ID' -replace 'Revenue Code', 'Revenue_Code' -replace 'Total Amount for Line', 'Amount' -replace 'Total Quantity for Line', 'Qty'), $dataRows
Except I don't want to hard code it, I am basically looking for a way to use Replace with a mapping file or hashtable if I can create one.
#Pseudo code for what I want
$hashtable = Get-Content mapping.xlsx
ConvertFrom-Csv ($headerRow.Trim() -replace $hashtable.Name, $hashtable.Value), $dataRows
I'm probably failing and failing to find similar examples since I'm trying to be flexible on the format of the mapping file. My original idea was to basically treat the 1st row as a string, and to replace that entire string with the second row. But the hashtable idea came from likely restructuring the mapping to look like this:
Here I would basically -replace each Source value with the corresponding Target value.
EDIT If you need to convert back, give this a shot - but keep in mind it'll only work if you have a one-to-one relationship of Source:Target values.
#Changing BACK to the original Headers...
$Unmap = #{}
(Import-Csv MappingTable.csv).ForEach({$Unmap[$_.Target] = $_.Source})
#Get string data from CSV Objects
$stringdata = $outputFixed | ConvertTo-CSV -NoTypeInformation
$headerRow = $stringdata[0]
$dataRows = $stringdata[1..($stringdata.Count-1)] -join "`r`n"
#Create new header data
$unmappedHeaderRow = ($headerRow -replace '"' -split ',').ForEach({'"' + $Unmap[$_] + '"'}) -join ','
$newdata = ConvertFrom-Csv $unmappedHeaderRow, $dataStrings
Here's a complete example that builds on your original attempt:
It provides the column-name (header) mapping via (another) .csv file, with columns Source and Target, where each row maps a source name to a target name, as (also) shown in your question.
The mapping CSV file is transformed into a hashtable that maps source names to target names.
The data CSV file is then read as plain text, as in your question - efficiently, but in full - split into header row and data rows, and a new header row with the mapped names is constructed with the help of the hashtable.
The new header row plus the data rows are then sent to ConvertFrom-Csv for to-object conversion based on the mapped column (property) names.
# Create sample column-name mapping file.
#'
Source,Target
Identification,Id
Revenue Code,Revenue_Code
'# > mapping.csv
# Create a hashtable from the mapping CSV file
# that maps each Source column value to its Target value.
$map = #{}
(Import-Csv mapping.csv).ForEach({ $map[$_.Source] = $_.Target })
# Create sample input CSV file.
#'
Revenue Code,Identification
r1,i1
r2,i2
'# > data.csv
# Read the data file as plain text, split into a header line and
# a multi-line string comprising all data lines.
$headerRow, $dataRows = (Get-Content -Raw data.csv) -split '\r?\n', 2
# Create the new header based on the column-name mapping.
$mappedHeaderRow =
($headerRow -replace '"' -split ',').ForEach({ $map[$_] }) -join ','
# Parse the data rows with the new header.
$mappedHeaderRow, $dataRows | ConvertFrom-Csv
The above outputs the following, showing that the columns were effectively mapped (renamed):
Revenue_Code Id
------------ --
r1 i1
r2 i2
The easiest thing to do here is to process the CSV and then transform each row, from whatever format it was, into a new desired target format.
Pretend we have an input CSV like this.
RowID,MayBeNull,MightHaveAValue
1,,Value1
2,Value2,
3,,Value3
Then we import the csv like so:
#helper function for ugly logic
function HasValue($param){
return -not [string]::IsNullOrEmpty($param)
}
$csv = import-csv C:\pathTo\this.csv
foreach($row in $csv){
if (HasValue($row.MayBeNull)){
$newColumn = $row.MayBeNull
}
else{
$newColumn = $row.MightHaveAValue
}
#generate new output
[psCustomObject]#{
Id = $row.RowId;
NewColumn = $newColumn
}
}
Which gives the following output:
This is an easy pattern to follow for a data migration script, then you just need to scale it up to fix your problem.

How do i overwrite a field in a csv file with PowerShell?

CSV sample:
I have the below code, where I would like to overwrite the current value in the PreviousGroup field. I know that -append adds to the end of the column, but that's not what I want to do.
$UserGroup = read-host "Enter Group Name"
$csvFile = Import-Csv "C:\HomeFolder\Locations.csv"
if ([string]::IsNullOrEmpty($PreviousGroup)) {$PreviousGroup = ""}
else {$PreviousGroup = $csvFile | Select-Object $csvFile.PreviousGroup -Verbose}
$csvFile.PreviousGroup = $UserGroup
$csvFile | Export-Csv
Secondly, is it possible to link Dom*_Groups in the below code to the list on the CSV?
param([Parameter(Mandatory = $false)]
[ValidateSet(*"list from csv"*)] [string]$Dom1_Groups)
param([Parameter(Mandatory = $false)]
[ValidateSet(*"list from csv"*)] [string]$Dom2_Groups)
$csvFile, as returned from Import-Csv, is an array of [pscustomobject] instances.
Therefore, assigning to the .PreviousGroup property of $csvFile in an attempt to assign to its elements' .PreviousGroup properties will not work: while it's understandable to attempt this, given that getting the elements' property values this way does work, via member-access enumeration, member-access enumeration by design only works for getting, not also for setting property values.
The simplest solution is to use the .ForEach() array method:
# Set the .PreviousGroup property of all elements of array $csvFile
# to the value of $UserGroup.
$csvFile.ForEach('PreviousGroup', $UserGroup)
Caveat: As of PowerShell 7.1, the above method of assigning property values unexpectedly fails if the input object happens to be a scalar (single object), which can happen if the CSV file happens to contain just one data row; see GitHub issue #14527.
An - inefficient - workaround is to use #(), the array-subexpression operator:
#($csvFile).ForEach('PreviousGroup', $UserGroup)
or to use a script block ({ ... }):
$csvFile.ForEach({ $_.PreviousGroup = $UserGroup })

How to check column count in a file to satisfy a condition

I am trying to write a PowerShell script to check the column count and see if it satisfies the condition or else throw error or email.
something I have tried:
$columns=(Get-Content "C:\Users\xs15169\Desktop\temp\OEC2_CFLOW.txt" | select -First 1).Split(",")
$Count=columns.count
if ($count -eq 280)
echo "column count is:$count"
else
email
I'm going to assume your text file is in CSV format, I can't imagine what format you're working with if it's a text-file table and not formatted as CSV.
If your CSV has headers
Process the CSV file, and count the number of properties on the resulting Powershell object.
$columnCount = #( ( Import-Csv '\path\to\file.txt' ).PSObject.Properties ).Count
We need to force the Properties object to an array (which is the #() syntax) to accurately get the count. The PSObject property is a hidden property for metadata about an object in Powershell, which is where we look for the Properties (column names) and get the count of how many there are.
CSV without headers
If your CSV doesn't have headers, Import-Csv requires you to manually specify the headers. There are tricks you can do to build out unique column names on-the-fly, but they are overly complex for simply getting a column count.
To take what you've already tried above, we can get the data in the first line and process the number of columns, though you were doing it incorrectly in the question. Here's how to properly do it:
$columnCount = ( ( Get-Content "\path\to\file.txt" | Select-Object -First 1 ) -Split ',' ).Count
What was wrong with the original
Both above solutions consolidate getting the column count down to one line of code. But in your original sample, you made a couple small mistakes:
$columns=( Get-Content "\path\to\file.txt" | select -First 1 ).Split(",")
# You forgot to prepend "columns" with a $. Should look like the below line
$Count=$columns.count
And you forgot to use curly braces with your if block:
if ($count -eq 280) {
echo "column count is:$count"
} else {
email
}
As for using the -Split operator vs. the .Split() method - this is purely stylistic preference on my part, and using Split() is perfectly valid.

CSV input, powershell pulling $null value rows from targeted column

I am trying to create a script to create Teams in Microsoft Teams from data in a CSV file.
The CSV file has the following columns: Team_name, Team_owner, Team_Description, Team_class
The script should grab Team_name row value and use that value to create a variable. Use that variable to query if it exists in Teams and if not, create it using the data in the other columns.
The problem I am having is my foreach loop seems to be collecting rows without values. I simplified the testing by first trying to identify the values and monitoring the output.
Here is the test script
$Team_infocsv = Import-csv -path $path Teams_info.csv
# $Team_infocsv | Foreach-object{
foreach($line in $Team_infocsv){
$owner = $line.Team_owner
Write-Host "Team Owner: $owner"
$teamname = $line.Team_name
Write-Host "Team Name: $teamname"
$team_descr = $line.Team_Description
Write-Host "Team Description: $team_descr"
$teamclass = $line.Team_class
Write-Host "Team Class: $teamclass"
}
I only have two rows of data but yet returned are the two lines as requested with extra output (from rows) without values.
There's no problem with your code per se, except:
Teams_info.csv is specified in addition to $path after Import-Csv -Path, which I presume is a typo, however.
$path could conceivably - and accidentally - be an array of file paths, and if the additional file(s) has entirely different columns, you'd get empty values for the first file's columns.
If not, the issue must be with the contents of Teams_info.csv, so I suggest you examine that; piping to Format-Custom as shown below will also you help you detect the case where $path is unexpectedly an array of file paths:
Here's a working example of a CSV file resembling your input - created ad hoc - that you can compare to your input file.
# Create sample file.
#'
"Team_name","Team_owner","Team_Description","Team_class"
"Team_nameVal1","Team_ownerVal1","Team_DescriptionVal1","Team_classVal1"
"Team_nameVal2","Team_ownerVal2","Team_DescriptionVal2","Team_classVal2"
'# > test.csv
# Import the file and examine the objects that get created.
# Note the use of Format-Custom.
Import-Csv test.csv test.csv | Format-Custom
The above yields:
class PSCustomObject
{
Team_name = Team_nameVal1
Team_owner = Team_ownerVal1
Team_Description = Team_DescriptionVal1
Team_class = Team_classVal1
}
class PSCustomObject
{
Team_name = Team_nameVal2
Team_owner = Team_ownerVal2
Team_Description = Team_DescriptionVal2
Team_class = Team_classVal2
}
Format-Custom produces a custom view (a non-table and non-list view) as defined by the type of the instances being output; in the case of the [pscustomobject] instances that Import-Csv outputs you get the above view, which is a convenient way of getting at least a quick sense of the objects' content (you may still have to dig deeper to distinguish empty strings from $nulls, ...).

How to add a column to an existing CSV row in PowerShell?

I'm trying to write a simple usage logger into my script that would store information about the time when user opened the script, finished using the script and the user name.
The first part of the logger where I gather the first two data works fine and adds two necessary columns with values to the CSV file. Yet when I run the second part of the logger it does not add a new column to my existing CSV file.
#Code I will add at the very beginning of my script
$FileNameDate = Get-Date -Format "MMM_yyyy"
$FilePath = "C:\Users\Username\Desktop\Script\Logs\${FileNameDate}_MonthlyLog.csv"
$TimeStamp = (Get-Date).toString("dd/MMM/yyyy HH:mm:ss")
$UserName = [string]($env:UserName)
$LogArray = #()
$LogArrayDetails = #{
Username = $UserName
StartDate = $TimeStamp
}
$LogArray += New-Object PSObject -Property $LogArrayDetails | Export-Csv $FilePath -Notypeinformation -Append
#Code I will add at the very end of my script
$logArrayFinishDetails = #{FinishDate = $TimeStamp}
$LogCsv = Import-Csv $FilePath | Select Username, StartDate, #{$LogArrayFinishDetails} | Export-Csv $FilePath -NoTypeInformation -Append
CSV file should look like this when the script is closed:
Username StartDate FinishDate
anyplane 08/Apr/2018 23:47:55 08/Apr/2018 23:48:55
Yet it looks like this:
StartDate Username
08/Apr/2018 23:47:55 anyplane
The other weird thing is that it puts the StartDate first while I clearly stated in $LogArrayDetails that Username goes first.
Assuming that you only ever want to record the most recent run [see bottom if you want to record multiple runs] (PSv3+):
# Log start of execution.
[pscustomobject] #{ Username = $env:USERNAME; StartDate = $TimeStamp } |
Export-Csv -Notypeinformation $FilePath
# Perform script actions...
# Log end of execution.
(Import-Csv $FilePath) |
Select-Object *, #{ n='FinishDate'; e={ (Get-Date).toString("dd/MMM/yyyy HH:mm:ss") } } |
Export-Csv -Notypeinformation $FilePath
As noted in boxdog's helpful answer, using -Append with Export-Csv won't add additional columns.
However, since you're seemingly attempting to rewrite the entire file, there is no need to use
-Append at all.
So as to ensure that the old version of the file has been read in full before you attempt to replace it with Export-Csv, be sure to enclose your Import-Csv $FilePath call in (...), however.
This is not strictly necessary with a 1-line file such as in this case, but a good habit to form for such rewrites; do note that this approach is somewhat brittle in general, as something could go wrong while rewriting the file, resulting in potential data loss.
#{ n='FinishDate'; e={ (Get-Date).toString("dd/MMM/yyyy HH:mm:ss") } is an example of a calculated property/column that is appended to the preexisting columns (*)
The other weird thing is that it puts the StartDate first while I clearly stated in $LogArrayDetails that Username goes first.
You've used a hashtable (#{ ... }) to declare the columns for the output CSV, but the order in which a hashtable's entries are enumerated is not guaranteed.
In PSv3+, you can use an ordered hashtable instead ([ordered] #{ ... }) to achieve predictable enumeration, which you also get if you convert the hashtable to a custom object by casting to [pscustomobject], as shown above.
If you do want to append to the existing file, you can use the following, but note that:
this approach does not scale well, because the entire log file is read into memory every time (and converted to objects), though limiting the entries to a month's worth should be fine.
as stated, the approach is brittle, as things can go wrong while rewriting the file; consider simply writing 2 rows per execution instead, which allows you to append to the file line by line.
there's no concurrency management, so the assumption is that only ever one instance of the script is run at a time.
$FilePath = './t.csv'
$TimeStamp = (Get-Date).toString("dd/MMM/yyyy HH:mm:ss")
$env:USERNAME = $env:USER
# Log start of execution. Note the empty 'FinishDate' property
# to ensure all rows ultimately have the same column structure.
[pscustomobject] #{ Username = $env:USERNAME; StartDate = $TimeStamp; FinishDate = '' } |
Export-Csv -Notypeinformation -Append $FilePath
# Perform script actions...
# Log end of execution:
# Read the entire existing file...
$logRows = Import-Csv $FilePath
# ... update the last row's .FinishDate property
$logRows[-1].FinishDate = (Get-Date).toString("dd/MMM/yyyy HH:mm:ss")
# ... and rewrite the entire file, keeping only the last 30 entries
$logRows[-30..-1] | Export-Csv -Notypeinformation $FilePath
Because your CSV already has a structure (i.e. defined headers), PowerShell honours this when appending and doesn't add additional columns. It is (sort of) explained in this excerpt from the Export-Csv help:
When you submit multiple objects to Export-CSV, Export-CSV organizes
the file based on the properties of the first object that you submit.
If the remaining objects do not have one of the specified properties,
the property value of that object is null, as represented by two
consecutive commas. If the remaining objects have additional
properties, those property values are not included in the file.
You could include the FinishDate property in the original file (even though it would be empty), but the best option might be to export your output to a different CSV at the end, perhaps deleting the original after import then recreating it with the additional data. In fact, just removing the -Append will likely give the result you want.