Powershell replace text in a field from import-csv - powershell

I'm reading in a large csv file via Import-CSV and have a column of data with the following format; v00001048, v00019045, or v0036905. I'd like to replace all the zero's (0) after the v but before any number not a zero so the above text becomes; v-1048, v-19045, or v-36905. Done plenty of searches without successful results.

If you have a CSV (say, 'data.csv') with data like this:
Property1,Property2,Property3
SomeText,MoreText,v00001048
Then you can replace the leading zeros in Property3 using this technique:
$data = Import-csv .\data.csv
$data |
ForEach-Object {
$_.Property3 = $_.Property3 -replace "(?<=v)0+(?=\d+)","-"
}
If the property doesn't have any leading zeros to start with (e.g. v1048) this will leave it untouched. If you'd like it to insert the '-' anyway, then change the regex pattern to:
"(?<=v)0*(?=\d+)"

Related

Powershell: Replace headers while using Import-CSV

I found a related answer here that is really helpful, but not quite what I'm looking for. There are also a number of other questions I've looked at, but I can't figure out how to get this to work unfortunately and it seems rather simple.
Basically, I'm using Import-Csv and manipulating a lot of data; but the names of the headers can sometimes change. So instead of re-writing my code, I'd like to map the headers I'm given to the headers that are used in my code blocks. Outputting the final data as a CSV, I can leave it using the 'updated headers' or, if I can figure out how to swap headers easily, I could always swap them back to what they were.
So let's say I have a mapping file in Excel. I can do the mapping in rows or columns, whichever will be easier. For this first example, I have the mapping in rows. When I use Import-CSV, I want to use the Headers from Row #2 instead of the headers in Row #1. Here's the content of the mapping file:
So basically if I hard coded this all, I'd have something like:
$null, $headerRow, $dataRows = (Get-Content -Raw foo.csv) -split '(^.+\r?\n)', 2
ConvertFrom-Csv ($headerRow.Trim() -replace 'Identification', 'ID' -replace 'Revenue Code', 'Revenue_Code' -replace 'Total Amount for Line', 'Amount' -replace 'Total Quantity for Line', 'Qty'), $dataRows
Except I don't want to hard code it, I am basically looking for a way to use Replace with a mapping file or hashtable if I can create one.
#Pseudo code for what I want
$hashtable = Get-Content mapping.xlsx
ConvertFrom-Csv ($headerRow.Trim() -replace $hashtable.Name, $hashtable.Value), $dataRows
I'm probably failing and failing to find similar examples since I'm trying to be flexible on the format of the mapping file. My original idea was to basically treat the 1st row as a string, and to replace that entire string with the second row. But the hashtable idea came from likely restructuring the mapping to look like this:
Here I would basically -replace each Source value with the corresponding Target value.
EDIT If you need to convert back, give this a shot - but keep in mind it'll only work if you have a one-to-one relationship of Source:Target values.
#Changing BACK to the original Headers...
$Unmap = #{}
(Import-Csv MappingTable.csv).ForEach({$Unmap[$_.Target] = $_.Source})
#Get string data from CSV Objects
$stringdata = $outputFixed | ConvertTo-CSV -NoTypeInformation
$headerRow = $stringdata[0]
$dataRows = $stringdata[1..($stringdata.Count-1)] -join "`r`n"
#Create new header data
$unmappedHeaderRow = ($headerRow -replace '"' -split ',').ForEach({'"' + $Unmap[$_] + '"'}) -join ','
$newdata = ConvertFrom-Csv $unmappedHeaderRow, $dataStrings
Here's a complete example that builds on your original attempt:
It provides the column-name (header) mapping via (another) .csv file, with columns Source and Target, where each row maps a source name to a target name, as (also) shown in your question.
The mapping CSV file is transformed into a hashtable that maps source names to target names.
The data CSV file is then read as plain text, as in your question - efficiently, but in full - split into header row and data rows, and a new header row with the mapped names is constructed with the help of the hashtable.
The new header row plus the data rows are then sent to ConvertFrom-Csv for to-object conversion based on the mapped column (property) names.
# Create sample column-name mapping file.
#'
Source,Target
Identification,Id
Revenue Code,Revenue_Code
'# > mapping.csv
# Create a hashtable from the mapping CSV file
# that maps each Source column value to its Target value.
$map = #{}
(Import-Csv mapping.csv).ForEach({ $map[$_.Source] = $_.Target })
# Create sample input CSV file.
#'
Revenue Code,Identification
r1,i1
r2,i2
'# > data.csv
# Read the data file as plain text, split into a header line and
# a multi-line string comprising all data lines.
$headerRow, $dataRows = (Get-Content -Raw data.csv) -split '\r?\n', 2
# Create the new header based on the column-name mapping.
$mappedHeaderRow =
($headerRow -replace '"' -split ',').ForEach({ $map[$_] }) -join ','
# Parse the data rows with the new header.
$mappedHeaderRow, $dataRows | ConvertFrom-Csv
The above outputs the following, showing that the columns were effectively mapped (renamed):
Revenue_Code Id
------------ --
r1 i1
r2 i2
The easiest thing to do here is to process the CSV and then transform each row, from whatever format it was, into a new desired target format.
Pretend we have an input CSV like this.
RowID,MayBeNull,MightHaveAValue
1,,Value1
2,Value2,
3,,Value3
Then we import the csv like so:
#helper function for ugly logic
function HasValue($param){
return -not [string]::IsNullOrEmpty($param)
}
$csv = import-csv C:\pathTo\this.csv
foreach($row in $csv){
if (HasValue($row.MayBeNull)){
$newColumn = $row.MayBeNull
}
else{
$newColumn = $row.MightHaveAValue
}
#generate new output
[psCustomObject]#{
Id = $row.RowId;
NewColumn = $newColumn
}
}
Which gives the following output:
This is an easy pattern to follow for a data migration script, then you just need to scale it up to fix your problem.

Question regarding incrementing a string value in a text file using Powershell

Just beginning with Powershell. I have a text file that contains the string "CloseYear/2019" and looking for a way to increment the "2019" to "2020". Any advice would be appreciated. Thank you.
If the question is how to update text within a file, you can do the following, which will replace specified text with more specified text. The file (t.txt) is read with Get-Content, the targeted text is updated with the String class Replace method, and the file is rewritten using Set-Content.
(Get-Content t.txt).Replace('CloseYear/2019','CloseYear/2020') | Set-Content t.txt
Additional Considerations:
General incrementing would require a object type that supports incrementing. You can isolate the numeric data using -split, increment it, and create a new, joined string. This solution assumes working with 32-bit integers but can be updated to other numeric types.
$str = 'CloseYear/2019'
-join ($str -split "(\d+)" | Foreach-Object {
if ($_ -as [int]) {
[int]$_ + 1
}
else {
$_
}
})
Putting it all together, the following would result in incrementing all complete numbers (123 as opposed to 1 and 2 and 3 individually) in a text file. Again, this can be tailored to target more specific numbers.
$contents = Get-Content t.txt -Raw # Raw to prevent an array output
-join ($contents -split "(\d+)" | Foreach-Object {
if ($_ -as [int]) {
[int]$_ + 1
}
else {
$_
}
}) | Set-Content t.txt
Explanation:
-split uses regex matching to split on the matched result resulting in an array. By default, -split removes the matched text. Creating a capture group using (), ensures the matched text displays as is and is not removed. \d+ is a regex mechanism matching a digit (\d) one or more (+) successive times.
Using the -as operator, we can test that each item in the split array can be cast to [int]. If successful, the if statement will evaluate to true, the text will be cast to [int], and the integer will be incremented by 1. If the -as operator is not successful, the pipeline object will remain as a string and just be output.
The -join operator just joins the resulting array (from the Foreach-Object) into a single string.
AdminOfThings' answer is very detailed and the correct answer.
I wanted to provide another answer for options.
Depending on what your end goal is, you might need to convert the date to a datetime object for future use.
Example:
$yearString = 'CloseYear/2019'
#convert to datetime
[datetime]$dateConvert = [datetime]::new((($yearString -split "/")[-1]),1,1)
#add year
$yearAdded = $dateConvert.AddYears(1)
#if you want to display "CloseYear" with the new date and write-host
$out = "CloseYear/{0}" -f $yearAdded.Year
Write-Host $out
This approach would allow you to use $dateConvert and $yearAdded as a datetime allowing you to accurately manipulate dates and cultures, for example.

How to check column count in a file to satisfy a condition

I am trying to write a PowerShell script to check the column count and see if it satisfies the condition or else throw error or email.
something I have tried:
$columns=(Get-Content "C:\Users\xs15169\Desktop\temp\OEC2_CFLOW.txt" | select -First 1).Split(",")
$Count=columns.count
if ($count -eq 280)
echo "column count is:$count"
else
email
I'm going to assume your text file is in CSV format, I can't imagine what format you're working with if it's a text-file table and not formatted as CSV.
If your CSV has headers
Process the CSV file, and count the number of properties on the resulting Powershell object.
$columnCount = #( ( Import-Csv '\path\to\file.txt' ).PSObject.Properties ).Count
We need to force the Properties object to an array (which is the #() syntax) to accurately get the count. The PSObject property is a hidden property for metadata about an object in Powershell, which is where we look for the Properties (column names) and get the count of how many there are.
CSV without headers
If your CSV doesn't have headers, Import-Csv requires you to manually specify the headers. There are tricks you can do to build out unique column names on-the-fly, but they are overly complex for simply getting a column count.
To take what you've already tried above, we can get the data in the first line and process the number of columns, though you were doing it incorrectly in the question. Here's how to properly do it:
$columnCount = ( ( Get-Content "\path\to\file.txt" | Select-Object -First 1 ) -Split ',' ).Count
What was wrong with the original
Both above solutions consolidate getting the column count down to one line of code. But in your original sample, you made a couple small mistakes:
$columns=( Get-Content "\path\to\file.txt" | select -First 1 ).Split(",")
# You forgot to prepend "columns" with a $. Should look like the below line
$Count=$columns.count
And you forgot to use curly braces with your if block:
if ($count -eq 280) {
echo "column count is:$count"
} else {
email
}
As for using the -Split operator vs. the .Split() method - this is purely stylistic preference on my part, and using Split() is perfectly valid.

Read a CSV in powershell with a variable number of columns

I have a CSV that contains a username, and then one or more values for the rest of the record. There are no headers in the file.
joe.user,Accounting-SG,CustomerService-SG,MidwestRegion-SG
frank.user,Accounting-SG,EastRegion-SG
I would like to read the file into a powershell object where the Username property is set to the first column, and the Membership property is set to either the remainder of the row (including the commas) or ideally, an array of strings with each element containing a single membership value.
Unfortunately, the following line only grabs the first membership and ignores the rest of the line.
$memberships = Import-Csv -Path C:\temp\values.csv -Header "username", "membership"
#{username=joe.user; membership=Accounting-SG}
#{username=frank.user; membership=Accounting-SG}
I'm looking for either of these outputs:
#{username=joe.user; membership=Accounting-SG,CustomerService-SG,MidwestRegion-SG}
#{username=frank.user; membership=Accounting-SG,EastRegion-SG}
or
#{username=joe.user; membership=string[]}
#{username=frank.user; membership=string[]}
I've been able to get the first result by enclosing the "rest" of the data in the csv file in quotes, but that doesn't really feel like the best answer:
joe.user,"Accounting-SG,CustomerService-SG,MidwestRegion-SG"
Well, the issue is that what you have isn't really a (proper) CSV. The CSV format doesn't support that notation.
You can "roll your own" and just process the file yourself, something like this:
$memberships = Get-Content -LiteralPath C:\temp\values.csv |
ForEach-Object -Process {
$user,$membership = $_.Split(',')
New-Object -TypeName PSObject -Property #{
username = $user
membership = $membership
}
}
You could do a half and half sort of thing. Using your modification, where the groups are all a single field in quotes, do this:
$memberships = Import-Csv -Path C:\temp\values.csv -Header "username", "membership" |
ForEach-Object -Process {
$_.membership = $_.membership.Split(',')
$_
}
The first example just reads the file line by line, splits on commas, then creates a new object with the properties you want.
The second example uses Import-Csv to create the object initially, then just resets the .membership property (it starts as a string, and we split the string so it's now an array).
The second way only makes sense if whatever is creating the "CSV" can create it that way in the first place. If you have to modify it yourself every time, just skip this and process it as it is.

cleanup improperly formatted csv file

I am downloading a xlsx file from a sharepoint, and then convert it into a csv file. However, since the xlsx file contained empty columns that were not deleted, it exports those to a csv file like follows...
columnOne,columnTwo,columnThree,,,,
valueOne,,,,,,
,valueTwo,,,,,
,,valueThree,,,,
As you can see, Import-Csv cmdlet will fail with that file because of the extra null titles. I want to know how to count the extra commas at the end. The number of columns are always changing, and the name of the columns are also always changing. So we start the count based from the last non-null title number.
Right now, I'm doing the following...
$csvFileEdited = Get-Content $csvFile
$csvFileEdited[0] = $csvFileEdited[0].TrimEnd(',')
$csvFileEdited | Set-Content "$csvFile-temp"
Move-Item "$csvFile-temp" $csvFile -Force
Write-Host "Trim Complete."
This will make the file output like this...
columnOne,columnTwo,columnThree
valueOne,,,,,,
,valueTwo,,,,,
,,valueThree,,,,
The naming is now accepted for Import-Csv, but as you can see there is still extra null values that are not necessary since they are null for every row.
If I did the following code...
$csvFileWithExtraCommas = Get-Content $csvFile
$csvFileWithoutExtraCommas = #()
FOrEach ($line in $csvFileWithExtraCommas)
{
$line = $line.TrimEnd(',')
$csvFileWithoutExtraCommas += $line
{
$csvFileWithoutExtraCommas | Set-Content "$csvFile-temp"
Move-Item "$csvFile-temp" $csvFile -Force
Write-Host "Trim Complete."
Then it would remove a null value that should be null because it belongs to a non-null title-name. Such is the output....
columnOne,columnTwo,columnThree
valueOne
,valueTwo
,,valueThree
Here is the desired output:
columnOne,columnTwo,columnThree
valueOne,,
,valueTwo,
,,valueThree
Can anyone help with this?
Update
I'm using the following code to count the extra null titles...
$csvFileWithCommas = Get-Content $csvFile
[int]$csvFileWithExtraCommasNumber = $csvFileWithCommas[0].Length
$csvFileTitlesWithoutExtraCommas = $csvFileWithCommas[0].TrimEnd(',')
[int]$csvFileWithoutExtraCommasNumber = $csvFileTitlesWithoutExtraCommas.Length
$numOfCommas = $csvFileWithExtraCommasNumber - $csvFileWithoutExtraCommasNumber
The output of value of $numOfCommas is 4. Now the question is how can I use $line.TrimEnd(',') to only do so 4 times??
Ok.... If you really need to do this you can count the trailing commas from the header and use regex to remove as many the from the end of each line. There are other string manipulation approaches but the regex in this case is pretty clean.
Note that what Bluecakes answer shows should suffice. Perhaps there is some other hidden characters that are not being copied in the question or perhaps an encoding issue with your real file.
$file = Get-Content "D:\temp\text.csv"
# Number of trailing commas. Compare the length before and after the trim
$numberofcommas = $file[0].Length - $file[0].TrimEnd(",").Length
# Use regex to remove as many commas from the end of each line and convert to csv object.
$file -replace ",{$numberofcommas}$" | ConvertFrom-Csv
Regex is looking for X commas at the end of of each line where X is $numberofcommas. In our case it would look like ,{4}$
Source file used with above code was generated as such
#"
columnOne,columnTwo,columnThree,,,,
valueOne,,,,,,
,valueTwo,,,,,
,,valueThree,,,,
"# | set-content D:\temp\text.csv
Are you getting an error when trying to Import-csv? The cmdlet is smart enough to ignore columns without a heading without any additional code needed.
I copied your csv file to my H:\ drive:
columnOne,columnTwo,columnThree,,,,
valueOne,,,,,,
,valueTwo,,,,,
,,valueThree,,,,
and then ran $nullcsv = Import-Csv -Path H:\nullcsv.csv and this is what i got
PS> $nullcsv
columnOne columnTwo columnThree
--------- --------- -----------
valueOne
valueTwo
valueThree
The imported csv only contains 3 values as you would expect:
PS> $nullcsv.count
3
The cmdlet is also orrectly accounting for null values in each of the columns:
PS> $nullcsv | Format-List
columnOne : valueOne
columnTwo :
columnThree :
columnOne :
columnTwo : valueTwo
columnThree :
columnOne :
columnTwo :
columnThree : valueThree