csv reformatting with powershell - powershell

I have a file cointaining a lot of lines in this format:
firstname ; lastname ; age ;
(it's a bit more complex but that's basically the file)
so the fields are of a fixed length, padded with spaces and with a semicolon in between the fields.
I would like to have it so:
firstname, lastname, age,
(commas and no fixed width)
I have replaced the commas with regexp but I would like to also trim the end of the strings. But I don't know how to do this.
The following is my start, but I can't manage to get a ".TrimEnd()" in there. I have also thought of trying a "-replace(" ", " ") but I can't integrate it in this expression:
Get-Content .\Bestand.txt | %{$data= [regex]::split($_, ';'); [string]:: join(',', $data)}
Can I get some information on how to achieve this?

I suggest you replace each occurrence of 'space;space' with a comma (assuming the replaced characters do not appear within a valid value), so the end result will look like:
firstname,lastname,age
Keeping it like the following is not a good idea cause now some of your headers (property names) start with a space:
"firstname, lastname, age,"
Give this a try (work on a copy of the file):
(Get-Content .\Bestand.txt) |
foreach {$_ -replace ' ; ',','} |
out-file .\Bestand.txt
Now it's easy to import and process the file with Import-Csv cmdlet.

The -replace operator takes a regular expression, which you can use to remove all leading and trailing spaces:
Get-Content .\Bestand.txt |
Foreach-Object { $_ -replace ' *; *',',' } |
Out-File .\Bestand.csv -Encoding OEM

Since you already create something CSV-ish, I'd go all the way and create proper CSV:
$cols = "firstname","lastname","age","rest"
Import-Csv "C:\input.txt" -Delimiter ";" -Header $cols | % {
foreach ($property in $_.PsObject.Properties) {
$property.Value = ([string]$property.Value).Trim()
}
$_
} | Export-Csv "C:\output.csv" -NoTypeInformation

Related

How to add quotation mark to existing text in a csv file using PowerShell

I need to convert strings in a csv file to strings with quotation marks around it.
My csv file looks like this:
Description;AllowHosts;SPNs;Owner
Description1;server1$, server2$, server3$;MSSQLSvc/PD01.dom1.com:1521,MSSQLSvc/PD01.dom1;Owner JDOE
Description2;server4$, server5$, server6$;MSSQLSvc/PD02.dom2.com:1521,MSSQLSvc/PD02.dom2;Owner JDOE
Description3;server7$, server8$, server9$;MSSQLSvc/PD03.dom1.com:1521,MSSQLSvc/PD03.dom1;Owner JDOE
I tried to search for header "AllowHosts" and replace with quotation mark in start and end,
$csv = #(
Import-Csv -Path $New -Delimiter ';' -Encoding UTF8
)
$data = ConvertFrom-Csv $csv
$Data[0].AllowHosts = '"'
$Data | where AllowHosts -Like '*$' | foreach {
$_.AllowHosts = '*$"'
}
$Data | where AllowHosts -Like 'SF' | foreach {
$_.AllowHosts = '"SF*'
}
$Data | ConvertTo-Csv -NoTypeInformation
but it did not work as expected....
I would like to have quotation mark around each string
in column "AllowHosts" (servernames)
in column "SPNs"
I am hoping for a result like this:
Description;AllowHosts;SPNs;Owner
Description1;"server1$", "server2$", "server3$";"MSSQLSvc/PD01.dom1.com:1521","MSSQLSvc/PD01.dom1";Owner JDOE
Description2;"server4$", "server5$", "server6$";"MSSQLSvc/PD02.dom2.com:1521","MSSQLSvc/PD02.dom2";Owner JDOE
Description3;"server7$", "server8$", "server9$";"MSSQLSvc/PD03.dom1.com:1521","MSSQLSvc/PD03.dom1";Owner JDOE
But how?
I have a powershell script that imports csv-file and creates json-files. My problem is that this line
" ""PrincipalsAllowedToRetrieveManagedPassword"""+": [" | Out-File $filepath1 -Append
gives this result
"PrincipalsAllowedToRetrieveManagedPassword": [ "server1$, server2$, server3$" ],
instead of
"PrincipalsAllowedToRetrieveManagedPassword": [ "server1$", "server2$", "server3$" ],
Use the -replace operator to add "'s around each "word" in the string:
# read data into memory
$csv = Import-Csv -Path $New -Delimiter ';' -Encoding UTF8
# modify all `AllowHosts` and `SPN` cells
$csv |ForEach-Object {
$_.AllowHosts = $_.AllowHosts -replace '([^\s,]+)','"$1"'
$_.SPNs = $_.SPNs -replace '([^\s,]+)','"$1"'
}
# re-export
$csv |Export-Csv -Path path\to\export.csv -NoTypeInformation
The pattern ([^\s,]+) matches (and captures) any consecutive sequence of characters not containing , or whitespace, and the substitution string "$1" expands to "".
Beware that this introduces ambiguity, as "'s are also used as value qualifiers in CSVs - so Export-Csv will escape the quotation marks you've added to retain them, and the resulting file will look like this:
"Description","AllowHosts","SPNs","Owner"
"Description1","""server1$"", ""server2$"", ""server3$""","""MSSQLSvc/PD01.dom1.com:1521"",""MSSQLSvc/PD01.dom1""","Owner JDOE"
"Description2","""server4$"", ""server5$"", ""server6$""","""MSSQLSvc/PD02.dom2.com:1521"",""MSSQLSvc/PD02.dom2""","Owner JDOE"
"Description3","""server7$"", ""server8$"", ""server9$""","""MSSQLSvc/PD03.dom1.com:1521"",""MSSQLSvc/PD03.dom1""","Owner JDOE"

Remove additional commas in CSV file using Powershell

I have a csv file that I'll like to import to sql but isn't properly formatted. I am not able to format the generated file (excel file) so I'm looking to do this with the CSV file using. I want to remove the extra commas and also replace the department name (,,,,,,) with the correct department as seen in the example below. Thank you in advance.
Example:
Current Format:
Department,,,,,,First Name,,,,Last Name,,,,,,,School Year,Enrolment Status
Psychology ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Jane,,,,Doe,,,,,,,2022,Enrolled
,,,,,,Jeff,,,,Dane,,,,,,,2019,Enrolled
,,,,,,Tate,,,,Anderson,,,,,,,2019,Not Enrolled
,,,,,,Daphne,,,,Miller,,,,,,,2021,Enrolled
,,,,,,Cora,,,,Dame,,,,,,,2022,Enrolled
Computer Science ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Dora,,,,Explorer,,,,,,,2022,Not Enrolled
,,,,,,Peppa,,,,Diggs,,,,,,,2020,Enrolled
,,,,,,Conrad,,,,Strat,,,,,,,2020,Enrolled
,,,,,,Kat,Noir,,,,2019,,,,,,,Enrolled
,,,,,,Lance,,,,Bug,2018,,,,,,,Enrolled
Ideal format:
Department,First Name,Last Name,School Year,Enrolment Status
Psychology ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
Psychology,Jane,Doe,2022,Enrolled
Psychology,Jeff,Dane,2019,Enrolled
Psychology,Tate,Anderson,2019,Not Enrolled
Psychology,Daphne,Miller,2021,Enrolled
Psychology,Cora,Dame,2022,Enrolled
Computer Science ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
Computer Science,Dora,Explorer,2022,Not Enrolled
Computer Science,Peppa,Diggs,2020,Enrolled
Computer Science,Conrad,Strat,2020,Enrolled
Computer Science,Kat,Noir,2019,Enrolled
Computer Science,Lance,Bug,2018,Enrolled
here you go:
$csvArray = new-object System.Collections.Generic.List[string]
#Import the file
$text = (gc "C:\tmp\testdata.txt") -replace ",{2,}",","
$arrayEnd = $text.count -1
$text[1..$arrayEnd] | %{
If ($_ -notmatch "^(,)"){
$department = $_ -replace ","
}
Else {
$csvArray.add($department + $_)
}
}
$csvArray.Insert(0,$text[0])
$csvArray | set-content 'C:\tmp\my.csv'
Using the Csv cmdlets:
$Csv = #'
Department,,,,,,First Name,,,,Last Name,,,,,,,School Year,Enrolment Status
Psychology ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Jane,,,,Doe,,,,,,,2022,Enrolled
,,,,,,Jeff,,,,Dane,,,,,,,2019,Enrolled
,,,,,,Tate,,,,Anderson,,,,,,,2019,Not Enrolled
,,,,,,Daphne,,,,Miller,,,,,,,2021,Enrolled
,,,,,,Cora,,,,Dame,,,,,,,2022,Enrolled
Computer Science ,,,,,,,,,,,,,,,,,,,,,,, (Remove this line)
,,,,,,Dora,,,,Explorer,,,,,,,2022,Not Enrolled
,,,,,,Peppa,,,,Diggs,,,,,,,2020,Enrolled
,,,,,,Conrad,,,,Strat,,,,,,,2020,Enrolled
,,,,,,Kat,Noir,,,,2019,,,,,,,Enrolled
,,,,,,Lance,,,,Bug,2018,,,,,,,Enrolled
'#
$List = ConvertFrom-Csv $Csv -Header #(1..20) # |Import-Csv .\Your.Csv -Header #(1..20)
$Columns = $List[0].PSObject.Properties.Where{ $_.Value -and $_.Value -ne 'Department' }.Name
$List |Select-Object -Property $Columns |Where-Object { $_.$($Columns[0]) } |
ConvertTo-Csv -UseQuote Never |Select-Object -Skip 1 # |Set-Content -Encoding utf8 out.csv
First Name,Last Name,School Year,Enrolment Status
Jane,Doe,2022,Enrolled
Jeff,Dane,2019,Enrolled
Tate,Anderson,2019,Not Enrolled
Daphne,Miller,2021,Enrolled
Cora,Dame,2022,Enrolled
Dora,Explorer,2022,Not Enrolled
Peppa,Diggs,2020,Enrolled
Conrad,Strat,2020,Enrolled
Kat,,,Enrolled
Lance,Bug,,Enrolled
Use a switch statement:
& {
$first = $true
switch -Wildcard -File in.csv { # Loop over all lines in file in.csv
',*' { # intra-department line
# Prepend the department name, eliminate empty fields and output.
$dept + (($_ -split ',' -ne '') -join ',')
}
default {
if ($first) { # header line
# Eliminate empty fields and output.
($_ -split ',' -ne '') -join ','
$first = $false
}
else { # department-only line
$dept = ($_ -split ',')[0] # save department name
}
}
}
} | Set-Content -Encoding utf8 out.csv
Note:
$_ -split ',' splits each line into fields by ,, and -ne '' filters out empty fields from the resulting array; applying -join ',' rejoins the nonempty fields with ,, which in effect removes multiple adjacent , and thereby eliminates empty fields.
If you don't mind the complexity of a regex, you can perform the above more simply with a single -replace operation, as shown in Toni's helpful answer.
Using switch -File is an efficient way to read files line by line and perform conditional processing based on sophisticated matching (as an alternative to -Wildcard you can use -Regex for regex matching, and you can even use script blocks ({ ... } as conditionals).
As a language statement, switch cannot be used directly in a pipeline.
This limitation can be overcome by enclosing it in a script block ({ ... }) invoked with &, which enables the usual, memory-friendly streaming behavior in the pipeline; that is, the lines are processed one by one, as are the modified output lines relayed to Set-Content, so that the input file needn't be read into memory as a whole.
In your case, plain-text processing of your CSV file enabled a simple solution, but in general it is better to parse CSV files into objects whose properties you can work with, using the Import-Csv cmdlet, and, for later re-exporting to a CSV file, Export-Csv,

Read text file and check for value in a specific position and change when true

I need to loop through multiple text files and check for a $ value in position 7 on each line of text and replace it with an * when found. But ONLY when it is in position 7. I do not want to change it if it is found in other positions. This is as far as I have gotten. Any help would be greatly appreciated.
Get-ChildItem 'C:\*.txt' -Recurse |
foreach $line in Get-Content $_ {
$linePosition1to5 = $line.Substring(0,6)
$linePosition7 = $line.Substring(6,1)
$linePositionRest = $line.Substring(8)
if($linePosition7 = "$"){
$linePosition7 = "*"
}
$linePosition1to5 + $linePosition7 + $linePositionRest |
Set-Content $_
}
Is there something that doesn't work in your example, or just that all the nested substrings are annoying to work with?
I'd use regex for this one. e.g.
$Lines = Get-Content -Path "C:\examplefile.txt" -raw
$Lines -replace '(?m)(^.{6})\$', '$1*'
To explain the regex:
?m indicates that it's multiline, required because I used raw get-content rather than pulling an array. Array would work too, just needs a loop like you did.
^.{6} line start plus any 6 characters (capture group 1)
$ escaped dollar character
$1* Capture group 1 left as is, dollar replaced with *, anything else not captured and therefore left untouched.
Thanks for code and the explanation. I realized that I left out the -raw option and it did work. Putting it back in it seems to add a line to the end of each file. Unless you can think of reason why I shouldn't I was going to leave it out.
Get-ChildItem 'C:\TEST\*.txt' -Recurse | ForEach {
(Get-Content $_ | ForEach { $_ -replace '(?m)(^.{6})\$', '$1*'}) |
Set-Content $_
}

Powershell replace text once per line

I have a Powershell script that I am trying to work out part of it, so the text input to this is listing the user group they are part of. This PS script is supposed to replace the group with the groups that I am assigning them in active directory(I am limited to only changing groups in active directory). My issue is that when it reaches HR and replaces it, it will then proceed to contine and replace all the new but it all so replaces the HR in CHRL, so my groups look nuts right now. But I am looking it over and it doesn't do it with every line. But for gilchrist it will put something in there for the HR in the name. Is there anything can I do to keep it for changing or am I going to have to change my HR to Human Resources? Thanks for the help.
$lookupTable = #{
'Admin' = 'W_CHRL_ADMIN_GS,M_CHRL_ADMIN_UD,M_CHRL_SITE_GS'
'Security' = 'W_CHRL_SECURITY_GS,M_CHRL_SITE_GS'
'HR' = 'M_CHRL_HR_UD,W_CHRL_HR_GS,M_CHRL_SITE_GS'
$original_file = 'c:\tmp\test.txt'
$destination_file = 'c:\tmp\test2.txt'
Get-Content -Path $original_file | ForEach-Object {
$line = $_
$lookupTable.GetEnumerator() | ForEach-Object {
if ($line -match $_.Key)
{
$line = $line -replace $_.Key, $_.Value
}
}
$line
} | Set-Content -Path $destination_file
Get-Content $destination_file
test.txt:
user,group
john.smith,Admin
joanha.smith,HR
john.gilchrist,security
aaron.r.smith,admin
abby.doe,secuity
abigail.doe,admin
Your input appears to be in CSV format (though note that your sample rows have trailing spaces, which you'd have to deal with, if they're part of your actual data).
Therefore, use Import-Csv and Export-Csv to read / rewrite your data, which allows a more concise and convenient solution:
Import-Csv test.txt |
Select-Object user, #{ Name='group'; Expression = { $lookupTable[$_.group] } } |
Export-Csv -NoTypeInformation -Encoding Utf8 test2.txt
Import-Csv reads the CSV file as a collection of custom objects whose properties correspond to the CSV column values; that is, each object has a .user and .name property in your case.
$_.group therefore robustly reports the abstract group name only, which you can directly pass to your lookup hashtable; Select-Object is used to pass the original .user value through, and to replace the original .group value with the lookup result, using a calculated property.
Export-Csv re-converts the custom objects to a CSV file:
-NoTypeInformation suppresses the (usually useless) data-type-information line at the top of the output file
-Encoding Utf8 was added to prevent potential data loss, because it is ASCII encoding that is used by default.
Note that Export-Csv blindly double-quotes all field values, whether they need it or not; that said, CSV readers should be able to deal with that (and Import-Csv certainly does).
As for what you tried:
The -replace operator replaces all occurrences of a given regex (regular expression) in the input.
Your regexes amounts to looking for (case-insensitive) substrings, which explains why HR matches both the HR group name and substring hr in username gilchrist.
A simple workaround would be to add assertions to your regex so that the substrings only match where you want them; e.g.: ,HR$ would only match after a , at the end of a line ($).
However, your approach of enumerating the hashtable keys for each input CSV row is inefficient, and you're better off splitting off the group name and doing a straight lookup based on it:
# Split the row into fields.
$fields = $line -split ','
# Update the group value (last field)
$fields[-1] = $lookupTable[$fields[-1]]
# Rebuild the line
$line = $fields -join ','
Note that you'd have to make an exception for the header row (e.g., test if the lookup result is empty and refrain from updating, if so).
Why don't you load your text file as a CSV file, using Import-CSV and use "," as a delimiter?
This will allow you to have a Powershell Object you can work on. and then export it as text o CSV. if I use your file & lookup table this code may help you :
$file = Import-Csv -Delimiter "," -Path "c:\ps\test.txt"
$lookupTable = #{
'Admin' = 'W_CHRL_ADMIN_GS,M_CHRL_ADMIN_UD,M_CHRL_SITE_GS'
'Security' = 'W_CHRL_SECURITY_GS,M_CHRL_SITE_GS'
'HR' = 'M_CHRL_HR_UD,W_CHRL_HR_GS,M_CHRL_SITE_GS'}
foreach ($i in $file) {
#Compare and replace
...
}
Export-CSV $file -Delimiter ","
You can then iterate over $file and compare and replace. you can also Export-CSV after you're done.

PowerShell script to convert one-column CSV file

I'm looking for a script, doesn't have to be in PS but must run under Windows, that converts a one column text file like below
abc
def
ghi
into
'abc',
'def',
'ghi'
I'm currently making this change in Excel using =concatenate, but a script would be better.
Use can use a regular expression to insert characters at beginning and end.
get-content ./myonlinecolumn.txt | foreach {$_ -replace "^","'" -replace "`$","',"}
Or you could use the format operator -f:
get-content ./myonlinecolumn.txt | foreach {"'{0}'," -f $_ }
Its a bit more work to remove the last trailing comma, but this also possible
$a = get-content ./myonlinecolumn.txt
get-content ./myonlinecolumn.txt | foreach { if ($_.readcount -lt $a.count) {"'{0}'," -f $_ } else {"'{0}'" -f $_ }}
My first idea was similar to what Chad already wrote, that is a check on the line number. So I've tried a different solution. Not very nice but I post it too :)
((gc c:\before.txt | % {"'"+$_+"'"} ) -join ",*").split("*") | out-file c:\after.txt
You can just use
(gc myfile | %{"'$_'"}) -join ',
'
or, if you love escapes:
(gc myfile | %{"'$_'"}) -join ",`n"
This loads the file into an array of strings (Get-Content), then processes each string by putting it into single quotes. (Use `"'$($_.Trim())'" if you need to trim whitespace, too). Then the lines are joined with a comma and line break (those can be embedded directly into strings).
If your values can contain single quotes (which need to be escaped) it's trivial to stick that in there, too:
(gc myfile | %{"'$($_.Trim() -replace "'","''")'"}) -join ",`n"