Sort and export-CSV - powershell

I have a csv file containing rows of the following extract:
"EmployeeID","FirstName","LastName","Location","Department","TelephoneNo","Email"
"000001 ","abc ","def ","Loc1"," "," ","name1#company.com "
"000023 ","ghi ","jkl ","Loc2"," "," ","name2#company.com "
"000089 ","mno ","pqr ","Loc2"," "," ","name3#company.com "
How do I keep the quotes and sort and save as a csv file?
I have the following powershell source script which works with csv files not having double quotes for the columns:
Get-Content $Source -ReadCount 1000 |
ConvertFrom-Csv -Delimiter $Delimiter |
Sort-Object -Property $NamesOfColumns -Unique |
ForEach-Object {
# Each of the values in $ColumnValueFormat must be executed to get the property from the loop variable ($_).
$values = foreach ($value in $ColumnValueFormat) {
Invoke-Expression $value
}
# Then the values can be passed in as an argument for the format operator.
$ShowColsByNumber -f $values
} |
Add-Content $Destination;
The $Source, $Delimiter, $NamesOfColumns and $ColumnValueFormat are given or built dynamically.
$ColumnValueFormat with a non quoted csv file contains:
$_.EmployeeID.Trim()
$_.FirstName.Trim()
$_.LastName.Trim()
$_.Location.Trim()
$_.Department.Trim()
$_.TelephoneNo.Trim()
$_.Email.Trim()
$ColumnValueFormat with a quoted csv file contains:
$_."EmployeeID".Trim()
$_."FirstName".Trim()
$_."LastName".Trim()
$_."Location".Trim()
$_."Department".Trim()
$_."TelephoneNo".Trim()
$_."Email".Trim()
The problem seems to be based around the $ColumnValueFormat that is placing the column headers with the double quotes. (If I remove them I am not sure the internals of the cmdlet will recognize the column headings when it is processing the rows)
I am having two problems:
The column heading surrounded by the double quotes. The problem seems to be based around the $ColumnValueFormat that is placing the column headers with the double quotes as it does not process the rows. (If I remove the double quotes then it does not recognize the column headings when it is processing the rows).
Another problem I came across last minute is if the last column is blank it thinks it's a null and when the Invoke-Expression $value executes (where $value holds the last column expression of $_.Email.Trim() - on a non quoted CSV file) it bombs. If I try to place the statement in a try/catch block it simply ignore it the last column is not added to the $values array and again bombs.

Quotes around property names are used syntactically to access names with spaces, not to write quotes to the output.
Export-Csv cmdlet doesn't have an option to force quotes so we'll have to export the CSV manually. And we'll have to process empty values that are $Null after ConvertFrom-Csv with an empty string. In case only some fields are needed we'll use Select cmdlet with -index parameter.
Get-Content $Source |
ConvertFrom-Csv |
%{ $header = $false } {
if (!$header) {
$header = $true
'"' + (
($csv[0].PSObject.Properties.Name.trim() |
select -index 1,6
) -join '","'
) + '"'
}
'"' + (
($_.PSObject.Properties.Value |
%{ if ($_) { $_.trim() } else { '' } } |
select -index 1,6
) -join '","'
) + '"'
} | Out-File $Destination
The above code is great for pass-through processing of large CSV files because it doesn't keep the entire file in memory. Otherwise it's possible to simplify the code a bit:
$csv = Get-Content $Source | ConvertFrom-Csv
$csv | %{
'"' + (
($csv[0].PSObject.Properties.Name.trim() |
select -index 1,6
) -join '","'
) + '"'
} {
'"' + (
($_.PSObject.Properties.Value |
%{ if ($_) { $_.trim() } else { '' } } |
select -index 1,6
) -join '","'
) + '"'
) | Out-File $Destination

Related

How to replace a value in text file using powershell script

My file consists of following data (no header)
DEPOSIT ADD 123456789 (VALUE)(VARIABLE) NNNN VALUEVARIABLE
DEPOSIT ADD 234567890 (VALUE)(P75) NNNN VALUEVARIABLE
DEPOSIT ADD 345678901 (VALUE)(VARIABLE) NNNN VALUEVARIABLE
This is a tab delimited text file.
There are total of 5 columns. (123456789 (VALUE)(VARIABLE) is a single value column)
My requirements are:
I need to fetch only the row which contains P75 to update in the same file.
I have to replace the values in Col3,Col4 and in Col5 after fetching P75 other rows should be unaffected.
from
DEPOSIT ADD 234567890 (VALUE)(P75) NNNN VALUEVARIABLE
to
DEPOSIT ADD 234567890 (VTG)(SPVTG) TCM VTGSPVTG
Only the records which contains P75 should be updated like this. The replace values are same for all selected records.
My script which I have written is
$original_file='C:\Path\20200721130155_copy.txt' -header Col1,Col2,Col3,Col4,Col5,| Select Col3,Col4,Col5
(Get-Content $original_file) |ForEach-Object {
if($_.Col3 -match '(VALUE)(P75)')
{
$_ -replace '(VALUE)(P75)', '(VTG)(SPVTG)' `
-replace 'VALUEVARIABLE', 'VTGSPVTG' `
-replace 'NNNN', 'TCM' `
}
$_
}| Set-Content $original_file+'_new.txt' -Force
I am getting output file with same content. The file is not getting updated.
Please advice.
Thanks
You can do the following:
$newfile = "{0}_new.txt" -f $original_file
Get-Content $original_file | Foreach-Object {
if ($_ -match '\(VALUE\)\(P75\)') {
$_ -replace '\(VALUE\)\(P75\)','(VTG)(SPVTG)' -replace 'VALUEVARIABLE', 'VTGSPVTG' -replace 'NNNN', 'TCM'
} else {
$_
}
} | Set-Content $newfile -Force
Since -replace uses regex, you must backslash escape special regex characters like ( and ).
Since $_ is the current line read from Get-Content without -Raw, you will need to output $_ if you want to make no changes. If you do want to replace text, then $_ -replace 'regex','text' will output that line with the replaced text.
Alternatively, you can apply the same logic above in a switch statement, which is more efficient:
$newfile = "{0}_new.txt" -f $original_file
$(switch -regex -file $original_file {
'\(VALUE\)\(P75\)' {
$_ -replace '\(VALUE\)\(P75\)','(VTG)(SPVTG)' -replace 'VALUEVARIABLE', 'VTGSPVTG' -replace 'NNNN', 'TCM'
}
default { $_ }
}) | Set-Content $newfile -Force
You can use Import-Csv -Delimiter "`t" to import the data of the original file.
Then loop over the items and change the values if Col3 matches the search text.
$original_file = 'C:\Path\20200721130155_copy.txt'
$out_file = $original_file -replace '\.txt$', '_new.txt'
(Import-Csv -Path $original_file -Delimiter "`t" -Header 'Col1','Col2','Col3','Col4','Col5') | ForEach-Object {
if( $_.Col3 -like '*(*)*(P75)') {
$_.Col3 = $_.Col3 -replace '\([^)]+\)\(P75\)$', '(VTG)(SPVTG)'
$_.Col4 = 'TCM'
$_.Col5 = 'VTGSPVTG'
}
# rejoin the fields and output the line
$_.PsObject.Properties.Value -join "`t"
} | Set-Content -Path $out_file -Force
Output will be
DEPOSIT ADD 123456789 (VALUE)(VARIABLE) NNNN VALUEVARIABLE
DEPOSIT ADD 234567890 (VTG)(SPVTG) TCM VTGSPVTG
DEPOSIT ADD 345678901 (VALUE)(VARIABLE) NNNN VALUEVARIABLE
Regex details for -replace:
\( Match the character “(” literally
[^)] Match any character that is NOT a “)”
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\) Match the character “)” literally
\( Match the character “(” literally
P75 Match the characters “P75” literally
\) Match the character “)” literally
$ Assert position at the end of the string (or before the line break at the end of the string, if any)

Keep DC part of distinguished name

I have for input distinguished names like the following:
CN=A00.user,OU=MyOU,OU=A00,OU=MyOU3,DC=my,DC=domain
CN=A01.user1,OU=MyOU1,OU-MyOU2,OU=A00,OU=MyOU3,DC=my,DC=first,DC=domain
I need to print only the DC part, to get an output like:
my.domain
my.first.domain
Looks like split or replace should work, but I'm having trouble figuring out the syntax.
You can use Get-ADPathname.ps1 with the -Split parameter, Select-String with a regular expression, and the -join operator:
(
Get-ADPathname 'CN=A01.user1,OU=MyOU1,OU-MyOU2,OU=A00,OU=MyOU3,DC=my,DC=first,DC=domain' -Split | Select-String '^DC=(.+)' | ForEach-Object {
$_.Matches[0].Groups[1].Value
}
) -join '.'
Output:
my.first.domain
Here's a quick and dirty way to get it done.
("CN=A00.user,OU=MyOU,OU=A00,OU=MyOU3,DC=my,DC=domain " -split "," |
Where-Object { $_.StartsWith("DC=") } |
ForEach-Object { $_.Replace("DC=","")}) -join "."
Produces
my.domain
I would simply remove everything up to and including the first ,DC= and then replace the remaining ,DC= with dots.
$dn = 'CN=A00.user,OU=MyOU,OU=A00,OU=MyOU3,DC=my,DC=domain',
'CN=A01.user1,OU=MyOU1,OU-MyOU2,OU=A00,OU=MyOU3,DC=my,DC=first,DC=domain'
$dn -replace '^.*?,dc=' -replace ',dc=', '.'

csv reformatting with powershell

I have a file cointaining a lot of lines in this format:
firstname ; lastname ; age ;
(it's a bit more complex but that's basically the file)
so the fields are of a fixed length, padded with spaces and with a semicolon in between the fields.
I would like to have it so:
firstname, lastname, age,
(commas and no fixed width)
I have replaced the commas with regexp but I would like to also trim the end of the strings. But I don't know how to do this.
The following is my start, but I can't manage to get a ".TrimEnd()" in there. I have also thought of trying a "-replace(" ", " ") but I can't integrate it in this expression:
Get-Content .\Bestand.txt | %{$data= [regex]::split($_, ';'); [string]:: join(',', $data)}
Can I get some information on how to achieve this?
I suggest you replace each occurrence of 'space;space' with a comma (assuming the replaced characters do not appear within a valid value), so the end result will look like:
firstname,lastname,age
Keeping it like the following is not a good idea cause now some of your headers (property names) start with a space:
"firstname, lastname, age,"
Give this a try (work on a copy of the file):
(Get-Content .\Bestand.txt) |
foreach {$_ -replace ' ; ',','} |
out-file .\Bestand.txt
Now it's easy to import and process the file with Import-Csv cmdlet.
The -replace operator takes a regular expression, which you can use to remove all leading and trailing spaces:
Get-Content .\Bestand.txt |
Foreach-Object { $_ -replace ' *; *',',' } |
Out-File .\Bestand.csv -Encoding OEM
Since you already create something CSV-ish, I'd go all the way and create proper CSV:
$cols = "firstname","lastname","age","rest"
Import-Csv "C:\input.txt" -Delimiter ";" -Header $cols | % {
foreach ($property in $_.PsObject.Properties) {
$property.Value = ([string]$property.Value).Trim()
}
$_
} | Export-Csv "C:\output.csv" -NoTypeInformation

Remove Duplicate Group of Data in Text file

I have a text file formatted similar to the following:
Description1: Data-123<br>
Description2: Data-ABC<br>
Description3: Data-789<br>
Description4: Data-EFG<br>
Description5: Data-XYZ<br>
Description1: Data-123<br>
Description2: Data-ABC<br>
Description3: Data-789<br>
Description4: Data-EFG<br>
Description5: Data-XYZ<br>
Description1: Data-123<br>
Description2: Data-ABC<br>
Description3: Data-789<br>
Description4: Data-EFG<br>
Description5: Data-584<br>
I need PowerShell to compare each group (5 lines of data) as a whole and remove any duplicate groups, leaving only the unique groups of data. I can get it to remove single duplicate lines with the code below, but no luck comparing each group.
get-content TextFile.txt | sort-object | get-unique > NewTextFile.txt
Maybe this can work, you need to create the output file based on the result of last line of code, anyway I give no explanation because you don't show us any code you have so far.
$a = gc mylist.txt
$b = [string]::Empty
$c = #()
$a | % {if ( $_ -ne [string]::Empty )
{ $b += "$_`n" }
else
{ $c += $b
$b = [string]::Empty
}
}
$c += $b
$c | select -Unique | out-file .\mynew.txt
Split the file content on double new line characters (that should match the end of the line right before the empty line + the empty line right after it), split each object returned (remove the empty line) and then join it back, add new line and write the results to a new file.
(Get-Content TextFile.txt | Out-String) -split "`r`n`r`n" | ForEach-Object{
($_.Split("`r`n",[System.StringSplitOptions]::RemoveEmptyEntries) -join "`r`n") + "`n"
} | Select-Object -Unique | Out-File NewTextFile.txt

Extracting columns from text file using PowerShell

I have to extract columns from a text file explained in this post:
Extracting columns from text file using Perl one-liner: similar to Unix cut
but I have to do this also in a Windows Server 2008 which does not have Perl installed. How could I do this using PowerShell? Any ideas or resources? I'm PowerShell noob...
Try this:
Get-Content test.txt | Foreach {($_ -split '\s+',4)[0..2]}
And if you want the data in those columns printed on the same line:
Get-Content test.txt | Foreach {"$(($_ -split '\s+',4)[0..2])"}
Note that this requires PowerShell 2.0 for the -split operator. Also, the ,4 tells the the split operator the maximum number of split strings you want but keep in mind the last string will always contain all extras concat'd.
For fixed width columns, here's one approach for column width equal to 7 ($w=7):
$res = Get-Content test.txt | Foreach {
$i=0;$w=7;$c=0; `
while($i+$w -lt $_.length -and $c++ -lt 2) {
$_.Substring($i,$w);$i=$i+$w-1}}
$res will contain each column for all rows. To set the max columns change $c++ -lt 2 from 2 to something else. There is probably a more elegant solution but don't have time right now to ponder it. :-)
Assuming it's white space delimited this code should do.
$fileName = "someFilePath.txt"
$columnToGet = 2
$columns = gc $fileName |
%{ $_.Split(" ",[StringSplitOptions]"RemoveEmptyEntries")[$columnToGet] }
To ordinary、
type foo.bar | % { $_.Split(" ") | select -first 3 }
Try this. This will help to skip initial rows if you want, extract/iterate through columns, edit the column data and rebuild the record:
$header3 = #("Field_1","Field_2","Field_3","Field_4","Field_5")
Import-Csv $fileName -Header $header3 -Delimiter "`t" | select -skip 3 | Foreach-Object {
$record = $indexName
foreach ($property in $_.PSObject.Properties){
#doSomething $property.Name, $property.Value
if($property.Name -like '*CUSIP*'){
$record = $record + "," + '"' + $property.Value + '"'
}
else{
$record = $record + "," + $property.Value
}
}
$array.add($record) | out-null
#write-host $record
}