PowerShell, replacing headers where replacement is the same name but in lowercase - powershell

At the moment I have this code to replace all the headers of my csv file.
$Csv = Import-Csv "$treatmentfolder\2_1_traitement.csv"
$OldColumnHeaders = "Avis,N° invent.,Cd.Srv.Cl.,NumOrdre"
$NewColumnHeaders = "avis","num_inventaire","cd_srv_cl","num_ordre"
$i=0
ForEach ($header in $OldColumnHeaders){
if ($header -ne $NewColumnHeaders[$i]){
$Csv |
Select-Object *,#{n=$NewColumnHeaders[$i]; e={$header} } -Exclude $header |
Export-Csv -NoTypeInformation "$treatmentfolder\2_1_2_traitement.csv"
(gc "$treatmentfolder\2_1_2_traitement.csv") |
% {$_ -replace '"', ""} |
out-file "$treatmentfolder\2_2_traitement.csv" -Fo -Encoding UTF8
$Csv= Import-Csv "$treatmentfolder\2_2_traitement.csv"
}
$i += 1
}
The problem that I have is that I have an error that says that the "avis" already exists as a column header even though the values are different with the uppercase and lowercase 'a'. How can I change replace this header then?

Related

Insert Content into specific place in Powershell

I am trying to compare the string of two CSV files. If the string from the 2nd CSV file occurs in the 1st CSV file, the corresponding line in the 1st CSV file should be marked with a label (e.g.: "TestLabel") after the semicolon. The strings contain a lot of special characters. By and large, the comparison already works, I can also already add the label.
Since Powershell is still new to me and this is my first script, the following question still arises. How can I set my text "TestLabel" to a certain place in an uncomplicated way? Here, for example, in the next empty field between the semicolons?
CSV1 contains:
Testdefinition;Stichwörter;Stichwörter;Stichwörter;Stichwörter;Stichwörter
It is just a normal text (with round brackets).Test: success;ExistingLabel;;;;
This is a second text;;;
Another text;ExistingLabel;;;;
One more text for the testing - success;ExistingLabel;;;;
CSV2 contains:
Testdefinition;Stichwörter;Stichwörter;Stichwörter;Stichwörter;Stichwörter
It is just a normal text (with round brackets).Test: success
One more text for the testing - success
My script so far:
$header='Testdefinition', 'Stichwörter1', 'Stichwörter2', 'Stichwörter3', 'Stichwörter4', 'Stichwörter5'
$exportheader="Testdefinition;Stichwörter;Stichwörter;Stichwörter;Stichwörter;Stichwörter"
$path1='D:\data\.....test.csv'
$path2='D:\data\.....test_failed.csv'
$wfile='temp1.csv'
$wfile2='temp2.csv'
Get-Content $path1 | Select-Object -Skip 1 | Set-Content $wfile -Encoding UTF8
Get-Content $path2 | Select-Object -Skip 1 | Set-Content $wfile2 -Encoding UTF8
$file1=Import-CSV -Path $wfile -Delimiter ";" -Header $header
$file2=Import-CSV -Path $wfile2 -Delimiter ";" -Header $header
$exportfile='test.csv'
#$exportfile=$file1
$file1 | Get-Member
$file2 | Get-Member
$file1 | Format-Table
$file2 | Format-Table
Write-Output ""
Write-Output "Searching for failed results"
Set-Content $exportfile -Value $exportheader
$file1.Testdefinition | ForEach-Object {
Write-Output "The Testdefinition is: $_ "
$testSearch = $_
$testlinecontent = $file2.Testdefinition | Select-String $testSearch
$testlinenumber = $testlinecontent.LineNumber
if("$_" -eq "$testlinecontent")
{
Write-Output "Testline found: $testlinecontent in Line $testlinenumber"
Write-Output "$_ = $testlinecontent"
$testlineexport = "$_;$testlinenumber;TestLabel"
Write-Output $testlineexport
$testlineexport | Add-Content -Path $exportfile
}
else
{
Write-Output "Testline not found"
$testlineexport = "$_;$testlinenumber;NULL"
Write-Output $testlineexport
$testlineexport | Add-Content -Path $exportfile
}
Write-Output ""
}
$exportCsv = Import-Csv $exportfile -Delimiter ";" -Header $header
$exportCsv | Format-Table
Remove-Item -Path $wfile
Remove-Item -Path $wfile2
I hope you can give me a hint. Thanks in advance!
Assuming the files aren't too big, you can use the following approach based on Compare-Object, which is conceptually clear and relatively simple:
# Read the CSV files into their header row and the array of data rows, as strings.
$header, $rows1 = Get-Content $path1
$null, $rows2 = Get-Content $path2
# Initialize the export file by writing its header
Set-Content -Encoding utf8 $exportfile -Value $exportheader
# Compare the data rows by their first ";"-separated field.
# If the fields match, append ";TestLabel" to the LHS data row before
# passing it through, otherwise pass it as-is, and append to the
# export file.
Compare-Object -PassThru $rows1 $rows2 -IncludeEqual -Property { $_.Split(';')[0] } |
ForEach-Object { if ($_.SideIndicator -eq '==') { $_ + ';TestLabel' } else { $_ } } |
Add-Content $exportfile
Note:
For brevity I've omitted the code to also add a line number.
As you are already aware, PowerShell doesn't support CSV files whose headers contain duplicate column names, given that the column names become property names on import, and must therefore be unique.

Ignore round brackets of strings when comparing CSV files [duplicate]

I am trying to compare the string of two CSV files. If the string from the 2nd CSV file occurs in the 1st CSV file, the corresponding line in the 1st CSV file should be marked with a label (e.g.: "TestLabel") after the semicolon. The strings contain a lot of special characters. By and large, the comparison already works, I can also already add the label.
Since Powershell is still new to me and this is my first script, the following question still arises. How can I set my text "TestLabel" to a certain place in an uncomplicated way? Here, for example, in the next empty field between the semicolons?
CSV1 contains:
Testdefinition;Stichwörter;Stichwörter;Stichwörter;Stichwörter;Stichwörter
It is just a normal text (with round brackets).Test: success;ExistingLabel;;;;
This is a second text;;;
Another text;ExistingLabel;;;;
One more text for the testing - success;ExistingLabel;;;;
CSV2 contains:
Testdefinition;Stichwörter;Stichwörter;Stichwörter;Stichwörter;Stichwörter
It is just a normal text (with round brackets).Test: success
One more text for the testing - success
My script so far:
$header='Testdefinition', 'Stichwörter1', 'Stichwörter2', 'Stichwörter3', 'Stichwörter4', 'Stichwörter5'
$exportheader="Testdefinition;Stichwörter;Stichwörter;Stichwörter;Stichwörter;Stichwörter"
$path1='D:\data\.....test.csv'
$path2='D:\data\.....test_failed.csv'
$wfile='temp1.csv'
$wfile2='temp2.csv'
Get-Content $path1 | Select-Object -Skip 1 | Set-Content $wfile -Encoding UTF8
Get-Content $path2 | Select-Object -Skip 1 | Set-Content $wfile2 -Encoding UTF8
$file1=Import-CSV -Path $wfile -Delimiter ";" -Header $header
$file2=Import-CSV -Path $wfile2 -Delimiter ";" -Header $header
$exportfile='test.csv'
#$exportfile=$file1
$file1 | Get-Member
$file2 | Get-Member
$file1 | Format-Table
$file2 | Format-Table
Write-Output ""
Write-Output "Searching for failed results"
Set-Content $exportfile -Value $exportheader
$file1.Testdefinition | ForEach-Object {
Write-Output "The Testdefinition is: $_ "
$testSearch = $_
$testlinecontent = $file2.Testdefinition | Select-String $testSearch
$testlinenumber = $testlinecontent.LineNumber
if("$_" -eq "$testlinecontent")
{
Write-Output "Testline found: $testlinecontent in Line $testlinenumber"
Write-Output "$_ = $testlinecontent"
$testlineexport = "$_;$testlinenumber;TestLabel"
Write-Output $testlineexport
$testlineexport | Add-Content -Path $exportfile
}
else
{
Write-Output "Testline not found"
$testlineexport = "$_;$testlinenumber;NULL"
Write-Output $testlineexport
$testlineexport | Add-Content -Path $exportfile
}
Write-Output ""
}
$exportCsv = Import-Csv $exportfile -Delimiter ";" -Header $header
$exportCsv | Format-Table
Remove-Item -Path $wfile
Remove-Item -Path $wfile2
I hope you can give me a hint. Thanks in advance!
Assuming the files aren't too big, you can use the following approach based on Compare-Object, which is conceptually clear and relatively simple:
# Read the CSV files into their header row and the array of data rows, as strings.
$header, $rows1 = Get-Content $path1
$null, $rows2 = Get-Content $path2
# Initialize the export file by writing its header
Set-Content -Encoding utf8 $exportfile -Value $exportheader
# Compare the data rows by their first ";"-separated field.
# If the fields match, append ";TestLabel" to the LHS data row before
# passing it through, otherwise pass it as-is, and append to the
# export file.
Compare-Object -PassThru $rows1 $rows2 -IncludeEqual -Property { $_.Split(';')[0] } |
ForEach-Object { if ($_.SideIndicator -eq '==') { $_ + ';TestLabel' } else { $_ } } |
Add-Content $exportfile
Note:
For brevity I've omitted the code to also add a line number.
As you are already aware, PowerShell doesn't support CSV files whose headers contain duplicate column names, given that the column names become property names on import, and must therefore be unique.

Powershell 5: ConvertTo-Csv a CSV with quotes in some but not all columns

I am building am updating a script which imports a large CSV file and then splits it into lots of separate CSV files based on the value in the first two columns
so POIMP_NL_20210306.csv which contains:
DOC_NUMBER|COMMENTS|ITEM|QTY|SUPPLIER
P-100-1234|JANE|5059585896978|2|"JOES SUPPLIES"
P-100-1234|JANE|5059585896985|2|"JOES SUPPLIES"
P-100-6666|TED|5059585896992|1|"ACTION TOYS"
must be split into POIMP_P-100-1234_JANE.csv containing
P-100-1234|JANE|5059585896978|2|"JOES SUPPLIES"
P-100-1234|JANE|5059585896985|2|"JOES SUPPLIES"
and POIMP_P-100-6666_TED.csv
P-100-6666|TED|5059585896992|1|"ACTION TOYS"
The problem I am trying to solve is preserving the quotes in just the SUPPLIER column
Since ConvertTo-Csv adds quotes to everything, I use a % { $_ -replace '"', ""} to remove these all before the out-file is created but of course it removes these from the SUPPLIER column 2
Here is my script which perfectly splits the big file into smaller files by DOC_NUMBER and COMMENTS but removes all quotes:
$basePath = "C:\"
$archivePath = "$basePath\archive\"
$todaysDate = $(get-date -Format yyyyMMdd)
$todaysFiles = #(
(Get-ChildItem -Path $basePath | Where-Object { $_.Name -match 'POIMP_' + $todaysDate })
)
cd $basePath
foreach ($file in $todaysFiles ) {
$fileName = $file.ToString()
Import-Csv $fileName -delimiter "|" | Group-Object -Property "DOC_NUMBER","COMMENTS" |
Foreach-Object {
$newName = $_.Name -replace ",","_" -replace " ",""; $path=$fileName.SubString(0,8) + $newName+".csv" ; $_.group |
ConvertTo-Csv -NoTypeInformation -delimiter "|" | % { $_ -replace '"', ""} | out-file $path -fo -en ascii
}
Rename-Item $fileName -NewName ([io.path]::GetFileNameWithoutExtension("$fileName") + "_Original.csv")
Move-Item (Get-ChildItem -Path $basePath | Where-Object { $_.Name -match '_Original' }) $archivePath -force
}
And here is another script which I found online and amended and which successfully leaves quotes in just the SUPPLIER column by first adding double back ticks and then replacing these with quotes after all others have been removed
$ImportedCSV = Import-CSV "C:\POIMP_NL_20210306.csv" -delimiter "|"
$NewCSV = Foreach ($Entry in $ImportedCsv) {
$Entry.SUPPLIER = '¬¬' + $Entry.SUPPLIER + '¬¬'
$Entry
}
$NewCSV |
ConvertTo-Csv -NoTypeInformation -delimiter "|" | % { $_ -replace '"', ""} | % { $_ -replace '¬¬', '"'} | out-file "C:\updatedPO.csv" -fo -en ascii
I just can't merge these scripts to achieve the desired result as I can't seem to reference the correct object. I'd really appreciate your help! Thanks
Any good CSV reader should be able to handle quotes around csv fields, even when not really needed.
Having said that, It is your explicit wish to only have quotes around the field in the SUPPLIER column. (Note, in your example there is a trailing space after that column name)
In this case, I think this would help.
Not only does it surround the SUPPLIER fields with quotes, but also saves the data as separate files using the values from column DOC_NUMBER and COMMENTS per group found in the csv
$path = 'D:\Test'
$fileIn = Join-Path -Path $path -ChildPath 'POIMP_NL_20210306.csv'
# import the csv file and group first two columns
Import-Csv -Path $fileIn -Delimiter '|' | Group-Object -Property "DOC_NUMBER","COMMENTS" | ForEach-Object {
$headerDone = $false
$data = foreach ($item in $_.Group) {
if (!$headerDone) {
$item.PsObject.Properties.Name -join '|'
$headerDone = $true
}
$item.SUPPLIER = '"{0}"' -f $item.SUPPLIER
$item.PsObject.Properties.Value -join '|'
}
# create a new filename like 'POIMP_P-100-1234_JANE.csv'
$fileOut = Join-Path -Path $path -ChildPath ('POIMP_{0}_{1}.csv' -f $_.Group[0].DOC_NUMBER, $_.Group[0].COMMENTS)
# save the data not using Export-Csv because that will add quotes around everything (in PowerShell 5)
$data | Set-Content -Path $fileOut -Force
}
Output
POIMP_P-100-1234_JANE.csv
DOC_NUMBER|COMMENTS|ITEM|QTY|SUPPLIER
P-100-1234|JANE|5059585896978|2|"JOES SUPPLIES"
P-100-1234|JANE|5059585896985|2|"JOES SUPPLIES"
POIMP_P-100-6666_TED.csv
DOC_NUMBER|COMMENTS|ITEM|QTY|SUPPLIER
P-100-6666|TED|5059585896992|1|"ACTION TOYS"
If you are Powershell 7 or later, you can use
$yourdata | ConvertTo-Csv -NoTypeInformation -QuoteFields "SUPPLIER" -Delimiter "|" |
Out-File ...
or you could use
$yourdata | Export-Csv -NoTypeInformation -QuoteFields "SUPPLIER" `
-Delimiter "|" -Path <path-to-output-file>.csv
You can also use -UseQuotes AsNeeded to let the converter add quoting where it thinks it makes sense, otherwise just specify the fields you want quoted.

Script for removing rows based on entries from specific column in CSV file

I have a file structured this way (tab separed):
HEADER_1 HEADER_2
entry_A entry_A
entry_B entry_C
entry_A entry_D
entry_D entry_A
What i need to do is: for every time an entry from column one appears in column two (at any point) delete whole row where the entry appears
Desired output:
HEADER_1 HEADER_2
entry_B entry_C
entry_A entry_D
I tried with Sort-Object -Unique but the output is not correct, it just removes duplicate lines
To output the row where Header_2 never contains an entry from all Header_1 values, you can do the following:
Windows PowerShell:
$data = Import-Csv file.csv -Delimiter "`t"
($data | where Header_1 -notin $data.Header_2 |
ConvertTo-Csv -NoType -Delimiter "`t") -replace '^"|"$|"(\t)"','$1' |
Set-Content file.csv
PowerShell 7:
$data = Import-Csv file.csv -Delimiter "`t"
$data | where Header_1 -notin $data.Header_2 |
Export-Csv -NoType -Delimiter "`t" -UseQuotes AsNeeded
I feel like what you want to do is output rows where Header_2 has not yet appeared as a Header_1 value, which means you are ignoring future Header_1 values.
$list = [system.collections.generic.list[string]]#()
(Import-Csv file.csv -delimiter "`t" | Foreach-Object {
$list.Add($_.Header_1)
if ($_.Header_2 -notin $list) {
$_
}
} | ConvertTo-Csv -NoType -Delimiter "`t") -replace '^"|"$|"(\t)"','$1' |
Set-Content file.csv
You can go a route without using *-Csv commands and then you don't have to deal with qualifying text for PowerShell non-core versions.
$list = [system.collections.generic.list[string]]#()
Get-Content file.csv | Foreach-Object {
$h1,$h2 = $_ -split '\t'
$list.Add($h1)
if ($h2 -notin $list) {
$_
}
} | Set-Content file.csv
You could also use the .NET System.Collections.Generic.HashSet class for O(1) lookups with Contains():
$data = Import-Csv -Path file.csv -Delimiter "`t"
$hashSet = New-Object -TypeName System.Collections.Generic.HashSet[string]
$keep = #()
foreach ($row in $data)
{
$hashSet.Add($row.HEADER_1)
if (-not($hashSet.Contains(($row.HEADER_2))))
{
$keep += $row
}
}
$keep | Export-Csv -Path file.csv -Delimiter "`t" -NoTypeInformation
Which results in a new file.csv:
"HEADER_1" "HEADER_2"
"entry_B" "entry_C"
"entry_A" "entry_D"

How do I find text between two words and export it to txt.file

I have a CSV file which contains many lines and I want to take the text between <STR_0.005_Long>, and µm,5.000µm.
Example line from the CSV:
Straightness(Up/Down) <STR_0.005_Long>,4.444µm,5.000µm,,Pass,‌​2.476µm,1.968µm,25,0‌​.566µm,0.720µm
This is the script that I am trying to write:
$arr = #()
$path = "C:\Users\georgi\Desktop\5\test.csv"
$pattern = "(?<=.*<STR_0.005_Long>,)\w+?(?=µm,5.000µm*)"
$Text = Get-Content $path
$Text.GetType() | Format-Table -AutoSize
$Text[14] | Foreach {
if ([Regex]::IsMatch($_, $pattern)) {
$arr += [Regex]::Match($_, $pattern)
Out-File C:\Users\georgi\Desktop\5\test.txt -Append
}
}
$arr | Foreach {$_.Value} | Out-File C:\Users\georgi\Desktop\5\test.txt -Append
Use a Where-Object filter with your regular expression and simply output the match to the output file:
Get-Content $path |
Where-Object { $_ -match $pattern } |
ForEach-Object { $matches[0] } |
Out-File 'C:\Users\georgi\Desktop\5\test.txt'
Of course, since you have a CSV, you could simply use Import-Csv and export the value of that particular column:
Import-Csv $path | Select-Object -Expand 'column_name' |
Out-File 'C:\Users\georgi\Desktop\5\test.txt'
Replace column_name with the actual name of the column. If the CSV doesn't have a column header you can specify one via the -Header parameter:
Import-Csv $path -Header 'col1','col2','col3',... |
Select-Object -Expand 'col2' |
Out-File 'C:\Users\georgi\Desktop\5\test.txt'