I have to process some text and got some difficulties:
The text .\text.txt is formatted like that:
name,
surname,
address,
name.
surname,
address,
etc.
What I want to achieve is join the objects that ends with the "," like this:
name,surname,address
name,surname,address
etc
I was working on something like this:
$content= path to the text.txt
$result= path to the result file
Get-Content -Encoding UTF8 $content | ForEach-object {
if ( $_ -match "," ) {
....join the selected lines....
}
} |Set-Content -Encoding UTF8 $result
What I need to consider is also that lines which terminate with "," may have a next line empty which should be a CR in the $result
You can do this by splitting the blocks of data on the empty newlines first:
# read the content of the file as one single multiline string
$content = Get-Content -Path 'Path\To\The\file.txt' -Raw -Encoding UTF8
# split on two or more newlines and dispose of empty blocks
$content -split '(\r?\n){2,}' | Where-Object { $_ -match '\S' } | ForEach-Object {
# trim the text block, split on newline and remove the trailing commas (or dots)
# output these joined with a comma
($_.Trim() -split '\r?\n' ).TrimEnd(",.") -join ','
} | Set-Content -Path 'Path\To\The\NEW_file.txt' -Encoding UTF8
Output:
name,surname,address
name,surname,address
all your terms ends with a , so you could use regex:
$content= "C:\test.txt"
$result= "path to the result file"
$CR = "`r`n"
$lines = Get-Content -Encoding UTF8 $content -raw
$option = [System.Text.RegularExpressions.RegexOptions]::Singleline
$lines = [regex]::new(',(?:\r?\n){2,}', $option).Replace($lines, $CR + $CR)
$lines = [regex]::new(',\r?\n', $option).Replace($lines, ",")
$lines | Out-File -FilePath $result -Encoding utf8
result:
name,surname,address
name1,surname,address
name,surname,address
name,surname,address
Below piece of code will give the required result.
$content= "Your file path"
$resultPath = "result file path"
Get-Content $content | foreach {
$data = $_
if($data -eq "address,")
{
$NewData = $data -replace ',',''
$data = $NewData + "`r`n"
}
$out = $out + $data
}
$out | Out-File $resultPath
Related
I am having trouble splitting a line into an array using the "|" in a text file and reassembling it in a certain order. There are multiple lines like the original line in the text file.
This is the original line:
80055555|Lastname|Firstname|AidYear|DCDOCS|D:\BDMS_UPLOAD\800123456_11-13-2018 14-35-53 PM_1.pdf
I need it to look this way:
80055555|DCDOCS|Lastname|Firstname|AidYear|D:\BDMS_UPLOAD\800123456_11-13-2018 14-35-53 PM_1.pdf
Here is the code I am working with:
$File = 'c:\Names\Complete\complete.txt'
$Arr = $File -split '|'
foreach ($line in Get-Content $File)
{
$outputline = $Arr[0] + "|" + $Arr[4] + "|" + $Arr[1] + "|" + $Arr[2] + "|" +
"##" + $Arr[5] |
Out-File -filepath "C:\Names\Complete\index.txt" -Encoding "ascii" -append
}
You need to process every line of the file on its own and then split them.
$File = get-content "D:\test\1234.txt"
foreach ($line in $File){
$Arr = $line.Split('|')
[array]$OutputFile += $Arr[0] + "|" + $Arr[4] + "|" + $Arr[1] + "|" + $Arr[2] + "|" + "##" + $Arr[5]
}
$OutputFile | out-file -filepath "D:\test\4321.txt" -Encoding "ascii" -append
edit: Thx to LotPings for this alternate suggestion based on -join and the avoidance of += to build the array (which is inefficient, because it rebuilds the array on every iteration):
$File = get-content "D:\test\1234.txt"
$OutputFile = foreach($line in $File){($line.split('|'))[0,4,1,2,3,5] -Join '|'}
$OutputFile | out-file -filepath "D:\test\4321.txt" -Encoding "ascii"
To offer a more PowerShell-idiomatic solution:
# Sample input line.
$line = '80055555|Lastname|Firstname|AidYear|DCDOCS|D:\BDMS_UPLOAD\800123456_11-13-2018 14-35-53 PM_1.pdf'
# Split by '|', rearrange, then re-join with '|'
($line -split '\|')[0,4,1,2,3,5] -join '|'
Note how PowerShell's indexing syntax (inside [...]) is flexible enough to accept an arbitrary array (list) of indices to extract.
Also note how -split's RHS operand is \|, i.e., an escaped | char., given that | has special meaning there, because it is interpreted as a regex.
To put it all together:
$File = 'c:\Names\Complete\complete.txt'
Get-Content $File | ForEach-Object {
($_ -split '\|')[0,4,1,2,3,5] -join '|'
} | Out-File -LiteralPath C:\Names\Complete\index.txt -Encoding ascii
As for what you tried:
$Arr = $File -split '|'
Primarily, the problem is that the -split operation is applied to the input file path, not to the file's content.
Secondarily, as noted above, to split by a literal | char., \| must be passed to -split, because it expects a regex (regular expression).
Also, instead of using Out-File inside a loop with -Append, it is more efficient to use a single pipeline with ForEach-Object, as shown above.
Since your input file is actually a CSV file without headers and where the fields are separated by the pipe symbol |, why not use Import-Csv like this:
$fileIn = 'C:\Names\Complete\complete.txt'
$fileOut = 'C:\Names\Complete\index.txt'
(Import-Csv -Path $File -Delimiter '|' -Header 'Item','LastName','FirstName','AidYear','Type','FileName' |
ForEach-Object {
"{0}|{1}|{2}|{3}|{4}|{5}" -f $_.Item, $_.Type, $_.LastName, $_.FirstName, $_.AidYear, $_.FileName
}
) | Add-Content -Path $fileOut -Encoding Ascii
I have this code which works as it should:
Get-Content $path\$newName -Encoding OEM |ForEach-Object {$_ -replace '<Num:(\d{8,20})>$','$1'}| Set-Content $path\$txtName -Encoding UTF8
The string is replaced by the digits. But I would like to be able to use $1 outside the loop.
Like:
write-host $1
For example. But if i do this noting is output.
Any suggestions?
Thanks.
Based on VivekKumarSinghs script.
$InFile = '.\test.txt'
$OutFile= '.\test2.txt'
$RegEx = "<Num:(\d{8,20})>$"
$array = #()
Get-Content $InFile -Encoding OEM | ForEach-Object {
if ($_ -match $RegEx ){$array += $matches[1]}
$_ -replace $RegEx,"`$1"
} | Set-Content $OutFile -Encoding UTF8
$array
> gc .\test.txt
<Num:1234567890>
<Num:23456789101112>
> .\SO_50579315.ps1
1234567890
23456789101112
> gc .\test2.txt
1234567890
23456789101112
One way would be assigning $1 to an array like this -
$array = #()
Get-Content $path\$newName -Encoding OEM | ForEach-Object {$_ -replace '<Num:(\d{8,20})>$','$1'; $array += $1 } | Set-Content $path\$txtName -Encoding UTF8
You can use the values of $1 like $array[0], $array[1], $array[2].. and so on.
At the moment I have this code to replace all the headers of my csv file.
$Csv = Import-Csv "$treatmentfolder\2_1_traitement.csv"
$OldColumnHeaders = "Avis,N° invent.,Cd.Srv.Cl.,NumOrdre"
$NewColumnHeaders = "avis","num_inventaire","cd_srv_cl","num_ordre"
$i=0
ForEach ($header in $OldColumnHeaders){
if ($header -ne $NewColumnHeaders[$i]){
$Csv |
Select-Object *,#{n=$NewColumnHeaders[$i]; e={$header} } -Exclude $header |
Export-Csv -NoTypeInformation "$treatmentfolder\2_1_2_traitement.csv"
(gc "$treatmentfolder\2_1_2_traitement.csv") |
% {$_ -replace '"', ""} |
out-file "$treatmentfolder\2_2_traitement.csv" -Fo -Encoding UTF8
$Csv= Import-Csv "$treatmentfolder\2_2_traitement.csv"
}
$i += 1
}
The problem that I have is that I have an error that says that the "avis" already exists as a column header even though the values are different with the uppercase and lowercase 'a'. How can I change replace this header then?
Working on a PowerShell code which will replace a set of characters from a text file in a folder (Contain lot of Text files). Is there a way where it can do it for all the files in the folder?
The issue is it creates a new file when I run the code (New_NOV_1995.txt) but it doesn't change any characters in the new file.
$lookupTable = #{
'¿' = '|'
'Ù' = '|'
'À' = '|'
'Ú' = '|'
'³' = '|'
'Ä' = '-'
}
$original_file = 'C:\FilePath\NOV_1995.txt'
$destination_file = 'C:\FilePath\NOV_1995_NEW.txt'
Get-Content -Path $original_file | ForEach-Object {
$line = $_
$lookupTable.GetEnumerator() | ForEach-Object {
if ($line -match $_.Key)
{
$line = $line -replace $_.Key, $_.Value
}
}
$line
} | Set-Content -Path $destination_file
While something like this would work, performance might be a problem. My only testing was on a tiny file containing the $lookupTable.
$lookupTable = #{
'¿' = '|'
'Ù' = '|'
'À' = '|'
'Ú' = '|'
'³' = '|'
'Ä' = '-'
}
$original_file = 'C:\FilePath\NOV_1995.txt'
$destination_file = 'C:\FilePath\NOV_1995_NEW.txt'
$originalContent = Get-Content -Path $original_file
$lookupTable.GetEnumerator() | % {
$originalContent = $originalContent -replace $_.Key,$_.Value
}
$originalContent | Out-File -FilePath $destination_file
Your code as you have it there is actually working for me. There is still a possible encoding issue maybe with your files. Does your file look right when you just read it into the console with Get-Content $path? If the file does not look right you might need to play with the -Encoding switches of the
Set-Content and Get-Contentcmdlets.
Improving on your current logic.
I changed your $lookuptable to a pair of psobjects. Since you are making the same replacement for the most part I combined them into a single regex.
The next part I hummed and hawed about but since, after my proposed change, you are only doing two replacements I figure you could just chain the two into a single replacement line. Otherwise you could have a foreach-object in there but I think this is simpler and faster.
This way we don't need to test for a match. -replace is doing the testing for us.
$toPipe = [pscustomobject]#{
Pattern = '¿|Ù|À|Ú|³'
Replacement = "|"
}
$toHypen = [pscustomobject]#{
Pattern = 'Ä'
Replacement = "-"
}
$path = "c:\temp\test\test"
Get-ChildItem -Path $path | ForEach-Object{
(Get-Content $_.FullName) -replace $toPipe.Pattern,$toPipe.Replacement -replace $toHypen.Pattern,$toHypen.Replacement |
Set-Content $_.FullName
}
Note that this will change the original files. Testing is encouraged.
Set-Content and Get-Content are not the best when it comes to performance so you might need to consider using [IO.File]::ReadAllLines($file) and its partner static method [IO.File]::WriteAllLines($file)
I have files which need to be modified according to mapping provided in CSV. I want to read each line of my txt file and depending if specified value exist I want to replace other strings in that line according to my CSV file (mapping). For that purpose I have used HashTable. Here is my ps script:
$file ="path\map.csv"
$mapping = Import-CSV $file -Encoding UTF8 -Delimiter ";"
$table = $mapping | Group-Object -AsHashTable -AsString -Property Name
$original_file = "path\input.txt"
$destination_file = "path\output.txt"
$content = Get-Content $original_file
foreach ($line in $content){
foreach ($e in $table.GetEnumerator()) {
if ($line -like "$($e.Name)") {
$line = $line -replace $e.Values.old_category, $e.Values.new_category
$line = $line -replace $e.Values.old_type, $e.Values.new_type
}
}
}
Set-Content -Path $destination_file -Value $content
My map.csv looks as follows:
Name;new_category;new_type;old_category;old_type
alfa;new_category1;new_type1;old_category1;old_type1
beta;new_category2;new_type2;old_category2;old_type2
gamma;new_category3;new_type3;old_category3;old_type3
And my input.txt content is:
bla bla "bla"
buuu buuu 123456 "test"
"gamma" "old_category3" "old_type3"
alfa
When I run this script it creates exactly the same output as initial file. Can someone tell me why it didn't change the line where "gamma" appears according to my mapping ?
Thanks in advance
Couple of things to change.
Firstly there is no need to change $mapping to a hash, Import-Csv already gives you an object array to work with.
Secondly, if you want to update the elements of $content, you need to use a for loop such that you can directly access modify them. Using a foreach creates a new variable in the pipeline and you were previously modifying it but then never writing it back to $content
Below should work:
$file ="map.csv"
$mapping = Import-CSV $file -Encoding UTF8 -Delimiter ";"
$original_file = "input.txt"
$destination_file = "output.txt"
$content = Get-Content $original_file
for($i=0; $i -lt $content.length; $i++) {
foreach($map in $mapping) {
if ($content[$i] -like "*$($map.Name)*") {
$content[$i] = $content[$i] -replace $map.old_category, $map.new_category
$content[$i] = $content[$i] -replace $map.old_type, $map.new_type
}
}
}
Set-Content -Path $destination_file -Value $content