I need to read in a CSV file and find replace certain characters from the first line of the file only. I have used foreach-object however this processes the entire file. Any thought on how this can best be achieved.
Here is the code :
Get-Content c:\output.csv | ForEach-Object { $_ -replace "objectGUID", 'StudentID' } | Set-Content c:\output2.csv
This won't fix the problem of having to process the entire file, but should substantially reduce the time it takes to do it if it's a substantially large file.
$Updated = $false
Get-Content c:\output.csv -ReadCount 1000 |
ForEach-Object {
if ($Updated)
{
$_ | Add-Content c:\output2.csv
}
else {
$_[0] = $_[0] -replace "objectGUID", 'StudentID'
$_ | Set-Content c:\output2.csv
$Updated = $true
}
}
Edit: if it's only 3000 rows this should be sufficient:
$FileContent = Get-Content c:\output.csv
$FileContent[0] = $FileContent[0] -replace 'objectGUID', 'StudentID'
$FileContent | Set-Content c:\output2.csv
Ok, Get-Content makes this simple enough if all you want to do is change the first line of a text file.
GC c:\output.csv|select -first 1|%{$_ -replace "objectGUID", 'StudentID'}|Out-File C:\Output2.csv
GC C:\output.csv -readcount 1000|Select -skip 1|Out-File C:\Output2.csv -Append
That will pull the first line only, replacing the text you wanted and write it to a new file (assuming you don't already have an Output2.csv file). After that it reads in the rest of the file skipping the first line and adds that to the same file. You can delete the original file after that and rename the output file if you feel the need.
Related
Good day,
with the script below I would like to use the following input txt from my output txt.
Input:
Klaus;Müller;Straße;PLZ;Ort;;;;;DE12345;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE12345678;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE999999;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE7777777;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE7777779;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE777777987;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE7777779765;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE77777797634;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE7777779763465;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE77777797623435435;
Output:
Klaus;Müller;Straße;PLZ;Ort;;;;;DE12345;DE12345678;DE999999;DE7777777;DE7777779;DE777777987;DE7777779765;DE77777797634;DE7777779763465;DE77777797623435435;
The script takes the last value from the following lines and appends them to the first line at the end and adds semicolons:
Import-Csv input.txt -delimiter ";" -Header (1..20)
1..9 | %{$data[0].($_+10) = $data[$_].10}
($data[0] | convertto-csv -delimiter ";" -NoType | select -skip 1) -replace '"' | out-file output.txt
gc test_neu.txt
if i save this into a .ps1 file it doesn´t work. anyone could say me why?
You don't assign Import-Csv to anything. The first line should be: $data = Import-Csv input.txt -delimiter ";" -Header (1..20) Your last line should be gc output.txt. And use the dot notation to location the input.txt file in the current directory. With these fixes, your script works:
$data = Import-Csv .\input.txt -delimiter ";" -Header (1..20)
1..9 | %{$data[0].($_+10) = $data[$_].10}
($data[0] | convertto-csv -delimiter ";" -NoType | select -skip 1) -replace '"' | out-file output.txt
gc output.txt
this seems to do what you want. [grin] it expects that the source lines are all to be combined.
i presume you can handle saving things to a file, so i leave that to you.
what it does ...
fakes reading in a text file
when ready to work with real data, replace the entire #region/#endregion block with a call to Get-Content.
iterates thru the collection by index number
if the line is the 1st, set $NewString to that entire value
else, add the last data item of the line to the existing $NewString value with a trailing ;
the .Where({$_}) filters out any blank items.
display the string
the code ...
#region >>> fake reading in a text file
# in real life, use Get-Content
$InStuff = #'
Klaus;Müller;Straße;PLZ;Ort;;;;;DE12345;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE12345678;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE999999;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE7777777;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE7777779;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE777777987;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE7777779765;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE77777797634;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE7777779763465;
Klaus;Müller;Straße;PLZ;Ort;;;;;DE77777797623435435;
'# -split [System.Environment]::NewLine
#endregion >>> fake reading in a text file
foreach ($Index in 0..$InStuff.GetUpperBound(0))
{
if ($Index -eq 0)
{
$NewString = $InStuff[$Index]
}
else
{
$NewString += $InStuff[$Index].Split(';').Where({$_})[-1] + ';'
}
}
$NewString
output ...
Klaus;Müller;Straße;PLZ;Ort;;;;;DE12345;DE12345678;DE999999;DE7777777;DE7777779;DE777777987;DE7777779765;DE77777797634;DE7777779763465;DE77777797623435435;
Just in case you don't know how many lines there are going to be on the input file:
$fmt='$1$2'
gc .\input.txt | %{$_ -replace '(^.*;)(.*;$)',$fmt;$fmt='$2'} | sc output.txt -NoNewline
gc output.txt
I want to do this
read the file
go through each line
if the line matches the pattern, do some changes with that line
save the content to another file
For now I use this script:
$file = [System.IO.File]::ReadLines("C:\path\to\some\file1.txt")
$output = "C:\path\to\some\file2.txt"
ForEach ($line in $file) {
if($line -match 'some_regex_expression') {
$line = $line.replace("some","great")
}
Out-File -append -filepath $output -inputobject $line
}
As you can see, here I write line by line. Is it possible to write the whole file at once ?
Good example is provided here :
(Get-Content c:\temp\test.txt) -replace '\[MYID\]', 'MyValue' | Set-Content c:\temp\test.txt
But my problem is that I have additional IF statement...
So, what could I do to improve my script ?
You could do it like that:
Get-Content -Path "C:\path\to\some\file1.txt" | foreach {
if($_ -match 'some_regex_expression') {
$_.replace("some","great")
}
else {
$_
}
} | Out-File -filepath "C:\path\to\some\file2.txt"
Get-Content reads a file line by line (array of strings) by default so you can just pipe it into a foreach loop, process each line within the loop and pipe the whole output into your file2.txt.
In this case Arrays or Array List(lists are better for large arrays) would be the most elegant solution. Simply add strings in array until ForEach loop ends. After that just flush array to a file.
This is Array List example
$file = [System.IO.File]::ReadLines("C:\path\to\some\file1.txt")
$output = "C:\path\to\some\file2.txt"
$outputData = New-Object System.Collections.ArrayList
ForEach ($line in $file) {
if($line -match 'some_regex_expression') {
$line = $line.replace("some","great")
}
$outputData.Add($line)
}
$outputData |Out-File $output
I think the if statement can be avoided in a lot of cases by using regular expression groups (e.g. (.*) and placeholders (e.g. $1, $2 etc.).
As in your example:
(Get-Content .\File1.txt) -Replace 'some(_regex_expression)', 'great$1' | Set-Content .\File2.txt
And for the good example" where [MYID\] might be somewhere inline:
(Get-Content c:\temp\test.txt) -Replace '^(.*)\[MYID\](.*)$', '$1MyValue$2' | Set-Content c:\temp\test.txt
(see also How to replace first and last part of each line with powershell)
I know that I can use:
gc c:\FileWithEmptyLines.txt | where {$_ -ne ""} > c:\FileWithNoEmptyLines.txt
to remove empty lines. But How I can remove them with '-replace' ?
I found a nice one liner here >> http://www.pixelchef.net/remove-empty-lines-file-powershell. Just tested it out with several blanks lines including newlines only as well as lines with just spaces, just tabs, and combinations.
(gc file.txt) | ? {$_.trim() -ne "" } | set-content file.txt
See the original for some notes about the code. Nice :)
This piece of code from Randy Skretka is working fine for me, but I had the problem, that I still had a newline at the end of the file.
(gc file.txt) | ? {$_.trim() -ne "" } | set-content file.txt
So I added finally this:
$content = [System.IO.File]::ReadAllText("file.txt")
$content = $content.Trim()
[System.IO.File]::WriteAllText("file.txt", $content)
You can use -match instead -eq if you also want to exclude files that only contain whitespace characters:
#(gc c:\FileWithEmptyLines.txt) -match '\S' | out-file c:\FileWithNoEmptyLines
Not specifically using -replace, but you get the same effect parsing the content using -notmatch and regex.
(get-content 'c:\FileWithEmptyLines.txt') -notmatch '^\s*$' > c:\FileWithNoEmptyLines.txt
To resolve this with RegEx, you need to use the multiline flag (?m):
((Get-Content file.txt -Raw) -replace "(?m)^\s*`r`n",'').trim() | Set-Content file.txt
If you actually want to filter blank lines from a file then you may try this:
(gc $source_file).Trim() | ? {$_.Length -gt 0}
You can't do replacing, you have to replace SOMETHING with SOMETHING, and you neither have both.
This will remove empty lines or lines with only whitespace characters (tabs/spaces).
[IO.File]::ReadAllText("FileWithEmptyLines.txt") -replace '\s+\r\n+', "`r`n" | Out-File "c:\FileWithNoEmptyLines.txt"
(Get-Content c:\FileWithEmptyLines.txt) |
Foreach { $_ -Replace "Old content", " New content" } |
Set-Content c:\FileWithEmptyLines.txt;
file
PS /home/edward/Desktop> Get-Content ./copy.txt
[Desktop Entry]
Name=calibre
Exec=~/Apps/calibre/calibre
Icon=~/Apps/calibre/resources/content-server/calibre.png
Type=Application*
Start by get the content from file and trim the white spaces if any found in each line of the text document. That becomes the object passed to the where-object to go through the array looking at each member of the array with string length greater then 0. That object is passed to replace the content of the file you started with. It would probably be better to make a new file...
Last thing to do is reads back the newly made file's content and see your awesomeness.
(Get-Content ./copy.txt).Trim() | Where-Object{$_.length -gt 0} | Set-Content ./copy.txt
Get-Content ./copy.txt
This removes trailing whitespace and blank lines from file.txt
PS C:\Users\> (gc file.txt) | Foreach {$_.TrimEnd()} | where {$_ -ne ""} | Set-Content file.txt
Get-Content returns immutable array of rows. You can covert this to mutable array and delete neccessary lines by index.Particular indexex you can get with match. After that you can write result to new file with Set-Content. With this approach you can avoid empty lines that powershell replace tool leaves when you try to replace smthing with "". Note that I dont guarantee perfect perfomance. Im not a professional powershell developer))
$fileLines = Get-Content $filePath
$neccessaryLine = Select-String -Path $filePath -Pattern 'something'
if (-Not $neccessaryLine) { exit }
$neccessaryLineIndex = $neccessaryLine.LineNumber - 1
$updatedFileContent = [System.Collections.ArrayList]::new($fileLines)
$updatedFileContent.RemoveAt($neccessaryLineIndex)
$updatedHostsFileContent.RemoveAt($domainInfoLineIndex - 1)
$updatedHostsFileContent | Set-Content $hostsFilePath
Set-Content -Path "File.txt" -Value (get-content -Path "File.txt" | Select-String -Pattern '^\s*$' -NotMatch)
This works for me, originally got the line from here and added Joel's suggested '^\s*$': Using PowerShell to remove lines from a text file if it contains a string
I am trying to get all the lines from an Input file starting with %% and paste it into Output file using powershell.
Used the following code, however I am only getting last line in Output file starting with %% instead of all the lines starting with %%.
I have only started to learn powershell, please help
$Clause = Get-Content "Input File location"
$Outvalue = $Clause | Foreach {
if ($_ -ilike "*%%*")
{
Set-Content "Output file location" $_
}
}
You are looping over the lines in the file, and setting each one as the whole content of the file, overwriting the previous file each time.
You need to either switch to using Add-Content instead of Set-Content, which will append to the file, or change the design to:
Get-Content "input.txt" | Foreach-Object {
if ($_ -like "%%*")
{
$_ # just putting this on its own, sends it on out of the pipeline
}
} | Set-Content Output.txt
Which you would more typically write as:
Get-Content "input.txt" | Where-Object { $_ -like "%%*" } | Set-Content Output.txt
and in the shell, you might write as
gc input.txt |? {$_ -like "%%*"} | sc output.txt
Where the whole file is filtered, and then all the matching lines are sent into Set-Content in one go, not calling Set-Content individually for each line.
NB. PowerShell is case insensitive by default, so -like and -ilike behave the same.
For a small file, Get-Content is nice. But if you start trying to do this on heavier files, Get-Content will eat your memory and leave you hanging.
Keeping it REALLY simple for other Powershell starters out there, you'll be better covered (and with better performance). So, something likes this would do the job:
$inputfile = "C:\Users\JohnnyC\Desktop\inputfile.txt"
$outputfile = "C:\Users\JohnnyC\Desktop\outputfile.txt"
$reader = [io.file]::OpenText($inputfile)
$writer = [io.file]::CreateText($outputfile)
while($reader.EndOfStream -ne $true) {
$line = $reader.Readline()
if ($line -like '%%*') {
$writer.WriteLine($line);
}
}
$writer.Dispose();
$reader.Dispose();
I have a text file in the following format:
.....
ENTRY,PartNumber1,,,
FIELD,IntCode,123456
...
FIELD,MFRPartNumber,ABC123,,,
...
FIELD,XPARTNUMBER,ABC123
...
FIELD,InternalPartNumber,3214567
...
ENTRY,PartNumber2,,,
...
...
the ... indicates there is other data between these fields. The ONLY thing I can be certain of is that the field starting with ENTRY is a new set of records. The rows starting with FIELD can be in any order, and not all of them may be present in each group of data.
I need to read in a chunk of data
Search for any field matching the
string ABC123
If ABC123 found, search for the existence of the
InternalPartNumber field & return that row of data.
I have not seen a way to use Get-Content that can read in a variable number of rows as a set & be able to search it.
Here is the code I currently have, which will read a file, searching for a string & replacing it with another. I hope this can be modified to be used in this case.
$ftype = "*.txt"
$fnames = gci -Path $filefolder1 -Filter $ftype -Recurse|% {$_.FullName}
$mfgPartlist = Import-Csv -Path "C:\test\mfrPartList.csv"
foreach ($file in $fnames) {
$contents = Get-Content -Path $file
foreach ($partnbr in $mfgPartlist) {
$oldString = $mfgPartlist.OldValue
$newString = $mfgPartlist.NewValue
if (Select-String -Path $file -SimpleMatch $oldString -Debug -Quiet) {
$stringData = $contents -imatch $oldString
$stringData = $stringData -replace "[\n\r]","|"
foreach ($dataline in $stringData) {
$file +"|"+$stringData+"|"+$oldString+"|"+$newString|Out-File "C:\test\Datachanges.txt" -Width 2000 -Append
}
$contents = $contents -replace $oldString $newString
Set-Content -Path $file -Value $contents
}
}
}
Is there a way to read & search a text file in "chunks" using Powershell? Or to do a Read-ahead & determine what to search?
Assuming your fine isn't too big to read into memory all at once:
$Text = Get-Content testfile.txt -Raw
($Text -split '(?ms)^(?=ENTRY)') |
foreach {
if ($_ -match '(?ms)^FIELD\S+ABC123')
{$_ -replace '(?ms).+(^Field\S+InternalPartNumber.+?$).+','$1'}
}
FIELD,InternalPartNumber,3214567
That reads the entire file in as a single multiline string, and then splits it at the beginning of any line that starts with 'ENTRY'. Then it tests each segment for a FIELD line that contains 'ABC123', and if it does, removes everything except the FIELD line for the InternalPartNumber.
This is not my best work as I have just got back from vacation. You could use a while loop reading the text and set an entry flag to gobble up the text in chunks. However if your files are not too big then you could just read up the text file at once and use regex to split up the chunks and then process accordingly.
$pattern = "ABC123"
$matchedRowToReturn = "InternalPartNumber"
$fileData = Get-Content "d:\temp\test.txt" | Where-Object{$_ -match '^(entry|field)'} | Out-String
$parts = $fileData | Select-String '(?smi)(^Entry).*?(?=^Entry|\Z)' -AllMatches | Select-Object -ExpandProperty Matches | Select-Object -ExpandProperty Value
$parts | Where-Object{$_ -match $pattern} | Select-String "$matchedRowToReturn.*$" | Select-Object -ExpandProperty Matches | Select-Object -ExpandProperty Value
What this will do is read in the text file, drop any lines that are not entry or field related, as one long string and split it up into chunks that start with lines that begin with the work "Entry".
Then we drop those "parts" that do not contain the $pattern. Of the remaining that match extract the InternalPartNumber line and present.