Powershell Remove all lines except those containing certain strings

Powershell Remove all lines except those containing certain strings - powershell

I can do this one-by-one with bookmarking and other Notepad++ features, but I will be doing this frequently to edit documents. I have used the below powershell for removing all lines except those containing a certain string, but how would I do it for, say 50 strings.
$SourceFile = 'C:\PATH\TO\FILE.csv'
$Pattern = 'word||'
(Get-Content $SourceFile) | % {if ($_ -match $Pattern){$_}} | Set-Content $SourceFile

I guess $Match should be $Pattern in your example.
You can specify multiple keywords in your pattern, like this:
$SourceFile = 'C:\PATH\TO\FILE.csv'
$Pattern = 'word|excel|powerpoint'
(Get-Content $SourceFile) | Where-Object { $_ -match $Pattern } | Set-Content $SourceFile

Related

Remove lines which not consist / filter csv flle

I have csv files with wsus raport ( there are server names )
I have also txt file with server names which are in my scope.
I would like remove entry from my csv files if not match txt files.
I found solution how to to something appositive ( keep lines if match)
$SourceFile = 'C:\temp\wsus.csv'
$scope = Get-Content C:\Temp\windows_server.txt
foreach ($Pattern in $scope)
{
(Get-Content $SourceFile) | Where-Object { $_ -notmatch $Pattern } | Set-Content $SourceFile
}
I was hopping that I change -notmatch to -match and it will work but doesn't.
Best Regards,
Krzysztof

You basically need to write it exactly the other way round:
Get-Content $sourceFile | where {
foreach($pattern in $scope) {
if ($_ -match $pattern) {
return $true
}
}
}
Another alternative would be using Select-String:
Get-Content $sourceFile | Select-String -Pattern $scope
Also, when working with CSV, you should consider using the appropriate cmdlets (like Import-Csv).

Assuming your wsus.csv looks anything like this:
Server,Something you need to know,Something you want to forget
Srv01,Main storage,blahblah
Srv02,Mail server,more blah
and your windows_server.txt simply holds servernames, each on a separate line, then you could do this:
$SourceFile = 'C:\temp\wsus.csv'
$scope = Get-Content C:\Temp\windows_server.txt
$csv = Import-Csv -Path $SourceFile | Where-Object {$scope -contains $_.Server }
# now export the scoped csv (I'd suggest to a new file, but you can overwrite the $SourceFile if you must)
$csv | Export-Csv -Path 'C:\temp\ScopedWsus.csv' -NoTypeInformation

Two files: keep lines with identical first n characters only

There are 2 text files in the CWD, a.txt, b.txt. From a.txt, I would like to delete all lines whose first 5 characters are NOT present in b.txt as any lines' first 5 characters. (Or, stating otherwise, keep only those lines in a.txt, whose first 5 characters is present in b.txt as any lines' first 5 characters.) Content after the 5th character to the end of the line is irrelevant.
For example: a.txt
abcde000dsdsddsdsdsdsdsd
0123456xxx
kkk
xyzxyzxyzfeeeee
kkkkkkkkkkk
and b.txt:
012345aabbcc
kkkkkkkhhkkvv
nnnnnnn5777nnnn77567
Intended result (lines in a.txt whose 1-5 character is present in b.txt):
0123456xxx
kkkkkkkkkkk
When I am running the code, it gives me an empty results.txt, but no error messages. What I am missing?
$pattern = "^[5]"
$set1 = Get-Content -Path a.txt
$results = New-Object -TypeName System.Text.StringBuilder
Get-Content -Path b.txt | foreach {
if ($_ -match $pattern) {
[void]$results.AppendLine($_)
}
}
$results.ToString() | Out-File -FilePath .\results.txt -Encoding ascii

Your code doesn't work because your pattern doesn't match anything. The regular expression ^[5] means "the character '5' at the beginning of the string" (the square brackets define a character class), not "5 characters at the beginning of the string". The latter would be ^.{5}. Also, you never match the content of a.txt against the content of b.txt.
There are several ways to do what you want:
Extract the first 5 characters from each line of b.txt. to an array and compare the lines of a.txt against that array. Esperento57's answer sort of uses this approach, but in a way that requires PowerShell v3 or newer. A variant that'll work on all PowerShell versions could look like this:
$pattern = '^(.{5}).*'
$ref = (Get-Content 'b.txt') -match $pattern -replace $pattern, '$1' |
Get-Unique
Get-Content 'a.txt' | Where-Object {
$ref -contains ($_ -replace $pattern, '$1')
} | Set-Content 'results.txt'
Since lookups in arrays are comparatively slow and don't scale well (they get significantly slower with increasing number of elements in the array) you could also put the reference values in a hashtable so you can do index lookups (which are significantly faster):
$pattern = '^(.{5}).*'
$ref = #{}
(Get-Content 'b.txt') -match $pattern -replace $pattern, '$1' |
ForEach-Object { $ref[$_] = $true }
Get-Content 'a.txt' | Where-Object {
$ref.ContainsKey(($_ -replace $pattern, '$1'))
} | Set-Content 'results.txt'
Another alternative would be to build a second regular expression from the substrings extracted from b.txt and compare the content of a.txt against that expression:
$pattern = '^(.{5}).*'
$list = (Get-Content 'b.txt') -match $pattern -replace $pattern, '$1' |
Get-Unique |
ForEach-Object { [regex]::Escape($_) }
$ref = '^({0})' -f ($list -join '|')
(Get-Content 'a.txt') -match $ref | Set-Content 'results.txt'
Note that each of these approaches will ignore lines shorter than 5 characters.

try Something like this:
$listB=get-content "c:\temp\b.txt" | where {$_.Length -gt 4} | select #{N="First5";E={$_.Substring(0, 5)}}
get-content "c:\temp\a.txt" | where {$_.Length -gt 4 -and $_.Substring(0, 5) -in $listB.First5}

If performance is a concern, consider to use the hashtable(s) as index:
$Pattern = '^(.{5}).*'
$a = #{}; $b = #{}
Get-Content -Path a.txt | Where {$_ -Match $Pattern} | ForEach {$a[$Matches[1]] = #($a[$Matches[1]] + $_)}
Get-Content -Path b.txt | Where {$_ -Match $Pattern} | ForEach {$b[$Matches[1]] = #($b[$Matches[1]] + $_)}
$a.Keys | Where {$b.Keys -Contains $_} | ForEach {$a.$_} | Set-Content results.txt

Need to output multiple rows to CSV file

I am using the following script that iterates through hundreds of text files looking for specific instances of the regex expression within. I need to add a second data point to the array, which tells me the object the pattern matched in.
In the below script the [Regex]::Matches($str, $Pattern) | % { $_.Value } piece returns multiple rows per file, which cannot be easily output to a file.
What I would like to know is, how would I output a 2 column CSV file, one column with the file name (which should be $_.FullName), and one column with the regex results? The code of where I am at now is below.
$FolderPath = "C:\Test"
$Pattern = "(?i)(?<=\b^test\b)\s+(\w+)\S+"
$Lines = #()
Get-ChildItem -Recurse $FolderPath -File | ForEach-Object {
$_.FullName
$str = Get-Content $_.FullName
$Lines += [Regex]::Matches($str, $Pattern) |
% { $_.Value } |
Sort-Object |
Get-Unique
}
$Lines = $Lines.Trim().ToUpper() -replace '[\r\n]+', ' ' -replace ";", '' |
Sort-Object |
Get-Unique # Cleaning up data in array

I can think of two ways but the simplest way is to use a hashtable (dict). Another way is create psobjects to fill your Lines variable. I am going to go with the simple way so you can only use one variable, the hashtable.
$FolderPath = "C:\Test"
$Pattern = "(?i)(?<=\b^test\b)\s+(\w+)\S+"
$Results =#{}
Get-ChildItem -Recurse $FolderPath -File |
ForEach-Object {
$str = Get-Content $_.FullName
$Line = [regex]::matches($str,$Pattern) | % { $_.Value } | Sort-Object | Get-Unique
$Line = $Line.Trim().ToUpper() -Replace '[\r\n]+', ' ' -Replace ";",'' | Sort-Object | Get-Unique # Cleaning up data in array
$Results[$_.FullName] = $Line
}
$Results.GetEnumerator() | Select #{L="Folder";E={$_.Key}}, #{L="Matches";E={$_.Value}} | Export-Csv -NoType -Path <Path to save CSV>
Your results will be in $Results. $Result.keys contain the folder names. $Results.Values has the results from expression. You can reference the results of a particular folder by its key $Results["Folder path"]. of course it will error if the key does not exist.

Change and save .nc files

I have a massive amount of .nc files (text files) where I need to change different lines based on their linenumer and content.
Example:
So far I have:
Get-ChildItem I:\temp *.nc -recurse | ForEach-Object {
$c = ($_ | Get-Content)
$c = $c -replace "S355J2","S235JR2"
$c = $c.GetType() | Format-Table -AutoSize
$c = $c -replace $c[3],$c[4]
[IO.File]::WriteAllText($_.FullName, ($c -join "`r`n"))
}
This is not working, however, since it returns only a few PowerShell lines to each file, instead of the original (changed) content.

I don't know what you expect $c = $c.GetType() | Format-Table -AutoSize to do, but it most likely doesn't do whatever it is you're expecting.
If I understand your question correctly you essentially want to
remove the line pos,
replace the code S355J2 with S235JR2, and
remove a section SI if it exists.
The following code should work:
Get-ChildItem I:\temp *.nc -Recurse | ForEach-Object {
(Get-Content $_.FullName | Out-String) -replace 'pos\r\n\s+' -replace 'S355J2', 'S235JR2' -replace '(?m)^SI\r\n(\s+.*\n)+' |
Set-Content $_.FullName
}
Out-String mangles the content of the input file into a single string, and the daisy-chained replacement operations modify that string before it's written back to the file. The expression (?m)^SI\r\n(\s+.*\n)+ matches a line beginning with SI and followed by one or more indented lines. The (?m) modifier is to allow matching start-of-line in a multiline string, otherwise ^ would only match the beginning of the string.
Edit: If you need to replace variable text in the 3rd line with the text from the 4th line (thus duplicating the 4th line) you're indeed better off working with an array for that. Delay the mangling of the string array until after that replacement:
Get-ChildItem I:\temp *.nc -Recurse | ForEach-Object {
$txt = #(Get-Content $_.FullName)
$txt[3] = $txt[4]
($txt | Out-String) -replace 'S355J2', 'S235JR2' -replace '(?m)^SI\r\n(\s+.*\n)+' |
Set-Content $_.FullName
}

Powershell: addin line into the .txt file

I have a text (.txt) file with following content:
Car1
Car2
Car3
Car4
Car5
For changing Car1 for random text I used this script:
Get-ChildItem "C:\Users\boris.magdic\Desktop\q" -Filter *.TXT |
Foreach-Object{
$content = Get-Content $_.FullName
$content | ForEach-Object { $_ -replace "Car1", "random_text" } | Set-Content $_.FullName
}
This is working ok, but now I want to add one text line under Car2 in my text file.
How can I do that?

Just chain another -replace and use a new line!
Get-ChildItem "C:\Users\boris.magdic\Desktop\q" -Filter *.TXT |
Foreach-Object{
$file = $_.FullName
$content = Get-Content $file
$content | ForEach-Object { $_ -replace "Car1", "random_text" -replace "(Car2)","`$1`r`nOtherText" } | Set-Content $file
}
First thing is that | Set-Content $_.FullName would not work since the file object does not exist in that pipe. So one simple this to do it save the variable for use later in the pipe. You can also use the ForEach($file in (Get-ChildItem....)) construct.
The specific change to get what you want is the second -replace. We place what you want to match in brackets to that we can reference it in the replacement string with $1. We use a backtick to ensure PowerShell does not treat it as a variable.
We can remove some redundancy as well since -replace will work against the strings of file as a whole
Get-ChildItem "c:\temp" -Filter *.TXT |
Foreach-Object{
$file = $_.FullName
(Get-Content $file) -replace "Car1", "random_text" -replace "(Car2)","`$1`r`nOtherText" | Set-Content $file
}
While this does work with your sample text I want to point out that more complicated strings might require more finesse to ensure you make the correct changed and that the replacements we are using are regex based and do not need to be for this specific example.
.Replace()
So if you were just doing simple replacements then we can update your original logic.
Foreach-Object{
$file = $_.FullName
$content = Get-Content $_.FullName
$content | ForEach-Object { $_.replace("Car1", "random_text").replace("Car2","Car2`r`nOtherText")} | Set-Content $file
}
So that is just simple text replacement chained using the string method .Replace()

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Powershell Remove all lines except those containing certain strings - powershell

I guess $Match should be $Pattern in your example. You can specify multiple keywords in your pattern, like this: $SourceFile = 'C:\PATH\TO\FILE.csv' $Pattern = 'word|excel|powerpoint' (Get-Content $SourceFile) | Where-Object { $_ -match $Pattern } | Set-Content $SourceFile

Related

Remove lines which not consist / filter csv flle

Two files: keep lines with identical first n characters only

Need to output multiple rows to CSV file

Change and save .nc files

Powershell: addin line into the .txt file

Categories

Resources