How to select strings from file via one lines? - powershell

How to select strings from file via one lines?
For example my file contains strings
string1
string2
string3
string4
i want get
string2
string4
I try it this way
Get-Content -Path "E:\myfile.txt" | Select-String
but i don't know how make this from Select-String method

If you literally want to select these two lines, then I guess this is the shortest way to do that:
(Get-Content -Path "E:\myfile.txt")[1,3]
or
Get-Content -Path "E:\myfile.txt" | Select-Object -Index 1,3
However, if you mean you want to select only the even numbered lines from the file, you could do this:
# return only the even lines (for odd lines, do for ($i = 0; ...)
$text = Get-Content -Path "E:\myfile.txt"; for ($i = 1; $i -lt #($text).Count; $i+=2) { $text[$i] }
Or by using Select-String
# return only the even lines (for odd lines, remove the ! exclamation mark
(Select-String -Path "E:\myfile.txt" -Pattern '.*' | Where-Object {!($_.LineNumber % 2)}).Line

Get-Content -Path "~\Desktop\strings.txt" | Select-String -Pattern "string2|string4"

You can use the Where-Object cmdlet to filter a stream of objects (strings in this case):
Get-Content -Path "E:\myfile.txt" | Where-Object {$_ -match '[24]$'}
# or
Get-Content -Path "E:\myfile.txt" | Where-Object {$_ -like '*[24]'}
# or
Get-Content -Path "E:\myfile.txt" | Where-Object {$_.EndsWith('2') -or $_.EndsWith('4')'}
If you want only even-numbered lines from the file:
Get-Content -Path "E:\myfile.txt" | Where-Object {$_.ReadCount % 2 -eq 0}

Related

How to strip out leading time stamp?

I have some log files.
Some of the UPDATE SQL statements are getting errors, but not all.
I need to know all the statements that are getting errors so I can find the pattern of failure.
I can sort all the log files and get the unique lines, like this:
$In = "C:\temp\data"
$Out1 = "C:\temp\output1"
$Out2 = "C:\temp\output2"
Remove-Item $Out1\*.*
Remove-Item $Out2\*.*
# Get the log files from the last 90 days
Get-ChildItem $In -Filter *.log | Where-Object {$_.LastWriteTime -gt (Get-Date).AddDays(-90)} |
Foreach-Object {
$content = Get-Content $_.FullName
#filter and save content to a file
$content | Where-Object {$_ -match 'STATEMENT'} | Sort-Object -Unique | Set-Content $Out1\$_
}
# merge all the files, sort unique, write to output
Get-Content $Out2\* | Sort-Object -Unique | Set-Content $Out3\output.txt
Works great.
But some of the logs have a leading date-time stamp in the leading 24 char. I need to strip that out, or all those lines are unique.
If it helps, all the files either have the leading timestamp or they don't. The lines are not mixed within a single file.
Here is what I have so far:
# Get the log files from the last 90 days
Get-ChildItem $In -Filter *.log | Where-Object {$_.LastWriteTime -gt (Get-Date).AddDays(-90)} |
Foreach-Object {
$content = Get-Content $_.FullName
#filter and save content to a file
$s = $content | Where-Object {$_ -match 'STATEMENT'}
# strip datetime from front if exists
If (Where-Object {$s.Substring(0,1) -Match '/d'}) { $s = $s.Substring(24) }
$s | Sort-Object -Unique | Set-Content $Out1\$_
}
# merge all the files, sort unique, write to output
Get-Content $Out1\* | Sort-Object -Unique | Set-Content $Out2\output.txt
But it just write the lines out without stripping the leading chars.
Regex /d should be \d (\ is the escape character in general, and character-class shortcuts such as d for a digit[1] must be prefixed with it).
Use a single pipeline that passes the Where-Object output to a ForEach-Object call where you can perform the conditional removal of the numeric prefix.
$content |
Where-Object { $_ -match 'STATEMENT' } |
ForEach-Object { if ($_[0] -match '\d') { $_.Substring(24) } else { $_ } } |
Set-Content $Out1\$_
Note: Strictly speaking, \d matches everything that the Unicode standard considers a digit, not just the ASCII-range digits 0 to 9; to limit matching to the latter, use [0-9].

Redundant code, how can I have multiple arguments per line

This script works, I want to condense it so if I add more lines to find and replace in the file I'm not being redundant.
Get-ChildItem C:\Users\JonSa\Desktop -Filter callcounts.xml | Foreach- Object{
(Get-Content $_.FullName) |
Foreach-Object {$_ -replace "#aXXXXX.ac1.vbspbx.com", ""} |
Set-Content $_.FullName
}
Get-ChildItem C:\Users\JonSa\Desktop -Filter callcounts.xml | Foreach- Object{
(Get-Content $_.FullName) |
Foreach-Object {$_ -replace "sip:", ""} |
Set-Content $_.FullName
}
I would like to accomplish this with fewer lines that leaves room for more arguments.
With only one file, don't use Get-ChildItem and a ForEach-Object
when using the -raw -parameter, you can apply the replace on the whole file
you can also append several -replace one after the other.
for the same replacement (here none) you can use an alternation | (OR)
an empty replacement can be omitted with the -replace operator (not so with the .replace() method)
$File = 'C:\Users\JonSa\Desktop\callcounts.xml'
(Get-Content $File -raw) -replace '#aXXXXX.ac1.vbspbx.com|sip:' |
Set-Content $File

Get-ChildItem Zero Results Output

I have an issue that I cannot resolve no matter which way I am wrapping it up. I am including my latest code which is not giving me the desired outcome and the code for a solution that does work but for only one file at a time. I cannot work out how to loop through each of the files automatically however.
In a nutshell, I have a directory with many CSV files some of the entries within the CSVfile have a negative value (-) I need to remove this minus sign in all instances.
Now what works is if I use the following (on a single file)
$content = Get-Content "L:\Controls\BCR\0001cash.csv" | ForEach {$_ -replace $variable, ""} | Set-Content "L:\controls\bcr\0001temp.csv"
What I am trying to do is iterate through the many thousand of these objects automatically and not have to refer to them individually.
I started with:
$Directory = "L:\Controls\BCR\"
$variable = "-"
$suffix = ".tmp"
To define the directory, minus symbol that I want to remove and the suffix of the file I want to change to...
$Files = Get-ChildItem $Directory | Where-Object {$_.Extension -like "*csv*"} | Where-Object {$_.Name -like "*cash*"}
Is obtaining each of the files that I wish to work with
And I am then working with
ForEach ($File in $Files) { Get-Content $Files | ForEach {$_ -replace $variable, ""} | Set-Content {$_.Basename + $Variable}
The results however are nothing...
At a loss? Anyone???
$Directory = "L:\Controls\BCR\"
$variable = "-"
$suffix = ".tmp"
$Files = Get-ChildItem $Directory | Where-Object {$_.Extension -like "*csv*"} | Where-Object {$_.Name -like "*cash*"}
$process = ForEach ($File in $Files) { Get-Content $Files | ForEach {$_ -replace $variable, ""} | Set-Content {$_.BaseName + $suffix}
}
You are using the wrong variable in the Get-Content cmdlet ($Files instead of $File). Also You can simplify your script:
$Directory = "L:\Controls\BCR\"
$variable = "-"
$suffix = ".tmp"
Get-ChildItem $Directory -Filter '*cash*csv' |
ForEach-Object {
(Get-Content $_ -Raw) -replace $variable |
Set-Content {$_.BaseName + $suffix}
}

Line Count ForEach-Object on multiple files

I'm trying to figure out how to incorporate a line count that gets added to each file for a loop. The count needs to be put into the footer of each file as it checks it. Another concern is that the count needs to include the addition of the header and footer lines (i.e. 8 lines + 1 header + 1 footer = 10). My code I'm using is below and I know that the code to count the lines is Get-Content $mypath | Measure-Object -Line | {$linecount = $_.Count} but I dont know how to properly incorporate it. Any suggestions?
Get-ChildItem $destinationfolderpath -REcurse -Filter *.txt | ForEach-Object -Begin { $seq = 0 } -Process {
$seq++
$seq1 = "{0:D4}" -f $seq; $header="File Sequence Number $seq1"
$footer="File Sequence Number $seq1 and Line Count $looplinecount"
$header + "`n" + (Get-Content $_.FullName | Out-String) + $footer | Set-Content -Path $_.FullName
}
So load the content of the file to a variable within the loop, perform your measure -line on that variable, add 2 (one of the header line, one for the footer line), and drop that into a sub-expression for the footer...
Get-ChildItem $destinationfolderpath -REcurse -Filter *.txt | ForEach-Object -Begin { $seq = 0 } -Process {
$seq++
$seq1 = "{0:D4}" -f $seq
$header="File Sequence Number $seq1"
$Content=Get-Content $_.FullName | Out-String
$footer="File Sequence Number $seq1 and Line Count $(($content|measure -line|select -expand lines)+2)"
"$header`n$Content$footer" | Set-Content -Path $_.FullName
}

Using PowerShell to remove lines from a text file if it contains a string

I am trying to remove all the lines from a text file that contains a partial string using the below PowerShell code:
Get-Content C:\new\temp_*.txt | Select-String -pattern "H|159" -notmatch | Out-File C:\new\newfile.txt
The actual string is H|159|28-05-2005|508|xxx, it repeats in the file multiple times, and I am trying to match only the first part as specified above. Is that correct? Currently I am getting empty as output.
Am I missing something?
Suppose you want to write that in the same file, you can do as follows:
Set-Content -Path "C:\temp\Newtext.txt" -Value (get-content -Path "c:\Temp\Newtext.txt" | Select-String -Pattern 'H\|159' -NotMatch)
Escape the | character using a backtick
get-content c:\new\temp_*.txt | select-string -pattern 'H`|159' -notmatch | Out-File c:\new\newfile.txt
Another option for writing to the same file, building on the existing answers. Just add brackets to complete the action before the content is sent to the file.
(get-content c:\new\sameFile.txt | select-string -pattern 'H`|159' -notmatch) | Set-Content c:\new\sameFile.txt
You don't need Select-String in this case, just filter the lines out with Where-Object
Get-Content C:\new\temp_*.txt |
Where-Object { -not $_.Contains('H|159') } |
Set-Content C:\new\newfile.txt
String.Contains does a string comparison instead of a regex so you don't need to escape the pipe character, and it's also faster
The pipe character | has a special meaning in regular expressions. a|b means "match either a or b". If you want to match a literal | character, you need to escape it:
... | Select-String -Pattern 'H\|159' -NotMatch | ...
This is probably a long way around a simple problem, it does allow me to remove lines containing a number of matches. I did not have a partial match that could be used, and needed it to be done on over 1000 files.
This post did help me get to where I needed to, thank you.
$ParentPath = "C:\temp\test"
$Files = Get-ChildItem -Path $ParentPath -Recurse -Include *.txt
$Match2 = "matchtext1"
$Match2 = "matchtext2"
$Match3 = "matchtext3"
$Match4 = "matchtext4"
$Match5 = "matchtext5"
$Match6 = "matchtext6"
$Match7 = "matchtext7"
$Match8 = "matchtext8"
$Match9 = "matchtext9"
$Match10 = "matchtext10"
foreach ($File in $Files) {
$FullPath = $File | % { $_.FullName }
$OldContent = Get-Content $FullPath
$NewContent = $OldContent `
| Where-Object {$_ -notmatch $Match1} `
| Where-Object {$_ -notmatch $Match2} `
| Where-Object {$_ -notmatch $Match3} `
| Where-Object {$_ -notmatch $Match4} `
| Where-Object {$_ -notmatch $Match5} `
| Where-Object {$_ -notmatch $Match6} `
| Where-Object {$_ -notmatch $Match7} `
| Where-Object {$_ -notmatch $Match8} `
| Where-Object {$_ -notmatch $Match9} `
| Where-Object {$_ -notmatch $Match10}
Set-Content -Path $FullPath -Value $NewContent
Write-Output $File
}
If you anyone having this issue while doing what suggested by Robert Brooker-
*These files have different encodings. Left file: Unicode (UTF-8) with signature. Right file: Unicode (UTF-8) without signature. You can resolve the difference by saving the right file with the encoding Unicode (UTF-8) with signature.* with Set-Content
use -Encoding UTF8
so like this
(get-content c:\new\sameFile.txt | select-string -pattern 'H`|159' -notmatch) | Set-Content c:\new\sameFile.txt -Encoding UTF8