I've powershell script which should basically fill the empty row in the picture. I'm not able to find solution on how to select that part and fill it with number.
I've tried to find the name with regex but didn't succeed
here is part of the code
$krokPattern = "https://kazdykrokpomaha.ozp.cz/index.php?kroky/index"
$ie.navigate($krokPattern)
while($ie.Busy) { Start-Sleep -Milliseconds 100 }
[regex]$regex = "krok-\d{4}-\d{2}-\d{2}"
$stering = Select-String -Path $krokPattern -Pattern $regex
Image - how it looks like
You can do something like the following with -replace. Just replace the value assigned to $number with whatever value you deem appropriate. However, a proper parser for the language in the file is going to be best.
$regex = [regex]'(?<=type=")[^"]+(?=" name="krok-\d{4}-\d{2}-\d{2}")'
$number = 24
(Get-Content index.html) -replace $regex,$number | Set-Content index.html
Explanation:
Since -replace uses regex matching, we can build off of your current idea. See the following for the $regex breakdown. The goal is to match all characters between the double quotes after type= and before name="krok-####-##-##".
(?<=): Positive Lookbehind
type=": matches the characters type=" literally
[^"]+: matches a single character that is not " one or more times (+).
`(?=): Positive Lookahead
" name="krok-\d{4}-\d{2}-\d{2}": matches literally "krok- followed by 4 digits, a literal -, 2 digits, a literal -, 2 digits, and a final ".
The characters that match $regex are replaced by $number.
See Regex Demo for example and deeper explanation.
Related
Beginner here, I am working on a error log file and library, the current step I am on is to pull specific information from a txt file.
The code I have currently is...
$StatusErr = "Type 1","Type 2"
for ($i=0; $i -lt $StatusErr.length; $i++) {
get-content C:\blah\Logs\StatusErrors.TXT |
select-string $StatusErr[$i] |
add-content C:\blah\Logs\StatusErrorsresult.txt
}
while it is working, I need it to display as
Type-1-Description
2-Description
Type-1-Description
2-Description
Type-1-Description
2-Description
etc.
it is currently displaying as
Type 1 = Type-1-Description
Type 1 = Type-1-Description
Type 1 = Type-1-Description
Type 2 = 2-Description
Type 2 = 2-Description
Type 2 = 2-Description
I am unsure how to change the arrangement and remove unneeded spaces and the = sign
You need to search for both patterns in a single Select-String call in order to get matching lines in order.
While the -Pattern parameter does accept an array of patterns, in this case a single regex will do.
You need to use a regex pattern in order to capture and output only part of the lines that match.
$StatusErrRegex = '(?<=Type [12]\s*=\s*)[^ ]+'
get-content C:\blah\Logs\StatusErrors.TXT |
select-string $StatusErrRegex |
foreach-object { $_.Matches.Value } |
set-content C:\blah\Logs\StatusErrorsresult.txt
Note that I've replaced add-content with set-content, as I'm assuming you don't want to append to a preexisting file. set-content writes all objects it receives via the pipeline to the output file.
Select-String outputs Microsoft.PowerShell.Commands.MatchInfo instances whose .Matches property provides access to the part of the line that was matched.
For an explanation of the regex and the ability to experiment with it, see this regex101.com page.
Additional notes:
Select-String, like PowerShell in general, is case-insensitive by default; add the -CaseSensitive switch, if needed.
(?<=...) is a (positive) lookbehind assertion, whose matching text doesn't became part of what the regex captures.
\s* matches zero or more whitespace characters; \s+ would match one or more.
[^ ]+ matches one or more (+) characters that are not ^ spaces ( ), and thereby captures the run of non-space characters to the right of the = sign.
To match any of multiple words at the start of the pattern, use a regex alternation (|), e.g. '(?<=(type|data) [12]\s*=\s*)[^ ]+'
I have a big file consists of "before" and "after" cases for every item as follows:
case1 (BEF) ACT
(AFT) BLK
case2 (BEF) ACT
(AFT) ACT
case3 (BEF) ACT
(AFT) CLC
...
I need to select all of the strings which have (BEF) ACT on the "first" string and (AFT) BLK on the "second" and place the result to a file.
The idea is to create a clause like
IF (stringX.LineNumber consists of "(BEF) ACT" AND stringX+1.LineNumber consists of (AFT) BLK)
{OutFile $stringX+$stringX+1}
Sorry for the syntax, I've just starting to work with PS :)
$logfile = 'c:\temp\file.txt'
$matchphrase = '\(BEF\) ACT'
$linenum=Get-Content $logfile | Select-String $matchphrase | ForEach-Object {$_.LineNumber+1}
$linenum
#I've worked out how to get a line number after the line with first required phrase
Create a new file with a result as follows:
string with "(BEF) ACT" following with a string with "(AFT) BLK"
Select-String -SimpleMatch -CaseSensitive '(BEF) ACT' c:\temp\file.txt -Context 0,1 |
ForEach-Object {
$lineAfter = $_.Context.PostContext[0]
if ($lineAfter.Contains('(AFT) BLK')) {
$_.Line, $lineAfter # output
}
} # | Set-Content ...
-SimpleMatch performs string-literal substring matching, which means you can pass the search string as-is, without needing to escape it.
However, if you needed to further constrain the search, such as to ensure that it only occurs at the end of a line ($), you would indeed need a regular expression with the (implied) -Pattern parameter: '\(BEF\) ACT$'
Also note PowerShell is generally case-insensitive by default, which is why switch -CaseSensitive is used.
Note how Select-String can accept file paths directly - no need for a preceding Get-Content call.
-Context 0,1 captures 0 lines before and 1 line after each match, and includes them in the [Microsoft.PowerShell.Commands.MatchInfo] instances that Select-String outputs.
Inside the ForEach-Object script block, $_.Context.PostContext[0] retrieves the line after the match and .Contains() performs a literal substring search in it.
Note that .Contains() is a method of the .NET System.String type, and such methods - unlike PowerShell - are case-sensitive by default, but you can use an optional parameter to change that.
If the substring is found on the subsequent line, both the line at hand and the subsequent one are output.
The above looks for all matching pairs in the input file; if you only wanted to find the first pair, append | Select-Object -First 2 to the Select-String call.
Another way of doing this is to read the $logFile in as a single string and use a RegEx match to get the parts you want:
$logFile = 'c:\temp\file.txt'
$outFile = 'c:\temp\file2.txt'
# read the content of the logfile as a single string
$content = Get-Content -Path $logFile -Raw
$regex = [regex] '(case\d+\s+\(BEF\)\s+ACT\s+\(AFT\)\s+BLK)'
$match = $regex.Match($content)
($output = while ($match.Success) {
$match.Value
$match = $match.NextMatch()
}) | Set-Content -Path $outFile -Force
When used the result is:
case1 (BEF) ACT
(AFT) BLK
case7 (BEF) ACT
(AFT) BLK
Regex details:
( Match the regular expression below and capture its match into backreference number 1
case Match the characters “case” literally
\d Match a single digit 0..9
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\( Match the character “(” literally
BEF Match the characters “BEF” literally
\) Match the character “)” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
ACT Match the characters “ACT” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\( Match the character “(” literally
AFT Match the characters “AFT” literally
\) Match the character “)” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
BLK Match the characters “BLK” literally
)
My other answer completes your own Select-String-based solution attempt. Select-String is versatile, but slow, though it is appropriate for processing files too large to fit into memory as a whole, given that it processes files line by line.
However, PowerShell offers a much faster line-by-line processing alternative: switch -File - see the solution below.
Theo's helpful answer, which reads the entire file into memory first, will probably perform best overall, depending on file size, but it comes at the cost of increased complexity, due to relying heavily on direct use of .NET functionality.
$(
$firstLine = ''
switch -CaseSensitive -Regex -File t.txt {
'\(BEF\) ACT' { $firstLine = $_; continue }
'\(AFT\) BLK' {
# Pair found, output it.
# If you don't want to look for further pairs,
# append `; break` inside the block.
if ($firstLine) { $firstLine, $_ }
# Look for further pairs.
$firstLine = ''; continue
}
default { $firstLine = '' }
}
) # | Set-Content ...
Note: The enclosing $(...) is only needed if you want to send the output directly to the pipeline to a cmdlet such as Set-Content; it is not needed for capturing the output in a variable: $pair = switch ...
-Regex interprets the branch conditionals as regular expressions.
$_ inside a branch's action script block ({ ... } refers to the line at hand.
The overall approach is:
$firstLine stores the 1st line of interest once found, and when the 2nd line's pattern is found and $firstLine is set (is nonempty), the pair is output.
The default handler resets $firstLine, to ensure that only two consecutive lines that contain the strings of interest are considered.
In a file i have
datasource =(Description=(failover=on)(load_balance=off) transport_connect_timeout=1)
I want to pass the value using $datasource.
While I use
$datasource =Get-content "c:\file | select-string -pattern datasource"
this give me whole line
datasource =(Description=(failover=on)(load_balance=off)transport_connect_timeout=1)
but I need only
(Description=(failover=on)(load_balance=off) transport_connect_timeout=1)
please help me. Thanks in advance.
Here's one approach:
$fullValue = "datasource =(Description=(failover=on)(load_balance=off) transport_connect_timeout=1)"
($fullValue -split "=" | Select-Object -Skip 1) -join "="
Split the string on the equals signs
Grab all but the first split string
Join them all back together again using the equals sign
Select-String uses regular expressions with the -Pattern.
I'd use a more advanced one with a positive look behind and a capture group.
$datasource = sls .\file.txt -Patt '(?<=datasource =)(.*)$'|% {$_.Matches.groups[1].value}
RegEx explanation from regex101.com
(?<=datasource =)(.*)$
Positive Lookbehind (?<=datasource =)
Assert that the Regex below matches
datasource = matches the characters datasource = literally (case sensitive)
1st Capturing Group (.*)
.*
. matches any character (except for line terminators)
* Quantifier — Matches between zero and unlimited times,
as many times as possible, giving back as needed (greedy)
$ asserts position at the end of a line
The pipe to % {$_.Matches.groups[1].value} iterates all matches and returns only the content of the capture group [1]
Let's say I have a test file named testfile.txt containing the below line:
one (two) "three"
I want to use PowerShell to say that if the entire string exists, place a line directly underneath it with the value:
four (five) "six"
(Notice that it includes both spaces, brackets and double quotes. This is important, as the problem I am having is I think with escaping the brackets and double quotes).
So the result would be:
one (two) "three"
four (five) "six"
I thought the easiest way of doing it would be to say that if the first string is found, replace it with the first string itself again, and the new string forming a new line included in the same command. I had difficulty putting the strings in line so I tried using a herestring variable whereby an entire text block with formatting is read. It still does not parse the full string with quotes into the pipeline. I'm new to powershell so don't hold back if you see something stupid.
$herestring1 = #"
one (two) "three"
"#
$herestring2 = #"
one (two) "three"
four (five) "six"
"#
if((Get-Content testfile.txt) | select-string $herestring1) {
"Match found - replacing string"
(Get-Content testfile.txt) | ForEach-Object { $_ -replace $herestring1,$herestring2 } | Set-Content ./testfile.txt
"Replaced string successfully"
}
else {
"No match found"}
The above just gives "No match found" every time. This is because it does not find the first string in the file.
I have tried variations using backtick [ ` ] and doubling quotes to try to escape, but I thought the point in a here string was that it should parse the text block including all formatting so I should not have to.
If I change the file to contain only:
one two three
and then change the herestring accordingly to:
$herestring1 = #"
one two three
"#
$herestring2 = #"
one two three
four five six
"#
Then it works ok and I get the string replaced as I want.
As Martin points out, you can use -SimpleMatch with Select-String to avoid parsing it as a regular expression.
But -replace will still be using a regex.
You can escape the pattern for RegEx using [RegEx]::Escape():
$herestring1 = #"
one (two) "three"
"#
$herestring2 = #"
one (two) "three"
four (five) "six"
"#
$pattern1 = [RegEx]::Escape($herestring1)
if((Get-Content testfile.txt) | select-string $pattern1) {
"Match found - replacing string"
(Get-Content testfile.txt) | ForEach-Object { $_ -replace $pattern1,$herestring2 } | Set-Content ./testfile.txt
"Replaced string successfully"
}
else {
"No match found"}
Regular expressions interpret parentheses () (what you are calling brackets) as special. By default, spaces are not special, but they can be with certain regex options. Double quotes are no problem.
In regex, the escape character is backslash \, and this is independent of any escaping you do for the PowerShell parser using backtick `.
[RegEx]::Escape() will ensure anything special to regex is escaped so that a regex pattern will interpret it as literal, so your pattern will end up looking like this: one\ \(two\)\ "three"
Just use the Select-String cmdlet with the -SimpleMatch switch:
# ....
if((Get-Content testfile.txt) | select-string -SimpleMatch $herestring1) {
# ....
-SimpleMatch
Indicates that the cmdlet uses a simple match rather than a regular
expression match. In a simple match, Select-String searches the input
for the text in the Pattern parameter. It does not interpret the value
of the Pattern parameter as a regular expression statement.
Source.
I have a text file with lines in this format:
FirstName,LastName,SSN,$x.xx,$x.xx,$x.xx
FirstName,MiddleInitial,LastName,SSN,$x.xx,$x.xx,$x.xx
The lines could be in either format. For example:
Joe,Smith,123-45-6789,$150.00,$150.00,$0.00
Jane,F,Doe,987-65-4321,$250.00,$500.00,$0.00
I want to basically turn everything before the SSN into a single field for the name thus:
Joe Smith,123-45-6789,$150.00,$150.00,$0.00
Jane F Doe,987-65-4321,$250.00,$500.00,$0.00
How can I do this using PowerShell? I think I need to use ForEach-Object and at some point replace "," with " ", but I don't know how to specify the pattern. I also don't know how to use a ForEach-Object with a $_.Where so that I can specify the "SkipUntil" mode.
Thanks very much!
Mathias is correct; you want to use the -replace operator, which uses regular expressions. I think this will do what you want:
$string -replace ',(?=.*,\d{3}-\d{2}-\d{4})',' '
The regular expression uses a lookahead (?=) to look for any commas that are followed by any number of any character (. is any character, * is any number of them including 0) that are then followed by a comma immediately followed by a SSN (\d{3}-\d{2}-\d{4}). The concept of "zero-width assertions", such as this lookahead, simply means that it is used to determine the match, but it not actually returned as part of the match.
That's how we're able to match only the commas in the names themselves, and then replace them with a space.
I know it's answered, and neatly so, but I tried to come up with an alternative to using a regex - count the number of commas in a line, then replace either the first one, or the first two, commas in the line.
But strings can't count how many times a character appears in them without using the regex engine(*), and replacements can't be done a specific number of times without using the regex engine(**), so it's not very neat:
$comma = [regex]","
Get-Content data.csv | ForEach {
$numOfCommasToReplace = $comma.Matches($_).Count - 4
$comma.Replace($_, ' ', $numOfCommasToReplace)
} | Out-File data2.csv
Avoiding the regex engine entirely, just for fun, gets me things like this:
Get-Content .\data.csv | ForEach {
$1,$2,$3,$4,$5,$6,$7 = $_ -split ','
if ($7) {"$1 $2 $3,$4,$5,$6,$7"} else {"$1 $2,$3,$4,$5,$6"}
} | Out-File data2.csv
(*) ($line -as [char[]] -eq ',').Count
(**) while ( #counting ) { # split/mangle/join }