question about powershell text manipulation

question about powershell text manipulation - powershell

I apologise for asking the very basic question as I am beginner in Scripting.
i was wondering why i am getting different result from two different source with the same formatting. Below are my sample
file1.txt
Id Name Members
122 RCP_VMWARE-DMZ-NONPROD DMZ_NPROD01_111
DMZ_NPROD01_113
123 RCP_VMWARE-DMZ-PROD DMZ_PROD01_110
DMZ_PROD01_112
124 RCP_VMWARE-DMZ-INT.r87351 DMZ_TEMPL_210.r
DMZ_DECOM_211.r
125 RCP_VMWARE-LAN-NONPROD NPROD02_20
NPROD03_21
NPROD04_22
NPROD06_24
file2.txt
Id Name Members
4 HPUX_PROD HPUX_PROD.3
HPUX_PROD.4
HPUX_PROD.5
i'm trying to display the Name column and with this code i'm able to display the file1.txt correctly.
PS C:\Share> gc file1.txt |Select-Object -skip 1 | foreach-object { $_.split(" ")[1]} | ? {$_.trim() -ne "" }
RCP_VMWARE-DMZ-NONPROD
RCP_VMWARE-DMZ-PROD
RCP_VMWARE-DMZ-INT.r87351
RCP_VMWARE-LAN-NONPROD
However with the file2 im getting a different output.
PS C:\Share> gc .\file2.txt |Select-Object -skip 1 | foreach-object { $_.split(" ")[1]} | ? {$_.trim() -ne "" }
4
changing the code to *$_.split(" ")[2]}* helps to display the output correctly
However, i would like to have just 1 code which can be apply for both situation.appreciate if you can help me to sort this.. thank you in advance...

This happens because the latter file has different format.
When examined carefully, one notices there are two spaces between 4 and HPUX_PROD strings:
Id Name Members
4 HPUX_PROD HPUX_PROD.3
^^^^
On the first file, there is a single space between number and string:
Id Name Members
122 RCP_VMWARE-DMZ-NONPROD DMZ_NPROD01_111
^^^
As how to fix the issue depends if you need to match both file formats, or if the other has simply a typing error.

The existing answers are helpful, but let me try to break it down conceptually:
.Split(" ") splits the input string by each individual space character, whereas what you're looking for is to split by runs of (one or more) spaces, given that your column values can be separated by more than one space.
For instance 'a b'.split(' ') results in 3 array elements - 'a', '', 'b' - because the empty string between the two spaces is considered an element too.
The .NET [string] type's .Split() method is based on verbatim strings or character sets and therefore doesn't allow you to express the concept of "one ore more spaces" as a split criterion, whereas PowerShell's regex-based -split operator does.
Conveniently, -split's unary form (see below) has this logic built in: it splits each input string by any nonempty run of whitespace, while also ignoring leading and trailing whitespace, which in your case obviates the need for a regex altogether.
This answer compares and contrasts the -split operator with string type's .Split() method, and makes the case for routinely using the former.
Therefore, a working solution (for both input files) is:
Get-Content .\file2.txt | Select-Object -Skip 1 |
Foreach-Object { if ($value = (-split $_)[1]) { $value } }
Note:
If the column of interest contains a value (at least one non-whitespace character), so must all preceding columns in order for the approach to work. Also, column values themselves must not have embedded whitespace (which is true for your sample input).
The if conditional both extracts the 2nd column value ((-split $_)[1]) and assigns it to a variable ($value = ), whose value then implicitly serves as a Boolean:
Any nonempty string is implicitly $true, in which case the extracted value is output in the associated block ({ $value }); conversely, an empty string results in no output.
For a general overview of PowerShell's implicit to-Boolean conversions, see this bottom section of this answer.

Since this sort-of looks like csv output with spaces as delimiter (but not quite), I think you could use ConvertFrom-Csv on this:
# read the file as string array, trim each line and filter only the lines that
# when split on 1 or more whitespace characters has more than one field
# then replace the spaces by a comma and treat it as CSV
# return the 'Name' column only
(((Get-Content -Path 'D:\Test\file1.txt').Trim() |
Where-Object { #($_ -split '\s+').Count -gt 1 }) -replace '\s+', ',' |
ConvertFrom-Csv).Name
Shorter, but because you are only after the Name column, this works too:
((Get-Content -Path 'D:\Test\file2.txt').Trim() -replace '\s+', ',' | ConvertFrom-Csv).Name -ne ''
Output for file1
RCP_VMWARE-DMZ-NONPROD
RCP_VMWARE-DMZ-PROD
RCP_VMWARE-DMZ-INT.r87351
RCP_VMWARE-LAN-NONPROD
Output for file2
HPUX_PROD

Related

Parse info from Text File - Powershell

Beginner here, I am working on a error log file and library, the current step I am on is to pull specific information from a txt file.
The code I have currently is...
$StatusErr = "Type 1","Type 2"
for ($i=0; $i -lt $StatusErr.length; $i++) {
get-content C:\blah\Logs\StatusErrors.TXT |
select-string $StatusErr[$i] |
add-content C:\blah\Logs\StatusErrorsresult.txt
}
while it is working, I need it to display as
Type-1-Description
2-Description
Type-1-Description
2-Description
Type-1-Description
2-Description
etc.
it is currently displaying as
Type 1 = Type-1-Description
Type 1 = Type-1-Description
Type 1 = Type-1-Description
Type 2 = 2-Description
Type 2 = 2-Description
Type 2 = 2-Description
I am unsure how to change the arrangement and remove unneeded spaces and the = sign

You need to search for both patterns in a single Select-String call in order to get matching lines in order.
While the -Pattern parameter does accept an array of patterns, in this case a single regex will do.
You need to use a regex pattern in order to capture and output only part of the lines that match.
$StatusErrRegex = '(?<=Type [12]\s*=\s*)[^ ]+'
get-content C:\blah\Logs\StatusErrors.TXT |
select-string $StatusErrRegex |
foreach-object { $_.Matches.Value } |
set-content C:\blah\Logs\StatusErrorsresult.txt
Note that I've replaced add-content with set-content, as I'm assuming you don't want to append to a preexisting file. set-content writes all objects it receives via the pipeline to the output file.
Select-String outputs Microsoft.PowerShell.Commands.MatchInfo instances whose .Matches property provides access to the part of the line that was matched.
For an explanation of the regex and the ability to experiment with it, see this regex101.com page.
Additional notes:
Select-String, like PowerShell in general, is case-insensitive by default; add the -CaseSensitive switch, if needed.
(?<=...) is a (positive) lookbehind assertion, whose matching text doesn't became part of what the regex captures.
\s* matches zero or more whitespace characters; \s+ would match one or more.
[^ ]+ matches one or more (+) characters that are not ^ spaces ( ), and thereby captures the run of non-space characters to the right of the = sign.
To match any of multiple words at the start of the pattern, use a regex alternation (|), e.g. '(?<=(type|data) [12]\s*=\s*)[^ ]+'

PowerShell script that searches for a string in a .txt and if it finds it, looks for the next line containing another string and does a job with it

I have the line
Select-String -Path ".\*.txt" -Pattern "6,16" -Context 20 | Select-Object -First 1
that would return 20 lines of context looking for a pattern of "6,16".
I need to look for the next line containing the string "ID number:" after the line of "6,16", read what is the text right next to "ID number:", find if this exact text exists in another "export.txt" file located in the same folder (so in ".\export.txt"), and see if it contains "6,16" on the same line as the one containing the text in question.
I know it may seem confusing, but what I mean is for example:
example.txt:5218: ID number:0002743284
shows whether this is true:
export.txt:9783: 0002743284 *some text on the same line for example* 6,16

If I understand the question correctly, you're looking for something like:
Select-String -List -Path *.txt -Pattern '\b6,16\b' -Context 0, 20 |
ForEach-Object {
if ($_.Context.PostContext -join "`n" -match '\bID number:(\d+)') {
Select-String -List -LiteralPath export.txt -Pattern "$($Matches[1]).+$($_.Pattern)"
}
}
Select-String's -List switch limits the matching to one match per input file; -Context 0,20 also includes the 20 lines following the matching one in the output (but none (0) before).
Note that I've placed \b, a word-boundary assertion at either end of the search pattern, 6,16, to rule out accidental false positives such as 96,169.
$_.Context.PostContext contains the array of lines following the matching line (which itself is stored in $_.Line):
-join "`n" joins them into a multi-line string, so as to ensure that the subsequent -match operation reports the captured results in the automatic $Matches variable, notably reporting the ID number of interest in $Matches[1], the text captured by the first (and only) capture group ((\d+)).
The captured ID is then used in combination with the original search pattern to form a regex that looks for both on the same line, and is passed to a second Select-String call that searches through export.txt
Note: An object representing the matching line, if any, is output by default; to return just $true or $false, replace -List with -Quiet.

There's a lot wrong with what you're expecting and the code you've tried so let's break it down and get to the solution. Kudos for attempting this on your own. First, here's the solution, read below this code for an explanation of what you were doing wrong and how to arrive at the code I've written:
# Get matching lines plus the following line from the example.txt seed file
$seedMatches = Select-String -Path .\example.txt -Pattern "6,\s*16" -Context 0, 2
# Obtain the ID number from the line following each match
$idNumbers = foreach( $match in $seedMatches ) {
$postMatchFields = $match.Context.PostContext -split ":\s*"
# Note: .IndexOf(object) is case-sensitive when looking for strings
# Returns -1 if not found
$idFieldIndex = $postMatchFields.IndexOf("ID number")
# Return the "ID number" to `$idNumbers` if "ID number" is found in $postMatchFields
if( $idFieldIndex -gt -1 ) {
$postMatchFields[$idFieldIndex + 1]
}
}
# Match lines in export.txt where both the $id and "6,16" appear
$exportMatches = foreach( $id in $idNumbers ) {
Select-String -Path .\export.txt -Pattern "^(?=.*\b$id\b)(?=.*\b6,\s*16\b).*$"
}
mklement0's answer essentially condenses this into less code, but I wanted to break this down fully.
First, Select-String -Path ".\*.txt" will look in all .txt files in the current directory. You'll want to narrow that down to a specific naming pattern you're looking for in the seed file (the file we want to find the ID to look for in the other files). For this example, I'll use example.txt and export.txt for the paths which you've used elsewhere in your question, without using globbing to match on filenames.
Next, -Context gives context of the surrounding lines from the match. You only care about the next line match so 0, 1 should suffice for -Context (0 lines before, 1 line after the match).
Finally, I've added \s* to the -Pattern to match on whitespace, should the 16 ever be padded from the ,. So now we have our Select-String command ready to go:
$seedMatches = Select-String -Path .\example.txt -Pattern "6,\s*16" -Context 0, 2
Next, we will need to loop over the matching results from the seed file. You can use foreach or ForEach-Object, but I'll use foreach in the example below.
For each $match in $seedMatches we'll need to get the $idNumbers from the lines following each match. When $match is ToString()'d, it will spit out the matched line and any surrounding context lines. Since we only have one line following the match for our context, we can grab $match.Context.PostContext for this.
Now we can get the $idNumber. We can split example.txt:5218: ID number:0002743284 into an array of strings by using the -split operator to split the string on the :\s* pattern (\s* matches on any or no whitespace). Once we have this, we can get the index of "ID Number" and get the value of the field immediately following it. Now we have our $idNumbers. I'll also add some protection below to ensure the ID numbers field is actually found before continuing.
$idNumbers = foreach( $match in $seedMatches ) {
$postMatchFields = $match.Context.PostContext -split ":\s*"
# Note: .IndexOf(object) is case-sensitive when looking for strings
# Returns -1 if not found
$idFieldIndex = $postMatchFields.IndexOf("ID number")
# Return the "ID number" to `$idNumbers` if "ID number" is found in $postMatchFields
if( $idFieldIndex -gt -1 ) {
$postMatchFields[$idFieldIndex + 1]
}
}
Now that we have $idNumbers, we can look in export.txt for this ID number "6,\s*16" on the same line, once again using Select-String. This time, I'll put the code first since it's nothing new, then explain the regex a bit:
$exportMatches = foreach( $id in $idNumbers ) {
Select-String -Path .\export.txt -Pattern "^(?=.*\b$id\b)(?=.*\b6,\s*16\b).*$"
}
$exportMatches will now contain the lines which contain both the target ID number and the 6,16 value on the same line. Note that order wasn't specified so the expression uses positive lookaheads to find both the $id and 6,16 values regardless of their order in the string. I won't break down the exact expression but if you plug ^(?=.*\b0123456789\b)(?=.*\b6,\s*16\b).*$ into https://regexr.com it will break down and explain the regex pattern in detail.
The full code is above in at the top of this answer.

Question regarding incrementing a string value in a text file using Powershell

Just beginning with Powershell. I have a text file that contains the string "CloseYear/2019" and looking for a way to increment the "2019" to "2020". Any advice would be appreciated. Thank you.

If the question is how to update text within a file, you can do the following, which will replace specified text with more specified text. The file (t.txt) is read with Get-Content, the targeted text is updated with the String class Replace method, and the file is rewritten using Set-Content.
(Get-Content t.txt).Replace('CloseYear/2019','CloseYear/2020') | Set-Content t.txt
Additional Considerations:
General incrementing would require a object type that supports incrementing. You can isolate the numeric data using -split, increment it, and create a new, joined string. This solution assumes working with 32-bit integers but can be updated to other numeric types.
$str = 'CloseYear/2019'
-join ($str -split "(\d+)" | Foreach-Object {
if ($_ -as [int]) {
[int]$_ + 1
}
else {
$_
}
})
Putting it all together, the following would result in incrementing all complete numbers (123 as opposed to 1 and 2 and 3 individually) in a text file. Again, this can be tailored to target more specific numbers.
$contents = Get-Content t.txt -Raw # Raw to prevent an array output
-join ($contents -split "(\d+)" | Foreach-Object {
if ($_ -as [int]) {
[int]$_ + 1
}
else {
$_
}
}) | Set-Content t.txt
Explanation:
-split uses regex matching to split on the matched result resulting in an array. By default, -split removes the matched text. Creating a capture group using (), ensures the matched text displays as is and is not removed. \d+ is a regex mechanism matching a digit (\d) one or more (+) successive times.
Using the -as operator, we can test that each item in the split array can be cast to [int]. If successful, the if statement will evaluate to true, the text will be cast to [int], and the integer will be incremented by 1. If the -as operator is not successful, the pipeline object will remain as a string and just be output.
The -join operator just joins the resulting array (from the Foreach-Object) into a single string.

AdminOfThings' answer is very detailed and the correct answer.
I wanted to provide another answer for options.
Depending on what your end goal is, you might need to convert the date to a datetime object for future use.
Example:
$yearString = 'CloseYear/2019'
#convert to datetime
[datetime]$dateConvert = [datetime]::new((($yearString -split "/")[-1]),1,1)
#add year
$yearAdded = $dateConvert.AddYears(1)
#if you want to display "CloseYear" with the new date and write-host
$out = "CloseYear/{0}" -f $yearAdded.Year
Write-Host $out
This approach would allow you to use $dateConvert and $yearAdded as a datetime allowing you to accurately manipulate dates and cultures, for example.

How can I replace every comma with a space in a text file before a pattern using PowerShell

I have a text file with lines in this format:
FirstName,LastName,SSN,$x.xx,$x.xx,$x.xx
FirstName,MiddleInitial,LastName,SSN,$x.xx,$x.xx,$x.xx
The lines could be in either format. For example:
Joe,Smith,123-45-6789,$150.00,$150.00,$0.00
Jane,F,Doe,987-65-4321,$250.00,$500.00,$0.00
I want to basically turn everything before the SSN into a single field for the name thus:
Joe Smith,123-45-6789,$150.00,$150.00,$0.00
Jane F Doe,987-65-4321,$250.00,$500.00,$0.00
How can I do this using PowerShell? I think I need to use ForEach-Object and at some point replace "," with " ", but I don't know how to specify the pattern. I also don't know how to use a ForEach-Object with a $_.Where so that I can specify the "SkipUntil" mode.
Thanks very much!

Mathias is correct; you want to use the -replace operator, which uses regular expressions. I think this will do what you want:
$string -replace ',(?=.*,\d{3}-\d{2}-\d{4})',' '
The regular expression uses a lookahead (?=) to look for any commas that are followed by any number of any character (. is any character, * is any number of them including 0) that are then followed by a comma immediately followed by a SSN (\d{3}-\d{2}-\d{4}). The concept of "zero-width assertions", such as this lookahead, simply means that it is used to determine the match, but it not actually returned as part of the match.
That's how we're able to match only the commas in the names themselves, and then replace them with a space.

I know it's answered, and neatly so, but I tried to come up with an alternative to using a regex - count the number of commas in a line, then replace either the first one, or the first two, commas in the line.
But strings can't count how many times a character appears in them without using the regex engine(*), and replacements can't be done a specific number of times without using the regex engine(**), so it's not very neat:
$comma = [regex]","
Get-Content data.csv | ForEach {
$numOfCommasToReplace = $comma.Matches($_).Count - 4
$comma.Replace($_, ' ', $numOfCommasToReplace)
} | Out-File data2.csv
Avoiding the regex engine entirely, just for fun, gets me things like this:
Get-Content .\data.csv | ForEach {
$1,$2,$3,$4,$5,$6,$7 = $_ -split ','
if ($7) {"$1 $2 $3,$4,$5,$6,$7"} else {"$1 $2,$3,$4,$5,$6"}
} | Out-File data2.csv
(*) ($line -as [char[]] -eq ',').Count
(**) while ( #counting ) { # split/mangle/join }

How to Split DistinguishedName?

I have a list of folks and their DN from AD (I do not have direct access to that AD). Their DNs are in format:
$DNList = 'CN=Bob Dylan,OU=Users,OU=Dept,OU=Agency,OU=NorthState,DC=myworld,DC=com',
'CN=Ray Charles,OU=Contractors,OU=Dept,OU=Agency,OU=NorthState,DC=myworld,DC=com',
'CN=Martin Sheen,OU=Users,OU=Dept,OU=Agency,OU=WaySouth,DC=myworld,DC=com'
I'd like to make $DNList return the following:
OU=Users,OU=Dept,OU=Agency,OU=NorthState,DC=myworld,DC=com
OU=Contractors,OU=Dept,OU=Agency,OU=NorthState,DC=myworld,DC=com
OU=Users,OU=Dept,OU=Agency,OU=WaySouth,DC=myworld,DC=com

I decided to turn my comment into an answer:
$DNList | ForEach-Object {
$_ -replace '^.+?(?<!\\),',''
}
Debuggex Demo
This will correctly handle escaped commas that are part of the first component.
We do a non-greedy match for one or more characters at the beginning of the string, then look for a comma that is not preceded by a backslash (so that the dot will match the backslash and comma combination and keep going).

You can remove the first element with a replacement like this:
$DNList -replace '^.*?,(..=.*)$', '$1'
^.*?, is the shortest match from the beginning of the string to a comma.
(..=.*)$ matches the rest of the string (starting with two characters after the comma followed by a = character) and groups them, so that the match can be referenced in the replacement as $1.

You have 7 items per user, comma separated and you want rid of the first one.
So, split each item in the array using commas as the delimiter, return matches 1-6 (0 being the first item that you want to skip), then join with commas again e.g.
$DNList = $DNList|foreach{($_ -split ',')[1..6] -join ','}
If you then enter $DNList it returns
OU=Users,OU=Dept,OU=Agency,OU=NorthState,DC=myworld,DC=com
OU=Contractors,OU=Dept,OU=Agency,OU=NorthState,DC=myworld,DC=com
OU=Users,OU=Dept,OU=Agency,OU=WaySouth,DC=myworld,DC=com

Similar to Grahams answer but removed the hardcoded array values so it will just remove the CN portion without worrying how long the DN is.
$DNList | ForEach-Object{($_ -split "," | Select-Object -Skip 1) -join ","}
Ansgar most likely has a good reason but you can just use regex to remove every before the first comma
$DNList -replace "^.*?,"
Update based on briantist
To maintain a different answer but one that works this regex can still have issues but I doubt these characters will appear in a username
$DNList -replace "^.*?,(?=OU=)"
Regex uses a look ahead to be sure the , is followed by OU=
Similarly you could do this
($DNList | ForEach-Object{($_ -split "(,OU=)" | Select-Object -Skip 1) -join ""}) -replace "^,"

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

question about powershell text manipulation - powershell

Related

Parse info from Text File - Powershell

PowerShell script that searches for a string in a .txt and if it finds it, looks for the next line containing another string and does a job with it

Question regarding incrementing a string value in a text file using Powershell

How can I replace every comma with a space in a text file before a pattern using PowerShell

How to Split DistinguishedName?

Categories

Resources