Only split on the first occurence of a character - powershell

If I have a string like
foo:bar baz:count
and I want to split on the first occurrence of : and get an array returned which contains only two elements:
A string which is the element before the first colon.
A string which is everything after the first colon.
How can I achieve this in Powershell?

-split operator allows you to specify maximum number of substrings to return:
'foo:bar baz:count' -split ':',2

Using IndexOf() to find first occurance of ':'
Take the substring from the beginning until the index of ':'
Take the rest of the string from the ':' to the end.
Code:
$foobar = "foo:bar baz:count"
$pos = $foobar.IndexOf(":")
$leftPart = $foobar.Substring(0, $pos)
$rightPart = $foobar.Substring($pos+1)

Related

How to find the positions of all instances of a string in a specific line of a txt file?

Say that I have a .txt file with lines of multiple dates/times:
5/5/2020 5:45:45 AM
5/10/2020 12:30:03 PM
And I want to find the position of all slashes in one line, then move on to the next.
So for the first line I would want it to return the value:
1 3
And for the second line I would want:
1 4
How would I go about doing this?
I currently have:
$firstslashpos = Get-Content .\Documents\LoggedDates.txt | ForEach-Object{
$_.IndexOf("/")}
But that gives me only the first "/" on each line, and gives me that result for all lines at once. I need it to loop where I can figure out the space between each "/" for each line.
Sorry if I worded this badly.
You can indeed use the String.IndexOf() method for this!
function Find-SubstringIndex
{
param(
[string]$InputString,
[string]$Substring
)
$indices = #()
# start at position zero
$offset = 0
# Keep calling IndexOf() to find the next occurrence of the substring
# stop when IndexOf() returns -1
while(($i = $InputString.IndexOf($Substring, $offset)) -ne -1){
# Keep track of the index at which the substring was found
$indices += $i
# Update the offset, we'll want to start searching for the next index _after_ this one
$offset = $i + $Substring.Length
}
}
Now you can do:
Get-Content listOfDates.txt |ForEach-Object {
$indices = Find-SubstringIndex -InputString $_ -Substring '/'
Write-Host "Found slash at indices: $($indices -join ',')"
}
An concise solution is to use [regex]::Matches(), which finds all matches of a given regular expression in a given string and returns a collection of match objects that also indicate the index (character position) of each match:
# Create a sample file.
#'
5/5/2020 5:45:45 AM
5/10/2020 12:30:03 PM
'# > sample.txt
Get-Content sample.txt | ForEach-Object {
# Get the indices of all '/' instances.
$indices = [regex]::Matches($_, '/').Index
# Output them as a list (string), separated with spaces.
"$indices"
}
The above yields:
1 3
1 4
Note:
Input lines that contain no / instances at all will result in empty lines.
If, rather than strings, you want to output the indices as arrays (collections), use
, [regex]::Matches($_, '/').Index as the only statement in the ForEach-Object script block; the unary form of ,, the array constructor operator ensures (by way of a transient aux. array) that the collection returned by the method call is output as a whole. If you omit the , , the indices are output one by one, resulting in a flat array when collected in a variable.

In PowerShell, how do I copy the last alphabet characters from a string which also has numbers in it to create a variable?

For example if the string is blahblah02baboon - I need to get the "baboon" seperated from the rest and the variable would countain only the characters "baboon". Every string i need to do this with has alphabet characters first then 2 numbers then more alphabet characters, so it should be the same process everytime.
Any advice would be greatly appreciated.
My advice is to learn about regular expressions.
'blahblah02baboon' -replace '\D*\d*(\w*)', '$1'
Or use regex
$MyString = "01baaab01blah02baboon"
# Match any character which is not a digit
$Result = [regex]::matches($MyString, "\D+")
# Take the last result
$LastResult = $Result[$Result.Count-1].Value
# Output
Write-Output "My last result = $LastResult"

PowerShell Trim bug with String containing "< char >$< repeated char >"?

If I use the Trim() method on a string containing -char-$-repeated char-, e.g. "BL$LA" or "LA$AB", Trim() strips the repeated char after the $ as well.
For example:
$a = 'BL$LA'
$b = $a.Trim("BL$")
returns A not LA, but
$a = 'BM$LA'
$b = $a.Trim("BM$")
returns LA.
Any reason why? Or am I missing something?
The Trim() method removes all characters in the given argument (the string is automatically cast to a character array) from beginning and end of the string object. Your second example only seems to be doing what you want, because the remainder of the string does not have any of the characters-to-be-trimmed in it.
Demonstration:
PS C:\> $a = 'BL$LA'
PS C:\> $a.Trim("BL$")
A
PS C:\> $a = 'LxB$LA'
PS C:\> $a.Trim("BL$")
xB$LA
To remove a given substring from beginning and end of a string you need something like this instead:
$a -replace '^BL\$|BL\$$'
Regular expression breakdown:
^ matches the beginning of a string.
$ matches the end of a string.
BL\$ matches the literal character sequence "BL$".
...|... is an alternation (match any of these (sub)expressions).
If you just want to remove text up to and including the first $ from the beginning of a string you could also do something like this:
$a -replace '^.*?\$'
Regular expression breakdown:
^ matches the beginning of a string.
\$ matches a literal $ character.
.*? matches all characters up to the next (sub)expression (shortest/non-greedy match).

Using PowerShell To Count Sentences In A File

I am having an issue with my PowerShell Program counting the number of sentences in a file I am using. I am using the following code:
foreach ($Sentence in (Get-Content file))
{
$i = $Sentence.Split("?")
$n = $Sentence.Split(".")
$Sentences += $i.Length
$Sentences += $n.Length
}
The total number of sentences I should get is 61 but I am getting 71, could someone please help me out with this? I have Sentences set to zero as well.
Thanks
foreach ($Sentence in (Get-Content file))
{
$i = $Sentence.Split("[?\.]")
$Sentences = $i.Length
}
I edited your code a bit.
The . that you were using needs to be escaped, otherwise Powershell recognises it as a Regex dotall expression, which means "any character"
So you should split the string on "[?\.]" or similar.
When counting sentences, what you are looking for is where each sentence ends. Splitting, though, returns a collection of sentence fragments around those end characters, with the ends themselves represented by the gap between elements. Therefore, the number of sentences will equal the number of gaps, which is one less the number of fragments in the split result.
Of course, as Keith Hill pointed out in a comment above, the actual splitting is unnecessary when you can count the ends directly.
foreach( $Sentence in (Get-Content test.txt) ) {
# Split at every occurrence of '.' and '?', and count the gaps.
$Split = $Sentence.Split( '.?' )
$SplitSentences += $Split.Count - 1
# Count every occurrence of '.' and '?'.
$Ends = [char[]]$Sentence -match '[.?]'
$CountedSentences += $Ends.Count
}
Contents of test.txt file:
Is this a sentence? This is a
sentence. Is this a sentence?
This is a sentence. Is this a
very long sentence that spans
multiple lines?
Also, to clarify on the remarks to Vasili's answer: the PowerShell -split operator interprets a string as a regular expression by default, while the .NET Split method only works with literal string values.
For example:
'Unclosed [bracket?' -split '[?]' will treat [?] as a regular expression character class and match the ? character, returning the two strings 'Unclosed [bracket' and ''
'Unclosed [bracket?'.Split( '[?]' ) will call the Split(char[]) overload and match each [, ?, and ] character, returning the three strings 'Unclosed ', 'bracket', and ''

how to get the required strings from a text using perl

Here is the text to trim:
/home/netgear/Desktop/WGET-1.13/wget-1.13/src/cmpt.c:388,error,resourceLeak,Resource leak: fr
From the above text I need to get the data next to ":". How do I get 388,error,resourceLeak,Resource leak: fr?
You can use split to separate a string into a list based on a delimiter. In your case the delimiter should be a ::
my #parts = split ':', $text;
As the text you want to extract can also contain a :, use the limit argument to stop after the first one:
my #parts = split ':', $text, 2;
$parts[1] will then contain the text you wanted to extract. You could also pass the result into a list, discarding the first element:
my (undef, $extract) = split ':', $text, 2;
Aside from #RobEarl's suggestion of using split, you could use a regular expression to do this.
my ($match) = $text =~ /^[^:]+:(.*?)$/;
Regular expression:
^ the beginning of the string
[^:]+ any character except: ':' (1 or more times)
: match ':'
( group and capture to \1:
.*? any character except \n (0 or more times)
) end of \1
$ before an optional \n, and the end of the string
$match will now hold the result of capture group #1..
388,error,resourceLeak,Resource leak: fr