How to find the positions of all instances of a string in a specific line of a txt file? - powershell

Say that I have a .txt file with lines of multiple dates/times:
5/5/2020 5:45:45 AM
5/10/2020 12:30:03 PM
And I want to find the position of all slashes in one line, then move on to the next.
So for the first line I would want it to return the value:
1 3
And for the second line I would want:
1 4
How would I go about doing this?
I currently have:
$firstslashpos = Get-Content .\Documents\LoggedDates.txt | ForEach-Object{
$_.IndexOf("/")}
But that gives me only the first "/" on each line, and gives me that result for all lines at once. I need it to loop where I can figure out the space between each "/" for each line.
Sorry if I worded this badly.

You can indeed use the String.IndexOf() method for this!
function Find-SubstringIndex
{
param(
[string]$InputString,
[string]$Substring
)
$indices = #()
# start at position zero
$offset = 0
# Keep calling IndexOf() to find the next occurrence of the substring
# stop when IndexOf() returns -1
while(($i = $InputString.IndexOf($Substring, $offset)) -ne -1){
# Keep track of the index at which the substring was found
$indices += $i
# Update the offset, we'll want to start searching for the next index _after_ this one
$offset = $i + $Substring.Length
}
}
Now you can do:
Get-Content listOfDates.txt |ForEach-Object {
$indices = Find-SubstringIndex -InputString $_ -Substring '/'
Write-Host "Found slash at indices: $($indices -join ',')"
}

An concise solution is to use [regex]::Matches(), which finds all matches of a given regular expression in a given string and returns a collection of match objects that also indicate the index (character position) of each match:
# Create a sample file.
#'
5/5/2020 5:45:45 AM
5/10/2020 12:30:03 PM
'# > sample.txt
Get-Content sample.txt | ForEach-Object {
# Get the indices of all '/' instances.
$indices = [regex]::Matches($_, '/').Index
# Output them as a list (string), separated with spaces.
"$indices"
}
The above yields:
1 3
1 4
Note:
Input lines that contain no / instances at all will result in empty lines.
If, rather than strings, you want to output the indices as arrays (collections), use
, [regex]::Matches($_, '/').Index as the only statement in the ForEach-Object script block; the unary form of ,, the array constructor operator ensures (by way of a transient aux. array) that the collection returned by the method call is output as a whole. If you omit the , , the indices are output one by one, resulting in a flat array when collected in a variable.

Related

Construct a formatted list of strings in PowerShell

I have several lists consisting of strings like this (imagine them as a tree of sort):
$list1:
data-pool
data-pool.house 1
data-pool.house 1.door2
data-pool.house 1.door3
data-pool.house2
data-pool.house2.door 1
To make them more easier to parse as a tree how can indent them based on how many . characters occur while ditching the repetitive text earlier in the line? For example:
data-pool
house 1
door2
door3
house2
door 1
The way I approached it counting the occurrences of .s with .split('.').Length-1 to determine the amount of needed indents and then adding the spaces using .IndexOf('.') and .Substring(0, <position>) feels overly complicated - or then I just can't wrap my head around how to do it in a less complicated way.
I think this should work as long as the number of nodes from line to line are ordered, what I mean by this is that it will not look "pretty" if for example the current node has n elements and the next node has n+2 or more.
To put it into perspective, using this list as an example:
$list = #'
data-pool
data-pool.house 1
data-pool.house 1.door2
data-pool.house 1.door3
data-pool.house 1.door4.something1
data-pool.house2
data-pool.house2.door 1
data-pool.house2.door 2.something1
data-pool.house2.door 3.something1.something2
data-pool.house3
data-pool.house3.door 1
data-pool.house3.door 2
'# -split '\r?\n'
The function indent will take each line of your list and will split it using . as delimiter, if the count of elements after splitting is lower than or equal to 1 it will not perform any modification and display that line as is, else, it will multiply the $IndentType containing 2 white spaces by the number of elements of the split array minus 1 and concatenate it with the last element of the split array.
function indent {
param(
[string]$Value,
[string]$IndentType = ' '
)
$out = $Value -split '\.'
$level = $out.Count - 1
'{0}{1}' -f ($null,($IndentType*$level))[[int]($out.Count -gt 1)], $out[-1]
}
$list.ForEach({ indent $_ })
Sample:
data-pool
house 1
door2
door3
something1
house2
door 1
something1
something2
house3
door 1
door 2
the approach to get the last element of the string is below
## String
$string = "data-pool.house 1"
## split the string into an array every "."
$split = $string.split(".")
## return the last element of the array
write-host $split[-1] -ForegroundColor Green
Then to test against each string
$myArray = ("data-pool", "data-pool.house 1", "data-pool.house 1.door2", "data-pool.house 1.door3", "data-pool.house2", "data-pool.house2.door 1")
ForEach($Name in $myArray) {
## String
$Name = $Name.ToString()
$string = $Name
$split = $string.split(".")
## return the last element of the array
write-host $split[-1] -ForegroundColor Green
}

powershell: concatenate an extension to each element of an array [duplicate]

I have an array and when I try to append a string to it the array converts to a single string.
I have the following data in an array:
$Str
451 CAR,-3 ,7 ,10 ,0 ,3 , 20 ,Over: 41
452 DEN «,40.5,0,7,0,14, 21 ,  Cover: 4
And I want to append the week of the game in this instance like this:
$Str = "Week"+$Week+$Str
I get a single string:
Week16101,NYG,42.5 ,3 ,10 ,3 ,3 , 19 ,Over 43 102,PHI,- 1,14,7,0,3, 24 ,  Cover 4 103,
Of course I'd like the append to occur on each row.
Instead of a for loop you could also use the Foreach-Object cmdlet (if you prefer using the pipeline):
$str = "apple","lemon","toast"
$str = $str | ForEach-Object {"Week$_"}
Output:
Weekapple
Weeklemon
Weektoast
Another option for PowerShell v4+
$str = $str.ForEach({ "Week" + $Week + $_ })
Something like this will work for prepending/appending text to each line in an array.
Set array $str:
$str = "apple","lemon","toast"
$str
apple
lemon
toast
Prepend text now:
for ($i=0; $i -lt $Str.Count; $i++) {
$str[$i] = "yogurt" + $str[$i]
}
$str
yogurtapple
yogurtlemon
yogurttoast
This works for prepending/appending static text to each line. If you need to insert a changing variable this may require some modification. I would need to see more code in order to recommend something.
Another solution, which is fast and concise, albeit a bit obscure.
It uses the regex-based -replace operator with regex '^' which matches the position at the start of each input string and therefore effectively prepends the replacement string to each array element (analogously, you could use '$' to append):
# Sample array.
$array = 'one', 'two', 'three'
# Prepend 'Week ' to each element and create a new array.
$newArray = $array -replace '^', 'Week '
$newArray then contains 'Week one', 'Week two', 'Week three'
To show an equivalent foreach solution, which is syntactically simpler than a for solution (but, like the -replace solution above, invariably creates a new array):
[array] $newArray = foreach ($element in $array) { 'Week ' + $element }
Note: The [array] cast is needed to ensure that the result is always an array; without it, if the input array happens to contain just one element, PowerShell would assign the modified copy of that element as-is to $newArray; that is, no array would be created.
As for what you tried:
"Week"+$Week+$Str
Because the LHS of the + operation is a single string, simple string concatenation takes place, which means that the array in $str is stringified, which by default concatenates the (stringified) elements with a space character.
A simplified example:
PS> 'foo: ' + ('bar', 'baz')
foo: bar baz
Solution options:
For per-element operations on an array, you need one of the following:
A loop statement, such as foreach or for.
Michael Timmerman's answer shows a for solution, which - while syntactically more cumbersome than a foreach solution - has the advantage of updating the array in place.
A pipeline that performs per-element processing via the ForEach-Object cmdlet, as shown in Martin Brandl's answer.
An expression that uses the .ForEach() array method, as shown in Patrick Meinecke's answer.
An expression that uses an operator that accepts arrays as its LHS operand and then operates on each element, such as the -replace solution shown above.
Tradeoffs:
Speed:
An operator-based solution is fastest, followed by for / foreach, .ForEach(), and, the slowest option, ForEach-Object.
Memory use:
Only the for option with indexed access to the array elements allows in-place updating of the input array; all other methods create a new array.[1]
[1] Strictly speaking, what .ForEach() returns isn't a .NET array, but a collection of type [System.Collections.ObjectModel.Collection[psobject]], but the difference usually doesn't matter in PowerShell.

How to cut specific string?

I have a string with different length. I want to cut a specific word in my string.
Please help, I am new to PowerShell.
I tried this code, it's still not what I need.
$String = "C:\Users\XX\Documents\Data.txt"
$Cut = $String.Substring(22,0)
$Cut
My expectation is that I can return the word Data.
Assuming the string is always the same format (i.e. a path ending in a filename), then there are quite a few ways to do this, such as using regular expressions. Here is a slightly less conventional method:
# Define the path
$filepath = "C:\Users\XX\Documents\Data.txt"
# Create a dummy fileinfo object
$fileInfo = [System.IO.FileInfo]$filePath
# Get the file name property
$fileInfo.BaseName
Of course, you could do all of this in one step:
([System.IO.FileInfo]"C:\Users\XX\Documents\Data.txt").BaseName
If the path is an existing one, you could use
(Get-Item $String).BaseName
Otherwise
(Split-Path $String -Leaf) -Replace '\.[^\.]*$'
While in that specific example the simplest way is to use Substring(startPosition,length) to extract file name you'd probably want to use something like this:
(("C:\Users\XX\Documents\Data.txt".split("\\"))[-1].Split("."))[0]
Explanation:
("C:\Users\XX\Documents\Data.txt".split("\\"))[-1]
that part split the path by \ and returns last item (escaping it seems to be not mandatory by the way so you can use .split("\") instead of .split("\\")). From it you receive Data.txt so you have to separate name and extension. You can do this by splitting by . and choosing first element returned
There are number of ways of doing it depending upon your input -
Method 1 - Hard-coding using the sub-string function.
$String = "C:\Users\XX\Documents\Data.txt"
$Cut = $String.Substring(22,4)
$Cut
The above approach will work for a single input but will become difficult to manage for multiple inputs of different lengths.
Method 2 - Using the split method
$String = "C:\Users\XX\Documents\Data.txt"
$cut = $String.Split("\")[-1].split(".")[0]
$cut
Split method will split string into substring. The index [-1] will return the last value returned by the split method.
The second split is to return the word Data from the word Data.txt.
Method 3 - If the input is a file path
$string = Get-ChildItem $env:USERPROFILE\Desktop -File | select -First 1
$Cut = $String.BaseName
More about method 3 here.
If you can use Powershell 6 - SplitPath
#Requires -Version 6.0
Split-Path $String -LeafBase

Extract the nth to nth characters of an string object

I have a filename and I wish to extract two portions of this and add into variables so I can compare if they are the same.
$name = FILE_20161012_054146_Import_5785_1234.xml
So I want...
$a = 5785
$b = 1234
if ($a = $b) {
# do stuff
}
I have tried to extract the 36th up to the 39th character
Select-Object {$_.Name[35,36,37,38]}
but I get
{5, 7, 8, 5}
Have considered splitting but looks messy.
There are several ways to do this. One of the most straightforward, as PetSerAl suggested is with .Substring():
$_.name.Substring(35,4)
Another way is with square braces, as you tried to do, but it gives you an array of [char] objects, not a string. You can use -join and you can use a range to make that easier:
$_.name[35..38] -join ''
For what you're doing, matching a pattern, you could also use a regular expression with capturing groups:
if ($_.name -match '_(\d{4})_(\d{4})\.xml$') {
if ($Matches[1] -eq $Matches[2]) {
# ...
}
}
This way can be very powerful, but you need to learn more about regex if you're not familiar. In this case it's looking for an underscore _ followed by 4 digits (0-9), followed by an underscore, and four more digits, followed by .xml at the end of the string. The digits are wrapped in parentheses so they are captured separately to be referenced later (in $Matches).
Yet another approach: returns 1234 substring four times.
$FileName = "FILE_20161012_054146_Import_5785_1234.xml"
# $FileName
$FileName.Substring(33,4) # Substring method (zero-based)
-join $FileName[33..36] # indexing from beginning (zero-based)
-join $FileName[-8..-5] # reverse indexing:
# e.g. $FileName[-1] returns the last character
$FileArr = $FileName.Split("_.") # Split (depends only on filename "pattern template")
$FileArr[$FileArr.Count -2] # does not depend on lengths of tokens

Using PowerShell To Count Sentences In A File

I am having an issue with my PowerShell Program counting the number of sentences in a file I am using. I am using the following code:
foreach ($Sentence in (Get-Content file))
{
$i = $Sentence.Split("?")
$n = $Sentence.Split(".")
$Sentences += $i.Length
$Sentences += $n.Length
}
The total number of sentences I should get is 61 but I am getting 71, could someone please help me out with this? I have Sentences set to zero as well.
Thanks
foreach ($Sentence in (Get-Content file))
{
$i = $Sentence.Split("[?\.]")
$Sentences = $i.Length
}
I edited your code a bit.
The . that you were using needs to be escaped, otherwise Powershell recognises it as a Regex dotall expression, which means "any character"
So you should split the string on "[?\.]" or similar.
When counting sentences, what you are looking for is where each sentence ends. Splitting, though, returns a collection of sentence fragments around those end characters, with the ends themselves represented by the gap between elements. Therefore, the number of sentences will equal the number of gaps, which is one less the number of fragments in the split result.
Of course, as Keith Hill pointed out in a comment above, the actual splitting is unnecessary when you can count the ends directly.
foreach( $Sentence in (Get-Content test.txt) ) {
# Split at every occurrence of '.' and '?', and count the gaps.
$Split = $Sentence.Split( '.?' )
$SplitSentences += $Split.Count - 1
# Count every occurrence of '.' and '?'.
$Ends = [char[]]$Sentence -match '[.?]'
$CountedSentences += $Ends.Count
}
Contents of test.txt file:
Is this a sentence? This is a
sentence. Is this a sentence?
This is a sentence. Is this a
very long sentence that spans
multiple lines?
Also, to clarify on the remarks to Vasili's answer: the PowerShell -split operator interprets a string as a regular expression by default, while the .NET Split method only works with literal string values.
For example:
'Unclosed [bracket?' -split '[?]' will treat [?] as a regular expression character class and match the ? character, returning the two strings 'Unclosed [bracket' and ''
'Unclosed [bracket?'.Split( '[?]' ) will call the Split(char[]) overload and match each [, ?, and ] character, returning the three strings 'Unclosed ', 'bracket', and ''