I have a filename and I wish to extract two portions of this and add into variables so I can compare if they are the same.
$name = FILE_20161012_054146_Import_5785_1234.xml
So I want...
$a = 5785
$b = 1234
if ($a = $b) {
# do stuff
}
I have tried to extract the 36th up to the 39th character
Select-Object {$_.Name[35,36,37,38]}
but I get
{5, 7, 8, 5}
Have considered splitting but looks messy.
There are several ways to do this. One of the most straightforward, as PetSerAl suggested is with .Substring():
$_.name.Substring(35,4)
Another way is with square braces, as you tried to do, but it gives you an array of [char] objects, not a string. You can use -join and you can use a range to make that easier:
$_.name[35..38] -join ''
For what you're doing, matching a pattern, you could also use a regular expression with capturing groups:
if ($_.name -match '_(\d{4})_(\d{4})\.xml$') {
if ($Matches[1] -eq $Matches[2]) {
# ...
}
}
This way can be very powerful, but you need to learn more about regex if you're not familiar. In this case it's looking for an underscore _ followed by 4 digits (0-9), followed by an underscore, and four more digits, followed by .xml at the end of the string. The digits are wrapped in parentheses so they are captured separately to be referenced later (in $Matches).
Yet another approach: returns 1234 substring four times.
$FileName = "FILE_20161012_054146_Import_5785_1234.xml"
# $FileName
$FileName.Substring(33,4) # Substring method (zero-based)
-join $FileName[33..36] # indexing from beginning (zero-based)
-join $FileName[-8..-5] # reverse indexing:
# e.g. $FileName[-1] returns the last character
$FileArr = $FileName.Split("_.") # Split (depends only on filename "pattern template")
$FileArr[$FileArr.Count -2] # does not depend on lengths of tokens
Related
I have an array and when I try to append a string to it the array converts to a single string.
I have the following data in an array:
$Str
451 CAR,-3 ,7 ,10 ,0 ,3 , 20 ,Over: 41
452 DEN «,40.5,0,7,0,14, 21 , Cover: 4
And I want to append the week of the game in this instance like this:
$Str = "Week"+$Week+$Str
I get a single string:
Week16101,NYG,42.5 ,3 ,10 ,3 ,3 , 19 ,Over 43 102,PHI,- 1,14,7,0,3, 24 , Cover 4 103,
Of course I'd like the append to occur on each row.
Instead of a for loop you could also use the Foreach-Object cmdlet (if you prefer using the pipeline):
$str = "apple","lemon","toast"
$str = $str | ForEach-Object {"Week$_"}
Output:
Weekapple
Weeklemon
Weektoast
Another option for PowerShell v4+
$str = $str.ForEach({ "Week" + $Week + $_ })
Something like this will work for prepending/appending text to each line in an array.
Set array $str:
$str = "apple","lemon","toast"
$str
apple
lemon
toast
Prepend text now:
for ($i=0; $i -lt $Str.Count; $i++) {
$str[$i] = "yogurt" + $str[$i]
}
$str
yogurtapple
yogurtlemon
yogurttoast
This works for prepending/appending static text to each line. If you need to insert a changing variable this may require some modification. I would need to see more code in order to recommend something.
Another solution, which is fast and concise, albeit a bit obscure.
It uses the regex-based -replace operator with regex '^' which matches the position at the start of each input string and therefore effectively prepends the replacement string to each array element (analogously, you could use '$' to append):
# Sample array.
$array = 'one', 'two', 'three'
# Prepend 'Week ' to each element and create a new array.
$newArray = $array -replace '^', 'Week '
$newArray then contains 'Week one', 'Week two', 'Week three'
To show an equivalent foreach solution, which is syntactically simpler than a for solution (but, like the -replace solution above, invariably creates a new array):
[array] $newArray = foreach ($element in $array) { 'Week ' + $element }
Note: The [array] cast is needed to ensure that the result is always an array; without it, if the input array happens to contain just one element, PowerShell would assign the modified copy of that element as-is to $newArray; that is, no array would be created.
As for what you tried:
"Week"+$Week+$Str
Because the LHS of the + operation is a single string, simple string concatenation takes place, which means that the array in $str is stringified, which by default concatenates the (stringified) elements with a space character.
A simplified example:
PS> 'foo: ' + ('bar', 'baz')
foo: bar baz
Solution options:
For per-element operations on an array, you need one of the following:
A loop statement, such as foreach or for.
Michael Timmerman's answer shows a for solution, which - while syntactically more cumbersome than a foreach solution - has the advantage of updating the array in place.
A pipeline that performs per-element processing via the ForEach-Object cmdlet, as shown in Martin Brandl's answer.
An expression that uses the .ForEach() array method, as shown in Patrick Meinecke's answer.
An expression that uses an operator that accepts arrays as its LHS operand and then operates on each element, such as the -replace solution shown above.
Tradeoffs:
Speed:
An operator-based solution is fastest, followed by for / foreach, .ForEach(), and, the slowest option, ForEach-Object.
Memory use:
Only the for option with indexed access to the array elements allows in-place updating of the input array; all other methods create a new array.[1]
[1] Strictly speaking, what .ForEach() returns isn't a .NET array, but a collection of type [System.Collections.ObjectModel.Collection[psobject]], but the difference usually doesn't matter in PowerShell.
The problem is to find the position of the very first occurrence of any of the elements of an array.
$terms = #("#", ";", "$", "|");
$StringToBeSearched = "ABC$DEFG#";
The expected output needs to be: 3, as '$' occurs before any of the other $terms in the $StringToBeSearched variable
Also, the idea is to do it in the least expensive way.
# Define the characters to search for as an array of [char] instances ([char[]])
# Note the absence of `#(...)`, which is never needed for array literals,
# and the absence of `;`, which is only needed to place *multiple* statements
# on the same line.
[char[]] $terms = '#', ';', '$', '|'
# The string to search trough.
# Note the use of '...' rather than "...",
# to avoid unintended expansion of "$"-prefixed tokens as
# variable references.
$StringToBeSearched = 'ABC$DEFG#'
# Use the [string] type's .IndexOfAny() method to find the first
# occurrence of any of the characters in the `$terms` array.
$StringToBeSearched.IndexOfAny($terms) # -> 3
Why does the following result in an array with 7 elements with 5 blank? I'd expect only 2 elements.
Where are the 5 blank elements coming from?
$a = 'OU=RAH,OU=RAC'
$b = $a.Split('OU=')
$b.Count
$b
<#
Outputs:
7
RAH,
RAC
#>
In order to split by strings (rather than a set of characters) and/or regular expressions, use PowerShell's -split operator:
PS> ('OU=RAH,OU=RAC' -split ',?OU=') -ne '' # parentheses not strictly needed
RAH
RAC
-split by default interprets its RHS as a regular expression, and ,?OU= matches both OU by itself and ,OU, resulting in the desired splitting, returning the tokens as an array.
For all features supported by -split, including literal string matching, limiting the number of tokens returned, and use of script blocks, see Get-Help about_split.
Since the input starts with a match, however, -split considers the first element of the split to be the empty string. By passing the resulting array of tokens to -ne '', we filter out these empty strings.
By contrast, in Windows PowerShell use of the .NET (FullCLR, up to 4.x) String.Split() method, as you've tried, works very differently:
'OU=RAH,OU=RAC'.Split('OU=')
OU= is interpreted as an array of characters, any of which, individually acts as separator - irrespective of the order in which the characters are specified. Leading, adjacent, and trailing separators are by default considered to separate empty tokens, so you get an array of 7 tokens:
#( '', '', '', 'RAH,', '', '', 'RAC')
Note to PowerShell Core users (PowerShell versions 6 and above):
The .NET Core String.Split() method now does have a scalar [string] overload that looks for an entire string as the separator, which PowerShell Core selects by default; to get the character-array behavior described, you must cast to [char[]] explicitly:
'OU=RAH,OU=RAC'.Split([char[]] 'OU=')
If you construct the .Split() method call carefully, you can specify strings, but note that you still don't get regular-expression support:
PS> 'OU=RAH,OU=RAC'.Split([string[]] 'OU=', 'RemoveEmptyEntries')
RAH,
RAC
works to split by literal string OU=, removing empty entries, but as you can see, that doesn't allow you to account for the ,
You can take this further by specifying an array of strings to split by, which works in this simple case, but ultimately doesn't give you the same flexibility as the regular expressions that PowerShell's -split operator provides:
PS> 'OU=RAH,OU=RAC'.Split([string[]] ('OU=', ',OU='), 'RemoveEmptyEntries')
RAH
RAC
Note that specifying an (array of) strings requires the 2-argument form of the method call, meaning you must also specify a System.StringSplitOptions enumeration value. Use 'None' to not apply any options (as of this writing, the only true option that is supported is 'RemoveEmptyEntries', as used above).
(The type-safe way to specify option is to use, e.g., [System.StringSplitOptions]::None, however, passing the option name as a string is a convenient shortcut; e.g., 'None'.)
It splits the string for each character in the separator. So its splitting it on 'O', 'U' & '='.
As #mklement0 has commented, my earlier answer would not work in all cases. So here is an alternate way to get the expected items.
$a.Split(',') |% { $_.Split('=') |? { $_ -ne 'OU' } }
This code will split the string, first on , then each item will be split on = and ignore the items that are OU, eventually returning the expected values:
RAH
RAC
This will work even in case of:
$a = 'OU=FOO,OU=RAH,OU=RAC'
generating 3 items FOO, RAH & RAC
To get only 2 string as expected you could use following line:
$a.Split('OU=', [System.StringSplitOptions]::RemoveEmptyEntries)
Which will give output as:
RAH,
RAC
And if you use (note the comma in the separator)
$a.Split(',OU=', [System.StringSplitOptions]::RemoveEmptyEntries)
you will get
RAH
RAC
This is probably what you want. :)
Never mind. Just realised it looks for strings on either side of 'O', 'U', and '='.
There are therefore 5 blank chars (in front of the first 'O', between 'O' and 'U', between 'U' and '=', between the second 'O' and 'U', between the second 'U' and '=').
String.Split() is character oriented. It splits on O, U, = as three separate places.
Think of it as intending to be used for 1,2,3,4,5. If you had ,2,3,4, it would imply there were empty spaces at the start and end. If you had 1,2,,,5 it would imply two empty spaces in the middle.
You can see with something like:
PS C:\> $a = 'OU=RAH,OU=RAC'
PS C:\> $a.Split('RAH')
OU=
,OU=
C
The spaces are R_A_H and R_A. Split on the end of a string, it introduces blanks at the start/end.
PowerShell's -split operator is string oriented.
PS D:\t> $a = 'OU=RAH,OU=RAC'
PS D:\t> $a -split 'OU='
RAH,
RAC
You might do better to split on the comma, then replace out OU=, or vice versa, e.g.
PS D:\t> $a = 'OU=RAH,OU=RAC'
PS D:\t> $a.Replace('OU=','').Split(',')
RAH
RAC
I am getting a string from VSO (using TFPT.exe) that can be either the item number or the item number plus a letter
"830" or "830a"
How can I break off the letter if it exists - and convert the number to int
$a = 830
#or
$a = 830
$b = "a"
I tried to test if "830" was a number - but i guess because it pulls it in as a string, i don't know how to ask: could this string be a int?
Assuming only the one set of numbers you can -match that pretty easily with regex. Where \d+ will match a group of consecutive digits.
PS C:\temp> "830a" -match "\d+"
True
PS C:\temp> $matches[0]
830
Knowing that you could incorporate something like this in your code.
$b = If($a -match "\d+"){[int]$matches[0]}
Obviously it would be more appropriate to use better variable names but this is just proof of concept. This as written would cause an issue if the alpha characters were in the middle of the string. As long as the number are grouped together it will work either way.
The other way you could do this would be to replace all of the character that are not digits.
$a = "830adasdf"
$a = $a -replace "\D" -as [int]
\D meaning any non digit character. -as [int] will perform the cast.
In either case [int] will cast the remaining digit string as an integer.
If you could guarantee that it is just the one character on the end that could be there then you could use the string method .TrimEnd() as well. It removes all characters found on the end of a string as determined by a char array. Lets give it an array of all letters. In practice this was having an issue with case so we take the string, converted it to uppercase and then remove any trailing letters.
"830z".ToUpper().TrimEnd([char[]](65..99)) -as [int]
It actually seems to convert the number array to char automatically so this would do just the same
"830z".ToUpper().TrimEnd(65..99) -as [int]
This is the best I have been able to come up with, seams to work: doesn't seam the most efficient way...
$t = $parent.Substring($parent.Length-1)
if($t -in #("a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"))
{
[int]$parentSRP = $parent.Substring(0,$parent.Length-1)
$parentVer = $parent.Substring($parent.Length-1,1)
}
else{[int]$parentSRP = $parent}
I am having an issue with my PowerShell Program counting the number of sentences in a file I am using. I am using the following code:
foreach ($Sentence in (Get-Content file))
{
$i = $Sentence.Split("?")
$n = $Sentence.Split(".")
$Sentences += $i.Length
$Sentences += $n.Length
}
The total number of sentences I should get is 61 but I am getting 71, could someone please help me out with this? I have Sentences set to zero as well.
Thanks
foreach ($Sentence in (Get-Content file))
{
$i = $Sentence.Split("[?\.]")
$Sentences = $i.Length
}
I edited your code a bit.
The . that you were using needs to be escaped, otherwise Powershell recognises it as a Regex dotall expression, which means "any character"
So you should split the string on "[?\.]" or similar.
When counting sentences, what you are looking for is where each sentence ends. Splitting, though, returns a collection of sentence fragments around those end characters, with the ends themselves represented by the gap between elements. Therefore, the number of sentences will equal the number of gaps, which is one less the number of fragments in the split result.
Of course, as Keith Hill pointed out in a comment above, the actual splitting is unnecessary when you can count the ends directly.
foreach( $Sentence in (Get-Content test.txt) ) {
# Split at every occurrence of '.' and '?', and count the gaps.
$Split = $Sentence.Split( '.?' )
$SplitSentences += $Split.Count - 1
# Count every occurrence of '.' and '?'.
$Ends = [char[]]$Sentence -match '[.?]'
$CountedSentences += $Ends.Count
}
Contents of test.txt file:
Is this a sentence? This is a
sentence. Is this a sentence?
This is a sentence. Is this a
very long sentence that spans
multiple lines?
Also, to clarify on the remarks to Vasili's answer: the PowerShell -split operator interprets a string as a regular expression by default, while the .NET Split method only works with literal string values.
For example:
'Unclosed [bracket?' -split '[?]' will treat [?] as a regular expression character class and match the ? character, returning the two strings 'Unclosed [bracket' and ''
'Unclosed [bracket?'.Split( '[?]' ) will call the Split(char[]) overload and match each [, ?, and ] character, returning the three strings 'Unclosed ', 'bracket', and ''