Splitting numbers and letters from a string misses zeroes - powershell

I am processing a time value that arrives in a string that starts with a T then is followed by a series of numbers immediately followed by the unit for each number. For example, 8 hours and 22 minutes will come in as T8H22M and 3 hours, 2 minutes, and 1 second will be T3H2M1S.
I need to split this encoded value into separate H/M/S columns, but I am having issues getting zeroes to work properly. I am using this code from another Question here with similar (but not identical) requirements.
This script:
Write-Host $('T8H22M' -split { [bool]($_ -as [double]) })
Write-Host $('T8H22M' -split { ! [bool]($_ -as [double]) })
Write-Host $('T8H0M' -split { [bool]($_ -as [double]) })
Write-Host $('T8H0M' -split { ! [bool]($_ -as [double]) })
Write-Host $('T0H' -split { $_ -eq '0' })
Produces this output:
T H M
8 22
T H0M
8
T H
As you can see, any time the numerical value is zero the string just doesn't split. This is a real problem as zero-time comes in as T0H which just won't parse using the method above.
How can I modify the code above to also split out zero values?

You could use a regex to extract the values. For example:
'T3H2M1S','T8H22M','T8H0M','T0H' |
ForEach-Object {
if($_ -match '^T((?<hours>\d+)H)*((?<minutes>\d+)M)*((?<seconds>\d+)S)*$') {
[PsCustomObject]#{
TimeString = $matches.0
Hours = [Int]$matches.hours
Minutes = [Int]$matches.minutes
Seconds = [Int]$matches.seconds
}
}
}
This produces an object for each time string, with Hours, Minutes and Seconds properties corresponding to the related part of that string:
TimeString Hours Minutes Seconds
---------- ----- ------- -------
T3H2M1S 3 2 1
T8H22M 8 22 0
T8H0M 8 0 0
T0H 0 0 0
Of course, you can change the contents of the if to manipulate the values however you like, not just creating an object - the key is that the RegEx should* split it correctly for you.
* - I say 'should' because I only tested with the example strings you give, so be sure to test more thoroughly yourself.

The time values you have resemble the ISO 8601 duration format except they all lack the leading P.
You can use the ToTimeSpan() method of .net System.Xml.XmlConvert to convert these to TimeSpans using:
'T3H2M1S','T8H22M','T8H0M','T0H' | ForEach-Object {
[System.Xml.XmlConvert]::ToTimeSpan("P$_") # prepend a 'P'
}
To return an array of TimeSpan objects
Days : 0
Hours : 8
Minutes : 22
Seconds : 0
Milliseconds : 0
Ticks : 301200000000
TotalDays : 0,348611111111111
TotalHours : 8,36666666666667
TotalMinutes : 502
TotalSeconds : 30120
TotalMilliseconds : 30120000
Days : 0
Hours : 8
Minutes : 0
Seconds : 0
Milliseconds : 0
Ticks : 288000000000
TotalDays : 0,333333333333333
TotalHours : 8
TotalMinutes : 480
TotalSeconds : 28800
TotalMilliseconds : 28800000
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 0
Ticks : 0
TotalDays : 0
TotalHours : 0
TotalMinutes : 0
TotalSeconds : 0
TotalMilliseconds : 0
Now you can use any of the properties to format as you like.

To compliment the answers from #boxdog and #Theo on the actual question why the zero's are missing and "How can I modify the code above to also split out zero values?"
Quote on the <ScriptBlock> parameter
<ScriptBlock>
An expression that specifies rules for applying the delimiter. The
expression must evaluate to $true or $false. Enclose the script block
in braces.
The point here is that the 0 evaluates to $False ("falsy"):
if (0) { 'True' } else { 'False' }
False
if (1) { 'True' } else { 'False' }
True
[Bool]0
False
[Bool]1
True
In other words you will need to compare your split expression against $Null to get what you are looking for:
Write-Host $('T8H22M' -split { $Null -ne ($_ -as [double]) })
T H M
Write-Host $('T8H22M' -split { $Null -eq ($_ -as [double]) })
8 22
Write-Host $('T8H0M' -split { $Null -ne ($_ -as [double]) })
T H M
Write-Host $('T8H0M' -split { $Null -eq ($_ -as [double]) })
8 0
Anyways, I recommend you to go for the solution from #boxdog or #Theo.

Related

Sort an array containing a lot of dates quickly

I have a huge array which contains dates. The date has the following form: tt.mm.yyyy. I know how to sort the array with Sort-Object, but the sorting takes a lot of time. I found another way of sorting arrays, but it doesn't work as expected.
My former code to sort the array was like this.
$data | Sort-Object { [System.DateTime]::ParseExact($_, "dd.MM.yyyy", $null) }
But as I siad before: this way of sorting is too slow. The Sort() method from System.Array seems to be much faster.
[Array]::Sort([array]$array)
This code sorts an array containing strings much faster than Sort-Object. Is there a way how I can change the above sorting method like the Sort-Object method?
The .NET method will work for dates if you make sure that the array is of type DateTime.
Meaning you should use
[DateTime[]]$dateArray
instead of
[Array]$dateArray
when you create it. Then you can use
[Array]::Sort($dateArray)
to perform the sort it self...
Your input data are date strings with a date format that doesn't allow sorting in "date" order. You must convert the strings either to actual dates
Get-Date $_
[DateTime]::ParseExact($_, "dd.MM.yyyy", $null)
or change the format of the string dates to ISO format, which does allow sorting in date order.
'{2}-{1}-{0}' -f ($_ -split '.')
'{0}-{1}-{2}' -f $_.Substring(6,4), $_.Substring(3,2), $_.Substring(0,2)
$_ -replace '(\d+)\.(\d+).(\d+)', '$3-$2-$1'
At some point you must do one of these conversions, either when creating the data or when sorting.
I ran some tests WRT performance of each conversion, and string transformation using the Substring() method seems to be the fastest way:
PS C:\> $dates = 1..10000 | % {
>> $day = Get-Random -Min 1 -Max 28
>> $month = (Get-Random -Min 1 -Max 12
>> $year = Get-Random -Min 1900 -Max 2014
>> '{0:d2}.{1:d2}.{2}' -f $day, $month, $year
>> }
>>
PS C:\> Measure-Command { $dates | sort {Get-Date $_} }
Days : 0
Hours : 0
Minutes : 0
Seconds : 1
Milliseconds : 520
Ticks : 15200396
TotalDays : 1,75930509259259E-05
TotalHours : 0,000422233222222222
TotalMinutes : 0,0253339933333333
TotalSeconds : 1,5200396
TotalMilliseconds : 1520,0396
PS C:\> Measure-Command { $dates | sort {'{2}-{1}-{0}' -f ($_ -split '.')} }
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 413
Ticks : 4139027
TotalDays : 4,79054050925926E-06
TotalHours : 0,000114972972222222
TotalMinutes : 0,00689837833333333
TotalSeconds : 0,4139027
TotalMilliseconds : 413,9027
PS C:\> Measure-Command { $dates | sort {$_ -replace '(\d+)\.(\d+).(\d+)', '$3-$2-$1'} }
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 348
Ticks : 3488962
TotalDays : 4,03815046296296E-06
TotalHours : 9,69156111111111E-05
TotalMinutes : 0,00581493666666667
TotalSeconds : 0,3488962
TotalMilliseconds : 348,8962
PS C:\> Measure-Command { $dates | sort {[DateTime]::ParseExact($_, "dd.MM.yyyy", $null)} }
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 340
Ticks : 3408966
TotalDays : 3,9455625E-06
TotalHours : 9,46935E-05
TotalMinutes : 0,00568161
TotalSeconds : 0,3408966
TotalMilliseconds : 340,8966
PS C:\> Measure-Command { $dates | sort {'{0}-{1}-{2}' -f $_.Substring(6,4), $_.Substring(3,2), $_.Substring(0,2)} }
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 292
Ticks : 2926835
TotalDays : 3,38754050925926E-06
TotalHours : 8,13009722222222E-05
TotalMinutes : 0,00487805833333333
TotalSeconds : 0,2926835
TotalMilliseconds : 292,6835

powershell any search a large text file faster

$File="C:\temp\test\ID.txt"
$line="ART.023.AGA_203.PL"
Measure-Command {$Sel = Select-String -pattern $line -path $File }
Measure-Command {
$reader = New-Object System.IO.StreamReader($File)
$content = $reader.ReadToEnd().Split('`n')
$results = $content | select-string -Pattern $line
}
Measure-Command {
$content= get-content $File
$results = $content | select-string -Pattern $line
$results
}
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 197
Ticks : 1970580
TotalDays : 2.28076388888889E-06
TotalHours : 5.47383333333333E-05
TotalMinutes : 0.0032843
TotalSeconds : 0.197058
TotalMilliseconds : 197.058
Days : 0
Hours : 0
Minutes : 0
Seconds : 4
Milliseconds : 135
Ticks : 41350664
TotalDays : 4.78595648148148E-05
TotalHours : 0.00114862955555556
TotalMinutes : 0.0689177733333333
TotalSeconds : 4.1350664
TotalMilliseconds : 4135.0664
Days : 0
Hours : 0
Minutes : 0
Seconds : 4
Milliseconds : 926
Ticks : 49265692
TotalDays : 5.70204768518518E-05
TotalHours : 0.00136849144444444
TotalMinutes : 0.0821094866666667
TotalSeconds : 4.9265692
TotalMilliseconds : 4926.5692
i want to search about 10000 $line in $File
search time very slower,any faster?
example :
search Keyword:$line ,then $File will show line
Keyword:ART.023.AGA_203.PL
file:2,45433;ART.023.AGA_203.PL;dddd;wwww;tt;
How does this compare to your other methods?
Measure-Command {
Get-Content $file -ReadCount 1000 |
foreach {$_ -match $line}
}
Note: when comparison testing operations that do disk reads like this, always run multiple tests and discard the first one. If the disk has any on-board read cache, the first test can pre-load the cache for subsequent tests and skew the results.

In Powershell what is the most efficient way to generate a range interval?

Here is one example, but there must be a more efficient way:
1..100|%{$temp=$_;$temp%=3;if ($temp -eq 0){$_} }
1..100 | Where-Object {$_ % 3 -eq 0}
I would guess that the "most efficient" way would be to use a plain old for loop:
for($i=3; $i -le 100; $i +=3){$i}
Though that's not very elegant. You could create a function:
function range($start,$end,$interval) {for($i=$start; $i -le $end; $i +=$interval){$i}}
Timing this against your method (using more pithy version of other answer):
# ~> measure-command {1..100 | Where-Object {$_ % 3 -eq 0}}
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 7
Ticks : 76020
TotalDays : 8.79861111111111E-08
TotalHours : 2.11166666666667E-06
TotalMinutes : 0.0001267
TotalSeconds : 0.007602
TotalMilliseconds : 7.602
# ~> measure-command{range 3 100 3}
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 0
Ticks : 6197
TotalDays : 7.1724537037037E-09
TotalHours : 1.72138888888889E-07
TotalMinutes : 1.03283333333333E-05
TotalSeconds : 0.0006197
TotalMilliseconds : 0.6197

How to use the index property of the DfsrIdRecordInfo WMI class for pagination of WMI queries

edit: Should I have posted this on serverfault instead? There is not even a dfs-r category on stackoverflow, but I thought this was more of a scripting\programming question. Let me know if I should put this on serverfault instead.
I attempted to use the DfsrIdRecordInfo class to retrieve all files of a somewhat large (6000 file) DFSR database and was getting WMI quota errors.
Doubling and even tripling the wmi quotas on the server did not solve this.
I found looking here that the index property of this class is: "The run-time index of the record. This value is used to partition the result of a large query." which sounded like exactly what I wanted, but the behavior of this property is not what I expected.
I found that when I try to do paging with this property it does not retrieve all of the records as per the following example with powershell.
I tested this on a DFSR database with less than 700 files that does not throw a quota error. Because this is a small database I can get all the files like this in less than a second:
$DFSRFiles =
gwmi `
-Namespace root\microsoftdfs `
-ComputerName 'dfsrserver' `
-Query "SELECT *
FROM DfsrIdRecordInfo
WHERE replicatedfolderguid = '$guid'"
PS F:\> $DFSRFiles.count
680
So I have 680 files in this DFSR DB. Now if I try to use the index property for pagination like this:
$starttime = Get-Date;
$i = 0 #index counter
$DfsrIdRecordInfoArr = #()
while ($i -lt 1000) {
$starttimepage = Get-Date
$StartRange = $i
$EndRange = $i += 500
Write-Host -ForegroundColor Green "On range: $StartRange - $EndRange"
$DFSRFiles =
gwmi `
-Namespace root\microsoftdfs `
-ComputerName 'dfsrserver' `
-Query "SELECT *
FROM DfsrIdRecordInfo
WHERE index >= $StartRange
AND index <= $EndRange
AND replicatedfolderguid = '$guid'"
$DfsrIdRecordInfoArr += $DFSRFiles
Write-Host -ForegroundColor Green "Returned $($DFSRFiles.count) objects from range"
(Get-Date) - $starttimepage
write-host -fo yellow "DEBUG: i = $i"
}
(get-date) - $starttime
PS F:\> $DfsrIdRecordInfoArr.count
517
So it only returns 517 files.
Here is the full output of my debug messages. You can also see searching this way takes a super long time:
On range: 0 - 500
Returned 501 objects from range
Days : 0
Hours : 0
Minutes : 1
Seconds : 29
Milliseconds : 540
Ticks : 895409532
TotalDays : 0.001036353625
TotalHours : 0.024872487
TotalMinutes : 1.49234922
TotalSeconds : 89.5409532
TotalMilliseconds : 89540.9532
DEBUG: i = 500
On range: 500 - 1000
Returned 16 objects from range
Days : 0
Hours : 0
Minutes : 1
Seconds : 35
Milliseconds : 856
Ticks : 958565847
TotalDays : 0.00110945121180556
TotalHours : 0.0266268290833333
TotalMinutes : 1.597609745
TotalSeconds : 95.8565847
TotalMilliseconds : 95856.5847
DEBUG: i = 1000
Days : 0
Hours : 0
Minutes : 3
Seconds : 5
Milliseconds : 429
Ticks : 1854295411
TotalDays : 0.00214617524421296
TotalHours : 0.0515082058611111
TotalMinutes : 3.09049235166667
TotalSeconds : 185.4295411
TotalMilliseconds : 185429.5411
Am I doing something stupid? I was thinking that "run-time index" means the index property is not statically attached to the records and is generated anew for each record every time a query is run because the index properties of objects in $DFSRFiles do not match those in $DfsrIdRecordInfoArr.
But if the index property is different for every query then I would have duplicates in $DfsrIdRecordInfoArr which I do not. All the records are unique, but it just doesn't return all of them.
Is the index property totally useless for my purpose? Perhaps when it says "...partition the result of a large query" this means it is to be used on records that have already been returned from WMI not the WMI query itself.
Any guidance would be appreciated. Thanks in advance.

Converting time 121.419419 to readable minutes/seconds

I'd like to calculate the time my script runs, but my result from get-date is in totalseconds.
How can I convert this to 31:14:12 behing hours:minutes:seconds?
PS> $ts = New-TimeSpan -Seconds 1234567
PS> '{0:00}:{1:00}:{2:00}' -f $ts.Hours,$ts.Minutes,$ts.Seconds
06:56:07
or
PS> "$ts" -replace '^\d+?\.'
06:56:07
All you have to do is use the Measure-Command cmdlet to get the time:
PS > measure-command { sleep 5}
Days : 0
Hours : 0
Minutes : 0
Seconds : 5
Milliseconds : 13
Ticks : 50137481
TotalDays : 5.80294918981481E-05
TotalHours : 0.00139270780555556
TotalMinutes : 0.0835624683333333
TotalSeconds : 5.0137481
TotalMilliseconds : 5013.7481
The above output itself might be good enough for you, or you can format it appropriately as the the output of Measure-Command is a TimeSpan object. Or you can use ToString:
PS > (measure-command { sleep 125}).tostring()
00:02:05.0017446