Conditional Multiplication in Loop - powershell

I've got a script that goes through a CSV file with two formats of data (XY:ZABC or 0.xyz). The values are then saved in a CSV file with one column and variable number of rows. I am trying to setup my script such that, for numbers of value 0.xyz, it will multiply by 1440 and then store it in $Values. The numbers of format XY:ZABC will be stored as they are in $Values as well.
$Values = #(Get-Content *\source.csv -Raw) -split '\s+' |
Where-Object {$_ -like '*:*' -or '0.*'}
"UniqueActiveFaults" | Out-File *\IdealOutput.csv
$Values | Sort-Object -Unique | Out-File *\IdealOutput.csv
I've tried to do this by adding the following code:
foreach ($i in $Values) {
if ($i -lt 1) {$i*1440}
}
I've also tried to do it with a do {$i*1440} while ($I -lt 1) loop, but the result is the number 0.xyz shown 1440 times. I believe it's due to the type of data that $Values is taking, but not sure.
Sample data:
0.12345
00:9090 90:4582
0.12346
0.1145
0.145654
0.5648
01:9045 90:4500
90:4546
BA: 1117 BA:2525

In your code, $Values is an array of strings. The "multiply" operation on a string is to repeat it. To treat it like a number, cast to float before multiplying.
foreach ($i in $Values) {
if ($i -lt 1) {[float]$i * 1440}
}
As Tony Hinkle pointed out, this loop will simply output the result of the operation to the caller (or the console if you don't pipe it). If you want to your array to reflect the change, you have to store it back.
for ($i = 0; $i -lt $Values.length; $i++) {
if ($Values[$i] -lt 1) { [float]$Values[$i] *= 1440 }
}
Be aware this will leave some of your values array as strings and some as floats. Depending on what you do with it, you might have to do further casts.

When you use $i*1440 that is simply telling Powershell to multiply the two values and return the product. If you want to change the value of $i, you need to use $i = $1 * 1440.
You may have other issues as well, but this is assuming that you are getting the correct values assigned to $i from the input.

Related

Outputs and easy algorithm question Object Oriented

$a = dir
foreach ($file in $a)
{
if (($file.index%2 ) -eq 0)
//Hopefully this function works, supposed to
(Ideally) print every other file
{
Write-Host $file.name
}
}
The function -eq 0... not sure if that prints out every other file. I do not know exactly how the files are numbered, or how you reference a number to the file. Do you treat every file as an object and number them? Then make a function regarding the numbers made appended to the file?
Fairly new to this, I'm used to html, css.
If you have a more proficient answer, I'm open to the idea too.
Your script almost works.
Removed alias for dir, and sorted results as requested.
The -File switch for Get-ChildItem excludes folders. I guess that's what you want, but remove it otherwise.
Since there's not an easy way to get the current position in foreach, I used a for loop instead, but it's the same idea. If you want to try with foreach, you could set a variable to true, and then not (!) it each iteration.
$Path = 'C:\yourpath'
$Files = Get-ChildItem -Path $Path -File |
Sort-Object -Property 'Name' -Descending
for ($i = 0; $i -lt $Files.Count; $i++) {
if ($i % 2 -eq 0) {
Write-Host $Files[$i].Name
}
}
If you're using this output further, it's highly recommended to write results to an object rather than the console window.
Why not simply use a for loop and increment the index counter with a value of 2?
for ($i = 0; $i -lt $a.Count; $i += 2) {
Write-Host $a[$i].Name
}

A better way to slice an array (or a list) in powershell

How can i export mails adresses in CSV files in a range of 30 users for each one.
I have already try this
$users = Get-ADUser -Filter * -Properties Mail
$nbCsv = [int][Math]::Ceiling($users.Count/30)
For($i=0; $i -le $nbCsv; $i++){
$arr=#()
For($j=(0*$i);$j -le ($i + 30);$j++){
$arr+=$users[$j]
}
$arr|Export-Csv -Path ($PSScriptRoot + "\ASSFAM" + ("{0:d2}" -f ([int]$i)) + ".csv") -Delimiter ";" -Encoding UTF8 -NoTypeInformation
}
It works but, i think there is a better way to achieve this task.
Have you got some ideas ?
Thank you.
If you want a subset of an array, you can just use .., the range operator. The first 30 elements of an array would be:
$users[0..29]
Typically, you don't have to worry about going past the end of the array, either (see mkelement0's comment below). If there are 100 items and you're calling $array[90..119], you'll get the last 10 items in the array and no error. You can use variables and expressions there, too:
$users[$i..($i + 29)]
That's the $ith value and the next 29 values after the $ith value (if they exist).
Also, this pattern should be avoided in PowerShell:
$array = #()
loop-construct {
$array += $value
}
Arrays are immutable in .Net, and therefore immutable in PowerShell. That means that adding an element to an array with += means "create a brand new array, copy every element over, and then put this one new item on it, and then delete the old array." It generates tremendous memory pressure, and if you're working with more than a couple hundred items it will be significantly slower.
Instead, just do this:
$array = loop-construct {
$value
}
Strings are similarly immutable and have the same problem with the += operator. If you need to build a string via concatenation, you should use the StringBuilder class.
Ultimately, however, here is how I would write this:
$users = Get-ADUser -Filter * -Properties Mail
$exportFileTemplate = Join-Path -Path $PSScriptRoot -ChildPath 'ASSFAM{0:d2}.csv'
$batchSize = 30
$batchNum = 0
$row = 0
while ($row -lt $users.Count) {
$users[$row..($row + $batchSize - 1)] | Export-Csv ($exportFileTemplate -f $batchNum) -Encoding UTF8 -NoTypeInformation
$row += $batchSize
$batchNum++
}
$row and $batchNum could be rolled into one variable, technically, but this is a bit more readable, IMO.
I'm sure you could also write this with Select-Object and Group-Object, but that's going to be fairly complicated compared to the above and Group-Object isn't entirely known for it's performance prior to PowerShell 6.
If you are using Powershell's strict mode, which may be required in certain configurations or scenarios, then you'll need to check that you don't enumerate past the end of the array. You could do that with:
while ($row -lt $users.Count) {
$users[$row..([System.Math]::Min(($row + $batchSize - 1),($users.Count - 1)))] | Export-Csv ($exportFileTemplate -f $batchNum) -Encoding UTF8 -NoTypeInformation
$row += $batchSize
$batchNum++
}
I believe that's correct, but I may have an off-by-one error in that logic. I have not fully tested it.
Bacon Bits' helpful answer shows how to simplify your code with the help of .., the range operator, but it would be nice to have a general-purpose chunking (partitioning, batching) mechanism; however, as of PowerShell 7.0, there is no built-in feature.
GitHub feature suggestion #8270 proposes adding a -ReadCount <int> parameter to Select-Object, analogous to the parameter of the same name already defined for Get-Content.
If you'd like to see this feature implemented, show your support for the linked issue there.
With that feature in place, you could do the following:
$i = 0
Get-ADUser -Filter * -Properties Mail |
Select-Object -ReadCount 30 | # WISHFUL THINKING: output 30-element arrays
ForEach-Object {
$_ | Export-Csv -Path ($PSScriptRoot + "\ASSFAM" + ("{0:d2}" -f ++$i) + ".csv") -Delimiter ";" -Encoding UTF8 -NoTypeInformation
}
In the interim, you could use custom function Select-Chunk (source code below): replace Select-Object -ReadCount 30 with Select-Chunk -ReadCount 30 in the snippet above.
Here's a simpler demonstration of how it works:
PS> 1..7 | Select-Chunk -ReadCount 3 | ForEach-Object { "$_" }
1 2 3
4 5 6
7
The above shows that the ForEach-Object script block receive the following
three arrays, via $_, in sequence:
1, 2, 3, 4, 5, 6, and , 7
(When you stringify an array, by default you get a space-separated list of its elements; e.g., "$(1, 2, 3)" yields 1 2 3).
Select-Chunk source code:
The implementation uses a [System.Collections.Generic.Queue[object]] instance to collect the inputs in batches of fixed size.
function Select-Chunk {
<#
.SYNOPSIS
Chunks pipeline input.
.DESCRIPTION
Chunks (partitions) pipeline input into arrays of a given size.
By design, each such array is output as a *single* object to the pipeline,
so that the next command in the pipeline can process it as a whole.
That is, for the next command in the pipeline $_ contains an *array* of
(at most) as many elements as specified via -ReadCount.
.PARAMETER InputObject
The pipeline input objects binds to this parameter one by one.
Do not use it directly.
.PARAMETER ReadCount
The desired size of the chunks, i.e., how many input objects to collect
in an array before sending that array to the pipeline.
0 effectively means: collect *all* inputs and output a single array overall.
.EXAMPLE
1..7 | Select-Chunk 3 | ForEach-Object { "$_" }
1 2 3
4 5 6
7
The above shows that the ForEach-Object script block receive the following
three arrays: (1, 2, 3), (4, 5, 6), and (, 7)
#>
[CmdletBinding(PositionalBinding = $false)]
[OutputType([object[]])]
param (
[Parameter(ValueFromPipeline)]
$InputObject
,
[Parameter(Mandatory, Position = 0)]
[ValidateRange(0, [int]::MaxValue)]
[int] $ReadCount
)
begin {
$q = [System.Collections.Generic.Queue[object]]::new($ReadCount)
}
process {
$q.Enqueue($InputObject)
if ($q.Count -eq $ReadCount) {
, $q.ToArray()
$q.Clear()
}
}
end {
if ($q.Count) {
, $q.ToArray()
}
}
}

While loop does not produce pipeline output

It appears that a While loop does not produce an output that can continue in the pipeline. I need to process a large (many GiB) file. In this trivial example, I want to extract the second field, sort on it, then get only the unique values. What am I not understanding about the While loop and pushing things through the pipeline?
In the *NIX world this would be a simple:
cut -d "," -f 2 rf.txt | sort | uniq
In PowerShell this would be not quite as simple.
The source data.
PS C:\src\powershell> Get-Content .\rf.txt
these,1,there
lines,3,paragraphs
are,2,were
The script.
PS C:\src\powershell> Get-Content .\rf.ps1
$sr = New-Object System.IO.StreamReader("$(Get-Location)\rf.txt")
while ($line = $sr.ReadLine()) {
Write-Verbose $line
$v = $line.split(',')[1]
Write-Output $v
} | sort
$sr.Close()
The output.
PS C:\src\powershell> .\rf.ps1
At C:\src\powershell\rf.ps1:7 char:3
+ } | sort
+ ~
An empty pipe element is not allowed.
+ CategoryInfo : ParserError: (:) [], ParseException
+ FullyQualifiedErrorId : EmptyPipeElement
Making it a bit more complicated than it needs to be. You have a CSV without headers. The following should work:
Import-Csv .\rf.txt -Header f1,f2,f3 | Select-Object -ExpandProperty f2 -Unique | Sort-Object
Nasir's workaround looks like the way to go here.
If you want to know what was going wrong in your code, the answer is that while loops (and do/while/until loops) don't consistently return values to the pipeline the way that other statements in PowerShell do (actually that is true, and I'll keep the examples of that, but scroll down for the real reason it wasn't working for you).
ForEach-Object -- a cmdlet, not a built-in language feature/statement; does return objects to the pipeline.
1..3 | % { $_ }
foreach -- statement; does return.
foreach ($i in 1..3) { $i }
if/else -- statement; does return.
if ($true) { 1..3 }
for -- statement; does return.
for ( $i = 0 ; $i -le 3 ; $i++ ) { $i }
switch -- statement; does return.
switch (2)
{
1 { 'one' }
2 { 'two' }
3 { 'three' }
}
But for some reason, these other loops seem to act unpredictably.
Loops forever, returns $i (0 ; no incrementing going on).
$i = 0; while ($i -le 3) { $i }
Returns nothing, but $i does get incremented:
$i = 0; while ($i -le 3) { $i++ }
If you wrap the expression inside in parentheses, it seems it does get returned:
$i = 0; while ($i -le 3) { ($i++) }
But as it turns out (I'm learning a bit as I go here), while's strange return semantics have nothing to do with your error; you just can't pipe statements into functions/cmdlets, regardless of their return value.
foreach ($i in 1..3) { $i } | measure
will give you the same error.
You can "get around" this by making the entire statement a sub-expression with $():
$( foreach ($i in 1..3) { $i } ) | measure
That would work for you in this case. Or in your while loop instead of using Write-Output, you could just add your item to an array and then sort it after:
$arr = #()
while ($line = $sr.ReadLine()) {
Write-Verbose $line
$v = $line.split(',')[1]
$arr += $v
}
$arr | sort
I know you're dealing with a large file here, so maybe you're thinking that by piping to sort line by line you'll be avoiding a large memory footprint. In many cases piping does work that way in PowerShell, but the thing about sorting is that you need the whole set to sort it, so the Sort-Object cmdlet will be "collecting" each item you pass to it anyway and then do the actual sorting in the end; I'm not sure you can avoid that at all. Admittedly letting Sort-Object do that instead of building the array yourself might be more efficient depending on how its implemented, but I don't think you'll be saving much on RAM.
other solution
Get-Content -Path C:\temp\rf.txt | select #{Name="Mycolumn";Expression={($_ -split "," )[1]}} | select Mycolumn -Unique | sort

Fastest way to parse thousands of small files in PowerShell

I have over 16000 inventory log files ranging in size from 3-5 KB on a network share.
Sample file looks like this:
## System Info
SystemManufacturer:=:Dell Inc.
SystemModel:=:OptiPlex GX620
SystemType:=:X86-based PC
ChassisType:=:6 (Mini Tower)
## System Type
isLaptop=No
I need to put them into a DB, so I started parsing them and creating a custom object for each that I can later use to check duplicates, normalize etc...
Initial parse with a code snippet as in below took about 7.5mins.
Foreach ($invlog in $invlogs) {
$content = gc $invlog.FullName -ReadCount 0
foreach ($line in $content) {
if ($line -match '^#|^\s*$') { continue }
$invitem,$value=$line -split ':=:'
[PSCustomObject]#{Name=$invitem;Value=$value}
}
}
I started optimizing it and after several trial and error ended up with this which takes 2mins and 4 secs:
Foreach ($invlog in $invlogs) {
foreach ($line in ([System.IO.File]::ReadLines("$($invlog.FullName)") -match '^\w') ) {
$invitem,$value=$line -split ':=:'
[PSCustomObject]#{name=$invitem;Value=$value} #2.04mins
}
}
I also tried using a hash instead of PSCustomObject, but to my surprise it took much longer (5mins 26secs)
Foreach ($invlog in $invlogs) {
$hash=#{}
foreach ($line in ([System.IO.File]::ReadLines("$($invlog.FullName)") -match $propertyline) ) {
$invitem,$value=$line -split ':=:'
$hash[$invitem]=$value #5.26mins
}
}
What would be the fastest method to use here?
See if this is any faster:
Foreach ($invlog in $invlogs) {
#(gc $invlog.FullName -ReadCount 0) -notmatch '^#|^\s*$' |
foreach {
$invitem,$value=$line -split ':=:'
[PSCustomObject]#{Name=$invitem;Value=$value}
}
}
The -match and -notmatch operators, when appied to an array return all the elements that satisfy the match, so you can eliminate having to test every line for the lines to exclude.
Are you really wanting to create a PS Object for every line, or just one for every file?
If you want one object per file, see if this is any quicker:
The multi-line regex eliminates the line array, and a filter is used in place of the foreach to create the hash entries.
$regex = [regex]'(?ms)^(\w+):=:([^\r]+)'
filter make-hash { #{$_.groups[1].value = $_.groups[2].value} }
Foreach ($invlog in $invlogs) {
$regex.matches([io.file]::ReadAllText($invlog.fullname)) | make-hash
}
The objective of switching to using the multi-line regex and [io.file]::ReadAllText] is to simplify what Powershell is doing with the file input internally. The result of [io.file]::ReadAllText() will be a string object, which is a much simpler type of object than the array of strings that [io.file]::ReadAllLines() will produce, and requires less overhead to counstruct internally. A filter is essentially just the Process block of a function - it will run once for every object that comes to it from the pipeline, so it emulates the action of foreach-object, but actually runs slightly faster (I don't know the internals well enough to tell you exactly why). Both of these changes require more coding and only result in a marginal increase in performace. In my testing switching to multi-line gained about .1ms per file, and changing from foreach-object to the filter another .1 ms. You probably don't see these techniques used very often because of the low return compared to the additional coding work required, but it becomes significant when you start to multiply those fractions of a ms by 160K iterations.
Try this:
Foreach ($invlog in $invlogs) {
$output = #{}
foreach ($line in ([IO.File]::ReadLines("$($invlog.FullName)") -ne '') ) {
if ($line.Contains(":=:")) {
$item, $value = $line.Split(":=:") -ne ''
$output[$item] = $value
}
}
New-Object PSObject -Property $output
}
As a general rule, Regex is sometimes cool but always slower.
Wouldn't you want an object per system, and not per key-value pair? :S
Like this.. By replacing Get-Content to the .Net method you could probably save some time.
Get-ChildItem -Filter *.txt -Path <path to files> | ForEach-Object {
$ht = #{}
Get-Content $_ | Where-Object { $_ -match ':=:' } | ForEach-Object {
$ht[($_ -split ':=:')[0].Trim()] = ($_ -split ':=:')[1].Trim()
}
[pscustomobject]$ht
}
ChassisType SystemManufacturer SystemType SystemModel
----------- ------------------ ---------- -----------
6 (Mini Tower) Dell Inc. X86-based PC OptiPlex GX620

Slow Excel automation?

I'm using the following Powershell script to remove the NumberFormat of the cells in a column for a lot of files so all the fractions will be displayed. The column may have decimal, text or date, etc; only these decimal/currency formatted (with format of 0* or *#*) cells need to be applied
However it's slow (check/update two or three cells every second). Is there a better/faster way to do it?
$WorkBook = $Excel.Workbooks.Open($fileName)
$WorkSheet = $WorkBook.Worksheets.Item(1)
$cell = $WorkSheet.Cells
$ColumnIndex = 10 # The column may have decimal, text or date, etc.
$i = 2
while ($cell.Item($i, 1).value2 -ne $Null)
# Replace it to find the last row# of column 1 may cut the time in half? How?
{
$c = $cell.Item($i, $ColumnIndex)
if (($c.NumberFormat -like "0*") -or $c.NumberFormat -like "*#*")
{
"$i : $($c.NumberFormat) $($c.value2) "
$c.NumberFormat = $Null
}
$i++
}
Update:
Will the .Net Microsoft.Office.Interop.Excel much faster?
Or convert the files to xlsx format and use System.IO.Package.IO?
I improved the speed after read the comment. Thanks all.
Try to reduce the access of the cells as much as possible. I deleted the output line "$i : $($c.NumberFormat) $($c.value2) " and change
if (($c.NumberFormat -like "0*") -or $c.NumberFormat -like "*#*")
to
$f = $c.NumberFormat
if ($f -like "0*" -or $f -like "*#*")
I also use $lastRow = $cell.SpecialCells(11, 1).Row to get the last row number and change the loop to while ($i -le $lastRow).
$Excel.ScreenUpdating = False also helped reduced some time.