Using System.Random in PowerShell to Retrieve a Random List from an Array - powershell

I need a list of generated six-digit numbers
First I generated a list of numbers.
$Numbers += 0..999999 | ForEach-Object {
if ( ($_ -as [string]).Length -eq 1 ) {
"00000$_"
}elseif ( ($_ -as [string]).Length -eq 2 ) {
"0000$_"
}elseif ( ($_ -as [string]).Length -eq 3 ) {
"000$_"
}elseif ( ($_ -as [string]).Length -eq 4 ) {
"00$_"
}elseif ( ($_ -as [string]).Length -eq 5 ) {
"0$_"
}else{$_ -as [string]}
}
Next, I shuffle the list.
$RandomNumbers = $Numbers | Sort-Object { Get-Random }
And that's exactly what takes too long for me.
Is there a way to have the numbers sorted faster?
I thought about "System.Random", but I couldn't get it to work.

You can use Get-Random not only to generate random numbers but also to efficiently shuffle a collection:
$RandomNumbers = Get-Random -InputObject $Numbers -Count $Numbers.Count
This takes less than a second on my machine for one million numbers, whereas your original code took 27 seconds.
As expected, the $RandomNumbers array only contains unique numbers, which I have verified using ($RandomNumbers | Sort-Object -unique).Count which outputs 1000000.
If you have PowerShell 7+ available, you can slightly simplify the code by replacing the the -Count parameter with the -Shuffle parameter:
$RandomNumbers = Get-Random -InputObject $Numbers -Shuffle

The problem here is not that Get-Random is particularly slow, it's that you're generating way more data than necessary. As jdweng comments:
Only generate the minimum number of random numbers that you need.
If you only need 1000 numbers between 0 and 999999, call Get-Random only 1000 times:
$RandomNumbers = 1..1000 |ForEach-Object {
'{0:d6}' -f (Get-Random -Min 0 -Max 1000000)
}
If the numbers have to be distinct, you can use a HashSet<int> or a hashtable to keep track of unique values until you reach the desired amount:
$NumberCount = 1000
$newNumbers = #{}
while($newNumbers.psbase.Count -lt $NumberCount){
$newNumbers[(Get-Random -Min 0 -Max 1000000)] = $true
}
$RandomNumbers = $newNumbers.psbase.Keys |ForEach-Object ToString d6

Related

Powershell Get-Random with Constraints

I'm currently using the Get-Random function of Powershell to randomly pull a set number of rows from a csv. I need to create a constraint that says if one id is pulled, find the other ids that match it and pull their value.
Here is what I currently have:
$chosenOnes = Import-CSV C:\Temp\pk2.csv | sort{Get-Random} | Select -first 6
$i = 1
$count = $chosenOnes | Group-Object householdID
foreach ($row in $count)
{
if ($row.count -gt 1)
{
$students = $row.Group.Student
foreach ($student in $students)
{
$name = $student.tostring()
#...do something
$i = $i + 1
}
}
else
{
$name = $row.Group.Student
if($i -le 5)
{
#...do something
}
else
{
#...do something
}
$i = $i + 1
}
}
Example dataset
ID,name
165,Ernest Hemingway
1204,Mark Twain
1578,Stephen King
1634,Charles Dickens
1726,George Orwell
7751,John Doe
7751,Tim Doe
In this example, there are 7 rows but I'm randomly selecting 6 in my code. What needs to happen is when ID=7751 then I must return both rows where ID=7751. The IDs cannot not be statically set in the code.
Use Get-Random directly, with -Count, to extract a given number of random elements from a collection.
$allRows = Import-CSV C:\Temp\pk2.csv
$chosenHouseholdIDs = ($allRows | Get-Random -Count 6).householdID
Then filter all rows by whether their householdID column contains one of the 6 randomly selected rows' householdID values (PSv3+ syntax), using the -in array-containment operator:
$allRows | Where-Object householdID -in $chosenHouseholdIDs
Optional reading: performance considerations:
$allRows | Get-Random -Count 6 is not only conceptually simpler, but also much faster than $allRows | Sort-Object { Get-Random } | Select-Object -First 6
Using the Time-Command function to compare the performance of two approaches, using a 1000-row test file with 10 columns yields the following sample timings on my Windows 10 VM in Windows PowerShell - note that the Sort-Object { Get-Random }-based solution is more than 15(!) times slower:
Factor Secs (100-run avg.) Command TimeSpan
------ ------------------- ------- --------
1.00 0.007 $allRows | Get-Random -Count 6 00:00:00.0072520
15.65 0.113 $allRows | Sort-Object { Get-Random } | Select-Object -First 6 00:00:00.1134909
Similarly, a single pass through all rows to find matching IDs via array-containment operator -in performs much better than looping over the randomly selected IDs and searching all rows for each.
I tried sticking with your beginning and came up with this.
$Array = Import-CSV C:\test\StudtentTest.csv
$Array | Sort{Get-Random} | select -first 2 | %{
$id = $_.id
$Array | ?{$_.id -eq $id} | %{
$_
}
}
$Array will be your parsed CSV
We pipe in and sort by random select -first 2 (in this case)
Save the ID of the object into $id and then search the array for that ID and dispaly each that matches
If same ID does match you end up with something like
ID name
-- ----
7751 John Doe
7751 Tim Doe
1634 Charles Dickens

Get random items from hashtable but the total of values has to be equal to a set number

I'm trying to build a simple "task distributor" for the house tasks between me and my wife. Although the concept will be really useful at work too so I need to learn it properly.
My hashtable:
$Taches = #{
"Balayeuse plancher" = 20
"Moppe plancher" = 20
"Douche" = 15
"Litières" = 5
"Poele" = 5
"Comptoir" = 5
"Lave-Vaisselle" = 10
"Toilette" = 5
"Lavabos" = 10
"Couvertures lit" = 5
"Poubelles" = 5
}
The total value for all the items is 105 (minutes).
So roughly 50mins each of we split it in two.
My goal:
I want to select random items from that hashtable and build two different hashtables - one for me and my wife, each having a total value of 50 (So it's fair). For example 20+20+10 or 5+5+5+15+20, etc. The hard part is that ALL tasks have to be accounted for between the two hashtables and they can only be present ONCE in each of them (no use in cleaning the same thing twice!).
What would be the best option?
For now I successfully achieved a random hashtable of a total value of 50 like this:
do {
$Me = $null
$sum = $null
$Me = #{}
$Me = $Taches.GetEnumerator() | Get-Random -Count 5
$Me | ForEach-Object { $Sum += $_.value }
} until ($sum -eq 50)
Result example :
Name Value
---- -----
Poubelles 5
Balayeuse plancher 20
Douche 15
Poele 5
Toilette 5
It works but boy does it feel like it's a roundabout and crooked way of doing it. I'm sure there is a better approach? Plus I'm lacking important things. ALL the tasks have to be accounted for and not be present twice. This is quite complicated although it looked simple at first!
You can not maximise randomness and fairness at the same time so one has to give. I think you should not risk being unfair to your wife and so fairness must prevail!
Fairness at the expense of randomness
This approach sorts the items in descending time order and then randomly assigns them items to each person unless that assignment would be unfair.
The fairness calculation here is that the maximum time difference should be at most the duration of the quickest task.
$DescendingOrder = $Taches.Keys | Sort-Object -Descending { $Taches[$_] }
$Measures = $Taches.Values | Measure-Object -Sum -Minimum
$UnfairLimit = ($Measures.Sum + $Measures.Minimum) / 2
$Person1 = #{}
$Person2 = #{}
$Total1 = 0
$Total2 = 0
foreach ($Item in $DescendingOrder) {
$Time = $Taches[$Item]
$Choice = Get-Random 2
if (($Choice -eq 0) -and (($Total1 + $Time) -gt $UnfairLimit)) {
$Choice = 1
}
if (($Choice -eq 1) -and (($Total2 + $Time) -gt $UnfairLimit)) {
$Choice = 0
}
if ($Choice -eq 0) {
$Person1[$Item] = $Time
$Total1 += $Time
} else {
$Person2[$Item] = $Time
$Total2 += $Time
}
}
An example run:
PS> $Person1 | ConvertTo-Json
{
"Comptoir": 5,
"Lavabos": 10,
"Litières": 5,
"Couvertures lit": 5,
"Douche": 15,
"Lave-Vaisselle": 10
}
and the other person:
PS> $Person2 | ConvertTo-Json
{
"Moppe plancher": 20,
"Toilette": 5,
"Balayeuse plancher": 20,
"Poubelles": 5,
"Poele": 5
}
Randomness at the expense of fairness
This approach is to randomize the list, go through each item and then assign it to the person who has the least time allocated to them so far.
Earlier decisions might mean that later decisions end up being unfair.
$RandomOrder = $Taches.Keys | Sort-Object { Get-Random }
$Person1 = #{}
$Person2 = #{}
$Total1 = 0
$Total2 = 0
foreach ($Item in $RandomOrder) {
$Time = $Taches[$Item]
if ($Total1 -lt $Total2) {
$Person1[$Item] = $Time
$Total1 += $Time
} else {
$Person2[$Item] = $Time
$Total2 += $Time
}
}
An example run:
PS> $Person1 | ConvertTo-Json
{
"Poele": 5,
"Douche": 15,
"Couvertures lit": 5,
"Lave-Vaisselle": 10,
"Balayeuse plancher": 20,
"Toilette": 5
}
and the other person:
PS> $Person2 | ConvertTo-Json
{
"Lavabos": 10,
"Comptoir": 5,
"Poubelles": 5,
"Litières": 5,
"Moppe plancher": 20
}
You should probably write the algorithm to always have you take the extra task in a rounding error (Happy Wife, Happy Life).
This is probably over-engineered, but I was intrigued by the question, and learned some French in the process.
$Taches = #{
"Balayeuse plancher" = 20
"Moppe plancher" = 20
"Douche" = 15
"Litières" = 5
"Poele" = 5
"Comptoir" = 5
"Lave-Vaisselle" = 10
"Toilette" = 5
"Lavabos" = 10
"Couvertures lit" = 5
"Poubelles" = 5
}
$target = 0
$epsilon = 5
# copy if you don't want to destroy original list (not needed probably)
# put all entries in first list.
# randomly move entry to p2 if count over target +/- epsilon
# randomly move entry from p2 if count under target +/- epsilon
# (unless you know you can always get exactly target and not loop forever trying)
$p1 = #{} # person 1
$p2 = #{} # person 2
$p1Total = 0 # optimizaton to not have to walk entire list and recalculate constantly
$p2Total = 0 # might as well track this too...
$Taches.Keys | % {
$p1.Add($_, $Taches[$_])
$p1Total += $Taches[$_]
$target += $Taches[$_]
}
$target = $target / 2
$done = $false
while (-not $done)
{
if ($p1Total -gt ($target+$epsilon))
{
$item = $p1.Keys | Get-Random
$value = $p1[$item]
$p1.Remove($item)
$p2.Add($item, $value)
$p1Total -= $value
$p2Total += $value
continue
}
elseif ($p1Total -lt ($target-$epsilon))
{
$item = $p2.Keys | Get-Random
$value = $p2[$item]
$p2.Remove($item)
$p1.Add($item, $value)
$p1Total += $value
$p2Total -= $value
continue
}
$done = $true
}
"Final result"
"p1"
$p1Total
$p1
"`np2"
$p2Total
$p2
Yet another approach:
$MinSum = ($Taches.Values | Measure-Object -Minimum ).Minimum
$HalfSum = ($Taches.Values | Measure-Object -Sum ).Sum / 2
do {
$sum = 0
$All = $Taches.GetEnumerator() |
Get-Random -Count $Taches.Keys.Count
$Me = $All | ForEach-Object {
if ( $Sum -lt $HalfSum - $MinSum ) {
$Sum += $_.value
#{ $_.Key = $_.Value }
}
}
Write-Host "$sum " -NoNewline # debugging output
} until ($sum -eq 50 )
$Em = $Taches.Keys | ForEach-Object {
if ( $_ -notin $Me.Keys ) {
#{ $_ = $Taches.$_ }
}
}
# show "fairness" (task count vs. task cost)
$Me.Values | Measure-Object -Sum | Select-Object -Property Count, Sum
$Em.Values | Measure-Object -Sum | Select-Object -Property Count, Sum
Sample output(s):
PS D:\PShell> D:\PShell\SO\54610011.ps1
50
Count Sum
----- ---
4 50
7 55
PS D:\PShell> D:\PShell\SO\54610011.ps1
65 65 50
Count Sum
----- ---
6 50
5 55
Great answers guys, learned a lot. Here is what I ended up doing thanks to "Fischfreund" on Reddit (https://www.reddit.com/r/PowerShell/comments/aovs8s/get_random_items_from_hashtable_but_the_total_of/eg3ytds).
His approach is amazingly simple yet I didn't think of it at all.
First hashtable : Get a random count of 5 until the sum is 50. Then create a second hashtable where the items are not in the first hashtable! I assign that first hahstable containing 5 items to my wife so I'm the one who always has an extra task (like suggested by Kory ;)). Phew i'm safe.
$Taches = #{
"Balayeuse plancher" = 20
"Moppe plancher" = 20
"Douche" = 15
"Litières" = 5
"Poele" = 5
"Comptoir" = 5
"Lave-Vaisselle" = 10
"Toilette" = 5
"Lavabos" = 10
"Couvertures lit" = 5
"Poubelles" = 5
}
do {
$Selection1 = $Taches.GetEnumerator() | Get-Random -Count 5
} until (($Selection1.Value | measure -Sum ).Sum -eq 50)
$Selection2 = $Taches.GetEnumerator() | Where-Object {$_ -notin $Selection1}
$Selection1 | select-object #{Name="Personne";expression={"Wife"} },Name,Value
""
$Selection2 | select-object #{Name="Personne";expression={"Me"} },Name,Value

Appending to object with value already in object

So, I have the following code that is part of my spoof database creator - which works:
$Start_dateMin = get-date -year 2015 -month 1 -day 1
$Start_dateMax = get-date
for ( $i = 0; $i -le (Get-Random -Minimum 1 -Maximum 2); $i++ ) {
<# Create job data array #>
$jobData = #{
ReceivedDate = (new-object datetime (Get-Random -min $Start_dateMin.ticks -max $Start_dateMax.ticks)).ToString("yyyy/MM/dd")
}
<# Add to job data array #>
$jobData += #{
StartDate = (Get-Date $jobData.ReceivedDate).AddDays((Get-Random -Minimum 20 -Maximum 50)).ToString("yyyy/MM/dd")
EndDate = (Get-Date $jobData.ReceivedDate).AddDays((Get-Random -Minimum 100 -Maximum 500)).ToString("yyyy/MM/dd")
ClosingDate = (Get-Date $jobData.ReceivedDate).AddDays((Get-Random -Minimum 30 -Maximum 100)).ToString("yyyy/MM/dd")
UpdatedDate = (Get-Date $jobData.ReceivedDate).AddDays((Get-Random -Minimum 5 -Maximum 30)).ToString("yyyy/MM/dd")
JobStatusID = if (( Get-Date $jobData.ReceivedDate ) -ge ((Get-Date).AddDays(-100))) { 5 } else { 1 }
}
<# Add to job data array #>
$jobData += #{
jobOutcomeID = if ( $jobData.JobStatusID -eq 1 -and (Get-Random -Minimum 0 -Maximum 100) -lt 70 ) { 1 }
elseif ( $jobData.JobStatusID -eq 1 -and (Get-Random -Minimum 0 -Maximum 100) -lt 85 ) { 2 }
else { (Get-Random -InputObject $JobOutcomeList -count 1).itemArray[0] }
}
}
$jobData | format-table
There's no issue with this, all the data is populated correctly, heres some sample output:
Name Value
---- -----
JobStatusID True
UpdatedDate 2017/01/30
ClosingDate 2017/04/23
EndDate 2017/09/16
jobOutcomeID 2
StartDate 2017/02/28
ReceivedDate 2017/01/14
It is correctly using the received date from the declaration and creating dates based off that.
The second snippet of code does not work:
for ( $i = 0; $i -lt (Get-Random -Minimum 1 -Maximum 2); $i++) {
$jobCandData = #{
DateSelected = (Get-Date).AddDays((Get-Random -Minimum 1 -Maximum 30)).ToString("yyyy/MM/dd");
JobCanOutcome = 7
}
$jobCandData += #{
CandNote = if ( $jobCandDate.JobCanOutcome -eq 7 ) { "test" };
DateSubmitted = if ((Get-Random -Minimum 1 -Maximum 100) -gt 80) { (Get-Date $jobCandData.DateSelected).AddDays((Get-Random -Minimum 1 -Maximum 30)).ToString("yyyy/MM/dd") };
}
$jobCandData | Format-Table
}
This does not work, everything in the append call does not populate, here is a sample result:
Name Value
---- -----
DateSubmitted
CandNote
DateSelected 2018/03/16
JobCanOutcome 7
As far as I can see there is no difference. The second snippet of code is running in a foreach loop foreach (job in jobs ) { } these jobs are created from the first section of code. The two variables DateSubmitted and CandNote do not populate with or without this foreach loop.
Basically, I am just trying to understand what it is I am missing here and why it is not populating correctly.
It's a simple typo that Set-StrictMode -Version 1 (or higher) would have caught, as PetSerAl recommends in a comment on the question:
The original variable names is $jobCandData- with a final a - yet you try to refer to it as $jobCandDate- with a final e.
By default, with no strict mode in effect, a reference such as $jobCandDate.JobCanOutcome - i.e., an attempt to access a property of a nonexistent variable simply yields $null, which in your case means that conditional $jobCandDate.JobCanOutcome -eq 7 always returns $False and therefore never assigns a value to the CandNote hashtable entry.
By contrast, assigning to the DateSubmitted entry works fine - it's just that it will only assign a value if (Get-Random -Minimum 1 -Maximum 100) -gt 80 happens to be $True, which is on average only true about 19% of the time.

Conditional Logic and Casting in Powershell

I'm creating a script that imports 2 columns of CSV data, sorts by one column cast as type int, and shows only the values between 0 and 10,000. I've been able to get up to the sorting part, and I am able to show only greater than 0. When I try to add "-and -lt 10000" various ways, I am unable to get any useful data. One attempt gave me the data as if it were string again, though.
This only gives me > 0 but sorts as type int. Half way there!:
PS C:\> $_ = Import-Csv .\vc2.csv | Select -Property User_Name, Minutes; $_ | Sort {[int] $_.Minutes} | Where {($_.Minutes -gt 0)}
This gives me 10000 > x > 0 but sorts as string:
PS C:\> $_ = Import-Csv .\vc.csv | Select -Property User_Name, Minutes; $_ | Sort {[int] $_.Minutes} | Where {($_.Minutes -gt 0) -and ($_.Minutes -lt 10)}
Here and here are where I tried recasting as int and it gave me many errors:
PS C:\> $_ = Import-Csv .\vc.csv | Select -Property User_Name, Minutes; $_ | Sort {[int] $_.Minutes} | Where {[int]{($_.Minutes -gt 0) -and ($_.Minutes -lt 10000)}}
PS C:\> $_ = Import-Csv .\vc.csv | Select -Property User_Name, Minutes; $_ | Sort {[int] $_.Minutes} | Where { ({[int]$_.Minutes} -gt 0) -and ({[int]$_.Minutes} -lt 10000) }
Error: Cannot convert the "($.Minutes -gt 0) -and ($.Minutes -lt 10000)" value of type "System.Management.Automation.ScriptBlock" to type "System.Int32".
What is the proper syntax for this?
PowerShell usually coerces arguments of binary operators to the type of the left operand. This means when doing $_.Minutes -gt 10 the 10 gets converted to a string, because the fields in a parsed CSV are always strings. You can either switch the operands around: 10 -lt $_.Minutes or add a cast: [int]$_.Minutes -gt 10 or +$_.Minutes -gt 10.
Usually, when dealing with CSVs that contain non-string data that I want to use as such, I tend to just add a post-processing step, e.g.:
Import-Csv ... | ForEach-Object {
$_.Minutes = [int]$_.Minutes
$_.Date = [datetime]$_.Date
...
}
Afterwards the data is much nicer to handle, without excessive casts and conversions.
The problem is the use of the { and } brackets in the Where statement. Those are being interpreted as script blocks.
Where { ({[int]$_.Minutes} -gt 0) -and ({[int]$_.Minutes} -lt 10000) }
Try using ( and ) or excluding them altogether.
Where { (([int]$_.Minutes) -gt 0) -and (([int]$_.Minutes) -lt 10000) }
The way you're assigning values to $_ is also weird.
$_ represents the current value in the pipeline.
$list = #(1,2,3)
$list | foreach { $_ }
1
2
3
by assigning "$_" a value, you are losing that value as soon as you place it in the pipeline.
try something like:
$mycsv = import-csv .\vc.csv; $mycsv | select ...etc

Output Values from gc into Hash Table

Trying to make a hash table with 2 categories: Users and Passwords.
This is my code thus far but the issue is the output only displays the command and does not execute it.
for ($i=1; $i -le 10; $i++){
$caps = [char[]] "ABCDEFGHJKMNPQRSTUVWXY"
$lows = [char[]] "abcdefghjkmnpqrstuvwxy"
$nums = [char[]] "2346789"
$spl = [char[]] "!##$%^&*?+"
$first = $lows | Get-Random -count 1;
$second = $caps | Get-Random -count 1;
$third = $nums | Get-Random -count 1;
$forth = $lows | Get-Random -count 1;
$fifth = $spl | Get-Random -count 1;
$sixth = $caps | Get-Random -count 1;
$pwd = [string](#($first) + #($second) + #($third) + #($forth) + #($fifth) + #($sixth))
Out-File C:\Users\Administrator\Documents\L8_userpasswords.txt -InputObject $pwd -Append
}
$here = #'
$users=Get-Content C:\\Users\\Administrator\\Desktop\\L8_users.txt
$passwords=Get-Content C:\\Users\\Administrator\\Documents\\L8_userpasswords.txt
'#
convertfrom-stringdata -stringdata $here
This is the output I am getting:
PS C:\Users\Administrator> C:\Users\Administrator\Documents\l8.ps1
Name Value
---- -----
$users Get-Content C:\Users\Administrator\Desktop\Lab8_users.txt
$passwords Get-Content C:\Users\Administrator\Documents\L8_userpasswords.txt
I think you want this, which will turn the list of users and passwords into a HashTable, and then cast it to a PSCustomObject, which will have two properties: Users and Passwords.
$Data = [PSCustomObject]#{
Users = Get-Content -Path C:\Users\Administrator\Desktop\L8_users.txt;
Passwords = Get-Content -Path C:\Users\Administrator\Desktop\L8_userpasswords.txt;
}
$Data;
Or hey, you could probably just replace the entire script with a one liner:
GC C:\Users\Administrator\Desktop\L8_users.txt|%{[PSCustomObject]#{User=$_;Password=[System.Web.Security.Membership]::GeneratePassword(10,3)}}
Unless you are super attached to your password generation loop that is. [System.Web.Security.Membership]::GeneratePassword(X,Y) will generate complex passwords where X is the length and Y is the number of special characters (the rest will be a random mix of upper case letters, lower case letters, and numbers). So in my code (10,3) is a 10 character password with 3 non-alphanumeric characters.
You want it saved to a file? Pipe that to Export-CSV. Or assign it to a variable by prefixing it with something like $UserList = <code>.
Or if you really, really want a Hash Table you could make an empty one and then alter it just a little to add each pair to the table like this:
$UserList = #{}
GC C:\Users\Administrator\Desktop\L8_users.txt|%{$UserList.add($_,[System.Web.Security.Membership]::GeneratePassword(10,3))}
Assuming that L8_users.txt and L8_userpasswords.txt contain the same number of items, you could do something like this:
$users = Get-Content 'C:\Users\Administrator\Desktop\L8_users.txt'
$passwords = Get-Content 'C:\Users\Administrator\Documents\L8_userpasswords.txt'
$userpasswords = #{}
for ($i = 0; i -lt $users.Length; $i++) {
$userpasswords[$users[$i]] = $passwords[$i]
}