I'm looping through a CSV file and using ForEach-Object loop to grab info to attempt to update in_stock status on Woocommerce, what ends up happening is the woocommerce only see's one entry. I'm not a programmer, I'm still learning PowerShell and for the life of me I just can't understand the logic of for loops and it's output properly. I know it reads the entries in the CSV, but I think it's just overwriting the previous entry.
Another issue I'm having is properly setting in_stock values as true and false for each object respectively, if one is true then all false entries are also set as true. I can't seem to figure out how to assign true | false correctly.
I've been looking up PowerShell using the MS docs on it and how to append hashtables but I'm still not finding the answers or examples that will point me in the right direction. I've gone so far as to purchase PowerShell tutorials offsite and still haven't found a way to do this properly.
$website = "https://www.mywebsite.com"
$params += #{
type= #();
name = #();
SKU = #();
catalog_visibility = #();
regular_price = #();
in_stock = #();
categories = #();
}
$csv = Import-Csv C:\test\test-upload.csv
$csv | Select-Object -Property Type, SKU, Name, 'Visibility in catalog',
'Tax status', 'In stock?', Stock, 'Backorders allowed?', 'Allow customer
reviews?', 'Regular price', Categories | ForEach-Object{
$params.type += $_.type
$params.SKU += $_.SKU
$params.name += $_.name
$params.catalog_visibility += $_.'Visibility in catalog'
$params.categories += $_.Categories
$params.regular_price += $_.'Regular price'
$params.in_stock += $_.'In stock?'
if ($params.in_stock = 0) {$params.in_stock -replace 0, $false}
elseif($params.in_stock = 1) {$params.in_stock -replace 1, $true}
}
foreach($key in $params.keys){
Write-Output $params[$key]
}
I'm looking to get something like this
{
"name": "part 1",
"type": "simple",
"SKU": "0001",
"regular_price": "21.99",
"in_stock: false",
"categories: category 1",
"catalog_visibility": "hidden",
},
{
"name": "part 2",
"type": "simple",
"SKU": "0002",
"regular_price": "11.99",
"in_stock: true",
"categories: category 2",
"catalog_visibility": "hidden",
}
and what I am actually getting is
{
"name": "part 1 part 2",
"type": "simple simple ",
"SKU": "0001 0002",
"regular_price": "21.99 11.99",
"in_stock: true true",
"categories: category 1 category 1",
"catalog_visibility": "hidden hidden",
}
I would really appreciate it if someone could point me in the right direction and give me a few tips on best practice
Since you're new to programming let's talk a little bit about arrays and hashtables.
Arrays are like lists (sometimes they are called lists too), specifically, ordered lists by position.
Hashtables are a type of dictionary, whereby you have a Key that corresponds to a Value.
In PowerShell the syntax you're using for creating an array is #() (that one's empty, it could contain items) and the syntax you use for creating a hashtable is #{} (also empty, could contain values).
You don't show your initial definition of $params, but based on the rest of the code I'm going to assume it's like this:
$params = #()
Then, you have this:
$params += #{
type= #();
name = #();
SKU = #();
catalog_visibility = #();
regular_price = #();
in_stock = #();
categories = #();
}
So what this would mean is that you took your array, $params, and added a new item to it. The new item is the hashtable literal you defined here. All the names you added, like type, name, SKU, etc. are Keys.
According to your desired output, it does look like you want an array of hashtables, so I think that part is correct.
But note that the values you assigned to them are all empty arrays. This is curious because what you showed as your desired output has each hashtable with those keys being singular values, so I think that's one issue, and in fact it's clouding the area where the problem really is.
So let's skip ahead to the body of the loop, where you use this pattern:
$params.type += $_.type
$params.SKU += $_.SKU
$params.name += $_.name
$params.catalog_visibility += $_.'Visibility in catalog'
$params.categories += $_.Categories
$params.regular_price += $_.'Regular price'
$params.in_stock += $_.'In stock?'
Remember that $params is an array, so you should have items in it starting at position 0, like $params[0], $params[1], etc. To change the SKU of the second hashtable in the array, you'd use $params[1].SKU or $params[1]['SKU'].
But what you're doing is just $params.SKU. In many languages, and indeed in PowerShell before v3, this would throw an error. The array itself doesn't have a property named SKU. In PowerShell v3 though the dot . operator was enhanced to allow it to introspect into an array and return each item's property with the given name, that is:
$a = #('apple','orange','pear')
$a.PadLeft(10,'~')
is the same as if we had done:
$a = #('apple','orange','pear')
$a | ForEach-Object { $_.PadLeft(10,'~') }
It's very useful but might be confusing you here.
So back to your object, $params is an array with, so far, only a single hashtable in it. And in your loop you aren't adding anything to $params.
Instead you ask for $params.SKU, which in this case will be the SKU of every hashtable in the array, but there's only one hashtable, so you only get one SKU.
Then you add to the SKU itself:
$params.SKU += $_.SKU
Here's the part where setting SKU initially to an empty array is hiding your issue. If SKU were a string, this would fail, because strings don't support +=, but since it's an array, you're taking this new value, and adding it to the array of SKUs that exist as the value of the single hashtable you're working against.
Where to go from here
don't use arrays for your values in this case
create a new hashtable in each iteration of your loop, then add that new hashtable to the $params array
Let's take a look:
$params = #()
$csv = Import-Csv C:\test\test-upload.csv
$csv | Select-Object -Property Type, SKU, Name, 'Visibility in catalog',
'Tax status', 'In stock?', Stock, 'Backorders allowed?', 'Allow customer
reviews?', 'Regular price', Categories | ForEach-Object {
$params += #{ # new hashtable here
type = $_.type
SKU = $_.SKU
name = $_.name
catalog_visibility = $_.'Visibility in catalog'
categories = $_.Categories
regular_price = $_.'Regular price'
}
}
This is the main problem you have, I left out the in stock part because I'm going to explain that logic separately.
$params.in_stock = $_.'In stock?'
if ($params.in_stock = 0) {$params.in_stock -replace 0, $false}
elseif($params.in_stock = 1) {$params.in_stock -replace 1, $true}
}
It looks like your CSV has an In stock? column that can be 0 or 1 for false/true.
First thing I'll address is that = in PowerShell is always assignment. Testing for equality is -eq, so:
$params.in_stock = $_.'In stock?'
if ($params.in_stock -eq 0) {$params.in_stock -replace 0, $false}
elseif($params.in_stock -eq 1) {$params.in_stock -replace 1, $true}
}
Next, let's talk about true/false values; they're called Boolean or bool for short, and you should usually use this data type to represent them. Any time you do a comparison for example like $a -eq 5 you're returning a bool.
There's strong support for converting other types to bool, for instance if you want to evaluate a number as bool, 0 is false, and all other values are true. For strings, a $null value or an empty string is false, all other values are true. Note that if you have a string "0" that is true because the string has a value.
That also means that the number 0 is not the same as the string '0', but PowerShell does attempt to do conversions between types, usually trying to convert the right side's type to the left side for comparison, so PowerShell will tell you 0 -eq '0' is true (same with '0' -eq 0).
And for your situation, reading from a CSV, those values will end up as strings, but because of the above, your equality tests will work anyway (it's just worth knowing the details).
The issue with your use of -replace though, is that it's a string operation, so even if it works, you're going to end up with the string representation of a boolean, not the actual bool, even though you said to use $true and $false directly (and this is again because of type conversion; -replace needs a string there, PowerShell converts your bool to string to satisfy it).
So, after that long-winded explanation, what makes sense then is this:
$params.in_stock = $_.'In stock?'
if ($params.in_stock -eq 0) {
$params.in_stock = $false
} elseif($params.in_stock -eq 1) {
$params.in_stock -eq $true
}
in fact, the elseif isn't necessary since you can only have 2 values:
$params.in_stock = $_.'In stock?'
if ($params.in_stock -eq 0) {
$params.in_stock = $false
} else {
$params.in_stock -eq $true
}
Even further though, we can use conversions to not need a conditional at all. Remember what I said about converting strings to numbers, and numbers to bool.
0 -as [bool] # gives false
"0" -as [bool] # gives true (whoops)
"0" -as [int] # gives the number 0
"0" -as [int] -as [bool] # false!
Now, we can do this:
$params.in_stock = $_.'In stock?' -as [int] -as [bool]
cool! Let's put it back into the other code:
$params = #()
$csv = Import-Csv C:\test\test-upload.csv
$csv | Select-Object -Property Type, SKU, Name, 'Visibility in catalog',
'Tax status', 'In stock?', Stock, 'Backorders allowed?', 'Allow customer
reviews?', 'Regular price', Categories | ForEach-Object {
$params += #{ # new hashtable here
type = $_.type
SKU = $_.SKU
name = $_.name
catalog_visibility = $_.'Visibility in catalog'
categories = $_.Categories
regular_price = $_.'Regular price'
in_stock = $_.'In stock?' -as [int] -as [bool]
}
}
Deeper dive!
Piping: you're doing some calls like the Import-Csv call and assigning its output to a variable, then piping that variable into another command. That's fine, it's not wrong, but you could also just pipe the first command's output directly into the second like so:
$params = #()
Import-Csv C:\test\test-upload.csv |
Select-Object -Property Type, SKU, Name, 'Visibility in catalog',
'Tax status', 'In stock?', Stock, 'Backorders allowed?', 'Allow customer
reviews?', 'Regular price', Categories |
ForEach-Object {
$params += #{ # new hashtable here
type = $_.type
SKU = $_.SKU
name = $_.name
catalog_visibility = $_.'Visibility in catalog'
categories = $_.Categories
regular_price = $_.'Regular price'
in_stock = $_.'In stock?' -as [int] -as [bool]
}
}
I updated to formatting a little to show that you can use a line break after a pipe |, which can look a little cleaner.
About Select-Object: its purpose is to take objects with a certain set of properties, and give you back a new object with a more limited (or sometimes with brand new) properties (it has other uses around changing the number of objects or filtering the array in other ways that aren't relevant here at the moment).
But I bring this up, because all the properties (columns) you're selecting are by name, and therefore must exist on the input object. And since you refer to each one later directly as opposed to display the entire thing, there's no reason to use Select-Object to filter down the properties, so that entire call can be removed:
$params = #()
Import-Csv C:\test\test-upload.csv |
ForEach-Object {
$params += #{ # new hashtable here
type = $_.type
SKU = $_.SKU
name = $_.name
catalog_visibility = $_.'Visibility in catalog'
categories = $_.Categories
regular_price = $_.'Regular price'
in_stock = $_.'In stock?' -as [int] -as [bool]
}
}
Nice! Looking slim.
About arrays and +=. This is ok in most cases to be honest, but you should know that each time you do this, in reality a new array is being created and all of the original items plus the new item are being copied into it. This doesn't scale, but again it's fine in most use cases.
What you should also know is that the output from a pipeline (like any command, or your main script code, or the body of ForEach-Object is all sent to the next command in the pipeline (or back out the left side if there's nothing else). This can be any number of items, and you can use assignment to get all of those values, like:
$a = Get-ChildItem $env:HOME # get all of items in the directory
$a will be an array if there's more than one item, and during processing it doesn't continually create and destroy arrays.
So how is this relevant to you? It means you don't have to make $params an empty array and append to it, just return your new hashtables in each loop iteration, and then assign the output of your pipeline right to $params!
$params = Import-Csv C:\test\test-upload.csv |
ForEach-Object {
#{ # new hashtable here
type = $_.type
SKU = $_.SKU
name = $_.name
catalog_visibility = $_.'Visibility in catalog'
categories = $_.Categories
regular_price = $_.'Regular price'
in_stock = $_.'In stock?' -as [int] -as [bool]
} # output is implicit
}
And now we've got your script down to a single pipeline (you could make it a single line but I prefer multi-line formatting).
So what you are doing is a lot of += to try and create an array, but you're doing it at the wrong level. What you want to do is create a hashtable (or quite possibly a PSCustomObject) for each item in the CSV, and capture them as an array of objects (be they hashtable objects, or PSCustomObject objects). So, let's try and restructure things a little to do that. I'm ditching the template, we don't care, we're defining it for each object anyway. I'm going to output a hashtable for each item in the ForEach-Object loop, and capture it in $params. This should give you the results you want.
$website = "https://www.mywebsite.com"
$csv = Import-Csv C:\test\test-upload.csv
$params = $csv | Select-Object -Property Type, SKU, Name, 'Visibility in catalog', 'Tax status', 'In stock?', Stock, 'Backorders allowed?', 'Allow customer reviews?', 'Regular price', Categories | ForEach-Object{
#{
type = $_.type
SKU = $_.SKU
name = $_.name
catalog_visibility = $_.'Visibility in catalog'
categories = $_.Categories
regular_price = $_.'Regular price'
in_stock = [boolean][int]($_.'In stock?')
}
}
Related
I am trying to write a script that downloads web sites information. I am able to download the information but I cannot seem to get the filtering working. I have an a series of values that I want skipped stored in $TakeOut but it does not recognize the values in the if -eq $TakeOut. I have to write a line for each value.
What I am wondering is, if there is a way to use a $value as over time there will be a considerable amount of values to skip.
This works but is not practical in the long run.
if ($R.innerText -eq "Home") {Continue}
Something like this would be preferable.
if ($R.innerText -eq $TakeOut) {Continue}
Here is a sample of my code.
#List of values to skip
$TakeOut = #()
$TakeOut = (
"Help",
"Home",
"News",
"Sports",
"Terms of use",
"Travel",
"Video",
"Weather"
)
#Retrieve website information
$Results = ((Invoke-WebRequest -Uri "https://www.msn.com/en-ca/").Links)
#Filter and format to new table of values
$objects = #()
foreach($R in $Results) {
if ($R.innerText -eq $TakeOut) {Continue}
$objects += New-Object -Type PSObject -Prop #{'InnerText'= $R.InnerText;'href'=$R.href;'Title'=$R.href.split('/')[4]}
}
#output to file
$objects | ConvertTo-HTML -As Table -Fragment | Out-String >> $list_F
You cannot meaningfully use an array as the RHS of an -eq operation (the array will be implicitly stringified, which won't work as intended).
PowerShell has operators -contains and -in to test membership of a value in an array (using -eq on a per-element basis - see this answer for background); therefore:
if ($R.innerText -in $TakeOut) {Continue}
Generally, your code can be streamlined (PSv3+ syntax):
$TakeOut =
"Help",
"Home",
"News",
"Sports",
"Terms of use",
"Travel",
"Video",
"Weather"
#Retrieve website information
$Results = (Invoke-WebRequest -Uri "https://www.msn.com/en-ca/").Links
#Filter and format to new table of values
$objects = foreach($R in $Results) {
if ($R.innerText -in $TakeOut) {Continue}
[pscustomobject #{
InnerText = $R.InnerText
href = $R.href
Title = $R.href.split('/')[4]
}
}
#output to file
$objects | ConvertTo-HTML -As Table -Fragment >> $list_F
Note the absence of #(...), which is never needed for array literals.
Building an array in a loop with += is slow (and verbose); simply use the foreach statement as an expression, which returns the loop body's outputs as an array.
[pscustomobject] #{ ... } is PSv3+ syntactic sugar for constructing custom objects; in addition to being faster than a New-Object call, it has the added advantage of preserving property order.
You could write the whole thing as a single pipeline:
#Retrieve website information
(Invoke-WebRequest -Uri "https://www.msn.com/en-ca/").Links | ForEach-Object {
#Filter and format to new table of values
if ($_.innerText -in $TakeOut) {return}
[pscustomobject #{
InnerText = $_.InnerText
href = $_.href
Title = $_.href.split('/')[4]
}
} | ConvertTo-HTML -As Table -Fragment >> $list_F
Note the need to use return instead of continue to move on to the next input.
I've got the following:
# Compare the 2 lists and return the ones that exist in the top and children (meaning they're redundant).
$redundantUsers = $usersAssignedToThisGroup | where { $usersAssignedToGroupsInThisGroup -contains $_ }
# Build the results to output
$results += New-Object PSObject -property #{
group = $group.name #group assigned out of scope
users = #($redundantUsers)
}
I'm expecting to be able to call my script like:
$users = ./MyScript -myParam "something" | Select users
Then I'm expecting to be able to type $users[1] in the console and get the first element of that array.
But instead, I'm having to do $users[0].users[1].
How do I change my data and or call to do what I want?
I'm essentially just trying to let the script return the data in a useful way.
Let's say I have an object like this:
$test = #{
ThisIsTheFirstColumn = "ValueInFirstColumn";
ThisIsTheSecondColumn = "ValueInSecondColumn"
}
and I want to end up with:
$test = #{
THIS_IS_THE_FIRST_COLUMN = "ValueInFirstColumn";
THIS_IS_THE_SECOND_COLUMN = "ValueInSecondColumn"
}
without manually coding the new column names.
This shows me the values I want:
$test.PsObject.Properties | where-object { $_.Name -eq "Keys" } | select -expand value | foreach{ ($_.substring(0,1).toupper() + $_.substring(1) -creplace '[^\p{Ll}\s]', '_$&').Trim("_").ToUpper()} | Out-Host
which results in:
THIS_IS_THE_FIRST_COLUMN
THIS_IS_THE_SECOND_COLUMN
but now I can't seem to figure out how to assign these new values back to the object.
You can modify hashtable $test in place as follows:
foreach($key in #($test.Keys)) { # !! #(...) is required - see below.
$value = $test[$key] # save value
$test.Remove($key) # remove old entry
# Recreate the entry with the transformed name.
$test[($key -creplace '(?<!^)\p{Lu}', '_$&').ToUpper()] = $value
}
#($test.Keys) creates an array from the existing hashtable keys; #(...) ensures that the key collection is copied to a static array, because using the .Keys property directly in a loop that modifies the same hashtable would break.
The loop body saves the value for the input key at hand and then removes the entry under its old name.[1]
The entry is then recreated under its new key name using the desired name transformation:
$key -creplace '(?<!^)\p{Lu} matches every uppercase letter (\p{Lu}) in a given key, except at the start of the string ((?<!^)), and replaces it with _ followed by that letter (_$&); converting the result to uppercase (.ToUpper()) yields the desired name.
[1] Removing the old entry before adding the renamed one avoids problems with single-word names such as Simplest, whose transformed name, SIMPLEST, is considered the same name due to the case-insensitivity of hasthables in PowerShell. Thus, assigning a value to entry SIMPLEST while entry Simplest still exists actually targets the existing entry, and the subsequent $test.Remove($key) would then simply remove that entry, without having added a new one.
Tip of the hat to JosefZ for pointing out the problem.
I wonder if it is possible to do it in place on the original object?
($test.PsObject.Properties|Where-Object {$_.Name -eq "Keys"}).IsSettable says False. Hence, you need do it in two steps as follows:
$test = #{
ThisIsTheFirstColumn = "ValueInFirstColumn";
ThisIsTheSecondColumn = "ValueInSecondColumn"
}
$auxarr = $test.PsObject.Properties |
Where-Object { $_.Name -eq "Keys" } |
select -ExpandProperty value
$auxarr | ForEach-Object {
$aux = ($_.substring(0,1).toupper() +
$_.substring(1) -creplace '[^\p{Ll}\s]', '_$&').Trim("_").ToUpper()
$test.ADD( $aux, $test.$_)
$test.Remove( $_)
}
$test
Two-step approach is necessary as an attempt to perform REMOVE and ADD methods in the only pipeline leads to the following error:
select : Collection was modified; enumeration operation may not execute.
Edit. Unfortunately, the above solution would fail in case of an one-word Pascal Case key, e.g. for Simplest = "ValueInSimplest". Here's the improved script:
$test = #{
ThisIsTheFirstColumn = "ValueInFirstColumn";
ThisIsTheSecondColumn = "ValueInSecondColumn"
Simplest = "ValueInSimplest" # the simplest (one word) PascalCase
}
$auxarr = $test.PsObject.Properties |
Where-Object { $_.Name -eq "Keys" } |
select -ExpandProperty value
$auxarr | ForEach-Object {
$aux = ($_.substring(0,1).toupper() +
$_.substring(1) -creplace '[^\p{Ll}\s]', '_$&').Trim("_").ToUpper()
$newvalue = $test.$_
$test.Remove( $_)
$test.Add( $aux, $newvalue)
}
$test
This seems to work. I ended up putting stuff in a new hashtable, though.
$test = #{
ThisIsTheFirstColumn = "ValueInFirstColumn";
ThisIsTheSecondColumn = "ValueInSecondColumn"
}
$test2=#{}
$test.PsObject.Properties |
where-object { $_.Name -eq "Keys" } |
select -expand value | foreach{ $originalPropertyName=$_
$prop=($_.substring(0,1).toupper() + $_.substring(1) -creplace '[^\p{Ll}\s]', '_$&').Trim("_").ToUpper()
$test2.Add($prop,$test[$originalPropertyName])
}
$test2
I am retrieving two CSVs from an API, one called students.csv similar to:
StudentNo,PreferredFirstnames,PreferredSurname,UPN
111, john, smith, john#email.com
222, jane, doe, jane#email.com
one called rooms.csv:
roomName, roomNo, students
room1, 1, {#{StudentNo=111; StudentName=john smith; StartDate=2018-01-01T00:00:00; EndDate=2018-07-06T00:00:00},....
room2, 2,{#{StudentNo=222; StudentName=jane doe; StartDate=2018-01-01T00:00:00; EndDate=2018-07-06T00:00:00},...
The third column in rooms.csv is an array as provided by the API
What is the best way to consolidate the two into
StudentNo,PreferredFirstnames,PreferredSurname,UPN, roomName
111, john, smith, john#email.com, room1
222, jane, doe, jane#email.com, room2
Im thinking something like...
$rooms = Import-Csv rooms.csv
$students = Import-Csv students.csv
$combined = $students | select-object StudentNo,PreferredSurname,PreferredFirstnames,UPN,
#{Name="roomName";Expression={ ForEach ($r in $rooms) {
if ($r.Students.StudentNo.Contains($_.StudentNo) -eq "True")
{return $r.roomName}}}}
This works, but is the foreach the right way to go am i mixing things up or is there a more efficient way???
--- Original Post ---
With all of this information I need to compare the student data and update AzureAD and then compile a list of data including first name, last name, upn, room and others that are retrieved from AzureAD.
My issue is "efficiency". I have code that mostly works but it takes hours to run. Currently I am looping through students.csv and then for each student looping through rooms.csv to find the room they're in, and obviously waiting for multiple api calls in-between all this.
What is the most efficient way to find the room for each student? Is importing the CSV as a custom PSObject comparable to using hash tables?
I was able to get your proposed code to work but it requires some tweaks to the code and data:
There must be some additional step where you are deserializing the students column of rooms.csv to a collection of objects. It appears to be a ScriptBlock that evaluates to an array of HashTables, but some changes to the CSV input are still needed:
The StartDate and EndDate properties need to be quoted and cast to [DateTime].
At least for rooms that contain multiple students, the value must be quoted so Import-Csv doesn't interpret the , separating array elements as an additional column.
The downside of using CSV as an intermediate format is the original property types are lost; everything becomes a [String] upon import. Sometimes it's desirable to cast back to the original type for efficiency purposes, and sometimes it's absolutely necessary in order for certain operations to work. You could cast those properties every time you use them, but I prefer to cast them once immediately after import.
With those changes rooms.csv becomes...
roomName, roomNo, students
room1, 1, "{#{StudentNo=111; StudentName='john smith'; StartDate=[DateTime] '2018-01-01T00:00:00'; EndDate=[DateTime] '2018-07-06T00:00:00'}}"
room2, 2, "{#{StudentNo=222; StudentName='jane doe'; StartDate=[DateTime] '2018-01-01T00:00:00'; EndDate=[DateTime] '2018-07-06T00:00:00'}}"
...and the script becomes...
# Replace the [String] property "students" with an array of [HashTable] property "Students"
$rooms = Import-Csv rooms.csv `
| Select-Object `
-ExcludeProperty 'students' `
-Property '*', #{
Name = 'Students'
Expression = {
$studentsText = $_.students
$studentsScriptBlock = Invoke-Expression -Command $studentsText
$studentsArray = #(& $studentsScriptBlock)
return $studentsArray
}
}
# Replace the [String] property "StudentNo" with an [Int32] property of the same name
$students = Import-Csv students.csv `
| Select-Object `
-ExcludeProperty 'StudentNo' `
-Property '*', #{
Name = 'StudentNo'
Expression = { [Int32] $_.StudentNo }
}
$combined = $students `
| Select-Object -Property `
'StudentNo', `
'PreferredSurname', `
'PreferredFirstnames', `
'UPN', `
#{
Name = "roomName";
Expression = {
foreach ($r in $rooms)
{
if ($r.Students.StudentNo -contains $_.StudentNo)
{
return $r.roomName
}
}
#TODO: Return text indicating room not found?
}
}
The reason this can be slow is because you are performing a linear search - two of them, in fact - for every student object; first through the collection of rooms (foreach), then through the collection of students in each room (-contains). This can quickly turn into a lot of iterations and equality comparisons because in every room to which the current student is not assigned you are iterating the entire collection of that room's students, on and on until you do find the room for that student.
One easy optimization you can make when performing a linear search is to sort the items you're searching (in this case, the Students property will be ordered by the StudentNo property of each student)...
# Replace the [String] property "students" with an array of [HashTable] property "Students"
$rooms = Import-Csv rooms.csv `
| Select-Object `
-ExcludeProperty 'students' `
-Property '*', #{
Name = 'Students'
Expression = {
$studentsText = $_.students
$studentsScriptBlock = Invoke-Expression -Command $studentsText
$studentsArray = #(& $studentsScriptBlock) `
| Sort-Object -Property #{ Expression = { $_.StudentNo } }
return $studentsArray
}
}
...and then when you're searching that same collection if you come across an item that is greater than the one you're searching for you know the remainder of the collection can't possibly contain what you're searching for and you can immediately abort the search...
#{
Name = "roomName";
Expression = {
foreach ($r in $rooms)
{
# Requires $room.Students to be sorted by StudentNo
foreach ($roomStudentNo in $r.Students.StudentNo)
{
if ($roomStudentNo -eq $_.StudentNo)
{
# Return the matched room name and stop searching this and further rooms
return $r.roomName
}
elseif ($roomStudentNo -gt $_.StudentNo)
{
# Stop searching this room
break
}
# $roomStudentNo is less than $_.StudentNo; keep searching this room
}
}
#TODO: Return text indicating room not found?
}
}
Better yet, with a sorted collection you can also perform a binary search, which is faster than a linear search*. The Array class already provides a BinarySearch static method, so we can accomplish this in less code, too...
#{
Name = "roomName";
Expression = {
foreach ($r in $rooms)
{
# Requires $room.Students to be sorted by StudentNo
if ([Array]::BinarySearch($r.Students.StudentNo, $_.StudentNo) -ge 0)
{
return $r.roomName
}
}
#TODO: Return text indicating room not found?
}
}
The way I would approach this problem, however, is to use a [HashTable] mapping a StudentNo to a room. There is a little preprocessing required to build the [HashTable] but this will provide constant-time lookups when retrieving the room for a student.
function GetRoomsByStudentNoTable()
{
$table = #{ }
foreach ($room in $rooms)
{
foreach ($student in $room.Students)
{
#NOTE: It is assumed each student belongs to at most one room
$table[$student.StudentNo] = $room
}
}
return $table
}
# Replace the [String] property "students" with an array of [HashTable] property "Students"
$rooms = Import-Csv rooms.csv `
| Select-Object `
-ExcludeProperty 'students' `
-Property '*', #{
Name = 'Students'
Expression = {
$studentsText = $_.students
$studentsScriptBlock = Invoke-Expression -Command $studentsText
$studentsArray = #(& $studentsScriptBlock)
return $studentsArray
}
}
# Replace the [String] property "StudentNo" with an [Int32] property of the same name
$students = Import-Csv students.csv `
| Select-Object `
-ExcludeProperty 'StudentNo' `
-Property '*', #{
Name = 'StudentNo'
Expression = { [Int32] $_.StudentNo }
}
$roomsByStudentNo = GetRoomsByStudentNoTable
$combined = $students `
| Select-Object -Property `
'StudentNo', `
'PreferredSurname', `
'PreferredFirstnames', `
'UPN', `
#{
Name = "roomName";
Expression = {
$room = $roomsByStudentNo[$_.StudentNo]
if ($room -ne $null)
{
return $room.roomName
}
#TODO: Return text indicating room not found?
}
}
You can ameliorate the hit of building $roomsByStudentNo by doing so at the same time as importing rooms.csv...
# Replace the [String] property "students" with an array of [HashTable] property "Students"
$rooms = Import-Csv rooms.csv `
| Select-Object `
-ExcludeProperty 'students' `
-Property '*', #{
Name = 'Students'
Expression = {
$studentsText = $_.students
$studentsScriptBlock = Invoke-Expression -Command $studentsText
$studentsArray = #(& $studentsScriptBlock)
return $studentsArray
}
} `
| ForEach-Object -Begin {
$roomsByStudentNo = #{ }
} -Process {
foreach ($student in $_.Students)
{
#NOTE: It is assumed each student belongs to at most one room
$roomsByStudentNo[$student.StudentNo] = $_
}
return $_
}
*Except for on small arrays
I have a way of doing Arrays in other languagues like this:
$x = "David"
$arr = #()
$arr[$x]["TSHIRTS"]["SIZE"] = "M"
This generates an error.
You are trying to create an associative array (hash). Try out the following
sequence of commands
$arr=#{}
$arr["david"] = #{}
$arr["david"]["TSHIRTS"] = #{}
$arr["david"]["TSHIRTS"]["SIZE"] ="M"
$arr.david.tshirts.size
Note the difference between hashes and arrays
$a = #{} # hash
$a = #() # array
Arrays can only have non-negative integers as indexes
from powershell.com:
PowerShell supports two types of multi-dimensional arrays: jagged arrays and true multidimensional arrays.
Jagged arrays are normal PowerShell arrays that store arrays as elements. This is very cost-effective storage because dimensions can be of different size:
$array1 = 1,2,(1,2,3),3
$array1[0]
$array1[1]
$array1[2]
$array1[2][0]
$array1[2][1]
True multi-dimensional arrays always resemble a square matrix. To create such an array, you will need to access .NET. The next line creates a two-dimensional array with 10 and 20 elements resembling a 10x20 matrix:
$array2 = New-Object 'object[,]' 10,20
$array2[4,8] = 'Hello'
$array2[9,16] = 'Test'
$array2
for a 3-dimensioanl array 10*20*10
$array3 = New-Object 'object[,,]' 10,20,10
To extend on what manojlds said above is that you can nest Hashtables. It may not be a true multi-dimensional array but give you some ideas about how to structure the data. An example:
$hash = #{}
$computers | %{
$hash.Add(($_.Name),(#{
"Status" = ($_.Status)
"Date" = ($_.Date)
}))
}
What's cool about this is that you can reference things like:
($hash."Name1").Status
Also, it is far faster than arrays for finding stuff. I use this to compare data rather than use matching in Arrays.
$hash.ContainsKey("Name1")
Hope some of that helps!
-Adam
Knowing that PowerShell pipes objects between cmdlets, it is more common in PowerShell to use an array of PSCustomObjects:
$arr = #(
[PSCustomObject]#{Name = 'David'; Article = 'TShirt'; Size = 'M'}
[PSCustomObject]#{Name = 'Eduard'; Article = 'Trouwsers'; Size = 'S'}
)
Or for older PowerShell Versions (PSv2):
$arr = #(
New-Object PSObject -Property #{Name = 'David'; Article = 'TShirt'; Size = 'M'}
New-Object PSObject -Property #{Name = 'Eduard'; Article = 'Trouwsers'; Size = 'S'}
)
And grep your selection like:
$arr | Where {$_.Name -eq 'David' -and $_.Article -eq 'TShirt'} | Select Size
Or in newer PowerShell (Core) versions:
$arr | Where Name -eq 'David' | Where Article -eq 'TShirt' | Select Size
Or (just get the size):
$arr.Where{$_.Name -eq 'David' -and $_.Article -eq 'TShirt'}.Size
Addendum 2020-07-13
Syntax and readability
As mentioned in the comments, using an array of custom objects is straighter and saves typing, if you like to exhaust this further you might even use the ConvertForm-Csv (or the Import-Csv) cmdlet for building the array:
$arr = ConvertFrom-Csv #'
Name,Article,Size
David,TShirt,M
Eduard,Trouwsers,S
'#
Or more readable:
$arr = ConvertFrom-Csv #'
Name, Article, Size
David, TShirt, M
Eduard, Trouwsers, S
'#
Note: values that contain spaces or special characters need to be double quoted
Or use an external cmdlet like ConvertFrom-SourceTable which reads fixed width table formats:
$arr = ConvertFrom-SourceTable '
Name Article Size
David TShirt M
Eduard Trouwsers S
'
Indexing
The disadvantage of using an array of custom objects is that it is slower than a hash table which uses a binary search algorithm.
Note that the advantage of using an array of custom objects is that can easily search for anything else e.g. everybody that wears a TShirt with size M:
$arr | Where Article -eq 'TShirt' | Where Size -eq 'M' | Select Name
To build an binary search index from the array of objects:
$h = #{}
$arr | ForEach-Object {
If (!$h.ContainsKey($_.Name)) { $h[$_.Name] = #{} }
If (!$h[$_.Name].ContainsKey($_.Article)) { $h[$_.Name][$_.Article] = #{} }
$h[$_.Name][$_.Article] = $_ # Or: $h[$_.Name][$_.Article]['Size'] = $_.Size
}
$h.david.tshirt.size
M
Note: referencing a hash table key that doesn't exist in Set-StrictMode will cause an error:
Set-StrictMode -Version 2
$h.John.tshirt.size
PropertyNotFoundException: The property 'John' cannot be found on this object. Verify that the property exists.
Here is a simple multidimensional array of strings.
$psarray = #(
('Line' ,'One' ),
('Line' ,'Two')
)
foreach($item in $psarray)
{
$item[0]
$item[1]
}
Output:
Line
One
Line
Two
Two-dimensional arrays can be defined this way too as jagged array:
$array = New-Object system.Array[][] 5,5
This has the nice feature that
$array[0]
outputs a one-dimensional array, containing $array[0][0] to $array[0][4].
Depending on your situation you might prefer it over $array = New-Object 'object[,]' 5,5.
(I would have commented to CB above, but stackoverflow does not let me yet)
you could also uses System.Collections.ArrayList to make a and array of arrays or whatever you want.
Here is an example:
$resultsArray= New-Object System.Collections.ArrayList
[void] $resultsArray.Add(#(#('$hello'),2,0,0,0,0,0,0,1,1))
[void] $resultsArray.Add(#(#('$test', '$testagain'),3,0,0,1,0,0,0,1,2))
[void] $resultsArray.Add("ERROR")
[void] $resultsArray.Add(#(#('$var', '$result'),5,1,1,0,1,1,0,2,3))
[void] $resultsArray.Add(#(#('$num', '$number'),3,0,0,0,0,0,1,1,2))
One problem, if you would call it a problem, you cannot set a limit. Also, you need to use [void] or the script will get mad.
Using the .net syntax (like CB pointed above)
you also add coherence to your 'tabular' array...
if you define a array...
and you try to store diferent types
Powershell will 'alert' you:
$a = New-Object 'byte[,]' 4,4
$a[0,0] = 111; // OK
$a[0,1] = 1111; // Error
Of course Powershell will 'help' you
in the obvious conversions:
$a = New-Object 'string[,]' 2,2
$a[0,0] = "1111"; // OK
$a[0,1] = 111; // OK also
Another thread pointed here about how to add to a multidimensional array in Powershell. I don't know if there is some reason not to use this method, but it worked for my purposes.
$array = #()
$array += ,#( "1", "test1","a" )
$array += ,#( "2", "test2", "b" )
$array += ,#( "3", "test3", "c" )
Im found pretty cool solvation for making arrays in array.
$GroupArray = #()
foreach ( $Array in $ArrayList ){
$GroupArray += #($Array , $null)
}
$GroupArray = $GroupArray | Where-Object {$_ -ne $null}
Lent from above:
$arr = ConvertFrom-Csv #'
Name,Article,Size
David,TShirt,M
Eduard,Trouwsers,S
'#
Print the $arr:
$arr
Name Article Size
---- ------- ----
David TShirt M
Eduard Trouwsers S
Now select 'David'
$arr.Where({$_.Name -eq "david"})
Name Article Size
---- ------- ----
David TShirt M
Now if you want to know the Size of 'David'
$arr.Where({$_.Name -eq "david"}).size
M