delete objects from array when their path property equals object in another array - powershell

I have an $array of PSCustomObjects which contain a path,days,filter and recurse property
I Test-Path the path of each PSCustomObject and if it's false, I save only the path in another variable like $failpath
Now I want to remove all Objects inside $array when the path is inside $failpath
I tried things like the .remove() method for the $array, but that doesn't work and gave me this error (example pic from web): https://i0.wp.com/www.sapien.com/blog/wp-content/uploads/2014/11/image8.png
So I tried creating a new array, but it's giving me a hard time because I don't know how to iterate over the failpaths correctly. so that each correct objects only gets sent to the new array once (when I tried it, the correct object was there multiple times) - i can't show you the code for this because I already edited it too many times and now it's just a mess.
this is how $array and $faultypath look like
$array = #(
[pscustomobject]#{
path = "\\server\daten\Alle Adressen\Dokumente 70"
filter = "*.pdf"
days = "90"
recurse = "false"
}
[pscustomobject]#{
path = "\\server\Tobit\itacom\ERP2UMS"
filter = "*.fax"
days = "7"
recurse = "false"
}
)
[string[]]$faultypath = #()
$pfade | % { if (!(Test-Path $_.path)) { $faultypath += $_.path } }
How can I substract everything which is in $faultypath from $array?

For PowerShell 3 or higher
$faultyPath = $pfade | Where-Object { -not (Test-Path $_.Path) } | ForEach-Object Path
$array | Where-Object Path -notin $faultyPath
For PowerShell 2 or lower
$faultyPath = $pfade | Where-Object { -not (Test-Path $_.Path) } | ForEach-Object { $_.Path }
$array | Where-Object { $faultyPath -notcontains $_.Path }
This is potentially an expensive array comparison if both sets are large. In that case dictionaries or hashtables will provide better performance for the comparison.

Related

PowerShell script - Loop list of folders to get file count and sum of files for each folder listed

I want to get the file count & the sum of files for each individual folder listed in DGFoldersTEST.txt.
However, I’m currently getting the sum of all 3 folders.
And now I'm getting 'Index was outside the bounds of the array' error message.
$DGfolderlist = Get-Content -Path C:\DiskGroupsFolders\DGFoldersTEST.txt
$FolderSize =#()
$int=0
Foreach ($DGfolder in $DGfolderlist)
{
$FolderSize[$int] =
Get-ChildItem -Path $DGfolderlist -File -Recurse -Force -ErrorAction SilentlyContinue |
Measure-Object -Property Length -Sum |
Select-Object -Property Count, #{Name='Size(MB)'; Expression={('{0:N2}' -f($_.Sum/1mb))}}
Write-Host $DGfolder
Write-Host $FolderSize[$int]
$int++
}
To explain the error, you're trying to assign a value at index $int of your $FolderSize array, however, when arrays are initialized using the array subexpression operator #(..), they're intialized with 0 Length, hence why the error. It's different as to when you would initialize them with a specific Length:
$arr = #()
$arr.Length # 0
$arr[0] = 'hello' # Error
$arr = [array]::CreateInstance([object], 10)
$arr.Length # 10
$arr[0] = 'hello' # all good
As for how to approach your code, since you don't really know how many items will come as output from your loop, initializing an array with a specific Length is not possible. PowerShell offers the += operator for adding elements to it, however this is a very expensive operation and not a very good idea because each time we append a new element to the array, a new array has to be created, this is because arrays are of a fixed size. See this answer for more information and better approaches.
You can simply let PowerShell capture the output of your loop by assigning the variable to the loop itself:
$FolderSize = foreach ($DGfolder in $DGfolderlist) {
Get-ChildItem -Path $DGfolder -File -Recurse -Force -ErrorAction SilentlyContinue |
Measure-Object -Property Length -Sum |
Select-Object #(
#{ Name = 'Folder'; Expression = { $DGfolder }}
'Count'
#{ Name = 'Size(MB)'; Expression = { ($_.Sum / 1mb).ToString('N2') }}
)
}

PowerShell: Find unique values from multiple CSV files

let's say that I have several CSV files and I need to check a specific column and find values that exist in one file, but not in any of the others. I'm having a bit of trouble coming up with the best way to go about it as I wanted to use Compare-Object and possibly keep all columns and not just the one that contains the values I'm checking.
So I do indeed have several CSV files and they all have a Service Code column, and I'm trying to create a list for each Service Code that only appears in one file. So I would have "Service Codes only in CSV1", "Service Codes only in CSV2", etc.
Based on some testing and a semi-related question, I've come up with a workable solution, but with all of the nesting and For loops, I'm wondering if there is a more elegant method out there.
Here's what I do have:
$files = Get-ChildItem -LiteralPath "C:\temp\ItemCompare" -Include "*.csv"
$HashList = [System.Collections.Generic.List[System.Collections.Generic.HashSet[String]]]::New()
For ($i = 0; $i -lt $files.Count; $i++){
$TempHashSet = [System.Collections.Generic.HashSet[String]]::New([String[]](Import-Csv $files[$i])."Service Code")
$HashList.Add($TempHashSet)
}
$FinalHashList = [System.Collections.Generic.List[System.Collections.Generic.HashSet[String]]]::New()
For ($i = 0; $i -lt $HashList.Count; $i++){
$UniqueHS = [System.Collections.Generic.HashSet[String]]::New($HashList[$i])
For ($j = 0; $j -lt $HashList.Count; $j++){
#Skip the check when the HashSet would be compared to itself
If ($j -eq $i){Continue}
$UniqueHS.ExceptWith($HashList[$j])
}
$FinalHashList.Add($UniqueHS)
}
It seems a bit messy to me using so many different .NET references, and I know I could make it cleaner with a tag to say using namespace System.Collections.Generic, but I'm wondering if there is a way to make it work using Compare-Object which was my first attempt, or even just a simpler/more efficient method to filter each file.
I believe I found an "elegant" solution based on Group-Object, using only a single pipeline:
# Import all CSV files.
Get-ChildItem $PSScriptRoot\csv\*.csv -File -PipelineVariable file | Import-Csv |
# Add new column "FileName" to distinguish the files.
Select-Object *, #{ label = 'FileName'; expression = { $file.Name } } |
# Group by ServiceCode to get a list of files per distinct value.
Group-Object ServiceCode |
# Filter by ServiceCode values that exist only in a single file.
# Sort-Object -Unique takes care of possible duplicates within a single file.
Where-Object { ( $_.Group.FileName | Sort-Object -Unique ).Count -eq 1 } |
# Expand the groups so we get the original object structure back.
ForEach-Object Group |
# Format-Table requires sorting by FileName, for -GroupBy.
Sort-Object FileName |
# Finally pretty-print the result.
Format-Table -Property ServiceCode, Foo -GroupBy FileName
Test Input
a.csv:
ServiceCode,Foo
1,fop
2,fip
3,fap
b.csv:
ServiceCode,Foo
6,bar
6,baz
3,bam
2,bir
4,biz
c.csv:
ServiceCode,Foo
2,bla
5,blu
1,bli
Output
FileName: b.csv
ServiceCode Foo
----------- ---
4 biz
6 bar
6 baz
FileName: c.csv
ServiceCode Foo
----------- ---
5 blu
Looks correct to me. The values 1, 2 and 3 are duplicated between multiple files, so they are excluded. 4, 5 and 6 exist only in single files, while 6 is a duplicate value only within a single file.
Understanding the code
Maybe it is easier to understand how this code works, by looking at the intermediate output of the pipeline produced by the Group-Object line:
Count Name Group
----- ---- -----
2 1 {#{ServiceCode=1; Foo=fop; FileName=a.csv}, #{ServiceCode=1; Foo=bli; FileName=c.csv}}
3 2 {#{ServiceCode=2; Foo=fip; FileName=a.csv}, #{ServiceCode=2; Foo=bir; FileName=b.csv}, #{ServiceCode=2; Foo=bla; FileName=c.csv}}
2 3 {#{ServiceCode=3; Foo=fap; FileName=a.csv}, #{ServiceCode=3; Foo=bam; FileName=b.csv}}
1 4 {#{ServiceCode=4; Foo=biz; FileName=b.csv}}
1 5 {#{ServiceCode=5; Foo=blu; FileName=c.csv}}
2 6 {#{ServiceCode=6; Foo=bar; FileName=b.csv}, #{ServiceCode=6; Foo=baz; FileName=b.csv}}
Here the Name contains the unique ServiceCode values, while Group "links" the data to the files.
From here it should already be clear how to find values that exist only in single files. If duplicate ServiceCode values within a single file wouldn't be allowed, we could even simplify the filter to Where-Object Count -eq 1. Since it was stated that dupes within single files may exist, we need the Sort-Object -Unique to count multiple equal file names within a group as only one.
It is not completely clear what you expect as an output.
If this is just the ServiceCodes that intersect then this is actually a duplicate with:
Comparing two arrays & get the values which are not common
Union and Intersection in PowerShell?
But taking that you actually want the related object and files, you might use this approach:
$HashTable = #{}
ForEach ($File in Get-ChildItem .\*.csv) {
ForEach ($Object in (Import-Csv $File)) {
$HashTable[$Object.ServiceCode] = $Object |Select-Object *,
#{ n='File'; e={ $File.Name } },
#{ n='Count'; e={ $HashTable[$Object.ServiceCode].Count + 1 } }
}
}
$HashTable.Values |Where-Object Count -eq 1
Here is my take on this fun exercise, I'm using a similar approach as yours with the HashSet but adding [System.StringComparer]::OrdinalIgnoreCase to leverage the .Contains(..) method:
using namespace System.Collections.Generic
# Generate Random CSVs:
$charset = 'abABcdCD0123xXyYzZ'
$ran = [random]::new()
$csvs = #{}
foreach($i in 1..50) # Create 50 CSVs for testing
{
$csvs["csv$i"] = foreach($z in 1..50) # With 50 Rows
{
$index = (0..2).ForEach({ $ran.Next($charset.Length) })
[pscustomobject]#{
ServiceCode = [string]::new($charset[$index])
Data = $ran.Next()
}
}
}
# Get Unique 'ServiceCode' per CSV:
$result = #{}
foreach($key in $csvs.Keys)
{
# Get all unique `ServiceCode` from the other CSVs
$tempHash = [HashSet[string]]::new(
[string[]]($csvs[$csvs.Keys -ne $key].ServiceCode),
[System.StringComparer]::OrdinalIgnoreCase
)
# Filter the unique `ServiceCode`
$result[$key] = foreach($line in $csvs[$key])
{
if(-not $tempHash.Contains($line.ServiceCode))
{
$line
}
}
}
# Test if the code worked,
# If something is returned from here means it didn't work
foreach($key in $result.Keys)
{
$tmp = $result[$result.Keys -ne $key].ServiceCode
foreach($val in $result[$key])
{
if($val.ServiceCode -in $tmp)
{
$val
}
}
}
i was able to get unique items as follow
# Get all items of CSVs in a single variable with adding the file name at the last column
$CSVs = Get-ChildItem "C:\temp\ItemCompare\*.csv" | ForEach-Object {
$CSV = Import-CSV -Path $_.FullName
$FileName = $_.Name
$CSV | Select-Object *,#{N='Filename';E={$FileName}}
}
Foreach($line in $CSVs){
$ServiceCode = $line.ServiceCode
$file = $line.Filename
if (!($CSVs | where {$_.ServiceCode -eq $ServiceCode -and $_.filename -ne $file})){
$line
}
}

PowerShell - Convert Property Names from Pascal Case to Upper Case With Underscores

Let's say I have an object like this:
$test = #{
ThisIsTheFirstColumn = "ValueInFirstColumn";
ThisIsTheSecondColumn = "ValueInSecondColumn"
}
and I want to end up with:
$test = #{
THIS_IS_THE_FIRST_COLUMN = "ValueInFirstColumn";
THIS_IS_THE_SECOND_COLUMN = "ValueInSecondColumn"
}
without manually coding the new column names.
This shows me the values I want:
$test.PsObject.Properties | where-object { $_.Name -eq "Keys" } | select -expand value | foreach{ ($_.substring(0,1).toupper() + $_.substring(1) -creplace '[^\p{Ll}\s]', '_$&').Trim("_").ToUpper()} | Out-Host
which results in:
THIS_IS_THE_FIRST_COLUMN
THIS_IS_THE_SECOND_COLUMN
but now I can't seem to figure out how to assign these new values back to the object.
You can modify hashtable $test in place as follows:
foreach($key in #($test.Keys)) { # !! #(...) is required - see below.
$value = $test[$key] # save value
$test.Remove($key) # remove old entry
# Recreate the entry with the transformed name.
$test[($key -creplace '(?<!^)\p{Lu}', '_$&').ToUpper()] = $value
}
#($test.Keys) creates an array from the existing hashtable keys; #(...) ensures that the key collection is copied to a static array, because using the .Keys property directly in a loop that modifies the same hashtable would break.
The loop body saves the value for the input key at hand and then removes the entry under its old name.[1]
The entry is then recreated under its new key name using the desired name transformation:
$key -creplace '(?<!^)\p{Lu} matches every uppercase letter (\p{Lu}) in a given key, except at the start of the string ((?<!^)), and replaces it with _ followed by that letter (_$&); converting the result to uppercase (.ToUpper()) yields the desired name.
[1] Removing the old entry before adding the renamed one avoids problems with single-word names such as Simplest, whose transformed name, SIMPLEST, is considered the same name due to the case-insensitivity of hasthables in PowerShell. Thus, assigning a value to entry SIMPLEST while entry Simplest still exists actually targets the existing entry, and the subsequent $test.Remove($key) would then simply remove that entry, without having added a new one.
Tip of the hat to JosefZ for pointing out the problem.
I wonder if it is possible to do it in place on the original object?
($test.PsObject.Properties|Where-Object {$_.Name -eq "Keys"}).IsSettable says False. Hence, you need do it in two steps as follows:
$test = #{
ThisIsTheFirstColumn = "ValueInFirstColumn";
ThisIsTheSecondColumn = "ValueInSecondColumn"
}
$auxarr = $test.PsObject.Properties |
Where-Object { $_.Name -eq "Keys" } |
select -ExpandProperty value
$auxarr | ForEach-Object {
$aux = ($_.substring(0,1).toupper() +
$_.substring(1) -creplace '[^\p{Ll}\s]', '_$&').Trim("_").ToUpper()
$test.ADD( $aux, $test.$_)
$test.Remove( $_)
}
$test
Two-step approach is necessary as an attempt to perform REMOVE and ADD methods in the only pipeline leads to the following error:
select : Collection was modified; enumeration operation may not execute.
Edit. Unfortunately, the above solution would fail in case of an one-word Pascal Case key, e.g. for Simplest = "ValueInSimplest". Here's the improved script:
$test = #{
ThisIsTheFirstColumn = "ValueInFirstColumn";
ThisIsTheSecondColumn = "ValueInSecondColumn"
Simplest = "ValueInSimplest" # the simplest (one word) PascalCase
}
$auxarr = $test.PsObject.Properties |
Where-Object { $_.Name -eq "Keys" } |
select -ExpandProperty value
$auxarr | ForEach-Object {
$aux = ($_.substring(0,1).toupper() +
$_.substring(1) -creplace '[^\p{Ll}\s]', '_$&').Trim("_").ToUpper()
$newvalue = $test.$_
$test.Remove( $_)
$test.Add( $aux, $newvalue)
}
$test
This seems to work. I ended up putting stuff in a new hashtable, though.
$test = #{
ThisIsTheFirstColumn = "ValueInFirstColumn";
ThisIsTheSecondColumn = "ValueInSecondColumn"
}
$test2=#{}
$test.PsObject.Properties |
where-object { $_.Name -eq "Keys" } |
select -expand value | foreach{ $originalPropertyName=$_
$prop=($_.substring(0,1).toupper() + $_.substring(1) -creplace '[^\p{Ll}\s]', '_$&').Trim("_").ToUpper()
$test2.Add($prop,$test[$originalPropertyName])
}
$test2

powershell prevent duplicate object keys

This is a follow up to this question
If I have 2 json files
file1.json
{
"foo": {
"honk": 42
}
}
file2.json
{
"foo": {
"honk": 9000,
"toot": 9000
}
}
And I create an object using ConvertFrom-Json
$bar = #(Get-ChildItem . -Filter *.json -Recurse | Get-Content -Raw |ConvertFrom-Json)
Powershell will happily take both, and overwrite foo.
foo
---
#{honk=42}
#{honk=9000; toot=9000}
The contents of $bar.foo are merged
$bar.foo
honk
----
42
9000
How can I error if importing duplicate objects?
Each JSON file is imported as a separate object, so there's nothing overwritten really. You just get a list of objects.
To throw an error when you get multiple objects with the same top-level property you can group the objects by property name and throw an error if you get a count >1.
$bar | Group-Object { $_.PSObject.Properties.Name } |
Where-Object { $_.Count -gt 1 } |
ForEach-Object { throw "Duplicate object $($_.Name)" }
When importing to an array, every object is unique. In this example it isn't ideal to leave the objects in an array, since there is no way to predictably iterate over them, since some objects might contain multiple keys.
{
"foo": 42
}
vs
{
"bar": 9000,
"buzz": 9000
}
This will cause heartache when trying to loop through all objects.
Instead, I took all array items and combined them into 1 powershell object. Since powershell objects are basically hashes, and hashes by design must have all keys unique, powershell will automatically error if overwriting a key.
function Load-Servers {
$allObjects = #(
Get-ChildItem '.\servers' -Filter *.json -Recurse | Get-Content -Raw | ConvertFrom-Json
)
$object = New-Object PSObject
Foreach ($o in $allObjects) {
$o.psobject.members | ? {$_.Membertype -eq "noteproperty" } | %{$object | add-member $_.Name $_.Value }
}
return $object
}

powershell: Check if any of a bunch of properties is set

I'm importing a csv-file which looks like this:
id,value1.1,value1.2,value1.3,Value2.1,Value2.2,Value3.1,Value3.2
row1,v1.1,,v1.3
row2,,,,v2.1,v2.2
row3,,,,,,,v3.2
Now I want to check, if any of the value-properties in one group is set.
I can do
Import-Csv .\test.csv | where {$_.Value1.1 -or $_.Value1.2 -or $_.Value1.3}
or
Import-Csv .\test.csv | foreach {
if ($_.Value1 -or $_.Value2 -or $_.Value3) {
Write-Output $_
}
}
But my "real" csv-file contains about 200 columns and I have to check 31 properties x 5 different object types that are mixed up in this csv. So my code will be realy ugly.
Is there anything like
where {$_.Value1.*}
or
where {$ArrayWithPropertyNames}
?
You could easily use the Get-Member cmdlet to get the properties which have the correct prefix (just use * as a wildcard after the prefix).
So to achieve what you want you could just filter the data based on whether any of the properties with the correct prefix contains data.
The script below uses your sample data, with a row4 added, and filters the list to find all items which have a value in any property starting with value1.
$csv = #"
id,value1.1,value1.2,value1.3,Value2.1,Value2.2,Value3.1,Value3.2
row1,v1.1,,v1.3
row2,,,,v2.1,v2.2
row3,,,,,,,v3.2
row4,v1.1,,v1.3
"#
$data = ConvertFrom-csv $csv
$data | Where {
$currentDataItem = $_
$propertyValues = $currentDataItem |
# Get's all the properties with the correct prefix
Get-Member 'value1*' -MemberType NoteProperty |
# Gets the values for each of those properties
Foreach { $currentDataItem.($_.Name) } |
# Only keep the property value if it has a value
Where { $_ }
# Could just return $propertyValues, but this makes the intention clearer
$hasValueOnPrefixedProperty = $propertyValues.Length -gt 0
Write-Output $hasValueOnPrefixedProperty
}
Alternate solution:
$PropsToCheck = 'Value1*'
Import-csv .\test.csv |
Where {
(($_ | Select $PropsToCheck).psobject.properties.value) -contains ''
}