How do I return a custom object in Powershell that's formatted as a table? - powershell

I'm pretty new to powershell, so I won't be surprised at all if I'm going about this all wrong. I'm trying to create a function that, when executed, prints results formatted as a table. Maybe it would even be possible to pipe those results to another function for further analysis.
Here's what I have so far. This is a simple function that iterates through a list of paths and collects the name of the directory and the number of items in that directory, putting the data in a hashtable, and returning an array of hashtables:
function Check-Paths(){
$paths =
"C:\code\DirA",
"C:\code\DirB"
$dirs = #()
foreach ($path in $paths){
if (Test-Path $path){
$len = (ls -path $path).length
}
else{
$len = 0
}
$dirName = ($path -split "\\")[-1]
$dirInfo = #{DirName = $dirName; NumItems = $len}
$dirs += $dirInfo
}
return $dirs
}
That seems straightforward enough. However, when I go run the command, this is what I get:
PS > Check-Paths
Name Value
---- -----
DirName DirA
NumItems 0
DirName DirB
NumItems 0
What I want is this:
DirName NumItems
------- --------
DirA 0
DirB 0
I could just hack my function to use a write statement, but I think there must be a much better way to do this. Is there a way to get the data formatted as a table, even better if that can be such that it can be piped to another method?

How 'bout using
return new-object psobject -Property $dirs
That would return an object whose properties match the items in the hashtable. Then you can use the built-in powershell formatting cmdlets to make it look like you want. since you only have 2 properties, it will be formatted as a table by default.
EDIT: Here's how the whole thing would look (After the various suggestions):
function Check-Paths(){
$paths =
"C:\code\DirA",
"C:\code\DirB"
$dirs = #()
foreach ($path in $paths){
if (Test-Path $path){
$len = (ls -path $path).length
}
else{
$len = 0
}
$dirName = ($path -split "\\")[-1]
new-object psobject -property #{DirName = $dirName; NumItems = $len}
}
}

Here's a one liner that will give you the number of children for each folder.
"C:\code\DirA", "C:\code\DirB" | ? {Test-Path $_} | Get-Item | select -property Name, #{ Name="NumOfItems" ; Expression = {$_.GetFileSystemInfos().Count} }
It passes an array of strings to Where-Object to select the ones that exist. The path strings that exist are passed to Get-Item to get the FileSystemObjects which get passed to Select-Object to create PSCustomObject objects. The PSCustomObjects have two properties, the name of the directory and the number of children.
If you want the outputted table columns closer together you can pipe the above to: Format-Table -AutoSize
Example usage and output:
dir | ? {$_.PsIsContainer} | select -property Name, #{ Name="NumOfItems" ; Expression = {$_.GetFileSystemInfos().Count} } | Format-Table -AutoSize
Name NumOfItems
---- ----------
Desktop 12
Favorites 3
My Documents 3
Start Menu 2

Related

PowerShell script - Loop list of folders to get file count and sum of files for each folder listed

I want to get the file count & the sum of files for each individual folder listed in DGFoldersTEST.txt.
However, I’m currently getting the sum of all 3 folders.
And now I'm getting 'Index was outside the bounds of the array' error message.
$DGfolderlist = Get-Content -Path C:\DiskGroupsFolders\DGFoldersTEST.txt
$FolderSize =#()
$int=0
Foreach ($DGfolder in $DGfolderlist)
{
$FolderSize[$int] =
Get-ChildItem -Path $DGfolderlist -File -Recurse -Force -ErrorAction SilentlyContinue |
Measure-Object -Property Length -Sum |
Select-Object -Property Count, #{Name='Size(MB)'; Expression={('{0:N2}' -f($_.Sum/1mb))}}
Write-Host $DGfolder
Write-Host $FolderSize[$int]
$int++
}
To explain the error, you're trying to assign a value at index $int of your $FolderSize array, however, when arrays are initialized using the array subexpression operator #(..), they're intialized with 0 Length, hence why the error. It's different as to when you would initialize them with a specific Length:
$arr = #()
$arr.Length # 0
$arr[0] = 'hello' # Error
$arr = [array]::CreateInstance([object], 10)
$arr.Length # 10
$arr[0] = 'hello' # all good
As for how to approach your code, since you don't really know how many items will come as output from your loop, initializing an array with a specific Length is not possible. PowerShell offers the += operator for adding elements to it, however this is a very expensive operation and not a very good idea because each time we append a new element to the array, a new array has to be created, this is because arrays are of a fixed size. See this answer for more information and better approaches.
You can simply let PowerShell capture the output of your loop by assigning the variable to the loop itself:
$FolderSize = foreach ($DGfolder in $DGfolderlist) {
Get-ChildItem -Path $DGfolder -File -Recurse -Force -ErrorAction SilentlyContinue |
Measure-Object -Property Length -Sum |
Select-Object #(
#{ Name = 'Folder'; Expression = { $DGfolder }}
'Count'
#{ Name = 'Size(MB)'; Expression = { ($_.Sum / 1mb).ToString('N2') }}
)
}

PowerShell: Find unique values from multiple CSV files

let's say that I have several CSV files and I need to check a specific column and find values that exist in one file, but not in any of the others. I'm having a bit of trouble coming up with the best way to go about it as I wanted to use Compare-Object and possibly keep all columns and not just the one that contains the values I'm checking.
So I do indeed have several CSV files and they all have a Service Code column, and I'm trying to create a list for each Service Code that only appears in one file. So I would have "Service Codes only in CSV1", "Service Codes only in CSV2", etc.
Based on some testing and a semi-related question, I've come up with a workable solution, but with all of the nesting and For loops, I'm wondering if there is a more elegant method out there.
Here's what I do have:
$files = Get-ChildItem -LiteralPath "C:\temp\ItemCompare" -Include "*.csv"
$HashList = [System.Collections.Generic.List[System.Collections.Generic.HashSet[String]]]::New()
For ($i = 0; $i -lt $files.Count; $i++){
$TempHashSet = [System.Collections.Generic.HashSet[String]]::New([String[]](Import-Csv $files[$i])."Service Code")
$HashList.Add($TempHashSet)
}
$FinalHashList = [System.Collections.Generic.List[System.Collections.Generic.HashSet[String]]]::New()
For ($i = 0; $i -lt $HashList.Count; $i++){
$UniqueHS = [System.Collections.Generic.HashSet[String]]::New($HashList[$i])
For ($j = 0; $j -lt $HashList.Count; $j++){
#Skip the check when the HashSet would be compared to itself
If ($j -eq $i){Continue}
$UniqueHS.ExceptWith($HashList[$j])
}
$FinalHashList.Add($UniqueHS)
}
It seems a bit messy to me using so many different .NET references, and I know I could make it cleaner with a tag to say using namespace System.Collections.Generic, but I'm wondering if there is a way to make it work using Compare-Object which was my first attempt, or even just a simpler/more efficient method to filter each file.
I believe I found an "elegant" solution based on Group-Object, using only a single pipeline:
# Import all CSV files.
Get-ChildItem $PSScriptRoot\csv\*.csv -File -PipelineVariable file | Import-Csv |
# Add new column "FileName" to distinguish the files.
Select-Object *, #{ label = 'FileName'; expression = { $file.Name } } |
# Group by ServiceCode to get a list of files per distinct value.
Group-Object ServiceCode |
# Filter by ServiceCode values that exist only in a single file.
# Sort-Object -Unique takes care of possible duplicates within a single file.
Where-Object { ( $_.Group.FileName | Sort-Object -Unique ).Count -eq 1 } |
# Expand the groups so we get the original object structure back.
ForEach-Object Group |
# Format-Table requires sorting by FileName, for -GroupBy.
Sort-Object FileName |
# Finally pretty-print the result.
Format-Table -Property ServiceCode, Foo -GroupBy FileName
Test Input
a.csv:
ServiceCode,Foo
1,fop
2,fip
3,fap
b.csv:
ServiceCode,Foo
6,bar
6,baz
3,bam
2,bir
4,biz
c.csv:
ServiceCode,Foo
2,bla
5,blu
1,bli
Output
FileName: b.csv
ServiceCode Foo
----------- ---
4 biz
6 bar
6 baz
FileName: c.csv
ServiceCode Foo
----------- ---
5 blu
Looks correct to me. The values 1, 2 and 3 are duplicated between multiple files, so they are excluded. 4, 5 and 6 exist only in single files, while 6 is a duplicate value only within a single file.
Understanding the code
Maybe it is easier to understand how this code works, by looking at the intermediate output of the pipeline produced by the Group-Object line:
Count Name Group
----- ---- -----
2 1 {#{ServiceCode=1; Foo=fop; FileName=a.csv}, #{ServiceCode=1; Foo=bli; FileName=c.csv}}
3 2 {#{ServiceCode=2; Foo=fip; FileName=a.csv}, #{ServiceCode=2; Foo=bir; FileName=b.csv}, #{ServiceCode=2; Foo=bla; FileName=c.csv}}
2 3 {#{ServiceCode=3; Foo=fap; FileName=a.csv}, #{ServiceCode=3; Foo=bam; FileName=b.csv}}
1 4 {#{ServiceCode=4; Foo=biz; FileName=b.csv}}
1 5 {#{ServiceCode=5; Foo=blu; FileName=c.csv}}
2 6 {#{ServiceCode=6; Foo=bar; FileName=b.csv}, #{ServiceCode=6; Foo=baz; FileName=b.csv}}
Here the Name contains the unique ServiceCode values, while Group "links" the data to the files.
From here it should already be clear how to find values that exist only in single files. If duplicate ServiceCode values within a single file wouldn't be allowed, we could even simplify the filter to Where-Object Count -eq 1. Since it was stated that dupes within single files may exist, we need the Sort-Object -Unique to count multiple equal file names within a group as only one.
It is not completely clear what you expect as an output.
If this is just the ServiceCodes that intersect then this is actually a duplicate with:
Comparing two arrays & get the values which are not common
Union and Intersection in PowerShell?
But taking that you actually want the related object and files, you might use this approach:
$HashTable = #{}
ForEach ($File in Get-ChildItem .\*.csv) {
ForEach ($Object in (Import-Csv $File)) {
$HashTable[$Object.ServiceCode] = $Object |Select-Object *,
#{ n='File'; e={ $File.Name } },
#{ n='Count'; e={ $HashTable[$Object.ServiceCode].Count + 1 } }
}
}
$HashTable.Values |Where-Object Count -eq 1
Here is my take on this fun exercise, I'm using a similar approach as yours with the HashSet but adding [System.StringComparer]::OrdinalIgnoreCase to leverage the .Contains(..) method:
using namespace System.Collections.Generic
# Generate Random CSVs:
$charset = 'abABcdCD0123xXyYzZ'
$ran = [random]::new()
$csvs = #{}
foreach($i in 1..50) # Create 50 CSVs for testing
{
$csvs["csv$i"] = foreach($z in 1..50) # With 50 Rows
{
$index = (0..2).ForEach({ $ran.Next($charset.Length) })
[pscustomobject]#{
ServiceCode = [string]::new($charset[$index])
Data = $ran.Next()
}
}
}
# Get Unique 'ServiceCode' per CSV:
$result = #{}
foreach($key in $csvs.Keys)
{
# Get all unique `ServiceCode` from the other CSVs
$tempHash = [HashSet[string]]::new(
[string[]]($csvs[$csvs.Keys -ne $key].ServiceCode),
[System.StringComparer]::OrdinalIgnoreCase
)
# Filter the unique `ServiceCode`
$result[$key] = foreach($line in $csvs[$key])
{
if(-not $tempHash.Contains($line.ServiceCode))
{
$line
}
}
}
# Test if the code worked,
# If something is returned from here means it didn't work
foreach($key in $result.Keys)
{
$tmp = $result[$result.Keys -ne $key].ServiceCode
foreach($val in $result[$key])
{
if($val.ServiceCode -in $tmp)
{
$val
}
}
}
i was able to get unique items as follow
# Get all items of CSVs in a single variable with adding the file name at the last column
$CSVs = Get-ChildItem "C:\temp\ItemCompare\*.csv" | ForEach-Object {
$CSV = Import-CSV -Path $_.FullName
$FileName = $_.Name
$CSV | Select-Object *,#{N='Filename';E={$FileName}}
}
Foreach($line in $CSVs){
$ServiceCode = $line.ServiceCode
$file = $line.Filename
if (!($CSVs | where {$_.ServiceCode -eq $ServiceCode -and $_.filename -ne $file})){
$line
}
}

delete objects from array when their path property equals object in another array

I have an $array of PSCustomObjects which contain a path,days,filter and recurse property
I Test-Path the path of each PSCustomObject and if it's false, I save only the path in another variable like $failpath
Now I want to remove all Objects inside $array when the path is inside $failpath
I tried things like the .remove() method for the $array, but that doesn't work and gave me this error (example pic from web): https://i0.wp.com/www.sapien.com/blog/wp-content/uploads/2014/11/image8.png
So I tried creating a new array, but it's giving me a hard time because I don't know how to iterate over the failpaths correctly. so that each correct objects only gets sent to the new array once (when I tried it, the correct object was there multiple times) - i can't show you the code for this because I already edited it too many times and now it's just a mess.
this is how $array and $faultypath look like
$array = #(
[pscustomobject]#{
path = "\\server\daten\Alle Adressen\Dokumente 70"
filter = "*.pdf"
days = "90"
recurse = "false"
}
[pscustomobject]#{
path = "\\server\Tobit\itacom\ERP2UMS"
filter = "*.fax"
days = "7"
recurse = "false"
}
)
[string[]]$faultypath = #()
$pfade | % { if (!(Test-Path $_.path)) { $faultypath += $_.path } }
How can I substract everything which is in $faultypath from $array?
For PowerShell 3 or higher
$faultyPath = $pfade | Where-Object { -not (Test-Path $_.Path) } | ForEach-Object Path
$array | Where-Object Path -notin $faultyPath
For PowerShell 2 or lower
$faultyPath = $pfade | Where-Object { -not (Test-Path $_.Path) } | ForEach-Object { $_.Path }
$array | Where-Object { $faultyPath -notcontains $_.Path }
This is potentially an expensive array comparison if both sets are large. In that case dictionaries or hashtables will provide better performance for the comparison.

Count number of files in each subfolder, ignoring files with certain name

Consider the following directory tree
ROOT
BAR001
foo_1.txt
foo_2.txt
foo_ignore_this_1.txt
BAR001_a
foo_3.txt
foo_4.txt
foo_ignore_this_2.txt
foo_ignore_this_3.txt
BAR001_b
foo_5.txt
foo_ignore_this_4.txt
BAR002
baz_1.txt
baz_ignore_this_1.txt
BAR002_a
baz_2.txt
baz_ignore_this_2.txt
BAR002_b
baz_3.txt
baz_4.txt
baz_5.txt
baz_ignore_this_3.txt
BAR002_c
baz_ignore_this_4.txt
BAR003
lor_1.txt
The structure will always be like this, so no deeper subfolders. I'm working on a script to count the number of files:
for each BARXXX folder
for each BARXXX_Y folder
textfiles with "ignore_this" in the name, should be ignored in the count
For the example above, this would result into:
Folder Filecount
---------------------
BAR001 2
BAR001_a 2
BAR001_b 1
BAR002 1
BAR002_a 1
BAR002_b 3
BAR002_c 0
BAR003 1
I now have:
Function Filecount {
param(
[string]$dir
)
$childs = Get-ChildItem $dir | where {$_.Attributes -eq 'Directory'}
Foreach ($childs in $child) {
Write-Host (Get-ChildItem $dir | Measure-Object).Count;
}
}
Filecount -dir "C:\ROOT"
(Not ready yet but building) This however, does not work. $child seems to be empty. Please tell me what I'm doing wrong.
Well, to start, you're running ForEach ($childs in $child), this syntax is backwards, so that will cause you some issues! If you swap it, so that you're running:
ForEach ($child in $childs)
You'll get the following output:
>2
>2
>1
>1
>1
>3
>0
Alright, I'm back now with the completed answer. For one, instead of using Write-Out, I'm using a PowerShell custom object to let PowerShell do the hard work for me. I'm setting FolderName equal to the $child.BaseName, and then running a GCI on the $Child.FullName to get the file count. I've added an extra parameter called $ignoreme, that should have an asterisk value for the values you want to ignore.
Here's the complete answer now. Keep in mind that my file structure was a bit different than yours, so my file count is different at the bottom as well.
Function Filecount {
param(
[string]$dir="C:\TEMP\Example",
[string]$ignoreme = "*_*"
)
$childs = Get-ChildItem $dir | where {$_.Attributes -eq 'Directory'}
Foreach ($child in $childs) {
[pscustomobject]#{FolderName=$child.Name;ItemCount=(Get-ChildItem $child.FullName | ? Name -notlike $ignoreme | Measure-Object).Count}
}
}
>Filecount | ft -AutoSize
>FolderName ItemCount
>---------- ---------
>BAR001 2
>BAR001_A 1
>BAR001_b 2
>BAR001_C 0
>BAR002 0
>BAR003 0
If you're using PowerShell v 2.0, use this method instead.
Function Filecount {
param(
[string]$dir="C:\TEMP\Example",
[string]$ignoreme = "*_*"
)
$childs = Get-ChildItem $dir | where {$_.Attributes -eq 'Directory'}
Foreach ($child in $childs) {
$ObjectProperties = #{
FolderName=$child.Name
ItemCount=(Get-ChildItem $child.FullName | ? Name -notlike $ignoreme | Measure-Object).Count}
New-Object PSObject -Property $ObjectProperties
}
}
I like that way of creating an object 1RedOne, haven't seen that before, thanks.
We can improve the performance of the code in a few of ways. By using the Filter Left principle, which states that the provider for any cmdlet is inherently more efficient than running things through PowerShell, by performing fewer loops and by removing an unnecessary step:
Function Filecount
{
param
(
[string]$dir = ".",
[parameter(mandatory=$true)]
[string]$ignoreme
)
Get-ChildItem -Recurse -Directory -Path $dir | ForEach-Object `
{
[pscustomobject]#{FolderName=$_.Name;ItemCount=(Get-ChildItem -Recurse -Exclude "*$ignoreme*" -Path $_.FullName).count}
}
}
So, firstly we can use the -Directory switch of Get-Childitem in the top-level directory (I know this is available in v3.0 and above, not sure about v2.0).
Then we can pipe the output of this directly in to the next loop, without storing it first.
Then we can replace another Where-Object with a provider -Exclude.
Finally, we can remove the Measure-Object as a simple count of the array will do:
Filecount "ROOT" "ignore_this" | ft -a
FolderName ItemCount
---------- ---------
BAR001 2
BAR001_a 2
BAR001_b 1
BAR002 1
BAR002_a 1
BAR002_b 3
BAR002_c 0
BAR003 1
Cheers Folks!

compare columns in two csv files

With all of the examples out there you would think I could have found my solution. :-)
Anyway, I have two csv files; one with two columns, one with 4. I need to compare one column from each one using powershell. I thought I had it figured out but when I did a compare of my results, it comes back as false when I know it should be true. Here's what I have so far:
$newemp = Import-Csv -Path "C:\Temp\newemp.csv" -Header login_id, lastname, firstname, other | Select-Object "login_id"
$ps = Import-Csv -Path "C:\Temp\Emplid_LoginID.csv" | Select-Object "login id"
If ($newemp -eq $ps)
{
write-host "IDs match" -forgroundcolor green
}
Else
{
write-host "Not all IDs match" -backgroundcolor yellow -foregroundcolor black
}
I had to specifiy headers for the first file because it doesn't have any. What's weird is that I can call each variable to see what it holds and they end up with the same info but for some reason still comes up as false. This occurs even if there is only one row (not counting the header row).
I started to parse them as arrays but wasn't quite sure that was the right thing. What's important is that I compare row1 of the first file with with row1 of the second file. I can't just do a simple -match or -contains.
EDIT: One annoying thing is that the variables seem to hold the header row as well. When I call each one, the header is shown. But if I call both variables, I only see one header but two rows.
I just added the following check but getting the same results (False for everything):
$results = Compare-Object -ReferenceObject $newemp -DifferenceObject $ps -PassThru | ForEach-Object { $_.InputObject }
Using latkin's answer from here I think this would give you the result set you're looking for. As per latkin's comment, the property comparison is redundant for your purposes but I left it in as it's good to know. Additionally the header is specified even for the csv with headers to prevent the header row being included in the comparison.
$newemp = Import-Csv -Path "C:\Temp\_sotemp\Book1.csv" -Header loginid |
Select-Object "loginid"
$ps = Import-Csv -Path "C:\Temp\_sotemp\Book2.csv" -Header loginid |
Select-Object "loginid"
#get list of (imported) CSV properties
$props1 = $newemp | gm -MemberType NoteProperty | select -expand Name | sort
$props2 = $ps | gm -MemberType NoteProperty | select -expand Name | sort
#first check that properties match
#omit this step if you know for sure they will be
if(Compare-Object $props1 $props2){
throw "Properties are not the same! [$props1] [$props2]"
}
#pass properties list to Compare-Object
else{
Compare-Object $newemp $ps -Property $props1
}
In the second line, I see there a space "login id" and the first line doesn't have it. Could that be an issue. Try having the same name for the headers in the .csv files itself. And it works for without providing header or select statements. Below is my experiment based upon your input.
emp.csv
loginid firstname lastname
------------------------------
abc123 John patel
zxy321 Kohn smith
sdf120 Maun scott
tiy123 Dham rye
k2340 Naam mason
lk10j5 Shaan kelso
303sk Doug smith
empids.csv
loginid
-------
abc123
zxy321
sdf120
tiy123
PS C:\>$newemp = Import-csv C:\scripts\emp.csv
PS C:\>$ps = Import-CSV C:\scripts\empids.csv
PS C:\>$results = Compare-Object -ReferenceObject $newemp -DifferenceObject $ps | foreach { $_.InputObject}
Shows the difference objects that are not in $ps
loginid firstname lastname SideIndicator
------- --------- -------- -------------
k2340 Naam mason <=
lk10j5 Shaan kelso <=
303sk Doug smith <=
I am not sure if this is what you are looking for but i have used the PowerShell to do some CSV formatting for myself.
$test = Import-Csv .\Desktop\Vmtools-compare.csv
foreach ($i in $test) {
foreach ($n in $i.name) {
foreach ($m in $test) {
$check = "yes"
if ($n -eq $m.prod) {
$check = "no"
break
}
}
if ($check -ne "no") {$n}
}
}
this is how my excel csv file looks like:
prod name
1 3
2 5
3 8
4 2
5 0
and script outputs this:
8
0
so basically script takes each number under Name column and then checks it against prod column. If the number is there then it won't display else it will display that number.
I have also done it the opposite way:
$test = Import-Csv c:\test.csv
foreach ($i in $test) {
foreach ($n in $i.name) {
foreach ($m in $test) {
$check = "yes"
if ($n -eq $m.prod) {echo $n}
}
}
}
this is how my excel csv looks like:
prod name
1 3
2 5
3 8
4 2
5 0
and script outputs this:
3
5
2
so script shows the matching entries only.
You can play around with the code to look at different columns.