I'm importing a csv-file which looks like this:
id,value1.1,value1.2,value1.3,Value2.1,Value2.2,Value3.1,Value3.2
row1,v1.1,,v1.3
row2,,,,v2.1,v2.2
row3,,,,,,,v3.2
Now I want to check, if any of the value-properties in one group is set.
I can do
Import-Csv .\test.csv | where {$_.Value1.1 -or $_.Value1.2 -or $_.Value1.3}
or
Import-Csv .\test.csv | foreach {
if ($_.Value1 -or $_.Value2 -or $_.Value3) {
Write-Output $_
}
}
But my "real" csv-file contains about 200 columns and I have to check 31 properties x 5 different object types that are mixed up in this csv. So my code will be realy ugly.
Is there anything like
where {$_.Value1.*}
or
where {$ArrayWithPropertyNames}
?
You could easily use the Get-Member cmdlet to get the properties which have the correct prefix (just use * as a wildcard after the prefix).
So to achieve what you want you could just filter the data based on whether any of the properties with the correct prefix contains data.
The script below uses your sample data, with a row4 added, and filters the list to find all items which have a value in any property starting with value1.
$csv = #"
id,value1.1,value1.2,value1.3,Value2.1,Value2.2,Value3.1,Value3.2
row1,v1.1,,v1.3
row2,,,,v2.1,v2.2
row3,,,,,,,v3.2
row4,v1.1,,v1.3
"#
$data = ConvertFrom-csv $csv
$data | Where {
$currentDataItem = $_
$propertyValues = $currentDataItem |
# Get's all the properties with the correct prefix
Get-Member 'value1*' -MemberType NoteProperty |
# Gets the values for each of those properties
Foreach { $currentDataItem.($_.Name) } |
# Only keep the property value if it has a value
Where { $_ }
# Could just return $propertyValues, but this makes the intention clearer
$hasValueOnPrefixedProperty = $propertyValues.Length -gt 0
Write-Output $hasValueOnPrefixedProperty
}
Alternate solution:
$PropsToCheck = 'Value1*'
Import-csv .\test.csv |
Where {
(($_ | Select $PropsToCheck).psobject.properties.value) -contains ''
}
Related
let's say that I have several CSV files and I need to check a specific column and find values that exist in one file, but not in any of the others. I'm having a bit of trouble coming up with the best way to go about it as I wanted to use Compare-Object and possibly keep all columns and not just the one that contains the values I'm checking.
So I do indeed have several CSV files and they all have a Service Code column, and I'm trying to create a list for each Service Code that only appears in one file. So I would have "Service Codes only in CSV1", "Service Codes only in CSV2", etc.
Based on some testing and a semi-related question, I've come up with a workable solution, but with all of the nesting and For loops, I'm wondering if there is a more elegant method out there.
Here's what I do have:
$files = Get-ChildItem -LiteralPath "C:\temp\ItemCompare" -Include "*.csv"
$HashList = [System.Collections.Generic.List[System.Collections.Generic.HashSet[String]]]::New()
For ($i = 0; $i -lt $files.Count; $i++){
$TempHashSet = [System.Collections.Generic.HashSet[String]]::New([String[]](Import-Csv $files[$i])."Service Code")
$HashList.Add($TempHashSet)
}
$FinalHashList = [System.Collections.Generic.List[System.Collections.Generic.HashSet[String]]]::New()
For ($i = 0; $i -lt $HashList.Count; $i++){
$UniqueHS = [System.Collections.Generic.HashSet[String]]::New($HashList[$i])
For ($j = 0; $j -lt $HashList.Count; $j++){
#Skip the check when the HashSet would be compared to itself
If ($j -eq $i){Continue}
$UniqueHS.ExceptWith($HashList[$j])
}
$FinalHashList.Add($UniqueHS)
}
It seems a bit messy to me using so many different .NET references, and I know I could make it cleaner with a tag to say using namespace System.Collections.Generic, but I'm wondering if there is a way to make it work using Compare-Object which was my first attempt, or even just a simpler/more efficient method to filter each file.
I believe I found an "elegant" solution based on Group-Object, using only a single pipeline:
# Import all CSV files.
Get-ChildItem $PSScriptRoot\csv\*.csv -File -PipelineVariable file | Import-Csv |
# Add new column "FileName" to distinguish the files.
Select-Object *, #{ label = 'FileName'; expression = { $file.Name } } |
# Group by ServiceCode to get a list of files per distinct value.
Group-Object ServiceCode |
# Filter by ServiceCode values that exist only in a single file.
# Sort-Object -Unique takes care of possible duplicates within a single file.
Where-Object { ( $_.Group.FileName | Sort-Object -Unique ).Count -eq 1 } |
# Expand the groups so we get the original object structure back.
ForEach-Object Group |
# Format-Table requires sorting by FileName, for -GroupBy.
Sort-Object FileName |
# Finally pretty-print the result.
Format-Table -Property ServiceCode, Foo -GroupBy FileName
Test Input
a.csv:
ServiceCode,Foo
1,fop
2,fip
3,fap
b.csv:
ServiceCode,Foo
6,bar
6,baz
3,bam
2,bir
4,biz
c.csv:
ServiceCode,Foo
2,bla
5,blu
1,bli
Output
FileName: b.csv
ServiceCode Foo
----------- ---
4 biz
6 bar
6 baz
FileName: c.csv
ServiceCode Foo
----------- ---
5 blu
Looks correct to me. The values 1, 2 and 3 are duplicated between multiple files, so they are excluded. 4, 5 and 6 exist only in single files, while 6 is a duplicate value only within a single file.
Understanding the code
Maybe it is easier to understand how this code works, by looking at the intermediate output of the pipeline produced by the Group-Object line:
Count Name Group
----- ---- -----
2 1 {#{ServiceCode=1; Foo=fop; FileName=a.csv}, #{ServiceCode=1; Foo=bli; FileName=c.csv}}
3 2 {#{ServiceCode=2; Foo=fip; FileName=a.csv}, #{ServiceCode=2; Foo=bir; FileName=b.csv}, #{ServiceCode=2; Foo=bla; FileName=c.csv}}
2 3 {#{ServiceCode=3; Foo=fap; FileName=a.csv}, #{ServiceCode=3; Foo=bam; FileName=b.csv}}
1 4 {#{ServiceCode=4; Foo=biz; FileName=b.csv}}
1 5 {#{ServiceCode=5; Foo=blu; FileName=c.csv}}
2 6 {#{ServiceCode=6; Foo=bar; FileName=b.csv}, #{ServiceCode=6; Foo=baz; FileName=b.csv}}
Here the Name contains the unique ServiceCode values, while Group "links" the data to the files.
From here it should already be clear how to find values that exist only in single files. If duplicate ServiceCode values within a single file wouldn't be allowed, we could even simplify the filter to Where-Object Count -eq 1. Since it was stated that dupes within single files may exist, we need the Sort-Object -Unique to count multiple equal file names within a group as only one.
It is not completely clear what you expect as an output.
If this is just the ServiceCodes that intersect then this is actually a duplicate with:
Comparing two arrays & get the values which are not common
Union and Intersection in PowerShell?
But taking that you actually want the related object and files, you might use this approach:
$HashTable = #{}
ForEach ($File in Get-ChildItem .\*.csv) {
ForEach ($Object in (Import-Csv $File)) {
$HashTable[$Object.ServiceCode] = $Object |Select-Object *,
#{ n='File'; e={ $File.Name } },
#{ n='Count'; e={ $HashTable[$Object.ServiceCode].Count + 1 } }
}
}
$HashTable.Values |Where-Object Count -eq 1
Here is my take on this fun exercise, I'm using a similar approach as yours with the HashSet but adding [System.StringComparer]::OrdinalIgnoreCase to leverage the .Contains(..) method:
using namespace System.Collections.Generic
# Generate Random CSVs:
$charset = 'abABcdCD0123xXyYzZ'
$ran = [random]::new()
$csvs = #{}
foreach($i in 1..50) # Create 50 CSVs for testing
{
$csvs["csv$i"] = foreach($z in 1..50) # With 50 Rows
{
$index = (0..2).ForEach({ $ran.Next($charset.Length) })
[pscustomobject]#{
ServiceCode = [string]::new($charset[$index])
Data = $ran.Next()
}
}
}
# Get Unique 'ServiceCode' per CSV:
$result = #{}
foreach($key in $csvs.Keys)
{
# Get all unique `ServiceCode` from the other CSVs
$tempHash = [HashSet[string]]::new(
[string[]]($csvs[$csvs.Keys -ne $key].ServiceCode),
[System.StringComparer]::OrdinalIgnoreCase
)
# Filter the unique `ServiceCode`
$result[$key] = foreach($line in $csvs[$key])
{
if(-not $tempHash.Contains($line.ServiceCode))
{
$line
}
}
}
# Test if the code worked,
# If something is returned from here means it didn't work
foreach($key in $result.Keys)
{
$tmp = $result[$result.Keys -ne $key].ServiceCode
foreach($val in $result[$key])
{
if($val.ServiceCode -in $tmp)
{
$val
}
}
}
i was able to get unique items as follow
# Get all items of CSVs in a single variable with adding the file name at the last column
$CSVs = Get-ChildItem "C:\temp\ItemCompare\*.csv" | ForEach-Object {
$CSV = Import-CSV -Path $_.FullName
$FileName = $_.Name
$CSV | Select-Object *,#{N='Filename';E={$FileName}}
}
Foreach($line in $CSVs){
$ServiceCode = $line.ServiceCode
$file = $line.Filename
if (!($CSVs | where {$_.ServiceCode -eq $ServiceCode -and $_.filename -ne $file})){
$line
}
}
This question already has an answer here:
Not all properties displayed
(1 answer)
Closed 1 year ago.
This is a follow-up question from PowerShell | EVTX | Compare Message with Array (Like)
I changed the tactic slightly, now I am collecting all the services installed,
$7045 = Get-WinEvent -FilterHashtable #{ Path="1system.evtx"; Id = 7045 } | select
#{N=’Timestamp’; E={$_.TimeCreated.ToUniversalTime().ToString('yyyy-MM-ddTHH:mm:ssZ')}},
Id,
#{N=’Machine Name’; E={$_.MachineName}},
#{N=’Service Name’; E={$_.Properties[0].Value}},#{N=’Image Path’;E=$_.Properties[1].Value}},
#{N=’RunAsUser’; E={$_.Properties[4].Value}},#{N=’Installed By’; E={$_.UserId}}
Now I match each object for any suspicious traits and if found, I add a column 'Suspicious' with the value 'Yes'. This is because I want to leave the decision upto the analyst and pretty sure the bad guys might use something we've not seen before.
foreach ($Evt in $7045)
{
if ($Evt.'Image Path' -match $sus)
{
$Evt | Add-Member -MemberType NoteProperty -Name 'Suspicious' -Value 'Yes'
}
}
Now, I'm unable to get PowerShell to display all columns unless I specifically Select them
$7045 | Format-Table
Same goes for CSV Export. The first two don't include the Suspicious Column but the third one does but that's because I'm explicitly asking it to.
$7045 | select * | Export-Csv -Path test.csv -NoTypeInformation
$7045 | Export-Csv -Path test.csv -NoTypeInformation
$7045 | Select-Object Timestamp, Id, 'Machine Name', 'Service Name', 'Image Path', 'RunAsUser', 'Installed By', Suspicious | Export-Csv -Path test.csv -NoTypeInformation
I read the Export-CSV documentation on MS. Searched StackOverFlow for some tips, I think it has something to do with PS checking the first Row and then compares if the property exists for the second row and so on.
Thank you
The issue you're experiencing is partially because of how objects are displayed to the console, the first object's Properties determines the displayed Properties (Columns) to the console.
The bigger problem though, is that Export-Csv will not export those properties that do not match with first object's properties unless they're explicitly added to the remaining objects or the objects are reconstructed, for this one easy way is to use Select-Object as you have pointed out in the question.
Given the following example:
$test = #(
[pscustomobject]#{
A = 'ValA'
}
[pscustomobject]#{
A = 'ValA'
B = 'ValB'
}
[pscustomobject]#{
C = 'ValC'
D = 'ValD'
E = 'ValE'
}
)
Format-Table will not display the properties B to E:
$test | Format-Table
A
-
ValA
ValA
Format-List can display the objects properly, this is because each property with it's corresponding value has it's own console line in the display:
PS /> $test | Format-List
A : ValA
A : ValA
B : ValB
C : ValC
D : ValD
E : ValE
Export-Csv and ConvertTo-Csv will also miss properties B to E:
$test | ConvertTo-Csv
"A"
"ValA"
"ValA"
You have different options as a workaround for this, you could either add the Suspicious property to all objects and for those events that are not suspicious you could add $null as Value.
Another workaround is to use Select-Object explicitly calling the Suspicious property (this works because you know the property is there and you know it's Name).
If you did not know how many properties your objects had, a dynamic way to solve this would be to discover their properties using the PSObject intrinsic member.
using namespace System.Collections.Generic
function ConvertTo-NormalizedObject {
[CmdletBinding()]
param(
[Parameter(ValueFromPipeline, Mandatory)]
[object[]] $InputObject
)
begin {
$list = [List[object]]::new()
$props = [HashSet[string]]::new([StringComparer]::InvariantCultureIgnoreCase)
}
process {
foreach($object in $InputObject) {
$list.Add($object)
foreach($property in $object.PSObject.Properties) {
$null = $props.Add($property.Name)
}
}
}
end {
$list | Select-Object ([object[]] $props)
}
}
Usage:
# From Pipeline
$test | ConvertTo-NormalizedObject | Format-Table
# From Positional / Named parameter binding
ConvertTo-NormalizedObject $test | Format-Table
Lastly, a pretty easy way of doing it thanks to Select-Object -Unique:
$prop = $test.ForEach{ $_.PSObject.Properties.Name } | Select-Object -Unique
$test | Select-Object $prop
Using $test for this example, the result would become:
A B C D E
- - - - -
ValA
ValA ValB
ValC ValD ValE
Continuing from my previous answer, you can add a column Suspicious straight away if you take out the Where-Object filter and simply add another calculated property to the Select-Object cmdlet:
# create a regex for the suspicious executables:
$sus = '(powershell|cmd|psexesvc)\.exe'
# alternatively you can join the array items like this:
# $sus = ('powershell.exe','cmd.exe','psexesvc.exe' | ForEach-Object {[regex]::Escape($_)}) -join '|'
$7045 = Get-WinEvent -FilterHashtable #{ LogName = 'System';Id = 7045 } |
Select-Object Id,
#{N='Timestamp';E={$_.TimeCreated.ToUniversalTime().ToString('yyyy-MM-ddTHH:mm:ssZ')}},
#{N='Machine Name';E={$_.MachineName}},
#{N='Service Name'; E={$_.Properties[0].Value}},
#{N='Image Path'; E={$_.Properties[1].Value}},
#{N='RunAsUser'; E={$_.Properties[4].Value}},
#{N='Installed By'; E={$_.UserId}},
#{N='Suspicious'; E={
if ($_.Properties[1].Value -match $sus) { 'Yes' } else {'No'}
}}
$7045 | Export-Csv -Path 'X:\Services.csv' -UseCulture -NoTypeInformation
Because you have many columns, this will not fit the console width anymore if you do $7045 | Format-Table, but the CSV file will hold all columns you wanted.
I added switch -UseCulture to the Export-Csv cmdlet, which makes sure you can simply double-click the csv file so it opens correctly in your Excel.
As sidenote: Please do not use those curly so-called 'smart-quotes' in code as they may lead to unforeseen errors. Straighten these ’ thingies and use normal double or single quotes (" and ')
In my PowerShell script, I'm working with a CSV file that looks like this (with a number of rows and columns that can vary, but there will always be at least the headers and the first 2 columns):
OS;IP;user0;user1;user3
Windows;10.0.0.1;;;
Linux;hostname2;0;;1
Linux;10.0.0.3;;0;0
Linux;hostname4;;;
Windows;hostname5;1;1;1
I basically list servers in the first column and users in the first row (CSV header). This represents a user "access granting" matrix to servers (1 for "give access", 0 for "remove access", and void for "don't change").
I'm looking for a way to extract only the rows that include a value equal to "1" or "0" between (and including) the 3rd and last column. (= to eventually get the list of servers where access rights should be changed)
So taking the above example, I only want the following lines returned:
Linux;hostname2;0;;1
Linux;10.0.0.3;;0;0
Windows;hostname5;1;1;1
Any hints to make this possible? Or the opposite (getting the ones without any 0 or 1)?
Even if it means using "Get-Content" instead of "Import-CSV". I don't care about the 1st (headers) row; I know how to exclude that.
Thank you!
--- Final solution, thanks to #Tomalak's answer:
$AccessMatrix = Import-CSV $CSVfile -delimiter ';'
$columns = $AccessMatrix | Get-Member -MemberType NoteProperty | Select-Object -Skip 2 -ExpandProperty Name
$AccessMatrix = $AccessMatrix | ForEach-Object {
$row = $_
foreach ($col in $columns) {
if ($row.$col.trim() -eq "1" -OR $row.$col.trim() -eq "0") {
$row # this pushes the $row onto the pipeline
break
}
}
}
The following uses Get-Member to select the names of all columns after the first two.
Then, using ForEach-Object, we can output only those rows that have a value in any of those columns.
$data = ConvertFrom-Csv "OS;IP;user0;user1;user3
Windows;10.0.0.1;;;
Linux;hostname2;0;;1
Linux;10.0.0.3;;0;0
Linux;hostname4;;;
Windows;hostname5;1;1;1" -Delimiter ";"
$columns = $data | Get-Member -MemberType NoteProperty | Select-Object -Skip 2 -ExpandProperty Name
$data | ForEach-Object {
$row = $_
foreach ($col in $columns) {
if ($row.$col -ne "") {
$row # this pushes the $row onto the pipeline
break
}
}
}
The break statement stops the execution of the inner foreach loop because there is no point in further checking as soon as the first column with any value is found.
This is equivalent to the above, if you prefer Where-Object:
$data | Where-Object {
$row = $_
foreach ($col in $columns) {
if ($row.$col -ne "") {
return $true
}
}
}
Is it possible to display the results of a PowerShell Compare-Object in two columns showing the differences of reference vs difference objects?
For example using my current cmdline:
Compare-Object $Base $Test
Gives:
InputObject SideIndicator
987654 =>
555555 <=
123456 <=
In reality the list is rather long. For easier data reading is it possible to format the data like so:
Base Test
555555 987654
123456
So each column shows which elements exist in that object vs the other.
For bonus points it would be fantastic to have a count in the column header like so:
Base(2) Test(1)
555555 987654
123456
Possible? Sure. Feasible? Not so much. PowerShell wasn't really built for creating this kind of tabular output. What you can do is collect the differences in a hashtable as nested arrays by input file:
$ht = #{}
Compare-Object $Base $Test | ForEach-Object {
$value = $_.InputObject
switch ($_.SideIndicator) {
'=>' { $ht['Test'] += #($value) }
'<=' { $ht['Base'] += #($value) }
}
}
then transpose the hashtable:
$cnt = $ht.Values |
ForEach-Object { $_.Count } |
Sort-Object |
Select-Object -Last 1
$keys = $ht.Keys | Sort-Object
0..($cnt-1) | ForEach-Object {
$props = [ordered]#{}
foreach ($key in $keys) {
$props[$key] = $ht[$key][$_]
}
New-Object -Type PSObject -Property $props
} | Format-Table -AutoSize
To include the item count in the header name change $props[$key] to $props["$key($($ht[$key].Count))"].
How does one access data imported from a CSV file by using dynamic note property names? That is, one doesn't know the colunm names beforehand. They do match a pattern and are extracted from the CSV file when the script runs.
As for an example, consider a CSV file:
"Header 1","Header A","Header 3","Header B"
0,0,0,0
1,2,3,4
5,6,7,8
I'd like to extract only columns that end with a letter. To do this, I read the header row and extract names with a regex like so,
$reader = new-object IO.StreamReader("C:\tmp\data.csv")
$line = $reader.ReadLine()
$headers = #()
$line.Split(",") | % {
$m = [regex]::match($_, '("Header [A-Z]")')
if($m.Success) { $headers += $m.value } }
This will get all the column names I care about:
"Header A"
"Header B"
Now, to access a CSV file I import it like so,
$csvData = import-csv "C:\tmp\data.csv"
Import-CSV will create a custom object that has properties as per the header row. One can access the fields by NoteProperty names like so,
$csvData | % { $_."Header A" } # Works fine
This obviously requires one to know the column name in advance. I'd like to use colunn names I extracted and stored into the $headers. How would I do that?
Some things I've tried so far
$csvData | % { $_.$headers[0] } # Error: Cannot index into a null array.
$csvData | % { $np = $headers[0]; $_.$np } # Doesn't print anything.
$csvData | % { $_.$($headers[0]) } # Doesn't print anything.
I could change the script like so it will write another a script that does know the column names. Is that my only solution?
I think you want this:
[string[]]$headers = $csvdata | gm -MemberType "noteproperty" |
?{ $_.Name -match "Header [a-zA-Z]$"} |
select -expand Name
$csvdata | select $headers
Choose the headers that match the condition (in this case, ones ending with characters) and then get the csv data for those headers.
the first thing ( and the only one... sorry) that came in my mind is:
$csvData | % { $_.$(( $csvData | gm | ? { $_.membertype -eq "noteproperty"} )[0].name) }
for get the first's column values and
$csvData | % { $_.$(( $csvData | gm | ? { $_.membertype -eq "noteproperty"} )[1].name) }
for second column and so on....
is this what you need?
you can use custom script to parse csv manually:
$content = Get-Content "C:\tmp\data.csv"
$header = $content | Select -first 1
$columns = $header.Split(",")
$indexes = #()
for($i; $i -lt $columns.Count;$i++)
{
# to verify whether string end with letter matches this regex: "[A-Za-z]$"
if ($column[$i] -match "[A-Za-z]$")
{
$indexes += $i
}
}
$outputFile = "C:\tmp\outdata.csv"
Remove-Item $outputFile -ea 0
foreach ($line in $content)
{
$output = ""
$rowcol = $line.Split(",")
[string]::Join(",", ($indexes | foreach { $rowcol[$_] })) | Add-Content $outputFile
}