Issue with Array subtraction in Powershell

Issue with Array subtraction in Powershell - powershell

I have two CSVs as following:
A
20180809000
20180809555
20180809666
20180809777
20180809888
File2:
A
20180809000
20180809555
20180809666
20180809777
I want to find difference of File1 - File2 which should output 20180809888. I tried the following:
$a1= Import-Csv -Path $file1 | select A
$a2 = Import-Csv -Path $file2 | select A
$a1| where {$a2 -notcontains $_}
But it outputs the entire file 1:
A
--------------
20180809000
20180809555
20180809666
20180809777
20180809888
I tried intersection also, but that outputs null.

The simplest solution is to use:
> Compare-Object (Get-Content .\File1.csv) (Get-Content .\File2.csv) -PassThru
20180809888
Or using Import-Csv
> Compare-Object (Import-Csv .\File1.csv).A (Import-Csv .\File2.csv).A -Passthru
20180809888
Or
> (Compare-Object (Import-Csv .\File1.csv) (Import-Csv .\File2.csv) -Passthru).A
20180809888

Your last line should be the following:
$a1.A.where{$_ -notin $a2.A}
To preserve the column, you can do the following for the last line:
$a1.where{$_.A -notin $a2.A}
The problem with this situation is that if the second file has more data than the first file. Then you would need to do something like this for your last line:
$a1 | compare $a2 | select -expand inputobject

select A will still return an object with a property named A.
# Returns an object list with property A
Import-Csv -Path $file | select A # (shorthand for Select-Object -Property A)
# A
# ---
# value1
# value2
# ...
You can get the array of values of property A using dot notation, e.g.:
# Returns the list of values of the A property
(Import-Csv -Path $file).A
# value1
# value2
# ...
The following should work:
$a1= (Import-Csv -Path $file1).A
$a2 = (Import-Csv -Path $file2).A
$a1 | where {$a2 -notcontains $_}

Related

Powershell - Check if Value in CSV exists

I want to check if a string exists in a csv file.
I'm trying to use if ($PCname -in $logFileLocation) { write-output "true" } else { write-output "false" }
However this always returns false.
How can I check for a value within a csv file?

You can integrate a query within the result of dir ( Get-Childitem )
$Query = "yourString"
$List = Get-Childitem *.CSV |
Select-Object Name,#{Name="MatchesQuery"; Expression={($_ | Get-Content -raw) -match $Query}}
$List
Attention: $Query will be interpreted as regular expression
That way you get a list of all files with the information that it is matching your cirteria or not. That way you can filter afterwards like that:
$List | where {$_.MatchesQuery -eq $true}
Depending in your CSV it might be better and more relyable if you do not only Get-Content but use ConvertFrom-CSV and select the column you want to search in.
This version reads CSV and searches the column "ComputerName" for your $Query
$Query = "yourString"
$List = Get-Childitem *.CSV | Select-Object Name,#{Name="MatchesQuery"; Expression={($_ | Get-Content -raw | ConvertFrom-CSV -Delimiter ";" | where {$_.ComputerName -eq "$Query").Count -gt 0}}}
$List

this could be helpful:
$find = "some_string"
Get-Content C:\temp\demo.csv | Select-String $find

Specific match between two CSV files

I have two CSV files like this:
CSV1:
Name
test;
test & example;
test & example & testagain;
CSV2:
Name
test1;
test&example;
test & example&testagain;
I want to compare each line of CSV1 with each line of CSV2 and, if the first 5 letters match, write the result.
I'm able to compare them but only if match perfectly:
$CSV1 = Import-Csv -Path ".\client.csv" -Delimiter ";"
$CSV2 = Import-Csv ".\client1.csv" -Delimiter ";"
foreach ($record in $CSV1) {
$result = $CSV2 | Where {$_.name -like $record.name}
$result
}

You can do so with Compare-Object and a custom property definition.
Compare-Object $CSV1 $CSV2 -Property {$_.name -replace '^(.{5}).*', '$1'} -PassThru
$_.name -replace '^(.{5}).*', '$1' will take the first 5 characters from the property name (or less if the string is shorter than 5 characters) and remove the rest. This property is then used for comparing the records from $CSV1 and $CSV2. The parameter -PassThru makes the cmdlet emit the original data rather than objects with just the custom property. In theory you could also use $_.name.Substring(0, 5) instead of a regular expression replacement for extracting the first 5 characters. However, that would throw an error if the name is shorter than 5 characters like in the first record from $CSV1.
By default Compare-Object outputs the differences between the input objects, so you also need to add the parameters -IncludeEqual and -ExcludeDifferent to get just the matching records.
Pipe the result through Select-Object * -Exclude SideIndicator to remove the property SideIndicator from the output.

foreach ($record in $CSV1) {
$CSV2 | Where {"$($_.name)12345".SubString(0, 5) -eq "$($record.name)12345".SubString(0, 5)} |
ForEach {[PSCustomObject]#{Name1 = $Record.Name; Name2 = $_.Name}}
}
or:
... | Where {($_.name[0..4] -Join '') -eq ($record.name[0..4] -Join '')} | ...
Using this Join-Object cmdlet:
$CSV1 | Join $CSV2 `
-Using {($Left.name[0..4] -Join '') -eq ($Right.name[0..4] -Join '')} `
-Property #{Name1 = {$Left.Name}; Name2 = {$Right.Name}}
All the above result in:
Name1 Name2
----- -----
test & example; test & example&testagain;
test & example & testagain; test & example&testagain;

Need to output multiple rows to CSV file

I am using the following script that iterates through hundreds of text files looking for specific instances of the regex expression within. I need to add a second data point to the array, which tells me the object the pattern matched in.
In the below script the [Regex]::Matches($str, $Pattern) | % { $_.Value } piece returns multiple rows per file, which cannot be easily output to a file.
What I would like to know is, how would I output a 2 column CSV file, one column with the file name (which should be $_.FullName), and one column with the regex results? The code of where I am at now is below.
$FolderPath = "C:\Test"
$Pattern = "(?i)(?<=\b^test\b)\s+(\w+)\S+"
$Lines = #()
Get-ChildItem -Recurse $FolderPath -File | ForEach-Object {
$_.FullName
$str = Get-Content $_.FullName
$Lines += [Regex]::Matches($str, $Pattern) |
% { $_.Value } |
Sort-Object |
Get-Unique
}
$Lines = $Lines.Trim().ToUpper() -replace '[\r\n]+', ' ' -replace ";", '' |
Sort-Object |
Get-Unique # Cleaning up data in array

I can think of two ways but the simplest way is to use a hashtable (dict). Another way is create psobjects to fill your Lines variable. I am going to go with the simple way so you can only use one variable, the hashtable.
$FolderPath = "C:\Test"
$Pattern = "(?i)(?<=\b^test\b)\s+(\w+)\S+"
$Results =#{}
Get-ChildItem -Recurse $FolderPath -File |
ForEach-Object {
$str = Get-Content $_.FullName
$Line = [regex]::matches($str,$Pattern) | % { $_.Value } | Sort-Object | Get-Unique
$Line = $Line.Trim().ToUpper() -Replace '[\r\n]+', ' ' -Replace ";",'' | Sort-Object | Get-Unique # Cleaning up data in array
$Results[$_.FullName] = $Line
}
$Results.GetEnumerator() | Select #{L="Folder";E={$_.Key}}, #{L="Matches";E={$_.Value}} | Export-Csv -NoType -Path <Path to save CSV>
Your results will be in $Results. $Result.keys contain the folder names. $Results.Values has the results from expression. You can reference the results of a particular folder by its key $Results["Folder path"]. of course it will error if the key does not exist.

Get first two items positionally from Import-CSV row

I have a series of files that have changed some header naming and column counts over time. However, the files always have the first column as the start date and second column as the end date.
I would like to get just these two columns, but the name has changed over time.
What I have tried is this:
$FileContents=Import-CSV -Path "$InputFilePath"
foreach ($line in $FileContents)
{
$StartDate=$line[0]
$EndDate=$line[1]
}
...but $FileContents is (I believe) an array of a type (objects?) that I'm not sure how to positionally access in PowerShell. Any help would be appreciated.
Edit: The files switched from comma delimiter to pipe delimiter a while back and there are 1000s of files to work with, so I use Import-CSV because it can implicitly read either format.

You could use the -Header parameter to give the first to columns of the csv the header names you want. Then you'll skip the first line that has the old header.
$FileContents = Import-CSV -Path "$InputFilePath" -Header "StartDate","EndDate" | Select-Object "StartDate","EndDate" -Skip 1
foreach ($line in $FileContents) {
$StartDate = $line.StartDate
$EndDate = $line.EndDate
}
Here's an example:
Example.csv
a,b,c
1,2,3
4,5,6
Import-CSV -Path Example.csv -Header "StartDate","EndDate" | Select-Object "StartDate","EndDate" -Skip 1
StartDate EndDate
--------- -------
1 2
4 5

If you use Import-Csv, PowerShell will indeed create an object for you. The "columns" are calles properties. You can select properties with Select-Object. You have to name the properties, you want to select. Since you don't know the property names in advance, you can get the names with Get-Member. The first two properties should match the first two columns in your CSV.
Use the following sample code and apply it to your script:
$csv = #'
1,2,3,4,5
a,b,c,d,e
g,h,i,j,k
'#
$csv = $csv | ConvertFrom-Csv
$properties = $csv | Get-Member -MemberType NoteProperty | Select-Object -First 2 -ExpandProperty Name
$csv | Select-Object -Property $properties

How about this:
$FileContents=get-content -Path "$InputFilePath"
for ($i=0;$i -lt $FileContents.count;$i++){
$textrow = ($FileContents[$i]).split(",")
$StartDate=$textrow[0]
$EndDate=$textrow[1]
#do what you want with the variables
write-host $startdate
write-host $EndDate
}
pending you are referencing a csv file....

Other solution with foreach (%=alias of foreach) and split :
Get-Content "example.csv" | select -skip 1 | %{$row=$_ -split ',', 3; [pscustomobject]#{NewCol1=$row[0];NewCol2=$row[1]}}

You can build predicate into the select too like this :
Get-Content "example.csv" | select #{N="Newcol1";E={($_ -split ',', 3)[0]}}, #{N="Newcol2";E={($_ -split ',', 3)[1]}} -skip 1

With convertfrom-string
Get-Content "example.csv" | ConvertFrom-Csv -Delimiter ',' -Header col1, col2 | select -skip 1

Using Import-CSV in Powershell, ignoring commented lines

I think that I must be missing something obvious because I'm trying to use Import-CSV to import CSV files that have commented out lines (always beginning with a # as the first character) at the top of the file, so the file looks like this:
#[SpecialCSV],,,,,,,,,,,,,,,,,,,,
#Version,1.0.0,,,,,,,,,,,,,,,,,,,
#,,,,,,,,,,,,,,,,,,,,
#,,,,,,,,,,,,,,,,,,,,
#[Table],,,,,,,,,,,,,,,,,,,,
Header1,Header2,Header3,Header4,Header5,Header6,Header7,...
Data1,Data2,Data3,Data4,Data5,Data6,Data7,...
I'd like to ignore those first 5 lines, but still use Import-csv to get the rest of the information nicely in to Powershell.
Thanks

Simple - just use Select-String to exclude commented lines with a regex, and pipe to ConvertFrom-Csv:
Get-Content <path to CSV file> | Select-String '^[^#]' | ConvertFrom-Csv
The difference between Import-Csv and ConvertTo-Csv is that the former takes input from a file, and the latter takes pipeline input, otherwise they do the same thing - convert CSV data to an array of PSCustomObjects. So, by using ConvertFrom-Csv you can do this without modifying the CSV flie or using a temp file. You can assign the results to an array or pipe to a Foreach-Object block just as you'd do with Import-Csv:
$array = Get-Content <path to CSV file> | Select-String '^[^#]' | ConvertFrom-Csv
or
Get-Content <path to CSV file> | Select-String '^[^#]' | ConvertFrom-Csv | %{
<whatever you want do with the data>
}

CSV has no notion of "comments" - it's just flat data. You'll need to use Get-Content and inspect each line. If a line starts with #, ignore it, otherwise process it.
If you're OK with using a temp file:
Get-content special.csv |where-object{!$_.StartsWith("#")}|add-content -path $(join-path -path $env:temp -childpath "special-filtered.csv");
$mydata = import-csv -path $(join-path -path $env:temp -childpath "special-filtered.csv");
remove-item -path $(join-path -path $env:temp -childpath "special-filtered.csv")
$mydata |format-table -autosize; #Just for illustration
Edit: Forgot about convertfrom-csv. It gets much simpler this way.
$mydata = Get-Content special.csv |
Where-Object { !$_.StartsWith("#") } |
ConvertFrom-Csv

If you feed convertfrom-csv csv data as an array of lines it seems to automatically filter out comments. I frequently use convertfrom-csv this way but I haven't seen it documented.
cat data.csv | convertfrom-csv #skips commented lines automagically
("co1,col2,col3", "abc,def,ghi", "#this,is,a,comment", "abc1,def1,ghi1")|convertfrom-csv
co1 col2 col3
--- ---- ----
abc def ghi
abc1 def1 ghi1
However, the following will not skip comments:
"co1,col2,col3
abc,def,ghi
#this,is,a,comment
abc1,def1,ghi1
"|convertfrom-csv
co1 col2 col3
--- ---- ----
abc def ghi
#this is a
abc1 def1 ghi1

Where-object will work after import-csv as well. You just have to reference the first column from csv in the clause.
e.g.:
$EscapeCharacter = '#'
$FilteredData = Import-Csv -Path "$($Home)\Documents\sample.csv" -Delimiter "`t" -Encoding UTF8 | Where-Object {$_.coll1 -notlike "$EscapeCharacter*"}
The sample of tab delimited csv:
coll1 coll2
#Kotehulky SomeValue
Cakovice OtherValue

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Issue with Array subtraction in Powershell - powershell

Related

Powershell - Check if Value in CSV exists

Specific match between two CSV files

Need to output multiple rows to CSV file

Get first two items positionally from Import-CSV row

Using Import-CSV in Powershell, ignoring commented lines

Categories

Resources