Comparing two files: Single column in FirstFile - Multiple columns in SecondFile - powershell

I've figured out how to compare single columns in two files, but I cant figure out how to compare two files, with one column in the first and multiple columns in the second file. Both containing emails.
First file.csv (contains single column with emails)
john#email.com
jack#email.com
jill#email.com
Second file.csv (contains multiple column with emails)
john#email.nl,john#email.eu,john#email.com
jill#email.se,jill#email.com,jill#email.us
By comparing I would like to output, the difference. This would result in.
Output.csv
jack#email.com
Anyone able to help me? :)
Single columns comparison and output difference
#Line extracts emails from list
$SubscribedMails = import-csv .\subscribed.csv | Select-Object -Property email
#Line extracts emails from list
$ValidEmails = import-csv .\users-emails.csv | Select-Object -Property email
$compare = Compare-Object $SubscribedMails $ValidEmails -property email -IncludeEqual | where-object {$_.SideIndicator -eq "<="} | Export-csv .\nonvalid-emails.csv –NoTypeInformation
(Get-Content .\nonvalid-emails.csv) | ForEach-Object { $_ -replace ',"<="' } > .\nonvalid-emails.csv

Since the first file already contains email addresses per column, you can import it right away.
Take the second file and split the strings containing several addresses.
A new array with seperate addresses will be generated.
Judging from your output, you only seek addresses that are within the first csv but not in the second.
Your code could look like this:
$firstFile = Get-Content 'FirstFile.csv'
$secondFile = (Get-Content 'SecondFile.csv').Split(',')
foreach ($item in $firstFile) {
if ($item -notin $secondFile) {
$item | Export-Csv output.csv -Append -NoTypeInformation
}
}

If you want to maintain your code, can you consider a script like:
#Line extracts emails from list
$SubscribedMails = import-csv .\subscribed.csv | Select-Object -Property email
Rename-Item .\users-emails.csv users-emails.csv.bk
Get-Content .\users-emails.csv.bk).replace(',', "`r`n") | Set-Content .\users-emails.csv
#Line extracts emails from list
$ValidEmails = import-csv .\users-emails.csv | Select-Object -Property email
$compare = Compare-Object $SubscribedMails $ValidEmails -property email -IncludeEqual | where-object {$_.SideIndicator -eq "<="} | Export-csv .\nonvalid-emails.csv –NoTypeInformation
(Get-Content .\nonvalid-emails.csv) | ForEach-Object { $_ -replace ',"<="' } > .\nonvalid-emails.csv
Remove-Item .\users-emails.csv
Rename-Item .\users-emails.csv.bk users-emails.csv
or, more simplest
#Line extracts emails from list
$SubscribedMails = import-csv .\subscribed.csv | Select-Object -Property email
Get-Content .\users-emails.csv).replace(',', "`r`n") | Set-Content .\users-emails.csv.bk
#Line extracts emails from list
$ValidEmails = import-csv .\users-emails.csv.bk | Select-Object -Property email
$compare = Compare-Object $SubscribedMails $ValidEmails -property email -IncludeEqual | where-object {$_.SideIndicator -eq "<="} | Export-csv .\nonvalid-emails.csv –NoTypeInformation
(Get-Content .\nonvalid-emails.csv) | ForEach-Object { $_ -replace ',"<="' } > .\nonvalid-emails.csv
Remove-Item .\users-emails.csv.bk

None of the suggestions so far works :(
Still hoping :)
Will delete comment when happy :p

Can you try this?
$One = (Get-Content .\FirstFile.csv).Split(',')
$Two = (Get-Content .\SecondFile.csv).Split(',')
$CsvPath = '.\Output.csv'
$Diff = #()
(Compare-Object ($One | Sort-Object) ($two | Sort-Object)| `
Where-Object {$_.SideIndicator -eq '<='}).inputobject | `
ForEach-Object {$Diff += New-Object PSObject -Property #{email=$_}}
$Diff | Export-Csv -Path $CsvPath -NoTypeInformation
Output.csv will contain entries that exist in FirstFile but not SecondFIle.

Related

Powershell - Combine CSV files and append a column

I'm trying (badly) to work through combining CSV files into one file and prepending a column that contains the file name. I'm new to PowerShell, so hopefully someone can help here.
I tried initially to do the well documented approach of using Import-Csv / Export-Csv, but I don't see any options to add columns.
Get-ChildItem -Filter *.csv | Select-Object -ExpandProperty FullName | Import-Csv | Export-Csv CombinedFile.txt -UseQuotes Never -NoTypeInformation -Append
Next I'm trying to loop through the files and append the name, which kind of works, but for some reason this stops after the first row is generated. Since it's not a CSV process, I have to use the switch to skip the first title row of each file.
$getFirstLine = $true
Get-ChildItem -Filter *.csv | Where-Object {$_.Name -NotMatch "Combined.csv"} | foreach {
$filePath = $_
$collection = Get-Content $filePath
foreach($lines in $collection) {
$lines = ($_.Basename + ";" + $lines)
}
$linesToWrite = switch($getFirstLine) {
$true {$lines}
$false {$lines | Select -Skip 1}
}
$getFirstLine = $false
Add-Content "Combined.csv" $linesToWrite
}
This is where the -PipelineVariable parameter comes in real handy. You can set a variable to represent the current iteration in the pipeline, so you can do things like this:
Get-ChildItem -Filter *.csv -PipelineVariable File | Where-Object {$_.Name -NotMatch "Combined.csv"} | ForEach-Object { Import-Csv $File.FullName } | Select *,#{l='OriginalFile';e={$File.Name}} | Export-Csv Combined.csv -Notypeinfo
Merging your CSVs into one and adding a column for the file's name can be done as follows, using a calculated property on Select-Object:
Get-ChildItem -Filter *.csv | ForEach-Object {
$fileName = $_.Name
Import-Csv $_.FullName | Select-Object #{
Name = 'FileName'
Expression = { $fileName }
}, *
} | Export-Csv path/to/merged.csv -NoTypeInformation

Powershell Compare 2 csv files on 2 objects and upload a 3 file

I want to compare 2 csv files test1 and test2. I want to export a test3 file only if the Department or the country has changed with the changed details. I have the following code but this compares the 2 whole files not just the Country and Department:
Compare-Object -ReferenceObject (Get-Content -Path C:\scripts\test1.csv) -DifferenceObject (Get-Content -Path C:\scripts\test2.csv) -PassThru | Where-Object{ $_.SideIndicator -eq "=>" } | % { $_ -Replace ',', ";"} | Out-File -FilePath C:\scripts\test3.csv
This is test1:
This is test2:
So test3 should be the same as test2. Anyone knows how you can compare the 2 files with just the changes of the country or department
Without using Compare-Object, you can do this like below:
$csv1 = Import-Csv -Path 'C:\scripts\test1.csv'
$csv2 = Import-Csv -Path 'C:\scripts\test2.csv'
$csv3 = foreach ($item in $csv2) {
$compare = $csv1 | Where-Object { $_.email -eq $item.email }
if ($compare.Country -ne $item.Country -or $compare.Department -ne $item.Department) {
# output the object from $csv2 to be collected in the new $csv3
$item
}
}
$csv3 | Export-Csv -Path 'C:\scripts\test3.csv' -Delimiter ';' -NoTypeInformation
If you do want to use Compare-Object, this also works:
$csv1 = Import-Csv -Path 'C:\scripts\test1.csv'
$csv2 = Import-Csv -Path 'C:\scripts\test2.csv'
Compare-Object -ReferenceObject $csv1 -DifferenceObject $csv2 -Property Country,Department -PassThru |
Where-Object{ $_.SideIndicator -eq "=>" } | Select-Object * -ExcludeProperty SideIndicator |
Export-Csv -Path 'C:\scripts\test3.csv' -Delimiter ';' -NoTypeInformation

Where-Object leaving blank rows

I'm again stuck on something that should be so simple. I have a CSV file in which I need to do a few string modifications and export it back out. The data looks like this:
FullName
--------
\\server\project\AOI
\\server\project\AOI\Folder1
\\server\project\AOI\Folder2
\\server\project\AOI\Folder3\User
I need to do the following:
Remove the "\\server\project" from each line but leave the rest of the line
Delete all rows which do not have a Folder (e.g., in the example above, the first row would be deleted but the other three would remain)
Delete any row with the word "User" in the path
Add a column called T/F with a value of "FALSE" for each record
Here is my initial attempt at this:
Get-Content C:\Folders.csv |
% {$_.replace('\\server\project\','')} |
Where-Object {$_ -match '\\'} |
#Removes User Folders rows from CSV
Where-Object {$_ -notmatch 'User'} |
Out-File C:\Folders-mod.csv
This works to a certain extent, except it deletes my header row and I have not found a way to add a column using Get-Content. For that, I have to use Import-Csv, which is fine, but it seems inefficient to be constantly reloading the same file. So I tried rewriting the above using Import-Csv instead of Get-Content:
$Folders = Import-Csv C:\Folders.csv
foreach ($Folder in $Folders) {
$Folder.FullName = $Folder.FullName.Replace('\\server\AOI\', '') |
Where-Object {$_ -match '\\'} |
Where-Object {$_ -notmatch 'User Files'}
}
$Folders | Export-Csv C:\Folders-mod.csv -NoTypeInformation
I haven't added the coding for adding the new column yet, but this keeps the header. However, I end up with a bunch of empty rows where the Where-Object deletes the line, and the only way I can find to get rid of them is to run the output file through a Get-Content command. This all seems overly complicated for something that should be simple.
So, what am I missing?
Thanks to TheMadTechnician for pointing out what I was doing wrong. Here is my final script (with additional column added):
$Folders= Import-CSV C:\Folders.csv
ForEach ($Folder in $Folders)
{
$Folder.FullName = $Folder.FullName.replace('\\server\project\','')
}
$Folders | Where-Object {$_ -match '\\' -and $_ -notmatch 'User'} |
Select-Object *,#{Name='T/F';Expression={'FALSE'}} |
Export-CSV C:\Folders.csv -NoTypeInformation
I would do this with a Table Array and pscustomobject.
#Create an empty Array
$Table = #()
#Manipulate the data
$Fullname = Get-Content C:\Folders.csv |
ForEach-Object {$_.replace('\\server\project\', '')} |
Where-Object {$_ -match '\\'} |
#Removes User Folders rows from CSV
Where-Object {$_ -notmatch 'User'}
#Define custom objects
Foreach ($name in $Fullname) {
$Table += [pscustomobject]#{'Fullname' = $name; 'T/F' = 'FALSE'}
}
#Export results to new csv
$Table | Export-CSV C:\Folders-mod.csv -NoTypeInformation
here's yet another way to do it ... [grin]
$FileList = #'
FullName
\\server\project\AOI
\\server\project\AOI\Folder1
\\server\project\AOI\Folder2
\\server\project\AOI\Folder3\User
'# | ConvertFrom-Csv
$ThingToRemove = '\\server\project'
$FileList |
Where-Object {
# toss out any blank lines
$_ -and
# toss out any lines with "user" in them
$_ -notmatch 'User'
} |
ForEach-Object {
[PSCustomObject]#{
FullName = $_.FullName -replace [regex]::Escape($ThingToRemove)
'T/F' = $False
}
}
output ...
FullName T/F
-------- ---
\AOI False
\AOI\Folder1 False
\AOI\Folder2 False
notes ...
putting a slash in the property name is ... icky [grin]
that requires wrapping the property name in quotes every time you need to access it. try another name - perhaps "Correct".
you can test for blank array items [lines] with $_ all on its own
the [regex]::Escape() stuff is really quite handy

Combining CSV files in Powershell - different headings

I need to take a slew of csv files from a directory and get them into an array in Powershell (to eventually manipulate and write back to a CSV).
The problem is there are 5 file types. I need around 8 columns from each. The columns are essentially the same, but have different headings.
Is there an easy way to do this? I started creating a custom object with my 8 fields, looping through the files importing each one, looking at the filename (which tells me the column names I need) and then a bunch of ifs to add it to my custom object array.
I was wondering if there is a simpler way...like with a template saying which columns from each file.
wound up doing this. It may have not been the most efficient, but works. I wound up writing out each file separately and combining at the end as PS really got bogged down (over a million rows combined).
$Newcsv = #()
$path = "c:\scrap\BWFILES\"
$files = gci -path $path -recurse -filter *.csv | Where-Object { ! ($_.psiscontainer) }
$counter=1
foreach($file in $files)
{
$csv = Import-Csv $file.FullName
if ($file.Name -like '*SAV*')
{
$Newcsv = $csv | Select-Object #{Name="PRODUCT";Expression={"SV"}},DMBRCH,DMACCT,DMSHRT
}
if ($file.Name -like '*TIME*')
{
$Newcsv = $csv | Select-Object #{Name="PRODUCT";Expression={"TM"}},TMBRCH,TMACCT,TMSHRT
}
if ($file.Name -like '*TRAN*')
{
$Newcsv = $csv | Select-Object #{Name="PRODUCT";Expression={"TR"}},DMBRCH,DMACCT,DMSHRT
}
if ($file.Name -like '*LN*')
{
$Newcsv = $csv | Select-Object #{Name="PRODUCT";Expression={"LN"}},LNBRCH,LNNOTE,LNSHRT
}
$Newcsv | Export-Csv "C:\scrap\$file.name$counter.csv" -force -notypeinformation
$counter++
}
get-childItem "c:\scrap\*.csv" | foreach {
$filePath = $_
$lines = $lines = Get-Content $filePath
$linesToWrite = switch($getFirstLine) {
$true {$lines}
$false {$lines | Select -Skip 1}
}
$getFirstLine = $false
Add-Content "c:\scrap\combined.csv" $linesToWrite
}
With a hashtable for reference, a little RegEx matching, and using the automatic variable $Matches in a ForEach-Object loop (alias % used) that could all be shortened to:
$path = "c:\scrap\BWFILES\"
$Reference = #{
'SAV' = 'SV'
'TIME' = 'TM'
'TRAN' = 'TR'
'LN'='LN'
}
Set-Content -Value "PRODUCT,BRCH,ACCT,SHRT" -Path 'c:\scrap\combined.csv'
gci -path $path -recurse -filter *.csv | Where-Object { !($_.psiscontainer) -and $_.Name -match ".*(SAV|TIME|TRAN|LN).*"}|%{
$Product = $Reference[($Matches[1])]
Import-CSV $_.FullName | Select-Object #{Name="PRODUCT";Expression={$Product}},*BRCH,#{l='Acct';e={$_.LNNOTE, $_.DMACCT, $_.TMACCT|?{$_}}},*SHRT | ConvertTo-Csv -NoTypeInformation | Select -Skip 1 | Add-Content 'c:\scrap\combined.csv'
}
That should produce the exact same file. Only kind of tricky part was the LNNOTE/TMACCT/DMACCT field since obviously you can't just do the same as like *SHRT.

Powershell csv remove lines

I have a CSV file (file1) that looks like: (User dirs and the size)
Initials,Size
User1,10
User2,100
User3,131
User4,140
I have another CSV file (file2) that looks like: (VIP users)
User2
User4
Now what I'm trying to do, is to update file1, so it looks like:
User1,10
User3,131
User2 and User4 is removed because they are in file2
I can get them removed, but at the same time I remove the size for all users, so my output is only the Users:
User1
User3
My code:
$SourcePath = "\\server1\info\SYSINFO\UsrSize"
$DestinationFile = "\\server1\info\SYSINFO\UsrSize\OverLimit\UsersOverLimit1.log"
$VIP_Exclusion_List = "\\server1\info\SYSINFO\UsrSize\OverLimit\_VIP_EXCLUSION_LIST.txt"
$Database = "\\server1\info\SYSINFO\UsrSize\OverLimit\_UsersOverLimitDATABASE.log"
$INT_SizeToLookFor = 100
dir $SourcePath -Filter usr*.txt | import-csv -delimiter "`t" |
Where-Object {[INT] $_."Size excl. Backup/Pst" -ge $INT_SizeToLookFor} |
Select-Object Initials,"Size excl. Backup/Pst" | convertto-csv -NoTypeInformation | % { $_ -replace '"', ""} | out-file $DestinationFile ;
$Userlist = import-csv $DestinationFile | Select-Object Initials |
convertto-csv -NoTypeInformation | % { $_ -replace '"', ""};
compare-object ($Userlist) (get-content $VIP_Exclusion_List) |
select-object inputObject | convertto-csv -NoTypeInformation |
% { $_ -replace '"', ""} | out-file "\\server1\info\SYSINFO\UsrSize\OverLimit\UsersOverLimitThisTime.log";
If the files are small-ish and you don't care too much about performance, then the following would be a trivial way:
$data = Import-Csv file1
$vips = Import-Csv file2
$data = $data | ?{ $vips -notcontains $_.Initials }
$data | Export-Csv file1_new -NoTypeInformation
A faster way would be to add the names to remove to a set, but given the things you're talking about here I doubt you'll get into the range of a few thousand or million users.
I solved it using this code:
$ArrayVIP = get-content $VIP_Exclusion_List
select-string $DestinationFile -pattern $ArrayVIP -notmatch |
select -expand line |
out-file $DestinationFile
Taken from here: Removing lines from a CSV