Compare 2 csv files and match based on 1 column then export new file that contains fields from both - powershell

I have 2 csv files. Each have different headers and different number of columns, and have different number of entries.
Here are some examples of the first couple lines
CSV 1
ID,Last_Name,First_Name,Middle_Name,Email_Addr,Title,Gender
###1,smith,bill,p,smith#soso.com,boss,m
###2,smith2,billy,p,smith2#soso.com,someguy,m
CSV 2
ID,Name Id,Last Name,First Name,Middle Name,Gender
###2,ID1010,smith2,billy,p,M
I am trying to import them and compare the ID column. When a match is found I am wanting a new csv file with All info from CSV 1 and the matched Name ID from csv 2.
New CSV Example:
ID,Last_Name,First_Name,Middle_Name,Email_Addr,Title,Gender,Name Id
###1,smith,bill,p,smith#soso.com,boss,m,
###2,smith2,billy,p,smith2#soso.com,someguy,m,ID1010
Ive been looking and came across this Stackoverflow from about a year ago that seemed to be on the right track but I cant seem to get code modified for my needs. Here is what I have tried.
$csv1 = Import-Csv -Path C:\STAFF\test1sky.csv
$csv2 = Import-Csv -Path C:\STAFF\test1power.csv
ForEach($Record in $csv2){
$MatchedValue = (Compare-Object $csv1 $Record -Property "ID" -IncludeEqual -ExcludeDifferent -PassThru).value
$Record = Add-Member -InputObject $Record -Type NoteProperty -Name "Name Id" -Value $MatchedValue
}
$csv2|Export-Csv 'C:\STAFF\combined.csv' -NoTypeInformation
I get the correct header in the new file but I never get the Name ID values to come though.
Any idea where I went wrong? I maybe on the wrong path completely and there be a easier way, but I need to be able to do this nightly without user interaction. Any help is appreciated!!

Let's try to simplify this. Add the 'Name ID' field to all records in CSV1. Then loop through it, and get the matches, and update the field. Something like:
$CSV1 = C:\Path\To\File1.csv
$CSV2 = C:\Path\To\File2.csv
$CSV1|ForEach{$_|Add-Member 'Name ID' $Null}
ForEach($Record in $CSV1){
$Record.'Name ID' = $CSV2|Where{$_.ID -eq $Record.ID}|Select -Expand 'Name ID'
}

$CSV1 = import-csv C:\Path\To\File1.csv
$CSV2 = import-csv C:\Path\To\File2.csv
#adds a row named "Name ID" to the PS Object( the CSV Import)
$CSV1|ForEach{$_|Add-Member 'Name ID' $Null}
ForEach($Record in $CSV1){
#gets the value from CSV1 for comparing to CSV2
$NameValue=Record."Last_Name"
#gets the Power Shell Object from the CSV2 Import that matches the Name ID from $csv1
$Nameobject= $CSV2|Where-object "Last Name" -contains $Namevalue
#Sets the Field "Name ID" in the PS Object $CSV1 Record to the Name ID from $csv2
$record."Name ID" = $Nameobject."Name ID"
}
You can easily grab addtional fields by adding other references to the CSV1 File by manipulating the CSV2 PS Object.
$record."Middle Name" = $nameobject."Middle_Name"
Since you have the entire object in the for loop form $csv2 you can call any of its fields or manipulate them by using variables and " |select -Property "Value" Like this
$objlength = $nameobject |select "First_Name"
$objlength.length
but i prefer to call it directly from the object as the output looks cleaner like this
$nameobject."First_Name".length

The operation you are looking for is called a relational join. Sometimes it's called an inner join, and sometimes just a join. My knowledge of join comes from SQL, not from Powershell.
Here's a description of "Join-Object". It seems to be what you are looking for.
http://blogs.msdn.com/b/powershell/archive/2012/07/13/join-object.aspx

Related

Need help on a power shell script to filter the CSV on Columns and display the filtered data in new sheet

I am new to power shell but learning it fast. So far i have made a script which is fetching the data from URL and creating a csv on the desktop and then i remove the first row from the CSV and saving it to desktop as csv2. I want to filter the column and copy the filtered data in new sheet. I may have to declare the array values to be looked for filtering and i need help on that. So far i have made the below script:-
\This script is made to download the asset analysis servers list with their corresponding site address and country and will save the output to CSV on C:\Users\vtarwani\Desktop.
Invoke-WebRequest -Uri "Link to URL" -OutFile "C:\Users\tarwaniv\Desktop\file1.csv"
$import = get-content "C:\Users\tarwaniv\Desktop\file1.csv"
$import | select-Object -Skip 1 | Set-Content "C:\Users\tarwaniv\Desktop\file2.csv"
Import-csv -Path "C:\Users\tarwaniv\Desktop\file2.csv" -Header "#Active Servers", "Street Address" , "City", "Country" | Where-Object {$_.Country -eq "UNITED STATES"}
I agree with #Theo and #Olaf. This is difficult to follow. Hopefully you'll be able to build off an example like this:
$WebFile = "C:\Users\tarwaniv\Desktop\file1.csv"
$Out_Dir = "C:\Users\tarwaniv\Desktop\"
$Headers = "#Active Servers", "Street Address" , "City", "Country"
Invoke-WebRequest -Uri "Link to URL" -OutFile $WebFile
Import-csv -Path $WebFile -Header |
Select-Object -Skip 1 |
Group-Object -Property Country |
ForEach-Object{
$OutFile = $Out_Dir + $_.Name
$_.Group | Export-Csv -Path $OutFile -NoTypeInformation
}
Note: using -Skip parameter on Select object should skip the first record coming from Import-Csv.
Use Group-Object to group on the Country. That will output its own objects, but the name property is what you grouped on, in this case Country, and the Group property has the original objects in it. So we can use these 2 properties to name an output file then export the objects to the new Csv file.
If you want to only output a subset of the columns coming from the initial file. add the -Property parameter to the Select-Object command like:
Select-Object -Skip 1 -Property <PropertiesToSelect>
Note: You can declare an array of properties similar to $Headers above.
If you want top filter on additional columns, and assuming they were included in the Select-Object command just build on the Where-Object clause.
Please bear in mind I couldn't really test this, so consider it a starting point. That said let me know how it turns out. Thanks.

Modifying Column Within Array

I'm reading in a CSV file which contains 25,000 records, and am reading each column into a psobject. Here is what I have so far:
$file = Import-CSV .\server.csv
$tempobj = New-Object psobject -Property #{
'Name' = $file.Name
'Group' = $file.Group
}
When this is ran, I get the correct results I want, being that $file.Name contains all the server names, and $file.Group contains the groups for servers. However, my issue is that I need to edit the names of each server without interfering with the .Group. Here is an example of what a server name look like as is.
WindowsAuthServer #{wdk9870WIN}
I need to remove WindowsAuthServer #{ and WIN} from each server name, leaving only the server name left, or for this example, wdk9870.
I tried using the -replace function ($tempobj.Name -replace "WindowsAuthServer #{",""), but it requires that I save the results to a new array, which then messes up or removes .Group entirely
Is there a different way to go about doing this? I'm lost.
Suppose your server.csv looks like this:
"Name","Group"
"WindowsAuthServer #{wdk9870WIN}","Group1"
"WindowsAuthServer #{wdk9880WIN}","Group2"
"WindowsAuthServer #{wdk9890WIN}","Group1"
"WindowsAuthServer #{wdk9900WIN}","Group1"
And you want to change the values in the Name column only, then something like this would probably do it:
Import-Csv .\server.csv | ForEach-Object {
New-Object psobject -Property #{
'Name' = ([regex]'#\{(\w+)WIN\}').Match($_.Name).Groups[1].Value
'Group' = $_.Group
}
}
This will output:
Name Group
---- -----
wdk9870 Group1
wdk9880 Group2
wdk9890 Group1
wdk9900 Group1
If you want, you can simply pipe this info to the Export-Csv cmdlet to save as a new CSV file. For that, just append | Export-Csv -Path .\server_updated.csv -NoTypeInformation to the code.
Hope that helps

Export results of (2) cmdlets to separate columns in the same CSV

I'm new to PS, so your patience is appreciated.
I'm trying to grab data from (2) separate CSV files and then dump them into a new CSV with (2) columns. Doing this for (1) is easy, but I don't know how to do it for more.
This works perfectly:
Import-CSV C:\File1.csv | Select "Employee" | Export-CSV -Path D:\Result.csv -NoTypeInformation
If I add another Import-CSV, then it simply overwrites the existing data:
Import-CSV C:\File2.csv | Select "Department" | Export-CSV -Path D:\Result.csv -NoTypeInformation
How can I get columns A and B populated with the info result from these two commands? Thanks for your help.
I would have choose this option:
$1 = Import-Csv -Path "C:\Users\user\Desktop\1.csv" | Select "Employee"
$2 = Import-Csv -Path "C:\Users\user\Desktop\2.csv" | Select "Department"
$marged = [pscustomobject]#()
$object = [pscustomobject]
for ($i=0 ; $i -lt $1.Count ; $i++){
$object = [pscustomobject]#{
Employees = $1[$i].Employee
Department = $2[$i].Department}
$marged += $object
}
$marged | ForEach-Object{ [pscustomobject]$_ } | Export-Csv -Path "C:\Users\user\Desktop\3.csv" -NoTypeInformation -Force
I'll explain how I would do this, but I do it this way because I'm more comfortable working with objects than with hastables. Someone else may offer an answer using hashtables which would probably work better.
First, I would define an array to hold your data, which can later be exported to CSV:
$report = #()
Then, I would import your CSV to an object that can be iterated through:
$firstSet = Import-CSV .\File1.csv
Then I would iterate through this, importing each row into an object that has the two properties I want. In your case these are Employee and Department (potentially more which you can add easily).
foreach($row in $firstSet)
{
$employeeName = $row.Employee
$employee = [PSCustomObject]#{
Employee = $employee
Department = ""
}
$report += $employee
}
And, as you can see in the example above, add this object to your report.
Then, import the second CSV file into a second object to iterate through (for good form I would actually do this at the begining of the script, when you import your first one):
$secondSet = Import-CSV .\File2.csv
Now here is where it gets interesting. Based on just the information you have provided, I am assuming that all employees in the one file are in the same order as the departments in the other files. So for example, if I work for the "Cake Tasting Department", and my name is on row 12 of File 1, row 12 of File 2 says "Cake Tasting Department".
In this case it's fairly easy. You would just roll through both lists and update the report:
$i = 0
foreach($row in $secondSet)
{
$dept = $row.Department
$report[i].Department = $dept
$i++
}
After this, your $report object will contain all of your employees in one row and departments in the other. Then you can export it to CSV:
$report | Export-CSV .\Result.csv -NoTypeInformation
This works if, as I said, your data aligns across both files. If not, then you need to get a little fancier:
foreach($row in $secondSet)
{
$emp = $row.Employee
$dept = $row.Department
$report | Where {$_.Employee -eq $emp} foreach {$_.Department = $dept
}
Technically you could just do it this way anyway, but it depends on a lot of things. First of all whether you have the data to match in that column across both files (which obviously in my example you don't otherwise you wouldn't need to do this in the first place, but you could match across other fields you may have, like EmployeeID or DoB). Second, on the sovereignty of individual records (e.g., if you have multiple matching records in your first file, you will have a problem; you would expect duplicates in the second as there are more than one person in each department).
Anyway, I hope this helps. As I said there is probably a 'better' way to do this, but this is how I would do it.

Select specific column based on data supplied using Powershell

I have a csv file that may have unknown headers, one of the columns will contain email addresses for example.
Is there a way to select only the column that contains the email addresses and save it as a list to a variable?
One csv could have the header say email, another could say emailaddresses, another could say email addresses another file might not even have the word email in the header. As you can see, the headers are different. So I want to be able to detect the correct column first and use that data further in the script. Once the column is identified based on the data it contains, select that column only.
I've tried the where-object and select-string cmdlets. With both, the output is the entire array and not just the data in the column I am wanting.
$CSV = import-csv file.csv
$CSV | Where {$_ -like "*#domain.com"}
This outputs the entire array as all rows will contain this data.
Sample Data for visualization
id,first_name,bagel,last_name
1,Base,bcruikshank0#homestead.com,Cruikshank
2,Regan,rbriamo1#ebay.co.uk,Briamo
3,Ryley,rsacase2#mysql.com,Sacase
4,Siobhan,sdonnett3#is.gd,Donnett
5,Patty,pesmonde4#diigo.com,Esmonde
Bagel is obviously what we are trying to find. And we will play pretend in that we have no knowledge of the columns name or position ahead of time.
Find column dynamically
# Import the CSV
$data = Import-CSV $path
# Take the first row and get its columns
$columns = $data[0].psobject.properties.name
# Cycle the columns to find the one that has an email address for a row value
# Use a VERY crude regex to validate an email address.
$emailColumn = $columns | Where-Object{$data[0].$_ -match ".*#*.\..*"}
# Example of using the found column(s) to display data.
$data | Select-Object $emailColumn
Basically read in the CSV like normal and use the first columns data to try and figure out where the email address column is. There is a caveat that if there is more than one column that matches it will get returned.
To enforce only 1 result a simple pipe to Select-Object -First 1 will handle that. Then you just have to hope the first one is the "right" one.
If you're using Import-Csv, the result is a PSCustomObject.
$CsvObject = Import-Csv -Path 'C:\Temp\Example.csv'
$Header = ($CsvObject | Get-Member | Where-Object { $_.Name -like '*email*' }).Name
$CsvObject.$Header
This filters for the header containing email, then selects that column from the object.
Edit for requirement:
$Str = #((Get-Content -Path 'C:\Temp\Example.csv') -like '*#domain.com*')
$Headers = #((Get-Content -Path 'C:\Temp\Example.csv' -TotalCount 1) -split ',')
$Str | ConvertFrom-Csv -Delimiter ',' -Header $Headers
Other method:
$PathFile="c:\temp\test.csv"
$columnName=$null
$content=Get-Content $PathFile
foreach ($item in $content)
{
$SplitRow= $item -split ','
$Cpt=0..($SplitRow.Count - 1) | where {$SplitRow[$_] -match ".*#*.\..*"} | select -first 1
if ($Cpt)
{
$columnName=($content[0] -split ',')[$Cpt]
break
}
}
if ($columnName)
{
import-csv "c:\temp\test.csv" | select $columnName
}
else
{
"No Email column founded"
}

Extracting Tables from .CSV for Use in Email

I have a .csv file that contains well over 1000 entries of former employees of the company I work for, and I am attempting to split it up by employee into a format that I can insert into an email template. To exacerbate things, many (but not all) of the employees have more than one entry, and the .csv contains a considerable amount of information that is completely unnecessary for the people I will be sending the emails to. Ideally I would like to exclude these columns, but I am having quite a bit of difficulty with it. It seems to me that format-table will be the cmdlet I will be needing, but I am not 100% certain. Thus far I have tried:
import-csv C:\filepath\xx.csv | format-table -groupby (key value)
which does return the information organized in the way I would like, however I do not know how to split the information from there so I can only send the information for the specific employee. This also includes quite a bit of extraneous information, as previously mentioned. I have also tried:
import-csv C:\filepath\xx.csv | format-table -groupby (key value) -property (properties I want,separated by commas)
However, this is still returning the same extraneous information. I also tried using a foreach loop to iterate through the .csv, store the information I want in variables, then store those variables in an array, then pipe that array into format-table, which results in a mess. I have also tried putting the .csv into a hashtable grouped by the key value I want, however that results in the rest of the information being put into a string. I believe I should be able to split this string using regex, however I am not at all familiar with regex and have no idea where to start with that. I'm pretty well stumped at this point. Any help would be greatly appreciated.
EDIT
After following #4c74356b41's suggestion (for which I am very grateful), I am able to output a list of tables with only the pertinent information. However, I am still unable to split this list of tables into the individual tables I need. I currently have a .txt file which I am using as a template, which I would like to add the table for the individual users to. I have so far had success with using get-content to retrieve the contents of the .txt, -replace to replace several other fields on the template from information in the original .csv (eg name, manager, etc), and then out-file to store the edited template into a temp file, which is then saved as a draft to outlook. I attempted to add another foreach loop inside the previous foreach loop to add the tables into the template, but that is returning templates with the line 'Microsoft.PowerShell.Commands.Internal.Format.FormatStartData' in place of the table. It is also creating a draft for each entry rather than just one for each unique user id. Here is all the code I have so far:
$olFolderDrafts = 16
$ol = New-Object -comObject Outlook.Application
$ns = $ol.GetNameSpace("MAPI")
$file= import-csv H:\filepath\term_rep.csv
$data= import-csv H:\filepath\term_rep.csv | select-object "Inactive Emp Name","User ID","Loc","Equipment Descr","Tag Number","Serial Number"
$table= $data | Format-Table -GroupBy "User ID"
foreach($i in $file)
{$user = $i | select-object -expandproperty "Inactive Emp Name"
$uid= $i | select-object -expandproperty "User ID"
$manager = $i | Select-Object -expandproperty "Manager"
$loc= $i | select -ExpandProperty "Loc"
$tagnum= $i | select -ExpandProperty "Tag Number"
$sernum= $i | select -ExpandProperty "Serial Number"
$path="C:\filepath\term_email.txt"
$newpath = [system.io.path]::GetTempFileName()
(Get-content $path) -replace "USER_NAME",$user `
-replace "USER_ID",$uid | out-file $newpath
foreach($id in $table){
(Get-Content $newpath) -replace "EQUIPMENT_LIST", $id | out-file $newpath
}
$subject="Equipment for user $uid"
$body= get-content $newpath
$mail = $ol.CreateItem(0)
$mail.To = $manager
$mail.CC = $null
$mail.Subject = $subject
$mail.Body = $body -join "`n"
$mail.save()}
To reduce noise and duplicates I would find a column that is specific to a user, such as UserName, or Email. Then group by that, and select the first item in each group. Then you could pipe to Select to reduce noise from extraneous columns. Something like:
import-csv C:\filepath\xx.csv | Group UserName | ForEach{$_.Group[0]} | Select FirstName,LastName,Email,UserName,SeparationDate
Then you can do with it what you want... Pipe to another ForEach loop to work with each record one at a time, or pipe it to Export-CSV to generate a new CSV with reduced clutter that you could work with. You could even have PowerShell create the emails for you, depending on how fancy you want to get.
If you need further help please update your question with an example of what your desired output to be, or what exactly you are trying to accomplish with each entry.
Edit: Ok, so when it comes down to it you want a insert a table of things for any records matching a specific user into an email, so that you can get their stuff back from their manager after they're separated from the company. Cool, we can do that. To start off with, you're using the Outlook ComObject, and generating your email that way. Awesome, I've done it myself, and you can do some great stuff that way! I highly recommend using HTML here. It makes the email look more professional, and lets us insert a nice looking table into the email instead of some formatted text that will look funny due to spacing once its in the email.
So, let's back up from the issue just a hair, and redo your email template. Pop open MS Word and open your template. Now, go to File and Save As. Select 'Web Page, Filtered' as your document type, and name it the same thing you already had.
So, rather than loading the list twice, then exporting, and blah, blah, blah, we're going to shorten things a little. We load your CSV once. For that matter, we don't need to read the body of the email once for each user, so let's just load it up one time here before the loop as well. Then we pipe the User ID field to Select-Object using the -Unique switch, so that we only get one email generated per user. So, the script up to there looks like this:
$olFolderDrafts = 16
$ol = New-Object -comObject Outlook.Application
$ns = $ol.GetNameSpace("MAPI")
$file= import-csv H:\filepath\term_rep.csv
$path="C:\filepath\term_email.txt"
$body = (Get-content $path) -join "`n"
foreach($i in ($file.'User ID'|Select -Unique))
{
Now, there's no need to assign all of those variables like you were, so I'm skipping that part. So we'll find the first instance of the current user, and save that to a variable.
$User = $File|Where{$_.'User ID' -eq $i}|Select -First 1
Once we've got that we will find all of their records in the list, select only the properties that you are interested in, and convert the output to an HTML table using ConvertTo-HTML.
$Equip = $File|Where{$_.'User ID' -eq $i}|select-object "Inactive Emp Name","User ID","Loc","Equipment Descr","Tag Number","Serial Number"|ConvertTo-Html -Property '*' -As Table -Fragment
Now that is a pretty long line, but most of it is just selecting the right properties. The magic happens in the last bit where we tell it to convert the data to a HTML. Specifically we want everything (specified by -Property '*'), that we want it to be converted as a table, and that this is just a fragment, so it doesn't try to put all the preceding and following tags in there (so no <HTML> and </HTML> type tags since it's being injected into the middle of existing HTML.)
Then we do the replaces, make the email, assign the properties, yada, yada, yada, you pretty much already had all this...
$HTMLbody = $body -replace "USER_NAME",$user.'Inactive Emp Name' -replace "USER_ID",$user.'User ID' -replace "EQUIPMENT_LIST", $Equip
$subject="Equipment for user " + $User.'User ID'
$mail = $ol.CreateItem(0)
$mail.To = $User.Manager
$mail.Subject = $subject
$mail.HTMLBody = $HTMLbody
$mail.save()
}
Save it, and done! Loop to the next user, and repeat. So put that together and you get:
$olFolderDrafts = 16
$ol = New-Object -comObject Outlook.Application
$ns = $ol.GetNameSpace("MAPI")
$file= import-csv H:\filepath\term_rep.csv
$path="C:\temp\email.htm"
$body = (Get-content $path) -join "`n"
foreach($i in ($file.'User ID'|Select -Unique))
{
$User = $File|Where{$_.'User ID' -eq $i}|Select -First 1
$Equip = $File|Where{$_.'User ID' -eq $i}|select-object "Inactive Emp Name","User ID","Loc","Equipment Descr","Tag Number","Serial Number"|ConvertTo-Html -Property '*' -As Table -Fragment
$HTMLbody = $body -replace "USER_NAME",$user.'Inactive Emp Name' -replace "USER_ID",$user.'User ID' -replace "EQUIPMENT_LIST", $Equip
$subject="Equipment for user " + $User.'User ID'
$mail = $ol.CreateItem(0)
$mail.To = $User.Manager
$mail.Subject = $subject
$mail.HTMLBody = $HTMLbody
$mail.save()
}
Say your csv has some columns: name, surname, bla-bla, bla-bla, bla-bla.
$data = import-csv C:\filepath\xx.csv | select name, surname
$data | format-table -groupby (key value)