I have a .csv file which looks like:
employeenumber;phone;mobile;fax;userid;Email
99999991;+1324569991;+234569991;+5234569991;user01;user1#domain.com
99999992;+1234569992;+234569992;;user02;user2#domain.com
99999993;+1234569993;+234569993;;user03;user3#domain.com
99999993;+12345699933;;;user03;user3#domain.com
99999993;;;+5234569993;user03;user3#domain.com
99999994;+1234569994;;;user04;user4#domain.com
As you can see there are different employeenumbers and some lines with the same employeenumber.
Is there any way to merge the lines with the same employeenumber in powershell?
Similar Output:
employeenumber;phone;mobile;fax;userid;Email
99999991;+1324569991;+234569991;+5234569991;user01;user1#domain.com
99999992;+1234569992;+234569992;;user2;user2#domain.com
99999993;+1234569993 / +12345699933;+234569993;+5234569993;user03;user3#domain.com
99999994;+1234569994;;;user04;user4#domain.com
Thank you
I've taken a shot at it. I believe my answer is easier to read than Mjolinor's.
I group the entries from the CSV into either $singletons or $duplicates, based on using the Group-Object command. Then, I pipe through the $duplicates and merge the records found in either the phone,mobile, or fax fields, using a '/' character as you've indicated.
#$csv = get-content .\CSVNeedstoMerge.csv
$csvValues = $csv | ConvertFrom-Csv -Delimiter ';'
$duplicates = $csvValues | group-object EmployeeNumber | ? Count -gt 1
$objs = New-Object System.Collections.ArrayList
$singletons = $csvValues | group-object EmployeeNumber | ? Count -eq 1 | % {$objs.Add($_.Group)}
ForEach ($duplicate in $duplicates){
$objs.Add([pscustomobject]#{employeenumber=($duplicate.Group.employeenumber | select -Unique) -as [int];
phone=($duplicate.Group.phone | ? Length -gt 0) -join '/';
mobile=($duplicate.Group.mobile| ? Length -gt 0) -join '/';
fax=($duplicate.Group.fax | ? Length -gt 0) -join '/';
userid = $duplicate.Group.userid | select -Unique
email= $duplicate.Group.email | select -Unique })
}
$objs | Sort EmployeeNumber
I'll give that a shot:
(#'
employeenumber;phone;mobile;fax;userid;Email
99999991;+1324569991;+234569991;+5234569991;user01;user1#domain.com
99999992;+1234569992;+234569992;;user02;user2#domain.com
99999993;+1234569993;+234569993;;user03;user3#domain.com
99999993;+12345699933;;;user03;user3#domain.com
99999993;;;+5234569993;user03;user3#domain.com
99999994;+1234569994;;;user04;user4#domain.com
'#).split("`n") |
foreach {$_.trim()} | sc test.csv
$ht = #{}
$props = (Get-Content test.csv -TotalCount 1).split(';')
import-csv test.csv -Delimiter ';' |
foreach {
if ( $ht.ContainsKey($_.employeenumber) )
{
foreach ($prop in $props )
{
if ($_.$prop )
{$ht[$_.employeenumber].$prop = $_.$prop }
}
}
else { $ht[$_.employeenumber] = $_ }
}
$ht.values | sort employeenumber
employeenumber : 99999991
phone : +1324569991
mobile : +234569991
fax : +5234569991
userid : user01
Email : user1#domain.com
employeenumber : 99999992
phone : +1234569992
mobile : +234569992
fax :
userid : user02
Email : user2#domain.com
employeenumber : 99999993
phone : +12345699933
mobile : +234569993
fax : +5234569993
userid : user03
Email : user3#domain.com
employeenumber : 99999994
phone : +1234569994
mobile :
fax :
userid : user04
Email : user4#domain.com
Related
In my existing CSV file I have a column called "SharePoint ID" and it look like this
1.ylkbq
2.KlMNO
3.
4.MSTeam
6.
7.MSTEAM
8.LMNO83
and I'm just wondering how can I create a new Column in my CSV call "SharePoint Email" and then add "#gmail.com" to only the actual Id like "ylkbq", "KLMNO" and "LMNO83" instead of applying to all even in the blank space. And Maybe not add/transfer "MSTEAM" to the new Column since it's not an Id.
$file = "C:\AuditLogSearch\New folder\OriginalFile.csv"
$file2 = "C:\AuditLogSearch\New folder\newFile23.csv"
$add = "#GMAIL.COM"
$properties = #{
Name = 'Sharepoint Email'
Expression = {
switch -Regex ($_.'SharePoint ID') {
#Not sure what to do here
}
}
}, '*'
Import-Csv -Path $file |
Select-Object $properties |
Export-Csv $file2 -NoTypeInformation
Using calculated properties with Select-Object this is how it could look:
$add = "#GMAIL.COM"
$expression = {
switch($_.'SharePoint ID')
{
{[string]::IsNullOrWhiteSpace($_) -or $_ -match 'MSTeam'}
{
# Null value or mathces MSTeam, leave this Null
break
}
Default # We can assume these are IDs, append $add
{
$_.Trim() + $add
}
}
}
Import-Csv $file | Select-Object *, #{
Name = 'SharePoint Email'
Expression = $expression
} | Export-Csv $file2 -NoTypeInformation
Sample Output
Index SharePoint ID SharePoint Email
----- ------------- ----------------
1 ylkbq ylkbq#GMAIL.COM
2 KlMNO KlMNO#GMAIL.COM
3
4 MSTeam
5
6 MSTEAM
7 LMNO83 LMNO83#GMAIL.COM
A more concise expression, since I misread the point, it can be reduced to just one if statement:
$expression = {
if(-not [string]::IsNullOrWhiteSpace($_.'SharePoint ID') -and $_ -notmatch 'MSTeam')
{
$_.'SharePoint ID'.Trim() + $add
}
}
How do I fix this:
There are 2 objects in this group,
one contains a name under 'Move-In (Name)', one doesn't:
$rest.Group | select * | Where-Object {$_.SERV_UNIT -eq '2704'}
Account_no : 12345
SERV_UNIT : 2704
FINALREAD :
Move-In (Name) : OWNER / CURRENT TENANT
Move-In (Home Phone #) :
Move-In (Business Phone #) :
Move-In (Email) :
Account_no : 12345
SERV_UNIT : 2704
FINALREAD :
CHANNEL_ID :
Move-In (Name) :
Move-In (Home Phone #) :
Move-In (Business Phone #) :
Move-In (Email) :
I want to select only the one with a move in name.
However, with the code I've written, because they're apart of the same group, I will not pick up either:
$rest = $allinfo | Group-Object account_no | Where-Object { $_.Group.FINALREAD -contains '' -and $_.Group.'Move-In (Name)' -notcontains ''} | Select * -Unique
$rest now equals nothing. I want it to have one result, the first one with 'Owner' as the Move-in name.
To get the correct result I have to do this:
$rest = $allinfo | Group-Object account_no | Where-Object { $_.Group.FINALREAD -contains ''} | Select * -Unique
$newRest = #()
foreach ($row in $rest.group) {
if ($row.'Move-In (Name)' -notcontains '')
{
$newRest += $row
}
}
Is it possible to do that all within this one line?
$rest = $allinfo | Group-Object account_no | Where-Object { $_.Group.FINALREAD -contains '' -and $_.Group.'Move-In (Name)' -notcontains ''} | Select * -Unique
If not, it's alright, I have an answer.
Thanks for help
Maybe you can remove the records you don't need (empty Move-In) first, then group by account and list the account-groups that contains one or more blank FINALREAD-values
$rest = $allinfo | Where-Object { $_.Group.'Move-In (Name)' } | Group-Object account_no | Where-Object { $_.Group.FINALREAD -contains '' }
I have 2 csv files
First file:
firstName,secondName
1234,Value1
2345,Value1
3456,Value1
4567,Value3
7645,Value3
Second file:
firstName,fileSplitter,Csv2ColumnOne,Csv2ColumnTwo,Csv2ColumnThree
1234,,1234,abc,Value1
1234,,1234,asd,Value1
3456,,3456,qwe,Value1
4567,,4567,mnb,Value1
I want to insert column secondName in the second file in between columns firstName and fileSplitter.
The result should look like this:
firstName,secondName,fileSplitter,Csv2ColumnOne,Csv2ColumnTwo,Csv2ColumnThree
1234,Value1,,1234,abc,Value1
1234,Value1,,1234,asd,Value1
3456,Value1,,3456,qwe,Value1
4567,Value3,,4567,mnb,Value1
I'm trying the following code:
Function InsertColumnInBetweenColumns
{
Param ($FirstFileFirstColumnTitle, $firstFile, [string]$1stColumnName, [string]$2ndColumnName, [string]$columnMergedFileBeforeInput)
Write-Host "Creating hash table with columns values `"$1stColumnName`" `"$2ndColumnName`" From $OimFileWithMatches"
$hashFirstFileTwoColumns = #{}
Import-Csv $firstFile | ForEach-Object {$hashFirstFileTwoColumns[$_.$1stColumnName] = $_.$2ndColumnName}
Write-Host "Complete."
Write-Host "Appending Merge file with column `"$2ndColumnName`" from file $secondCsvFileWithLocalPath"
Import-Csv $outputCsvFileWithLocalPath | Select-Object $columnMergedFileBeforeInput, #{n=$2ndColumnName; e={
if ($hashFirstFileTwoColumns.ContainsKey($_.$FirstFileFirstColumnTitle)) {
$hashFirstFileTwoColumns[$_.$FirstFileFirstColumnTitle]
} Else {
'Not Found'
}}}, * | Export-Csv "$outputCsvFileWithLocalPath-temp" -NoType -Force
Move-Item "$outputCsvFileWithLocalPath-temp" $outputCsvFileWithLocalPath -Force
Write-Host "Complete."
Write-Host ""
}
This function will be called in a for loop for each column found in the first file (can contain an indefinite number). For testing, I am only using 2 columns from the first file.
I'm getting an error output resulting the following:
Select : Property cannot be processed because property "firstName" already exists.
At C:\Scripts\Tests\Compare2CsvFilesOutput1WithMatchesOnly.ps1:490 char:43
+ Import-Csv $outputCsvFileWithLocalPath | Select $columnMergedFileBeforeInput, # ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidOperation: (#{firstName=L...ntName=asdfas}:PSObject) [Select-Object], PSArgume
ntException
+ FullyQualifiedErrorId : AlreadyExistingUserSpecifiedPropertyNoExpand,Microsoft.PowerShell.Commands.SelectObjectC
ommand
I know the issue is where it says Select-Object $columnMergedFileBeforeInput,.
How can I get the loop statement to insert the column in between the before column (name is specified), and append the rest using *?
Update
Just an fyi, changing this line Select-Object $columnMergedFileBeforeInput, #{n=$2ndColumnName..... to this line Select-Object #{n=$2ndColumnName..... works, it just attaches the columns out of order. That is why I'm trying to insert the column in between. Maybe if i do it this way but insert the columns in backwards using the for loop, this would work...
Not sure if this is the most efficient way to do it, but it should do the trick. It just adds the property to the record from file2, then reorders the output so secondName is the second column. You can output results to csv where required too (ConvertTo-Csv).
$file1 = Import-Csv -Path file1.csv
$file2 = Import-Csv -Path file2.csv
$results = #()
ForEach ($record In $file2) {
Add-Member -InputObject $record -MemberType NoteProperty -Name secondName -Value $($file1 | ? { $_.firstName -eq $record.firstName } | Select -ExpandProperty secondName)
$results += $record
}
$results | Select-Object -Property firstName,secondName,fileSplitter,Csv2ColumnOne,Csv2ColumnTwo,Csv2ColumnThree
I've created the following function. What it does is find the match (in this case "firstname") and adds the matching columnname to the new array afther the columnname on which the match is made (little difficult to explain in my poor English).
function Add-ColumnAfterMatchingColumn{
[CmdletBinding()]
param(
[string]$MainFile,
[string]$MatchingFile,
[string]$MatchColumnName,
[string]$MatchingColumnName
)
# Import data from two files
$file1 = Import-Csv -Path $MainFile
$file2 = Import-Csv -Path $MatchingFile
# Find column names and order them
$columnnames = $file2 | gm | where {$_.MemberType -like "NoteProperty"} | Select Name | %{$_.Name}
[array]::Reverse($columnnames)
# Find $MatchColumnName index and put the $MatchingColumnName after it
$MatchColumnNameIndex = [array]::IndexOf($columnnames, $MatchColumnName)
if($MatchColumnNameIndex -eq -1){
$MatchColumnNameIndex = 0
}
$columnnames = $columnnames[0..$MatchColumnNameIndex] + $MatchingColumnName + $columnnames[($MatchColumnNameIndex+1)..($columnnames.Length -1)]
$returnObject = #()
foreach ($item in $file2){
# Find corresponding value MatchingColumnName in $file1 and add it to the current item
$item | Add-Member -Name "$MatchingColumnName" -Value ($file1 | ?{$_."$($MatchColumnName)" -eq $item."$($MatchColumnName)"})."$MatchingColumnName" -MemberType NoteProperty
# Add current item to the returnObject array, in the correct order
$newItem = New-Object psobject
foreach ($columnname in [string[]]$columnnames){
$newItem | Add-Member -Name $columnname -Value $item."$columnname" -MemberType NoteProperty
}
$returnObject += $newItem
}
return $returnObject
}
When you run this function you will have the following output:
Add-ColumnAfterMatchingColumn -MainFile C:\Temp\file1.csv -MatchingFile C:\Temp\file2.csv -MatchColumnName "firstname" -MatchingColumnName "secondname" | ft
firstName secondname fileSplitter Csv2ColumnTwo Csv2ColumnThree Csv2ColumnOne
--------- ---------- ------------ ------------- --------------- -------------
1234 Value1 abc Value1 1234
1234 Value1 asd Value1 1234
3456 Value1 qwe Value1 3456
4567 Value3 mnb Value1 4567
If I already have a variable $test with a set of users.
Each user entry has 10 columns that represent email addresses.
How to return only values with a specific entries from the $test variable.
Example of a user entry:
Alias : User01
EmailAddresses_1 : X500:/o=bla bla bla b
EmailAddresses_2 : x500:/o=bla1 bla1 bla1 bla1
EmailAddresses_3 : smtp:USR1#testdomain1.com
EmailAddresses_4 : smtp:user01#testdomain1.com
EmailAddresses_5 : smtp:user1#testdomain2.com
EmailAddresses_6 : SMTP:user001#testdomain1.com
EmailAddresses_7 : SIP:usr01#testdomain1.com
EmailAddresses_8 : smtp:u1#testdomain2.com
EmailAddresses_9 :
EmailAddresses_10 :
So as you can see, some columns are populated with different values and other are empty.
How can I return only the columns with a specific value assuming I only have the variable to work with.
For example all the user entries with only the values that start with "SIP:*"
A little guiding light is appreciated.
If you are looking for the Alias of all the users that have a property name that starts with EmailAddresses and contains a specific value, this might help you out:
$Test = [PSCustomObject]#{
Alias = 'User01'
EmailAddresses_1 = 'X500:/o=bla bla bla b'
EmailAddresses_2 = 'x500:/o=bla1 bla1 bla1 bla1'
EmailAddresses_3 = 'smtp:USR1#testdomain1.com'
EmailAddresses_4 = 'smtp:user01#testdomain1.com'
EmailAddresses_5 = 'smtp:user1#testdomain2.com'
EmailAddresses_6 = 'SMTP:user001#testdomain1.com'
EmailAddresses_7 = 'SIP:usr01#testdomain1.com'
EmailAddresses_8 = 'smtp:u1#testdomain2.com'
EmailAddresses_9 = $null
EmailAddresses_10 = $null
}
$SearchString = 'SIP:'
$Found = Foreach ($T in $Test) {
$Properties = $Test | Get-Member | Where {($_.MemberType -EQ 'NoteProperty') -and ($_.Name -like 'EmailAddresses*')}
Foreach ($P in $Properties) {
if ($T.($P.Name) -like "$SearchString*") {
$T.Alias
}
}
}
$Found | Select -Unique
After clarification in the comments, this might be more what you're looking for:
$SearchString = 'SIP:'
$Test | Select Alias,
#{Name='EmailAddres1';Expression={if ($_.EmailAddresses_1 -like "$SearchString*"){$_.EmailAddresses_1}}},
#{Name='EmailAddres2';Expression={if ($_.EmailAddresses_2 -like "$SearchString*"){$_.EmailAddresses_2}}},
#{Name='EmailAddres3';Expression={if ($_.EmailAddresses_3 -like "$SearchString*"){$_.EmailAddresses_3}}},
#{Name='EmailAddres4';Expression={if ($_.EmailAddresses_4 -like "$SearchString*"){$_.EmailAddresses_4}}},
#{Name='EmailAddres5';Expression={if ($_.EmailAddresses_5 -like "$SearchString*"){$_.EmailAddresses_5}}},
#{Name='EmailAddres6';Expression={if ($_.EmailAddresses_6 -like "$SearchString*"){$_.EmailAddresses_6}}},
#{Name='EmailAddres7';Expression={if ($_.EmailAddresses_7 -like "$SearchString*"){$_.EmailAddresses_7}}},
#{Name='EmailAddres8';Expression={if ($_.EmailAddresses_8 -like "$SearchString*"){$_.EmailAddresses_8}}},
#{Name='EmailAddres9';Expression={if ($_.EmailAddresses_9 -like "$SearchString*"){$_.EmailAddresses_9}}},
#{Name='EmailAddres10';Expression={if ($_.EmailAddresses_10 -like "$SearchString*"){$_.EmailAddresses_10}}}
I have scraped two files from a website in order to list the companies in my city.
The first lists : name, city, phone number, email
The second lists : name, city, phone number
And I will have duplicate lines if I merge them, as an example, i will have the following :
> "Firm1";"Los Angeles";"000000";"info#firm1.lol"
> "Firm1";"Los Angeles";"000000";""
> "Firm2";"Los Angeles";"111111";""
> "Firm3";"Los Angeles";"000000";"contact#firm3.lol"
> "Firm3";"Los Angeles";"000000";""
> ...
Is there a way to merge the two files and keep the max info like this :
> "Firm1";"Los Angeles";"000000";"info#firm1.lol"
> "Firm2";"Los Angeles";"111111";""
> "Firm3";"Los Angeles";"000000";"contact#firm3.lol"
> ...
According to the fact you've got a file like this called 'firm.csv'
"Firm1";"Los Angeles";"000000";"info#firm1.lol"
"Firm1";"Los Angeles";"000000";""
"Firm2";"Los Angeles";"111111";""
"Firm3";"Los Angeles";"000000";"contact#firm3.lol"
"Firm3";"Los Angeles";"000000";""
You can load it using :
$firms = import-csv C:\temp\firm.csv -Header 'Firm','Town','Tel','Mail' -Delimiter ';'
Then
$firms | Sort-Object -Unique -Property 'Firm'
According to Joey's comment I improved the solution :
$firms | Group-Object -Property 'firm' | % {$_.group | Sort-Object -Property mail -Descending | Select-Object -first 1}
EDIT: just realized the two files don't contain the same headers. Here is an update.
$main = Import-Csv firm1.csv -Header 'Firm','Town','Tel','Mail' -Delimiter ";"
$alt = Import-Csv firm2.csv -Header 'Firm','Town','Tel' -Delimiter ";"
foreach ($f in $alt)
{
$found = $false
foreach($g in $main)
{
if ($g.Firm -eq $f.Firm -and $g.city -eq $f.city)
{
$found = $true
if ($g.Tel -eq "")
{
$g.Tel = $f.Tel
}
}
}
if ($found -eq $false)
{
$main += $f
}
}
# Everything is merged into the $main array
$main
There must be better approach but this is one costy way to do this.
$firms = import-csv C:\firm.csv -Header 'Firm','Town','Tel','Mail' -Delimiter ';'
$Result = #()
ForEach($i in $firms){
$found = 0;
ForEach($m in $Result){
if($m.Firm -eq $i.Firm){
$found = 1
if( $i.Mail.length -ne 0 )
{
$m.Mail = $i.Mail
}
break;
}
}
if($found -eq 0){
$Result += [pscustomobject] #{Firm=$i.Firm; Town=$i.Town; Tel=$i.Tel; Mail=$i.Mail}
}
}
$Result | export-csv C:\out.csv