PowerShell - Trying to combine data from 2 CSV files into one based on column value - powershell

long time listener first time caller.
Normally I am pretty good at finding and digging and getting what I need and then modifying it to suit. This one seems to be a little trickier than what I have managed to pull off before. I am self taught in PowerShell mostly out of curiosity to see what I can do.
I am trying to create a report from data from 2 CSVs, and "most" of the data in the 2 CSVs are identical. There is simply 1 column of data in one of the CSVs that I want to add to the other one. I live regularly in the world of excel and I can do this with a formula in a matter of seconds [=VLOOKUP(H8,C:C,2,FALSE)] but accomplishing the same goal in PowerShell seems to be eluding me.
As I mentioned, I tend to try and find others who have done similar things and modify it. The best sounding one I found here ( Combine data from 2 CSV files into 1 new CSV using Powershell ) and I am still trying to play with the code on that site. Sometimes I find something and I try and stick with it too long where there might be another command that I am not familiar with that is better suited to what I should be looking at and might just need a pointer in that direction.
But here is a visual representation of what I am trying to do.
And every email address in File 2, is present in File 1.

Use Import-Csv to parse both CSV input files into arrays of [pscustomobject] instances for OOP processing.
For file 2, build a hashtable that maps the Email column values to their License values.
Then use a calculated property with Select-Object to append a property to the objects parsed from file 1, using the hashtable to map each Email property to the License value from file 2; if there is no hashtable entry for a given Email property value, $null is returned, which in the context of exporting to CSV (with Export-Csv) amounts to an empty field (column value).
# Import file 2 and create a hashtable that maps each Email
# column value to the License column value.
$ht = #{}
Import-Csv File2 | ForEach-Object { $ht[$_.Email] = $_.License }
# Import file 1 and append a License column that contains
# the license value from file 2 if the Email column value matches.
Import-Csv File1 |
Select-Object *, #{ Name='License'; Expression={ $ht[$_.Email] } }
# | Export-Csv ... # complete as needed

Related

Create Record using Headers from a .csv

<EDIT: I kind of have it working, but in order to get it to work, my template csv has to have a blank line for every line I am going to be adding to it. So, if I could figure out how to add lines to the imported empty (just a header row) csv file, I could then use export-csv at the end. (It would be somewhat slower, but it would at least work.)>
I am creating a .csv file in PowerShell. The output file has 140 columns. Many of them are null.
I started out just doing
$out = 'S-'+$Snum+',,,,,TRUE,,,,,'+'S-'+$Snum+',"'
$out = $out + '{0:d9}' -f $item.SupplierCode2
until I had filled all the columns with the correct value. But, the system that is reading the output keeps changing the column locations. So, I wanted to take the header row from the template for the system and use that to name the columns. Then, if the columns change location, it won't matter because I will be referring to it by name.
Because there are so many columns, I'm trying to avoid a solution that has me enter all the column names. By using a blank .csv with just the headers, I can just paste that into the csv whenever it changes and I won't have to change my code.
So, I started by reading my csv file in so I can use the headers.
$TempA = Import-Csv -Path $Pathta -Encoding Default
Then I was hoping I could do something like this:
$TempA.'Supplier Key' = "S-$Snum"
$TempA.'Auto Complete' = "TRUE"
$TempA.'Supplier ID' = "S-$Snum"
$tempA.'Supplier Data - Supplier Reference ID' = '{0:d9}' -f $item.SupplierCode2
I would only need to fill in the fields that have values, everything else would be null.
Then I was thinking I could write out this record to a file. My old write looked like this
$writer2.WriteLine($out)
I wanted to write the line from the new csv line instead
$writer2.WriteLine($TempA)
I'd rather use streams if I can because the files are large and using add-Content really slows things down.
I know I need to do something to add a line to $TempA and I would like each loop to start with a new line (with all nulls) because there are times when certain lines only have a small subset of the values populated.
Clearly, I'm not taking the correct approach here. I'd really appreciate any advice anyone can give me.
Thank you.
If you only want to fill in certain fields, and don't mind using Export-Csv you can use the -append and -force switches, and it will put the properties in the right places. For example, if you had the template CSV file with only the column names in it you could do:
$Output = ForEach($item in $allItems){
[PSCustomObject]#{
'Supplier Key' = "S-$Snum"
'Auto Complete' = "TRUE"
'Supplier ID' = "S-$Snum"
'Supplier Data - Supplier Reference ID' = '{0:d9}' -f $item.SupplierCode2
}
}
$Output | Export-Csv -Path $Pathta -Append -Force
That would create objects with only the four properties that you are interested in, and then output them to the CSV in the correct columns, adding commas as needed to create blank values for all other columns.

Windows Powershell - trouble importing CSV and iterating

I'm new to Powershell (of course), and having troubles with a seemingly simple process. I have found a couple of examples that I think I am following, but they aren't working for me.
What I am trying to do: add a bunch of users to the local Windows OS, by reading from a CSV file (has names, usernames, passwords, etc).
My understanding is that the 'Import-CSV' cmdlet is supposed to return an object-like thing you can iterate over:
"The result of an Import-Csv command is a collection of strings that
form a table-like custom object."
When I perform that step, saving it to a variable, it seems that there is only ever 1 row present. And if I don't provide the "-Header" parameter, I get errors about a 'member is already present'... even if I include the header in the CSV file (my original file did not include a header row in the CSV file.)
I have tried various methods trying to get a Count of the imported CSV results, just trying to see what the data is, but I'm not having any luck. (MS Docs say you can use the Count property.)
MS Docs (https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/import-csv?view=powershell-7.2) say this about "Import-CSV":
Outputs
Object
This cmdlet returns the objects described by the content in the CSV
file.
...
Notes
Because the imported objects are CSV versions of the object type...
The result of an Import-Csv command is a collection of strings that
form a table-like custom object. Each row is a separate string, so you
can use the Count property of the object to count the table rows. The
columns are the properties of the object and items in the rows are the
property values.
An example of my input CSV file:
"ISA","LOG","Consulting & Other","Vendor","Isalog","alsdkjfalsdjflasdkfjalsdkfjlaksdjflkasdfj"
"Bry","Link","Bry Link","Vendor","Bry","asdkfjalsdjflaksdjflasdkjflaksdfj"
"Michael","Had","Premier Service Of Western","Vendor","Michael","alsdkfjalskdjflaksdjflaksdfjalksdfj"
Code of one example that I am testing:
param ($InputFile)
Write-Host "Provided input file: $InputFile"
$CSV = Import-CSV -Path $InputFile -Header 'FirstName', 'LastName', 'FirmName', 'Type', 'Username', 'Password'
foreach($LINE in $CSV)
{
$NewUser="$($LINE.USERNAME)"
$NewPass="$($LINE.PASSWORD)"
$SecurePass=ConvertTo-SecureString –AsPlainText -Force -String "$NewPass"
Write-Host "User = $NewUser"
#New-LocalUser -Name $NewUser -Password $SecurePass
}
And a screenshot of my script plus the run results:
Running on: Windows server 2019 datacenter.
Powershell version: 5.1
The ultimate answer was that the character encoding for the CSV file I was using as input was causing problems for Powershell. Specifically, the line-ending encoding.
My original file was created on a Mac. The line-ending enconding was 'Macintosh (CR)'. The files that worked OK were created on this Windows machine, and used the line-ending encoding = "Windows (CR LF)".
Thanks to Olaf who got me thinking about this issue and made me investigate that area further.

Rename Files with Index(Excel)

Anyone have any ideas on how to rename files by finding an association with an index file?
I have a file/folder structure like the following:
Folder name = "Doe, John EO11-123"
Several files under this folder
The index file(MS Excel) has several columns. It contains the names in 2 columns(First and Last). It also has a column containing the number EO11-123.
What I would like to do is write maybe a script to look at the folder names in a directory, compare/find an associated value in the index file(like that number EO11-123) and then rename all the files under the folder using a 4th column value in the index.
So,
Folder name = "Doe, John EO11-123", index column1 contains same value "EO11-123", use column2 value "111111_000000" and rename all the files under that directory folder to "111111_000000_0", "111111_000000_1", "111111_000000_2" and so on.
This possible with powershell or vbscript?
Ok, I'll answer your questions in your comment first. Importing the data into PowerShell allows you to make an array in powershell that you can match against, or better yet make a HashTable to reference for your renaming purposes. I'll get into that later, but it's way better than trying to have PowerShell talk to Excel and use Excel's search functions because this way it's all in PowerShell and there's no third party application dependencies. As for importing, that script is a function that you can load into your current session, so you run that function and it will automatically take care of the import for you (it opens Excel, then opens the XLS(x) file, saves it as a temp CSV file, closes Excel, imports that CSV file into PowerShell, and then deletes the temp file).
Now, you did not state what your XLS file looks like, so I'm going to assume it's got a header row, and looks something like this:
FirstName | Last Name | Identifier | FileCode
Joe | Shmoe | XA22-573 | JS573
John | Doe | EO11-123 | JD123
If that's not your format, you'll need to either adapt my code, or your file, or both.
So, how do we do this? First, download, save, and if needed unblock the script to Import-XLS. Then we will dot source that file to load the function into the current PowerShell session. Once we have the function we will run it and assign the results to a variable. Then we can make an empty hashtable, and for each record in the imported array create an entry in the hashtable where the 'Identifier' property (in your example above that would be the one that has the value "EO11-123" in it), make that the Key, then make the entire record the value. So, so far we have this:
#Load function into current session
. C:\Path\To\Import-XLS.ps1
$RefArray = Import-XLS C:\Path\To\file.xls
$RefHash = #{}
$RefArray | ForEach( $RefHash.Add($_.Identifier, $_)}
Now you should be able to reference the identifier to access any of the properties for the associated record such as:
PS C:\> $RefHash['EO11-123'].FileCode
JD123
Now, we just need to extract that name from the folder, and rename all the files in it. Pretty straight forward from here.
Get-ChildItem c:\Path\to\Folders -directory | Where{$_.Name -match "(?<= )(\S+)$"}|
ForEach{
$Files = Get-ChildItem $_.FullName
$NewName = $RefHash['$($Matches[1])'].FileCode
For($i = 1;$i -lt $files.count;$i++){
$Files[$i] | Rename-Item -New "$NewName_$i"
}
}
Edit: Ok, let's break down the rename process here. It is a lot of piping here, so I'll try and take it step by step. First off we have Get-ChildItem that gets a list of folders for the path you specify. That part's straight forward enough. Then it pipes to a Where statement, that filters the results checking each one's name to see if it matches the Regular Expression "(?<= )(\S+)$". If you are unfamiliar with how regular expressions work you can see a fairly good breakdown of it at https://regex101.com/r/zW8sW1/1. What that does is matches any folders that have more than one "word" in the name, and captures the last "word". It saves that in the automatic variable $Matches, and since it captured text, that gets assigned to $Matches[1]. Now the code breaks down here because your CSV isn't laid out like I had assumed, and you want the files named differently. We'll have to make some adjustments on the fly.
So, those folder that pass the filter will get piped into a ForEach loop (which I had a typo in previously and had a ( instead of {, that's fixed now). So for each of those folders it starts off by getting a list of files within that folder and assigning them to the variable $Files. It also sets up the $NewName variable, but since you don't have a column in your CSV named 'FileCode' that line won't work for you. It uses the $Matches automatic variable that I mentioned earlier to reference the hashtable that we setup with all of the Identifier codes, and then looks at a property of that specific record to setup the new name to assign to files. Since what you want and what I assumed are different, and your CSV has different properties we'll re-work both the previous Where statement, and this line a little bit. Here's how that bit of the script will now read:
Get-ChildItem c:\Path\to\Folders -directory | Where{$_.Name -match "^(.+?), .*? (\S+)$"}|
ForEach{
$Files = Get-ChildItem $_.FullName
$NewName = $Matches[2] + "_" + $Matches[1]
That now matches the folder name in the Where statement and captures 2 things. The first thing it grabs is everything at the beginning of the name before the comma. Then it skips everything until it gets tho the last piece of text at the end of the name and captures everything after the last space. New breakdown on RegEx101: https://regex101.com/r/zW8sW1/2
So you want the ID_LName, which can be gotten from the folder name, there's really no need to even use your CSV file at this point I don't think. We build the new name of the files based off the automatic $Matches variable using the second capture group and the first capture group and putting an underscore between them. Then we just iterate through the files with a For loop basing it off how many files were found. So we start with the first file in the array $Files (record 0), add that to the $NewName with an underscore, and use that to rename the file.

Simple PowerShell Script to loop through 1 CSV file to create a new CSV file from another

I know the title sounds confusing, but once I describe this, I'm certain there is a very easy way to perform what I need to do. I'm very new to PowerShell and am trying to perform a specific task that seems rather difficult to find a good answer for one the Web.
I have spent the past several days searching through methods of concatenating the data and joining the files together, but nothing that was specific enough to this task. All examples show how to display data, but nothing that loops through and adds data together to create a new csv file. If anything, I've over-researched this issue to the point of having to pose this message to see where I can get my brain de-cluttered with all of the useless options I've already tried...
I have two csv files. I call them csv's, but they are really just a single column of information each.
Files:
Users.csv
Offices.csv
The Users.csv file has a list of network user names. The Offices.csv file has a list of numbers that correspond to office locations.
What I want to have happen is to use a loop that will take each user from the users.csv file and create a new line in a separate csv file the adds each of the offices to it.
EXAMPLE:
Users.csv
NTNAME
domain\user1
domain\user2
domain\user3
Offices.csv
OFFICES
0001
0023
0043
0067
When combined, I would like csv file that looks like this:
NTNAME,OFFICES
domain\user1,0001
domain\user1,0023
domain\user1,0043
domain\user1,0067
domain\user2,0001
domain\user2,0023
domain\user2,0043
domain\user2,0067
domain\user3,0001
domain\user3,0023
domain\user3,0043
domain\user3,0067
Any help you can give would be greatly appreciated...
Borrowing Shay's awesome CSV field enumeration code:
$offices = Import-Csv 'C:\path\to\offices.csv'
Import-Csv 'C:\path\to\users.csv' | % {
foreach ($prop in $_.PSObject.Properties) {
$offices | select #{n=$prop.Name;e={$prop.Value}}, OFFICES
}
} | Export-Csv 'C:\path\to\combined.csv' -NoTypeInformation

looping through a csv

just wondering if i could do this in powershell, or even a c#/vb.net command line program.
I have data that looks like this:
(source: kalleload.net)
I have a Teams Table. It looks like this:
| id | teamname | teamcity |
so for example, C2 has the value "Atlanta Braves". I need to split this up into "Atlanta" and "Braves". Data is consistent. for example "New York Mets" is actually "NewYork Mets".
So i need to go through column C and D and insert all the teams (no duplicates into the db).
One line of PowerShell will read in the CSV file and create a custom object for each home and away team listing (with a property for the city name and for the team name). The last command in the pipeline will eliminate the duplicates.
$TeamsAndCities = import-csv -path c:\mycsvfile.csv | foreach-object { $_.away, $_.home | select-object #{Name='City';Expression={$_.split(' ')[0]}}, #{Name='Team';Expression={$_.split(' ')[1]}} } | select-object -unique
You can do database access from PowerShell as well, but that might be suited to a new question with some more details about the database you are connecting to.
I rarely code in VBA/VB but...
Something like
Dim rngAwayTeam As Range, rngHomeTeam As Range
set rngAwayTeam = Worksheets("YourWorksheet").Range("C2")
set rngHomeTeam = Worksheets("YourWorksheet").Range("D2")
Dim rowOffset As Integer
rowOffset = 1
Do While (rngAwayTeam.Offset(rowOffset,1).Text <> "")
'Do something with rngAwayTeam.Offset(rowOffset,1).Text
'and rngHomeTeam.Offset(rowOffset,1).Text
rowOffset = rowOffset + 1
Loop
There are other ways I'm sure, but, here is what I would do.
Yes that is an excel macro. Again, I rarely use VBA or .Net, just trying to help you out the best I can. You could just use a C# COM object for the database side of things. (Still new, can't comment.)
You can do it in C# console application quite easily.
All you have to do is loop through each line in the file, adding it to an array using split on the comma (,).
Then you can use your array to display the values or retrieve a specific value on a row.