I'm trying (badly) to work through combining CSV files into one file and prepending a column that contains the file name. I'm new to PowerShell, so hopefully someone can help here.
I tried initially to do the well documented approach of using Import-Csv / Export-Csv, but I don't see any options to add columns.
Get-ChildItem -Filter *.csv | Select-Object -ExpandProperty FullName | Import-Csv | Export-Csv CombinedFile.txt -UseQuotes Never -NoTypeInformation -Append
Next I'm trying to loop through the files and append the name, which kind of works, but for some reason this stops after the first row is generated. Since it's not a CSV process, I have to use the switch to skip the first title row of each file.
$getFirstLine = $true
Get-ChildItem -Filter *.csv | Where-Object {$_.Name -NotMatch "Combined.csv"} | foreach {
$filePath = $_
$collection = Get-Content $filePath
foreach($lines in $collection) {
$lines = ($_.Basename + ";" + $lines)
}
$linesToWrite = switch($getFirstLine) {
$true {$lines}
$false {$lines | Select -Skip 1}
}
$getFirstLine = $false
Add-Content "Combined.csv" $linesToWrite
}
This is where the -PipelineVariable parameter comes in real handy. You can set a variable to represent the current iteration in the pipeline, so you can do things like this:
Get-ChildItem -Filter *.csv -PipelineVariable File | Where-Object {$_.Name -NotMatch "Combined.csv"} | ForEach-Object { Import-Csv $File.FullName } | Select *,#{l='OriginalFile';e={$File.Name}} | Export-Csv Combined.csv -Notypeinfo
Merging your CSVs into one and adding a column for the file's name can be done as follows, using a calculated property on Select-Object:
Get-ChildItem -Filter *.csv | ForEach-Object {
$fileName = $_.Name
Import-Csv $_.FullName | Select-Object #{
Name = 'FileName'
Expression = { $fileName }
}, *
} | Export-Csv path/to/merged.csv -NoTypeInformation
Related
In this script I'm getting a collection of CSV files, performing a replace, storing in an empty array and attempting to export it to CSV.
$CSVFiles = Get-ChildItem "C:\GALIC\Test\Test2\WindowsLists\*.csv" -Exclude M*
$AllJobsList = $CSVFiles | ForEach { (Import-CSV $_ -Delimiter ',' | Select 'Agent', 'Name', 'Folder' | Where-Object {$_.Agent -like "*AGENTGROUP*"})}
$UpdatedGroupsList = #()
$AllJobsList | Export-Csv -Path "C:\GALIC\Test\Test2\WindowsLists\FullJobs-Test.csv" -NoTypeInformation -Force
**$CSVContent = Get-Content "C:\GALIC\Test\Test2\WindowsLists\FullJobs-Test.csv"
foreach($line in $CSVContent)
{
if($line.Contains('|') -and $line.Contains('HOSTG'))
{
#Write-Host $line
$null = $line.Replace('|', '').Replace('HOSTG', '')
#Write-Host $LineReplace
$UpdatedGroupsList += $line
}
}
$UpdatedGroupsList | Export-CSV -Path "C:\GALIC\Test\Test2\WindowsLists\UpdatedFullJobs.csv" -NoTypeInformation -Force**
($CSVContent on down is what's giving me issues.)
After opening the CSV file, the content looks nothing like what I'm expecting. Any ideas/suggestions?
enter image description here
I use powershell to automate extracting of selected data from a CSV file.
My $target_servers also contains two the same server name but it has different data in each rows.
Here is my code:
$target_servers = Get-Content -Path D:\Users\Tools\windows\target_prd_servers.txt
foreach($server in $target_servers) {
Import-Csv $path\Serverlist_Template.csv | Where-Object {$_.Hostname -Like $server} | Export-Csv -Path $path/windows_prd.csv -Append -NoTypeInformation
}
After executing the above code it extracts CSV data based on a TXT file, but my problem is some of the results are duplicated.
I am expecting around 28 results but it gave me around 49.
As commented, -Append is the culprit here and you should check if the newly added records are not already present in the output file:
# read the Hostname column of the target csv file as array to avoid duplicates
$existingHostsNames = #((Import-Csv -Path "$path/windows_prd.csv").Hostname)
$target_servers = Get-Content -Path D:\Users\Tools\windows\target_prd_servers.txt
foreach($server in $target_servers) {
Import-Csv "$path\Serverlist_Template.csv" |
Where-Object {($_.Hostname -eq $server) -and ($existingHostsNames -notcontains $_.HostName)} |
Export-Csv -Path "$path/windows_prd.csv" -Append -NoTypeInformation
}
You can convert your data to array of objects and then use select -Unique, like this:
$target_servers = Get-Content -Path D:\Users\Tools\windows\target_prd_servers.txt
$data = #()
foreach($server in $target_servers) {
$data += Import-Csv $path\Serverlist_Template.csv| Where-Object {$_.Hostname -Like $server}
}
$data | select -Unique | Export-Csv -Path $path/windows_prd.csv -Append -NoTypeInformation
It will work only if duplicated rows have same value in every column. If not, you can pass column names to select which are important for you. For ex.:
$data | select Hostname -Unique | Export-Csv -Path $path/windows_prd.csv -Append -NoTypeInformation
It will give you list of unique hostnames.
I'm trying to merge CSV files in Powershell. I've read numerous answers here but I'm stuck on this problem.
I have a list of csv files, 2 difficulties :
[A] each file has a metadataline, the headers are in the second line.
[B] each file has the same structure, but sometimes quotes surround the column to escape the content.
Thanks to this question : Merging multiple CSV files into one using PowerShell,
I'm able to solve these two problems individually.
However, I'm stuck at combining the solutions.
Partial solution A
Skips every metadata line as well as header for subsequent files
Adapting the answer from kemiller2002:
$sourcefilefolderPath = "C:\CSV_folder"
$destinationfilePath = "C:\appended_files.csv"
$getHeader = $true
Get-ChildItem -Path $sourcefilefolderPath -Filter *.csv -Recurse| foreach {
$filePath = $_.FullName
$lines = $lines = Get-Content $filePath
$linesToWrite = switch($getHeader) {
$true {$lines | Select -Skip 1} # skips only the metadata line
$false {$lines | Select -Skip 2} # skips both the metadata line as well as headers
}
$getHeader = False
Add-Content $destination_file $linesToWrite
}
The problem : Import-Csv $destination_file give inconsistent results, as the quoting can be different for each source file.
Partial solution B
handles successfully random quoted columns
Solution provided by stinkyfriend.
Import-Csv seems to import the data gracefully when the column quoting, however different from one column to the other, is consistent for each line of the source file.
I could not combine this solution with the one above.
Get-ChildItem -Path $sourcefilefolderPath -File -Filter *.csv -Recurse |
Select-Object -ExpandProperty FullName |
Import-Csv |
Export-Csv $destination_file -NoTypeInformation -Append
Thanks a lot for your help !
Solution C
produces blank file on my PC
using suggestion from Mathias R. Jessen
Get-ChildItem -Path $sourcefilefolderPath -File -Filter *.csv -Recurse | foreach {
Write-Host $_.FullName |
Get-Content $_.FullName | Select-Object -Skip 1 | ConvertFrom-Csv |
Export-Csv $destinationfilePath -NoTypeInformation -Append
--- EDIT ---
RESULT
I could solve the problem by creating appended_files.csv using the first matching source file and then append to it.
$pattern_sourceFile = "*.csv*"
$list_files = Get-ChildItem -Path $sourcefilefolderPath -File -Recurse | Where {
$_FullName -match $pattern_sourcefile }
Get-Content $list_files[0].FullName |
Select-Object -Skip 1 | # skips metadataline
ConvertFrom-Csv | Export-Csv $destinationfilePath -NoTypeInformation
$list_files |
Select-Object -Skip 1 | # skips $array_files[0]
foreach { Get-Content $_.FullName |
Select-Object -Skip 1 | # skips metadata line
ConvertFrom-Csv |
Export-Csv $destinationfilePath -NoTypeInformation -Append }
Use ConvertFrom-Csv instead of Import-Csv, this way you can still control how many lines to skip:
Get-Content $file |Select -Skip 1 |ConvertFrom-Csv
So you'll end up with something like:
$sourcefilefolderPath = "C:\CSV_folder"
$destinationfilePath = "C:\appended_files.csv"
Get-ChildItem -Path $sourcefilefolderPath -Filter *.csv -Recurse | foreach {
Get-Content $_.FullName |Select-Object -Skip 1 |ConvertFrom-Csv |Export-Csv -Path $destinationfilePath -NoTypeInformation -Append
}
I'm a bit new to PowerShell. I have a working script returning -Line, -Character and -Word to a csv file. I can't figure out how to add the full name of the file into the csv.
get-childitem -recurse -Path C:\Temp\*.* | foreach-object { $name = $_.FullName; get-content $name | Measure-Object -Line -Character -Word} | Export-Csv -Path C:\Temp\FileAttributes.csv
I've tried using Write-Host and Select-Object, but I'm not sure about the syntax.
I've been using the following as a reference.
Results
This is what I'm after
Use Select-Object with a calculated property:
Get-Childitem -recurse -Path C:\Temp\*.* | ForEach-Object {
$fullName = $_.FullName
Get-Content $fullName | Measure-Object -Line -Character -Word |
Select-Object #{ Name = 'FullName'; Expression={ $fullName } }, *
} | Export-Csv -Path C:\Temp\FileAttributes.csv
Note:
Pass -ExcludeProperty Property to Select-Object to omit the empty Property column.
Pass -NoTypeInformation to Export-Csv to suppress the virtually useless first line (the type annotation) in the CSV.
I need to take a slew of csv files from a directory and get them into an array in Powershell (to eventually manipulate and write back to a CSV).
The problem is there are 5 file types. I need around 8 columns from each. The columns are essentially the same, but have different headings.
Is there an easy way to do this? I started creating a custom object with my 8 fields, looping through the files importing each one, looking at the filename (which tells me the column names I need) and then a bunch of ifs to add it to my custom object array.
I was wondering if there is a simpler way...like with a template saying which columns from each file.
wound up doing this. It may have not been the most efficient, but works. I wound up writing out each file separately and combining at the end as PS really got bogged down (over a million rows combined).
$Newcsv = #()
$path = "c:\scrap\BWFILES\"
$files = gci -path $path -recurse -filter *.csv | Where-Object { ! ($_.psiscontainer) }
$counter=1
foreach($file in $files)
{
$csv = Import-Csv $file.FullName
if ($file.Name -like '*SAV*')
{
$Newcsv = $csv | Select-Object #{Name="PRODUCT";Expression={"SV"}},DMBRCH,DMACCT,DMSHRT
}
if ($file.Name -like '*TIME*')
{
$Newcsv = $csv | Select-Object #{Name="PRODUCT";Expression={"TM"}},TMBRCH,TMACCT,TMSHRT
}
if ($file.Name -like '*TRAN*')
{
$Newcsv = $csv | Select-Object #{Name="PRODUCT";Expression={"TR"}},DMBRCH,DMACCT,DMSHRT
}
if ($file.Name -like '*LN*')
{
$Newcsv = $csv | Select-Object #{Name="PRODUCT";Expression={"LN"}},LNBRCH,LNNOTE,LNSHRT
}
$Newcsv | Export-Csv "C:\scrap\$file.name$counter.csv" -force -notypeinformation
$counter++
}
get-childItem "c:\scrap\*.csv" | foreach {
$filePath = $_
$lines = $lines = Get-Content $filePath
$linesToWrite = switch($getFirstLine) {
$true {$lines}
$false {$lines | Select -Skip 1}
}
$getFirstLine = $false
Add-Content "c:\scrap\combined.csv" $linesToWrite
}
With a hashtable for reference, a little RegEx matching, and using the automatic variable $Matches in a ForEach-Object loop (alias % used) that could all be shortened to:
$path = "c:\scrap\BWFILES\"
$Reference = #{
'SAV' = 'SV'
'TIME' = 'TM'
'TRAN' = 'TR'
'LN'='LN'
}
Set-Content -Value "PRODUCT,BRCH,ACCT,SHRT" -Path 'c:\scrap\combined.csv'
gci -path $path -recurse -filter *.csv | Where-Object { !($_.psiscontainer) -and $_.Name -match ".*(SAV|TIME|TRAN|LN).*"}|%{
$Product = $Reference[($Matches[1])]
Import-CSV $_.FullName | Select-Object #{Name="PRODUCT";Expression={$Product}},*BRCH,#{l='Acct';e={$_.LNNOTE, $_.DMACCT, $_.TMACCT|?{$_}}},*SHRT | ConvertTo-Csv -NoTypeInformation | Select -Skip 1 | Add-Content 'c:\scrap\combined.csv'
}
That should produce the exact same file. Only kind of tricky part was the LNNOTE/TMACCT/DMACCT field since obviously you can't just do the same as like *SHRT.