Merges csv files from directory into a single csv file PowerShell - powershell

How can I run one single PowerShell script that does the following in series?
Adds a the filename of all csv files in a directory as a column in the end of each file using this script:
Get-ChildItem *.csv | ForEach-Object {
$CSV = Import-CSV -Path $_.FullName -Delimiter ","
$FileName = $_.Name
$CSV | Select-Object *,#{N='Filename';E={$FileName}} | Export-CSV $_.FullName -NTI -Delimiter ","}
Merges all csv files in the directory into a single csv file
Keeping only a header (first row) only from first csv and excluding all other first rows from files.
Similiar to what kemiller2002 has done here, except one script with csv inputs and a csv output.

Bill's answer allows you to combine CSVs, but doesn't tack file names onto the end of each row. I think the best way to do that would be to use the PipelineVariable common parameter to add that within the ForEach loop.
Get-ChildItem \inputCSVFiles\*.csv -PipelineVariable File |
ForEach-Object { Import-Csv $_ | Select *,#{l='FileName';e={$File.Name}}} |
Export-Csv \outputCSVFiles\newOutputFile.csv -NoTypeInformation
That should accomplish what you're looking for.

This is the general pattern:
Get-ChildItem \inputCSVFiles\*.csv |
ForEach-Object { Import-Csv $_ } |
Export-Csv \outputCSVFiles\newOutputFile.csv -NoTypeInformation
Make sure the output CSV file has a different filename pattern, or use a different directory name (like in this example).

If your csv files dont have always same header you can do it :
$Dir="c:\temp\"
#get header first csv file founded
$header=Get-ChildItem $Dir -file -Filter "*.csv" | select -First 1 | Get-Content -head 1
#header + all rows without header into new file
$header, (Get-ChildItem $Dir -file -Filter "*.csv" | %{Get-Content $_.fullname | select -skip 1}) | Out-File "c:\temp\result.csv"

Related

PowerShell CSV, take a specific row from each line and combine it into one CSV

I have 300 CSV files all separated in a directory.
I want to get one specific criteria from each CSV and put it into another using PowerShell.
This is the line I have, but doesn't seem to work.
Get-ChildItem -Filter "*Results.csv" | Get-Content | Where-Object {$_.NAME -eq "Cage,Johnny"} | Add-Content "test.csv"
I filtered for the specific CSVs I wanted in my directory with gci, Got the content of each using Get-Content and Where the value is Johnny Cage in the NAME column, and Add-Content into a test.csv file but doesn't work.
Any help would be great!
You need to deserialize your CSV text into objects with properties that can be referenced. Then you can compare the Name property. You can do the following if all your csv files have the same headers.
Get-ChildItem -Filter "*Results.csv" | Foreach-Object {
Import-Csv $_.FullName |
Where-Object {$_.NAME -eq "Cage,Johnny"} } |
Export-Csv "test.csv"
If your CSV files contain different headers, then you have a couple of options. One, you could create your output CSV with all possible headers that exist across all files (or just the headers you want as long as they are the same across all files). Second, you could just output your data rows and have a broken CSV.
# Broken CSV Approach
Get-ChildItem -Filter "*Results.csv" | Foreach-Object {
Import-Csv $_.FullName |
Where-Object {$_.NAME -eq "Cage,Johnny"}} | Foreach-Object {
$_ | ConvertTo-Csv -Notype | Select-Object -Skip 1
} | Add-Content test.csv
I think I got it.
Get-ChildItem -Filter *Results.csv |
ForEach-Object{
Import-Csv $_.NAME | ? { $_.EMPLID -eq "Cage,Johnny"}
} | Export-Csv "test.csv"

how to prepend filename to every record in a csv?

How do we prepend the filename to ALL the csv files in a specific directory?
I've got a bunch of csv files that each look like this:
ExampleFile.Csv
2323, alex, gordon
4382, liza, smith
The output I'd like is:
ExampleFile.Csv, 2323, alex, gordon
ExampleFile.Csv, 4382, liza, smith
How do we prepend the filename to ALL the csv files in a specific directory?
I've attempted the following solution:
Get-ChildItem *.csv | ForEach-Object {
$CSV = Import-CSV -Path $_.FullName -Delimiter ","
$FileName = $_.Name
$CSV | Select-Object *,#{E={$FileName}} | Export-CSV $_.FullName -NTI -Delimiter ","
}
However, this did not work because it was altering the first row. (My data does not have a header row). Also, this script will append to each record at the end rather than prepend at the beginning.
You're missing the column header name I think. Take a look at the duplicate (or original, rather) and see Shay's answer. Your Select-Object should look like:
$CSV | Select-Object #{Name='FileName';Expression={"$filename"}},* | Export-Csv -Path $FileName -NoTypeInformation -Delimiter ','
That worked fine for me with multiple CSVs in a directory when using the rest of your sample code verbatim.
If your files do not have headers and the column count is unknown or unpredictable, you can read each line with Get-Content, make the changes, and then use Set-Content to make the update.
Get-ChildItem *.csv | ForEach-Object {
$Filename = $_.Name
$Fullname = $_.FullName
$contents = Get-Content -Path $Fullname | Foreach-Object {
"{0}, {1}" -f $Filename,$_
}
$contents | Set-Content -Path $Fullname
}

Export-Csv adding unwanted header double quotes

I have got a source CSV file (without a header, all columns delimited by a comma) which I am trying split out into separate CSV files based upon the value in the first column and using that column value as the output file name.
Input file:
S00000009,2016,M04 01/07/2016,0.00,0.00,0.00,0.00,0.00,0.00,750.00,0.00,0.00
S00000009,2016,M05 01/08/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000009,2016,M06 01/09/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000010,2015,W28 05/10/2015,2275.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
S00000010,2015,W41 04/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000010,2015,W42 11/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000012,2015,W10 01/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W11 08/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W12 15/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
My PowerShell script looks like this:
Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def |
Group-Object -Property "service_id" |
Foreach-Object {
$path = $_.Name + ".csv";
$_.group | Export-Csv -Path $path -NoTypeInformation
}
Output files:
S00000009.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000009","2016","M04 01/07/2016","0.00","0.00","0.00","0.00","0.00","0.00","750.00","0.00","0.00"
"S00000009","2016","M05 01/08/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
"S00000009","2016","M06 01/09/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
S00000010.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000010","2015","W28 05/10/2015","2275.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00"
"S00000010","2015","W41 04/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
"S00000010","2015","W42 11/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
It is generating the new files using the header value in column 1 (service_id).
There are 2 problems.
The output CSV file contains a header row which I don't need.
The columns are enclosed with double quotes which I don't need.
First of all the .csv file needs headers and the quote marks as a csv file structure. But if you don't want them then you can go on with a text file or...
$temp = Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def | Group-Object -Property "service_id" |
Foreach-Object {
$path=$_.name+".csv"
$temp0 = $_.group | ConvertTo-Csv -NoTypeInformation | Select-Object -Skip 1
$temp1 = $temp0.replace("""","")
$temp1 > $path
}
But this output is not a "real" csv file.
Hope that helps.
For your particular scenario you could probably use a simpler approach. Read the input file as a plain text file, group the lines by splitting off the first field, then write the groups to output files named after the groups:
Get-Content 'INPUT_FILE.csv' |
Group-Object { $_.Split(',')[0] } |
ForEach-Object { $_.Group | Set-Content ($_.Name + '.csv') }
Another solution,
using no named headers but simply numbers (as they aren't wanted in output anyway)
avoiding unneccessary temporary files.
removing only field delimiting double quotes.
Import-Csv INPUT_FILE.csv -Header (1..12) |
Group-Object -Property "1" | Foreach-Object {
($_.Group | ConvertTo-Csv -NoType | Select-Object -Skip 1).Trim('"') -replace '","',',' |
Set-Content -Path ("{0}.csv" -f $_.Name)
}

merge several .csv files into one array

I'm trying to load several .csv files (with the same columns) into the same array from different directories.
$csv1 = Import-Csv "PATH1"
$csv1 = Import-Csv "PATH2"
$csv1 | Export-Csv C:\test.csv
This just outputs the last .csv loaded, what would be the best way to do this?
When in doubt, read the documentation. The Import-Csv cmdlet accepts an array of path strings as input, so all you need to do (assuming that all your CSVs have the same fields) is something like this:
$src = 'C:\path\to\input1.csv', 'C:\path\to\input2.csv', ...
$dst = 'C:\path\to\output.csv'
Import-Csv $src | Export-Csv $dst -NoType
If you want an additional column with the path of the source file you need some additional steps, though:
$src | ForEach-Object {
$path = $_
Import-Csv $path | Select-Object *,#{n='Path';e={$path}}
} | Export-Csv $dst -NoType

Bulk Merge of CSV Files in PowerShell using parent folder name

After beginning this task at the command line I realised I need to get down and dirty with Powershell. I have about 100 folders and each folder has a few thousand CSV files that I would like to merge together inside each folder. Ideally the merged CSV file(s) in each folder would use the parent folders name. For example, here is a top level folder conatining the 100 folders
E:\CSVFolders
The subfolders are named in a semi-random fashion like this:
E:\CSVFolders\Folder1
E:\CSVFolders\Folder18
So far I am at this point:
# Merge csv files and use the parent folder name
Import-Csv (Get-ChildItem File*.csv) |
Export-Csv $folderName.csv -NoTypeInformation -Encoding UTF8
I am struggling to make the script enumerate the subfolders and then use their name as the basis for the merged CSV file so if anyone is able to shed light on this I would appreciate it!
Use two loops:
Get-ChildItem 'E:\CSVFolders' | Where-Object {
$_.PSIsContainer
} | ForEach-Object {
$csv = Join-Path $_.FullName ($_.Name + '.csv')
Get-ChildItem $_.FullName -Filter File*.csv | ForEach-Object {
Import-Csv $_.FullName
} | Export-Csv $csv -NoType -Encoding UTF8
}
you can group by directory like this:
Get-ChildItem "c:\temp" -file -Filter "*.csv" -Recurse |
group DirectoryName |
%{$dir=$_.Name; $_.Group.FullName | %{import-csv -path $_} | export-csv "$dir\global.csv" -NoTypeInformation}
short version (for no purist) :
gci "c:\temp" -file -Filter "*.csv" -Rec |
group DirectoryName |
%{$dir=$_.Name; $_.Group.FullName | %{ipcsv -path $_} | epcsv "$dir\global.csv" -NoType}