Powershell merge CSV based on cell value in a row

Powershell merge CSV based on cell value in a row - powershell

I need to merge a bunch of CSV files into one big CSV and selecting only for the rows that contain value "XX" in column 2
I created the script to merge the CSV (that also removes the first 8 lines of the header) but I don't know how to filter the rows to be copied into the merged file with the condition above.
$InFiles = Get-ChildItem -LiteralPath '.\' -File | Where-Object { $_.Extension -eq '.csv' }
ForEach ($File In $InFiles)
{
$File | Get-Content | Select-Object -Skip 8 | Out-File -LiteralPath '.\out\OutFile.csv' -Append -Encoding ascii
}

Related

PowerShell remove last column of pipe delimited text file

I have a folder of pipe delimited text files that I need to remove the last column on. I'm not seasoned in PS but I found enough through searches to help. I have two pieces of code. The first creates new text files in my destination path, keeps the pipe delimiter, but doesn't remove the last column. There are 11 columns. Here is that script:
$OutputFolder = "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Load_To_IMS"
ForEach ($File in (Get-ChildItem "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Stage_To_IMS\*.txt"))
{
(Get-Content $File) | Foreach-Object { $_.split()[0..9] -join '|' } | Out-File $OutputFolder\$($File.Name)
}
Then this second code I tried creates the new text files on my destination path, it DOES get rid of the last column, but it loses the pipe delimiter. Ugh.
$OutputFolder = "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Load_To_IMS"
ForEach ($File in (Get-ChildItem "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Stage_To_IMS\*.txt"))
{
Import-Csv $File -Header col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11 -Delimiter '|' |
Foreach-Object {"{0} {1} {2} {3} {4} {5} {6} {7} {8} {9}" -f $_.col1,$_.col2,$_.col3,$_.col4,$_.col5,$_.col6,$_.col7,$_.col8,$_.col9,$_.col10} | Out-File $destination\$($File.Name)
}
I have no clue on what I'm doing wrong. I have no preference in which way I get this done but I need to keep the delimiter and the have the last column removed. Any help would be greatly appreciated.

In your plain-text processing attempt with Get-Content, you simply need to split each line by | first (.Split('|')), before extracting the fields of interest with a range operation (..) and joining them back with |:
Get-Content $File |
Foreach-Object { $_.Split('|')[0..9] -join '|' } |
Out-File $OutputFolder\$($File.Name)
In your Import-Csv-based attempt, you can take advantage of the fact that it will only read as many columns as you supply column names for, via -Header:
# Pass only 10 column names to -Header
Import-Csv $File -Header (0..9).ForEach({ 'col' + $_ }) -Delimiter '|' |
ConvertTo-Csv -Delimiter '|' | # convert back to CSV with delimiter '|'
Select-Object -Skip 1 | # skip the header row
Out-File $destination\$($File.Name)
Note that ConvertTo-Csv, just like Export-Csv by default double-quotes each field in the resulting CSV data / file.
In Windows PowerShell, you cannot avoid this, but in PowerShell (Core) 7+ you can control this behavior with -UseQuotes Never, for instance.

You can give this a try, should be more efficient than using Import-Csv, however note, this should always exclude the last column of your files no matter how many columns they have and assuming they're pipe delimited:
$OutputFolder = "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Load_To_IMS"
foreach ($File in (Get-ChildItem "D:\DC_Costing\Vendor Domain\CostUpdate_Development_Stage_To_IMS\*.txt")) {
[IO.File]::ReadAllLines($File.FullName) | & {
process{
-join ($_ -split '(?=\|)' | Select-Object -SkipLast 1)
}
} | Set-Content (Join-Path $OutputFolder -ChildPath $File.Name)
}

Delete empty rows in multiple csv files by using Powershell [duplicate]

This question already has answers here:
delete empty rows at end of the csv file using Powershell
(2 answers)
Closed 1 year ago.
To delete empty rows in multiple CSV files by using Powershell. this is the code I am trying with for each loop. But it is writing empty files. The resulting file should open in excel and notepad++ with proper format.
Get-ChildItem $paramDest -Filter *Test*.csv | ForEach-Object {
$content = Get-Content $_.FullName | Where { $_.Replace("","","").trim() -ne "" }
Out-File -InputObject $content -FilePath $_.FullName
}
I have some other set of files, the code should work for both these kinds of files. I am okay to have 2 separate codes for these 2 separate files.
Here another file sample format

Try with this
Input
Command
Import-Csv -Path "C:\sample.csv" | Where-Object { $_.PSObject.Properties.Value -ne '' } | Export-Csv -Path "C:\Sample_clean.csv" -NoTypeInformation
Output*
Edit: To update Multiple files, thanks to #Santiago Squarzon
Get-ChildItem $paramDest -Filter *Test*.csv |
Foreach-Object {
# get the content of file
$content = Get-Content $_.FullName
# replace all "," with empty string and exclude those lines and save the output to original file
$content | where { $_.Replace('","','').trim('"') -ne '' } | Set-Content $_.FullName
}

Export-Csv adding unwanted header double quotes

I have got a source CSV file (without a header, all columns delimited by a comma) which I am trying split out into separate CSV files based upon the value in the first column and using that column value as the output file name.
Input file:
S00000009,2016,M04 01/07/2016,0.00,0.00,0.00,0.00,0.00,0.00,750.00,0.00,0.00
S00000009,2016,M05 01/08/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000009,2016,M06 01/09/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000010,2015,W28 05/10/2015,2275.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
S00000010,2015,W41 04/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000010,2015,W42 11/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000012,2015,W10 01/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W11 08/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W12 15/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
My PowerShell script looks like this:
Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def |
Group-Object -Property "service_id" |
Foreach-Object {
$path = $_.Name + ".csv";
$_.group | Export-Csv -Path $path -NoTypeInformation
}
Output files:
S00000009.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000009","2016","M04 01/07/2016","0.00","0.00","0.00","0.00","0.00","0.00","750.00","0.00","0.00"
"S00000009","2016","M05 01/08/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
"S00000009","2016","M06 01/09/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
S00000010.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000010","2015","W28 05/10/2015","2275.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00"
"S00000010","2015","W41 04/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
"S00000010","2015","W42 11/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
It is generating the new files using the header value in column 1 (service_id).
There are 2 problems.
The output CSV file contains a header row which I don't need.
The columns are enclosed with double quotes which I don't need.

First of all the .csv file needs headers and the quote marks as a csv file structure. But if you don't want them then you can go on with a text file or...
$temp = Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def | Group-Object -Property "service_id" |
Foreach-Object {
$path=$_.name+".csv"
$temp0 = $_.group | ConvertTo-Csv -NoTypeInformation | Select-Object -Skip 1
$temp1 = $temp0.replace("""","")
$temp1 > $path
}
But this output is not a "real" csv file.
Hope that helps.

For your particular scenario you could probably use a simpler approach. Read the input file as a plain text file, group the lines by splitting off the first field, then write the groups to output files named after the groups:
Get-Content 'INPUT_FILE.csv' |
Group-Object { $_.Split(',')[0] } |
ForEach-Object { $_.Group | Set-Content ($_.Name + '.csv') }

Another solution,
using no named headers but simply numbers (as they aren't wanted in output anyway)
avoiding unneccessary temporary files.
removing only field delimiting double quotes.
Import-Csv INPUT_FILE.csv -Header (1..12) |
Group-Object -Property "1" | Foreach-Object {
($_.Group | ConvertTo-Csv -NoType | Select-Object -Skip 1).Trim('"') -replace '","',',' |
Set-Content -Path ("{0}.csv" -f $_.Name)
}

Merges csv files from directory into a single csv file PowerShell

How can I run one single PowerShell script that does the following in series?
Adds a the filename of all csv files in a directory as a column in the end of each file using this script:
Get-ChildItem *.csv | ForEach-Object {
$CSV = Import-CSV -Path $_.FullName -Delimiter ","
$FileName = $_.Name
$CSV | Select-Object *,#{N='Filename';E={$FileName}} | Export-CSV $_.FullName -NTI -Delimiter ","}
Merges all csv files in the directory into a single csv file
Keeping only a header (first row) only from first csv and excluding all other first rows from files.
Similiar to what kemiller2002 has done here, except one script with csv inputs and a csv output.

Bill's answer allows you to combine CSVs, but doesn't tack file names onto the end of each row. I think the best way to do that would be to use the PipelineVariable common parameter to add that within the ForEach loop.
Get-ChildItem \inputCSVFiles\*.csv -PipelineVariable File |
ForEach-Object { Import-Csv $_ | Select *,#{l='FileName';e={$File.Name}}} |
Export-Csv \outputCSVFiles\newOutputFile.csv -NoTypeInformation
That should accomplish what you're looking for.

This is the general pattern:
Get-ChildItem \inputCSVFiles\*.csv |
ForEach-Object { Import-Csv $_ } |
Export-Csv \outputCSVFiles\newOutputFile.csv -NoTypeInformation
Make sure the output CSV file has a different filename pattern, or use a different directory name (like in this example).

If your csv files dont have always same header you can do it :
$Dir="c:\temp\"
#get header first csv file founded
$header=Get-ChildItem $Dir -file -Filter "*.csv" | select -First 1 | Get-Content -head 1
#header + all rows without header into new file
$header, (Get-ChildItem $Dir -file -Filter "*.csv" | %{Get-Content $_.fullname | select -skip 1}) | Out-File "c:\temp\result.csv"

How to parse csv file, look for trigger and split into new files with powershell

I have a CSV file which is structured like this:
"SA1";"21020180123155514000000000000000002"
"SA2";"21020180123155514000000000000000002";"210"
"SA4";"21020180123155514000000000000000002";"210";"200000001"
"SA5";"21020180123155514000000000000000002";"210";"200000001";"140000001";"ZZ"
"SA1";"21020180123155522000000000000000002"
"SA2";"21020180123155522000000000000000002";"210"
"SA4";"21020180123155522000000000000000002";"210";"200000001"
"SA5";"21020180123155522000000000000000002";"210";"200000001";"140000671";"ZZ"
"SA1";"21020180123155567000000000000000002"
"SA2";"21020180123155567000000000000000002";"210"
"SA4";"21020180123155567000000000000000002";"210";"200000001"
"SA5";"21020180123155567000000000000000002";"210";"200000001";"140000001";"ZZ"
So the Value in the second field (separator ';') marks the data which belongs together and value 140000001 or 140000671 is the trigger.
So the result should be:
1st file: 140000001.txt
"SA1";"21020180123155514000000000000000002"
"SA2";"21020180123155514000000000000000002";"210"
"SA4";"21020180123155514000000000000000002";"210";"200000001"
"SA5";"21020180123155514000000000000000002";"210";"200000001";"140000001";"ZZ"
"SA1";"21020180123155567000000000000000002"
"SA2";"21020180123155567000000000000000002";"210"
"SA4";"21020180123155567000000000000000002";"210";"200000001"
"SA5";"21020180123155567000000000000000002";"210";"200000001";"140000001";"ZZ"
2nd file: 140000671.txt
"SA1";"21020180123155522000000000000000002"
"SA2";"21020180123155522000000000000000002";"210"
"SA4";"21020180123155522000000000000000002";"210";"200000001"
"SA5";"21020180123155522000000000000000002";"210";"200000001";"140000671";"ZZ"
For now I found a snippet which splits the big file by the second field:
$src = "C:\temp\ORD001.txt"
$dstDir = "C:\temp\files\"
Remove-Item -Path "$dstDir\\*"
$header = Get-Content -Path $src | select -First 1
Get-Content -Path $src | select -Skip 1 | foreach {
$file = "$(($_ -split ";")[1]).txt"
Write-Verbose "Wrting to $file"
$file = $file.Replace('"',"")
if (-not (Test-Path -Path $dstDir\$file))
{
Out-File -FilePath $dstDir\$file -InputObject $header -Encoding ascii
}
$file -replace '"', ""
Out-File -FilePath $dstDir\$file -InputObject $_ -Encoding ascii -Append
}
For the rest I'm standing in the dark.
Please help.

The Import-CSV cmdlet will work here, if you don't already know about it. I would use that, as it returns all the rows as different objects in an array, with the properties being the column values. And you don't have to manually remove the quotes and such. Assuming the second column is a date time value, and should be unique for each group of 4 consecutive rows, then this will work:
$src = "C:\temp\ORD001.txt"
$dstDir = "C:\temp\files\"
Remove-Item -Path "$dstDir\*"
$csv = Import-CSV $src -Delimiter ';'
$DateTimeGroups = $csv | Group-Object -Property 'ColumnTwoHeader'
foreach ($group in $DateTimeGroups) {
$filename = $group.Group.'ColumnFiveHeader' | select -Unique
$group.Group | Export-CSV "$dstDir\$filename.txt" -Append -NoTypeInformation
}
However, this will break if two of those "groups of 4 consecutive rows" have the same value for the second column and the fifth column. There isn't a way to fix this unless you are certain that there will always be 4 consecutive rows in each time group. In which case:
$src = "C:\temp\ORD001.txt"
$dstDir = "C:\temp\files\"
Remove-Item -Path "$dstDir\*"
$csv = Import-CSV $src -Delimiter ';'
if ($csv.count % 4 -ne 0) {
Write-Error "CSV does not have a proper number of rows. Attempting to continue will be bad :)"
return
}
for ($i = 0 ; $i -lt $csv.Count ; $i=$i+4) {
$group = $csv[$i..($i+4)]
$group | Export-Csv "$dstDir\$($group[3].'ColumnFiveHeader').txt" -Append -NoTypeInformation
}
Just be sure to replace Column2Header and Column5Header with the appropriate values.

If performance is not a concern, combining Import-Csv / Export-Csv with Group-Object allows the most concise, direct expression of your intent, using PowerShell's ability to convert CSV to objects and back:
$src = "C:\temp\ORD001.txt" # Input CSV file
$dstDir = "C:\temp\files" # Output directory
# Delete previous output files, if necessary.
Remove-Item -Path "$dstDir\*" -WhatIf
# Import the source CSV into custom objects with properties named for the columns.
# Note: The assumption is that your CSV header line defines columns "Col1", "Col2", ...
Import-Csv $src -Delimiter ';' |
# Group the resulting objects by column 2
Group-Object -Property Col2 |
ForEach-Object { # Process each resulting group.
# Determine the output filename via the group's last row's column 5 value.
$outFile = '{0}\{1}.txt' -f $dstDir, $_.Group[-1].Col5
# Append the group at hand to the target file.
$_.Group | Export-Csv -Append -Encoding Ascii $outFile -Delimiter ';' -NoTypeInformation
}
Note:
The assumption - in line with your sample data - is that it is always the last row in a group of lines sharing the same column-2 value whose column 5 contains the root of the output filename (e.g., 140000001)

Sorry but I don't have a Header Column. It's a semikolon seperated txt file for an interface

You can simply read the file with Get-Content, and then search for the trigger in the line.
I hope this small example can help:
$file = Get-Content CSV_File.txt
$140000001 = #()
$140000671 = #()
$bTrig = #()
foreach($line in $file){
$bTrig += $line
if($line -match ';"140000001";'){
$140000001 += $bTrig
$bTrig = #()
}
elseif($line -match ';"140000671";'){
$140000671 += $bTrig
$bTrig = #()
}
}
if($bTrig.Count -ne 0){Write-Warning "No trigger for $bTrig"}
$140000001 | Out-File 140000001.txt -Encoding ascii
$140000671 | Out-File 140000671.txt -Encoding ascii

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Powershell merge CSV based on cell value in a row - powershell

Related

PowerShell remove last column of pipe delimited text file

Delete empty rows in multiple csv files by using Powershell [duplicate]

Export-Csv adding unwanted header double quotes

Merges csv files from directory into a single csv file PowerShell

How to parse csv file, look for trigger and split into new files with powershell

Categories

Resources