Merging two CSVs and then re-ordering columns on output - powershell

I have this first CSV:
Server,Info
server1,item1
server1,item1
and this 2nd CSV:
Server,Info
server2,item2
server2,item2
And I am trying to get this output:
Server,Server,Info,Info
server1,server2,item1,item2
server1,server2,item1,item2
As you see, the problem is that the headers of the 2 CSV have the same names, which cause a problem if I parse them into objects and loop over keys.
So I am trying to merge them then reordering them as strings, but my mind can't figure how to do it in the last for loop:
$file1 = Get-Content ".\Powershell test\A.csv"
$file2 = Get-Content ".\Powershell test\B.csv"
$content = for ($i = 0; $i -lt $file1.Length; $i++) {
'{0},{1}' -f $file1[$i].Trim(), $file2[$i].Trim()
}
$content | Out-File ".\Powershell test\merged.csv"
$firstFileParsed = Import-Csv -Path ".\Powershell test\B.csv"
$secondFileParsed = Import-Csv -Path ".\Powershell test\B.csv"
$secondFilePath = ".\Powershell test\B.csv"
$contentOf2ndFile = Get-Content $secondFilePath
$csvColumnNames = (Get-Content '.\Powershell test\B.csv' |
Select-Object -First 1).Split(",")
$newColumns = #()
foreach($header in $csvColumnNames) {
$newColumns += $header
}
$newColumns = $newColumns -join ","
$contentOf2ndFile[0] = $newColumns
$contentOf2ndFile | Out-File ".\Powershell test\temp.csv"
$tempObject = Import-Csv -Path ".\Powershell test\temp.csv"
$tempFile = Get-Content ".\Powershell test\temp.csv"
$array = #()
$tempArr = #()
for ($i = 0; $i -lt $file1.Length; $i++) {
$tempArr1 = $file1[$i] -split ","
$tempArr2 = $tempFile[$i] -split ","
for ($j = 0; $j -lt $tempArr1.Length; $j++) {
$tempArr += $tempArr1[$j] + "," + $tempArr2[$j]
$tempArr
}
$array += $tempArr
}
$array | Out-File '.\Powershell test\merged.csv'

What you suggest is not very useful or even valid CSV. IMHO only two results would make sense:
This:
Server1,Info1,Server2,Info2
server1,item1,server2,item2
server1,item1,server2,item2
Or this:
Server,Info
server1,item1
server1,item1
server2,item2
server2,item2
First approach:
$csv1 = Import-Csv ".\Powershell test\A.csv"
$csv2 = Import-Csv ".\Powershell test\B.csv"
$merged = for($i = 0; $i -lt $csv1.Count; $i++) {
$new = new-object psobject
$entry1 = $csv1[$i]
$entry1 | Get-Member -Type NoteProperty | foreach {
Add-Member -InputObject $new -MemberType NoteProperty -Name ($_.Name + "1") -Value $entry1.($_.Name)
}
$entry2 = $csv2[$i]
$entry2 | Get-Member -Type NoteProperty | foreach {
Add-Member -InputObject $new -MemberType NoteProperty -Name ($_.Name + "2") -Value $entry2.($_.Name)
}
$new
}
$merged | Export-Csv ".\Powershell test\merged.csv"
Second approach:
$csv1 = Import-Csv ".\Powershell test\A.csv"
$csv2 = Import-Csv ".\Powershell test\B.csv"
$merged = $csv1 + $csv2
$merged | Export-Csv ".\Powershell test\merged.csv"
UPDATE
If you want exactly your output (and the files are certain to have the same headers and line count), you could use unique headers first, and then simply rename them later:
$csv1 = Import-Csv ".\Powershell test\A.csv"
$csv2 = Import-Csv ".\Powershell test\B.csv"
$merged = for($i = 0; $i -lt $csv1.Count; $i++) {
$new = New-Object PSObject
("Server", "Info") | foreach {
Add-Member -InputObject $new -MemberType NoteProperty -Name ($_ + "1") -Value $csv1[$i].$_
Add-Member -InputObject $new -MemberType NoteProperty -Name ($_ + "2") -Value $csv2[$i].$_
}
$new
}
$header = $true
$merged | ConvertTo-Csv -NoTypeInformation | foreach {
if ($header) {
$header = $false
# remove the numbers from the headers
$_ -replace "\d", ""
}
else { $_ }
} | Out-File ".\Powershell test\merged.csv"
Explanations:
Count is available in Powershell for all collections, and safer than Length which is a property of arrays only. But in this case, both should work.
In the loop, a new empty object is created (with New-Object) and then populated by adding the members of the parsed CSV objects (with Add-Member). A counter is added to the property names to make them unique.
The collection of these objects ($merged) is then converted to CSV, the numbers in the header line removed, and everything saved to file.

As it appears that there several used cases to discern unrelated property keys instead of merging them, I have added a new feature. The -Unify (formally/alias -Mergeparameter) to the Join-Object cmdlet, now accepts a one or two dynamic keys to distinguish unrelated column pairs in a join.
The -Unify (alias-Merge) parameter defines how to unify the left and
right object with respect to the unrelated common properties. The
common properties can discerned (<String>[,<String>]) or merged
(<ScriptBlock>). By default the unrelated common properties wil be
merged using the expression: {$LeftOrVoid.$_, $RightOrVoid.$_}
<String>[,<String>]
If the value is not a ScriptBlock, it is presumed
a string array with one or two items defining the left and right key
format. If the item includes an asterisks (*), the asterisks will be
replaced with the property name otherwise the item will be used to
prefix the property name.
Note: A consecutive number will be automatically added to a common
property name if is already used.
...
Example:
$Csv1 = ConvertFrom-Csv 'Server,Info
server1,item1
server1,item1'
$Csv2 = ConvertFrom-Csv 'Server,Info
server2,item2
server2,item2'
$Csv1 | Join $Csv2 -Unify *1, *2
Result:
Server1 Server2 Info1 Info2
------- ------- ----- -----
server1 server2 item1 item2
server1 server2 item1 item2

Related

PowerShell append to first available row in a headered column in csv

Fairly new to PowerShell and having challenges with appending data in the first available row for each headered column in a csv file.
I would like to utilize foreach for each column's type of data that will be independent of another column's data. The column headers are $headers = "Scope", "Drawing", "Submittal", "Database", "Estimate", "Sequence" with a foreach to locate and append their individual items to each column. The current problem that is happening is that because each category/column with its respective foreach will add it on a separate row because the previous row already had data appended from another category's/column's data creating a diagonal appended data.
The reason that a separate foreach is being used is for each category/column is because the category's are looking for and filtering files independently for each category.
Below is what is happening in the CSV file:
| Scope | Drawing | Submittal | Database | Estimate | Sequence |
| ------| ------- |---------- |--------- |--------- |--------- |
| DATA01| empty | empty | empty | empty | empty |
| empty | DATA11 | empty | empty | empty | empty |
| empty | empty | DATA21 | empty | empty | empty |
| empty | empty | empty | DATA31 | empty | empty |
| empty | empty | empty | empty | DATA41 | empty |
| empty | empty | empty | empty | empty | DATA51 |
This is what would be the desired result be for the CSV file:
| Scope | Drawing | Submittal | Database | Estimate | Sequence |
| ------| ------- |---------- |--------- |--------- |--------- |
| DATA01| DATA11 | DATA21 | DATA31 | DATA41 | DATA51 |
Here is part of the code that is being worked on:
# Creates the CSV if it does not already exist
$headers = "Scope", "Mechanical Drawing", "Controls Submittal", "Database", "Estimate", "Sequence of Operations"
$psObject = New-Object psobject
foreach($header in $headers)
{
Add-Member -InputObject $psobject -MemberType noteproperty -Name $header -Value ""
}
$psObject | Export-Csv $CsvFile -NoTypeInformation
foreach ($file in $ScopeList)
{
$hash=#{
"Scope" = $file.Fullname
}
$NewItem = New-Object PSObject -Property $hash
Export-Csv $CsvFile -inputobject $NewItem -append -Force
}
foreach ($file in $DrawingList)
{
$hash=#{
"Drawing" = $file.Fullname
}
$NewItem = New-Object PSObject -Property $hash
Export-Csv $CsvFile -inputobject $NewItem -append -Force
}
foreach ($file in $SubtmittalList)
{
$hash=#{
"Submittal" = $file.Fullname
}
$NewItem = New-Object PSObject -Property $hash
Export-Csv $CsvFile -inputobject $NewItem -append -Force
}
foreach ($file in $DatabaseList)
{
$hash=#{
"Database" = $file.Fullname
}
$NewItem = New-Object PSObject -Property $hash
Export-Csv $CsvFile -inputobject $NewItem -append -Force
}
foreach ($file in $EstimateList)
{
$hash=#{
"Estimate" = $file.Fullname
}
$NewItem = New-Object PSObject -Property $hash
Export-Csv $CsvFile -inputobject $NewItem -append -Force
}
foreach ($file in $SequenceList)
{
$hash=#{
"Sequence" = $file.Fullname
}
$NewItem = New-Object PSObject -Property $hash
Export-Csv $CsvFile -inputobject $NewItem -append -Force
}
The PowerShell version being used is 5.1. Windows 10 OS.
Could someone help me understand how to append on the same row but a different column without erasing another column's existing row of data? Would this be something that could be done with splatting or looking at each variable ${named}List?
If I understand the question properly, you have 6 arrays of '$file' items that need to be combined into a CSV file.
Then instead of using 6 foreach loops, just use one indexed loop and create Objects from the various lists
If all lists have the same number of items:
$result = for ($i = 0; $i -lt $ScopeList.Count; $i++) {
[PsCustomObject]#{
Scope = $ScopeList[$i].FullName
Drawing = $DrawingList[$i].FullName
Submittal = $SubtmittalList[$i].FullName
Database = $DatabaseList[$i].FullName
Estimate = $EstimateList[$i].FullName
Sequence = $SequenceList[$i].FullName
}
}
# now save the result as CSV file
$result | Export-Csv -Path 'X:\Path\to\TheResult.csv' -NoTypeInformation
If the lists are not all of the same length, you need to do some extra work:
# get the maximum number of items of your lists
$maxItems = (($ScopeList, $DrawingList, $SubtmittalList, $DatabaseList, $EstimateList, $SequenceList) |
Measure-Object -Property Count -Maximum).Maximum
# loop over the maximum number of items and check each list
# if the index $i does not exceed the max number of items for that list
$result = for ($i = 0; $i -lt $maxItems; $i++) {
[PsCustomObject]#{
Scope = if ($i -lt $ScopeList.Count) {$ScopeList[$i].FullName} else { $null}
Drawing = if ($i -lt $DrawingList.Count) {$DrawingList[$i].FullName} else { $null}
Submittal = if ($i -lt $SubtmittalList.Count) {$SubtmittalList[$i].FullName} else { $null}
Database = if ($i -lt $DatabaseList.Count) {$DatabaseList[$i].FullName} else { $null}
Estimate = if ($i -lt $EstimateList.Count) {$EstimateList[$i].FullName} else { $null}
Sequence = if ($i -lt $SequenceList.Count) {$SequenceList[$i].FullName} else { $null}
}
}
# now save the result as CSV file
$result | Export-Csv -Path 'X:\Path\to\TheResult.csv' -NoTypeInformation
Like Theo posted, this mostly worked or brought the intentions closer to the goal.
within each foreach loop that created their independent lists, another list was created for filtering the list and used in the below section from Theo.
# loop over the maximum number of items and check each list
# if the index $i does not exceed the max number of items for that list
$result = for ($i = 0; $i -lt $maxItems; $i++) {
[PsCustomObject]#{
Scope = if ($i -lt $ScopeList.Count) {$ScopeList[$i].FullName} else { $null}
Drawing = if ($i -lt $DrawingList.Count) {$DrawingList[$i].FullName} else { $null}
Submittal = if ($i -lt $SubtmittalList.Count) {$SubtmittalList[$i].FullName} else { $null}
Database = if ($i -lt $DatabaseList.Count) {$DatabaseList[$i].FullName} else { $null}
Estimate = if ($i -lt $EstimateList.Count) {$EstimateList[$i].FullName} else { $null}
Sequence = if ($i -lt $SequenceList.Count) {$SequenceList[$i].FullName} else { $null}
}
}
# now save the result as CSV file
$result | Export-Csv -Path 'X:\Path\to\TheResult.csv' -NoTypeInformation
To get the $maxItems, an alternative way was to have a default value of 1 unless a user input has a keyword "list" when they are prompted for other info, which is another challenge on how to limit values to greater than or equal to 1...

How to modify csv headers with powershell when there are no rows?

I have a PowerShell script that reads the CSV file, combines two rows into one based on condition, modifies column names, and exports a new CSV file. The problem is that script isn't working when CSV file has no rows although column names still need to be modified. I don't really know Powershell so any help will be greatly appreciated.
$csv = import-csv "<CSVFilePath>"
$fars = #()
$nears = #()
$combined = #()
foreach($line in $csv){
if(<CONDITION>) {
$fars += $line
} else {
$nears += $line
}
}
foreach($far in $fars){
foreach($near in $nears) {
if(<CONDITION>) {
$o = New-Object -Type PSObject
$far.PSObject.Properties | ForEach-Object {
$far_field = $_.Name + " FAR"
$o | Add-Member -Name $far_field -Type NoteProperty -Value $_.Value
}
$near.PSObject.Properties| ForEach-Object {ls
$near_field = $_.Name + " NEAR"
$o | Add-Member -Name $near_field -Type NoteProperty -Value $_.Value
}
$combined += $o
}
}
}
$combined | ConvertTo-Csv -NoTypeInformation | ForEach-Object { $_ -replace '"', ""} | out-file "<CSVFilePath>" -fo -en ascii

Monitor multiple file sizes and export result to csv every minute using PowerShell

What I am attempting to do is read the file size from 4 files every minute and then output the results to a CSV file. I would like each file size written to their own column. I have attempted this with the code below but I am not getting the results I need. When the file sizes are written to the CSV they are added to a single column, each size on their own row.
If I expand my array ($Process) to two records it forces all of the file lengths in the first record to the first row, each in their own column (which is what I want).
The second record does the same.
Is there any way to force a single record to one line and their values in their own columns by making an adjustment to the code? I appreciate your help. Thank you.
$Csv = 'c:\users\rob\LogFileSize.csv'
$today = get-date -f "ddd"
$day = $today.ToLower()
$date = Get-Date
$numColstoExport=5
$a = dir C:\RTscada\bin\ErrorLogs\error1_Log_$day.txt
$a | Add-Member -MemberType AliasProperty -Name FileLength -Value Length
$b = dir C:\RTscada\bin\ErrorLogs\error2_Log_$day.txt
$b| Add-Member -MemberType AliasProperty -Name FileLength -Value Length
$c = dir C:\RTscada\bin\ErrorLogs\error3_Log_$day.txt
$c | Add-Member -MemberType AliasProperty -Name FileLength -Value Length
$d = dir C:\RTscada\bin\ErrorLogs\error4_Log_$day.txt
$d | Add-Member -MemberType AliasProperty -Name FileLength -Value Length
$error1 = $a.Length
$error2 = $b.Length
$error3 = $c.Length
$error4 = $d.Length
# Array of date and file lengths
$Process = #($date, $error1, $error2, $error3, $error4)
$holdarr=#()
$pNames=#("Date", "Error1", "Error2","Error2","Error4")
foreach ($row in $Process){
$obj = new-object PSObject
for ($i=0;$i -lt $numColstoExport; $i++){
$obj | Add-Member -MemberType NoteProperty -Name $pNames[$i] -Value $row[$i]
}
$holdarr+=$obj
$obj=$null
}
$holdarr | export-csv $Csv -NoTypeInformation -Append
This should do the trick:
function MonitorLogFiles {
Param(
$LogFiles,
$Csv
)
$FirstCsvLine = 'Date'
foreach ($LogFile in $LogFiles) {
$FirstCsvLine = $FirstCsvLine + ",$(Split-Path $LogFile -leaf)"
}
$FirstCsvLine | Out-File $Csv -Encoding UTF8
while ($True) {
$CurrentCsvLine = (Get-Date -format "dd-MMM-yyyy HH:mm").ToString()
foreach ($LogFile in $LogFiles) {
$Size = (Get-Item $LogFile).Length / 1KB
$CurrentCsvLine = $CurrentCsvLine + ",$Size KB"
}
$CurrentCsvLine | Out-File $Csv -append -Encoding UTF8
Start-Sleep -Seconds 60
}
}
MonitorLogFiles -LogFiles C:\test.txt,C:\Test2.txt -Csv c:\test.csv
You can do something like this:
$holdarr|Select-Object -Property Name, Value | Export-Csv $Csv -NoTypeInformation -Append
Note: Whatever desired columns you wish to select , take all of them in the select-object.
Hope it helps.

try to determine the max number of character in each column

I have written a script that tries to determine the max no. of character for each column. This is what I wrote:
$path = 'folder path'
$file = Get-ChildItem $path\*
$FileContent = foreach ($files in $file) {
$FileHeader = #( (Get-Content $files -First 1).Split($delimiter) )
$importcsv = #( Import-Csv $files -Delimiter "$delimiter" )
for ($i=0; $i -lt $FileHeader.Length; $i++) {
#select each column
$Column = #( $importcsv | select $FileHeader[$i] )
#find the max no. of character
$MaxChar = #(($Column[$i] |
Select -ExpandProperty $FileHeader[$i] |
Measure-Object -Maximum -Property Length).Maximum)
$output = New-Object PSObject
$output | Add-Member NoteProperty FullName ($files.FullName)
$output | Add-Member NoteProperty FileName ($files.Name)
$output | Add-Member NoteProperty Time (Get-Date -Format s)
$output | Add-Member NoteProperty FileHeader ($($FileHeader[$i]))
$output | Add-Member NoteProperty MaxCharacter ($($MaxChar[$i]))
Write-Output $output
}
}
The script above is just part of it, so $delimiter is already defined. And finally I will export the result as CSV.
The script runs without any error, but when I open the file it only gives me the first column/header the max no. of character, and the rest of column/header are missing.
The perfect result will be showing each column/header the max no. of character.
Is something wrong with my loop?
my boss is trying to create an automate process to finding all the information from the raw data and use those information to upload to the database, so part of the script that is missing is about determine the delimiter of the raw file, the $CleanHeader is clean version of $FileHeader (remove all special characters, turn capital letters to small letters), those cleanheaders will be use for headers in the table in the database. and he also want to know the maximum character in each column, so that info can use them in creating the size of the column in the table in the database (he knows this part can be done in sql), but he ask me whether it can be done in PowerShell or not.
This should work:
$ht = #{}
# import a CSV and iterate over its rows
Import-Csv $f.FullName -Delimiter "$delimiter" | ForEach-Object {
# iterate over the columns of each row
$_.PSObject.Properties | ForEach-Object {
# update the hashtable if the length of the current column value is greater
# than the value in the hashtable
if ($_.Value.Length -gt $ht[$_.Name]) {
$ht[$_.Name] = $_.Value.Length
}
}
}
# create an object for each key in the hashtable
$date = Get-Date -Format s
$ht.Keys | ForEach-Object {
New-Object -Type PSObject -Property #{
FullName = $f.FullName
Name = $f.Name
Time = $date
FileHeader = $_
MaxCharacter = $ht[$_]
}
}
FileHeader[$i] was returning the column name with quotes : "ColumnName" instead of ColumnName
To fix, just add a trim to the line where you pull the header :
$FileHeader = #( (Get-Content $files -First 1).Split($delimiter).trim('"') )

How to add columns into multiple CSV files in a single directory without adding quotation marks?

I have thousands of CSV files in a single directory. I'm looking for a way how to add 2 columns (including headers) to each file.
There are some conditions:
There is always 0 value in column #5
In column #6 I want to store file name without extension (ABC)
INPUT FILE EXAMPLE (FILENAME IS ABC.CSV)
HEADER1,HEADER2,HEADER3,HEADER4
04/22/2012,47.64,47.97,47.05
04/23/2012,47.6,48.2,47.4
04/24/2012,48.13,48.33,47.84
04/25/2012,47.81,48.14,47.59
04/26/2012,47.83,48.21,47.49
04/27/2012,47.2,47.31,46.84
04/28/2012,47.01,47.05,46.33
The code I've posted below has 1 problem,
It adds quotation marks ("04/22/2012","47.64","47.97","47.05","0","ABC") to every value in a new file.
OUTPUT FILE EXAMPLE I NEED
HEADER1,HEADER2,HEADER3,HEADER4,HEADER5,HEADER6
04/22/2012,47.64,47.97,47.05,0,ABC
04/23/2012,47.6,48.2,47.4,0,ABC
04/24/2012,48.13,48.33,47.84,0,ABC
04/25/2012,47.81,48.14,47.59,0,ABC
04/26/2012,47.83,48.21,47.49,0,ABC
04/27/2012,47.2,47.31,46.84,0,ABC
04/28/2012,47.01,47.05,46.33,0,ABC
$files = Get-ChildItem ".\" -filter "*.csv"
for ($i=0; $i -lt $files.Count; $i++) {
$outfile = $files[$i].FullName + "out"
$csv = Import-Csv $files[$i].FullName
$newcsv = #()
foreach ( $row in $csv ) {
$row | Add-Member -MemberType NoteProperty -Name 'HEADER 5' -Value '0'
$row | Add-Member -MemberType NoteProperty -Name 'HEADER 6' -Value $files[$i].BaseName
$newcsv += $row
}
$newcsv | Export-Csv $files[$i].FullName -NoTypeInformation
}
And one more question. Because I have thousands of files in a directory, is this code efficient enough to do a task as fast as possible?
Somebody has already suggested me to improve code by Instead of looping thru the rows consider building te members with select
$csv = $csv | select-object *,#{n="HEADER5";e={0}},#{n="HEADER6";e={$file.BaseName}}
But I don't know how to implement his suggestion into my code.
Not tested:
$InputFolder = 'c:\SomeFolder'
$OutputFolder = 'c:\SomeOtherFolder'
Get-ChildItem $InputFolder -Filter *.* |
where {-not $_.psiscontainer} |
foreach {
$FileName = $_.Name
$BaseName = $_.Basename
$data = Get-Content $_ -ReadCount 0
"$($data[0]),Header5,Header6" | Set-Content $OutputFolder\$FileName
$data[1..($data.Length -1)] -replace '$',",0,$BaseName" |
Add-Content $OutputFolder\$FileName
}
I think my original answer depended too much on v3/v4 stuff, how about something like the following:
$files = Get-ChildItem ".\" | Where-Object { $_.Extension -eq ".csv" }
for ($i=0; $i -lt $files.Count; $i++) {
$outfile = $files[$i].FullName + "out"
$csv = Import-Csv $files[$i].FullName
$newcsv = #()
foreach ( $row in $csv ) {
$row | Add-Member -MemberType NoteProperty -Name 'HEADER5' -Value '0'
$row | Add-Member -MemberType NoteProperty -Name 'HEADER6' -Value $files[$i].BaseName
$newcsv += $row
}
( $newcsv | ConvertTo-Csv -NoTypeInformation ) | Foreach-Object { $_ -replace '"', '' } | Out-File $outfile
}