I have a CSV file with 2 columns and multiple rows. I have attached the contents of CSV file used for testing, with 2 columns and 30 rows having data from 2 person. But in real application the number of rows increases with the number of test made. So in practical application the dimension will be 2 columns and (15 times N) row where N is the number of person.
I need the output as N + 1 (header) rows and 15 columns (parameters). Can someone help me with a program to convert it using Powershell?
Gentle remainder that I took 2 readings for testing. In application the number of reading is not sure.
My input text file which is to be converted. The parameter and the corresponding values are separated by comma.
Name,Test
Age,18
Gender,Male
Time 1,379
Time 2,290
Time 3,305
Time 4,290
Time 5,319
Time 6,340
Time 7,436
Time 8,263
Time 9,290
Time 10,381
Responses,0
Average Reaction Time,329
Name,Test
Age,18
Gender,Male
Time 1,365
Time 2,340
Time 3,254
Time 4,270
Time 5,249
Time 6,350
Time 7,309
Time 8,527
Time 9,356
Time 10,407
Responses,1
Reaction Time,375
My code snippet for delimiting comma and transposing columns and rows
import-csv $file -delimiter "," | export-csv $outfile
(gc $outfile | select -Skip 1) | sc $outfile
$filedata = import-csv $outfile -Header Parameter , Value
$filedata | export-csv $outfile -NoTypeInformation
$Csv = import-csv $outfile
$Rows = #()
$Rows += $csv.Parameter -join ","
$Rows += $Csv.Value -join ","
Get-Process | Tee-Object -Variable ExportMe | Format-Table
$Rows | Set-Content $outfile
This is my current CSV file
Name,Age,Gender,Time 1,Time 2,Time 3,Time 4,Time 5,Time 6,Time 7,Time 8,Time 9,Time 10,Responses,Average Time,Name,Age,Gender,Time 1,Time 2,Time 3,Time 4,Time 5,Time 6,Time 7,Time 8,Time 9,Time 10,Responses,Average Time
Test,18,Male,379,290,305,290,319,340,436,263,290,381,0,329,Test,18,Male,365,340,254,270,249,350,309,527,356,407,1,375
I am expecting an output CSV like this
Name,Age,Gender,Time 1,Time 2,Time 3,Time 4,Time 5,Time 6,Time 7,Time 8,Time 9,Time 10,Responses,Average Time
Test,18,Male,379,290,305,290,319,340,436,263,290,381,0,329
Test,18,Male,365,340,254,270,249,350,309,527,356,407,1,375
I have also attached a snap of my actual and received output.
Thanks in Advance
I'd strongly suggest not trying to format the CSV by hand.
If you know that there are always exactly 15 rows of properties per person, you can do a nested loop to "chop up" your csv:
# import original csv
$rows = Import-Csv $file -Header Name,Value
# outer loop increments by 15 (span of one person) every time
$objects = for($i = 0;$i -lt $rows.Count;$i += 15){
# prepare an ordered dictionary to hold the properties
$props = [ordered]#{}
# generate an inner loop from the offset to offset+14
$i..($i+14)|%{
# copy each row to our dictionary
$props[$rows[$_].Name] = $rows[$_].Value
}
# cast our dictionary to an object
[pscustomobject]$props
}
# convert back to csv
$objects |Export-Csv $outfile -NoTypeInformation
Related
Iam trying to merge two csv files with common column name but one has 22 row and other just 16.
1st CSV 2nd CSV
Name Service_StatusA Name Service_StatusB
IEClient running IEClient Manual
IE Nomad running IE Nomad running
Data Usage running Print Spooler Manual
Print Spooler running Server running
Server running
I want to merge this to a single csv
Name Service_StatusA Service_StatusB
IEClient running Manual
IE Nomad running running
Data Usage running
Print Spooler running Manual
Server running running
$file1 = Import-Csv -Path .\PC1.csv
$file2 = Import-Csv -Path .\PC2.csv
$report = #()
foreach ($line in $file1)
{
$match = $file2 | Where-Object {$_.Name -eq $line.Name}
if ($match)
{
$row = "" | Select-Object 'Name','Service_StatusA','Service_StatusA',
$row.Name = $line.Name
$row.'Service_StatusA' = $line.'Service_StatusA'
$row.'Service_StatusB' = $match.'Service_StatusB'
$report += $row
}
}
$report | export-csv .\mergetemp.csv -notype -force
how to compare the row values before merging
In SQL database terms you want a left join, and your code is doing an inner join. In set terms, you are doing an intersection of 1.csv and 2.csv (only the rows which appear in both) but you want to be doing a union of 1.csv + the intersection (all rows from 1.csv with only matching lines from 2.csv).
You want every row in the first csv to be a row in the output csv. That should be the start - always output something in your loop. At the moment you output from the if() test. You want matching rows in the second csv to have their data added in if they exist, but not to change the amount of output.
$file1 = Import-Csv -Path .\PC1.csv
$file2 = Import-Csv -Path .\PC2.csv
$report = foreach ($line in $file1)
{
# always make an output line for each row in file1
$row = "" | Select-Object 'Name','Service_StatusA','Service_StatusA',
$row.Name = $line.Name
$row.'Service_StatusA' = $line.'Service_StatusA'
# if there is a matching line in file2, add its data in
$match = $file2 | Where-Object {$_.Name -eq $line.Name}
if ($match)
{
$row.'Service_StatusB' = $match.'Service_StatusB'
}
# always have output a row for a row in file1
$row
}
$report | export-csv .\mergetemp.csv -notype -force
(It is possible that what you want is a SQL outer join where rows in 2.csv that are not in 1.csv also create an output row, but your example does not show that).
(I took out $report += because it's more code that runs slower, which is an annoying combination).
I need to sort first column (column may differ) of csv files.
As my csv files have more than a million records, for executing below command , it is taking 10 minutes.
is there any other way to optimize the code to speed up the execution?
$CsvFile = "D:\Performance\10_lakh_records.csv"
$OutputFile ="D:\Performance\output.csv"
Import-Csv $CsvFile | Sort-Object { $_.psobject.Properties.Value[1] } | Export-Csv -Encoding default -Path $OutputFile -NoTypeInformation
You could try using the [array]::Sort() static method which might prove faster than Sort-Object, although it does take an extra step to first get a one-dimensional array of all values to sort upon..
Try
$CsvFile = "D:\Performance\10_lakh_records.csv"
$OutputFile = "D:\Performance\output.csv"
# import the data
$data = Import-Csv -Path $CsvFile
# determine the column name to sort on. In this demo the first column
# of course, if you know the column name you don't need that and can simply use the name as-is
$column = $data[0].PSObject.Properties.Name[0]
# use the Sort(Array, Array) overload method to sort the data by the
# values of the column you have chosen.
# see https://learn.microsoft.com/en-us/dotnet/api/system.array.sort?view=net-5.0#System_Array_Sort_System_Array_System_Array_
[array]::Sort($data.$column, $data)
$data | Export-Csv -Encoding default -Path $OutputFile -NoTypeInformation
I would like to read data from csv or another txt files. Data should been read only from row 1 and few columns on row 1 and save them to variables and after saving delete the row. Now I have done it like this:
Get-ChildItem -Path C:\path | ForEach-Object -Process {
$YourContent = Get-Content -Path $_.FullName
$YourVariable = $YourContent | Select-Object -First 1
$YourContent | Select-Object -Skip 1 | Set-Content -Path $_.FullName
My problem is that my variable prints out like this :
Elvis;867.5390;elvis#geocities.com
So I would like to save each variable to its own column. Example what csv could look:
Elvis | 867.5309 | Elvis#Geocities.com
Sammy | 555.1234 | SamSosa#Hotmail.com
Use Import-Csv instead of Get-Content:
Import-Csv file.csv -Delimiter ";" -Header A, B, C
here's one way to do what i think you want.
the 1st 8 lines make a file to work with. [grin]
line 10 reads in that file
lines 11-13 convert the 1st line into an object & remove the unwanted property
lines 14-15 grab all BUT the 1st line & send it to overwrite the source file
the remaining lines show what was done [grin]
Code:
$FileName = "$env:TEMP\Pimeydentimo.txt"
# create a file to work with
#'
Alfa;123.456;Some unwanted info;Alfa#example.com
Bravo;234.567;More info that can be dropped;Bravo#example.com
Charlie;345.678;This is also ignoreable;Charlie#example.com
'# | Set-Content -LiteralPath $FileName
$InStuff = Get-Content -LiteralPath $FileName
$TempObject = $InStuff[0] |
ConvertFrom-Csv -Delimiter ';' -Header 'Name', 'Number', 'DropThisOne', 'Email' |
Select-Object -Property * -ExcludeProperty DropThisOne
$InStuff[1..$InStuff.GetUpperBound(0)] |
Set-Content -LiteralPath $FileName
$InStuff
'=' * 30
$TempObject
'=' * 30
Get-Content -LiteralPath $FileName
output ...
Alfa;123.456;Some unwanted info;Alfa#example.com
Bravo;234.567;More info that can be dropped;Bravo#example.com
Charlie;345.678;This is also ignoreable;Charlie#example.com
==============================
Name Number Email
---- ------ -----
Alfa 123.456 Alfa#example.com
==============================
Bravo;234.567;More info that can be dropped;Bravo#example.com
Charlie;345.678;This is also ignoreable;Charlie#example.com
Thanks for the answers!
I try to clarify a bit more what i was trying to do. Answers might do it already, but I'm not yet that good in Powershell and learning still a alot.
If I have csv or any other txt file, i would want to read the first row of the file. The row contains more than one piece of information. I want also save each piece of information to Variables. After saving information to variables, I would like to delete the row.
Example:
Car Model Year
Ford Fiesta 2015
Audi A6 2018
In this example, i would like to save Ford, Fiesta and 2015 to variables (row 1)($Card, $Model, $Year) and after it delete the row. The 2nd row should not be deleted, because it is used later on
Edit 1:
So I've figure out how to get the unique headers in CSV 2 to append to CSV 1.
$header = ($table | Get-Member -MemberType NoteProperty).Name
$header_add = ($table_add | Get-Member -MemberType NoteProperty).Name
$header_diff = $header + $header_add
$header_diff = ($header_diff | Sort-Object -Unique)
$header_diff = (Compare-Object -ReferenceObject $header -DifferenceObject $header_diff -PassThru)
$header is an array of headers from CSV 1 ($table). $header_add is an array of headers from CSV 2 ($table_add). $header_diff houses the unique headers in CSV 2 by the end of the code block.
So as far as I'm aware, my next step would be:
$append = ($table_add | Select-Object $header_diff)
My problem now is how do I append these objects to my CSV 1 ($table 1) object? I don't quite see a way for Add-Member to do this in a particularly nice fashion.
Original:
Here's the headers for the two CSV files I'm trying to combine.
CSV 1:
Date, Name, Assigned Router, City, Country, # of Calls , Calls in , Calls out
CSV 2:
Date, Name, Assigned Router, City, Country, # of Minutes, Minutes in, Minutes out
So a quick rundown of what these files are; both files contain call information for a set of names for one day (the date column has the same date for each row; this is because this eventually gets sent to a master .xlsx file with all dates combined). All of the columns up to Country contain the same values in the same order in both files. The files simply separate the # of calls and # of minutes data. I was wondering if there was a convenient way to move the unlike columns from one CSV to another.
I've tried using something along the lines of:
Import-Csv (Get-ChildItem <directory> -Include <common pattern in file pair>) | Export-Csv <output path> -NoTypeInformation
This didn't combine all of the matching headers and append the unique ones afterwards. Only the first file that's processed kept its unique headers. The second file that was processed had all of those headers and data discarded in the output. Shared header data in the second CSV was added as additional rows.
An example output of my described fail output:
PS > $small | Format-Table
Column_1 Column_2 Column_3
-------- -------- --------
1 a a
1 b b
1 c c
PS > $small_add | Format-Table
Column_1 Column_4 Column_5
-------- -------- --------
1 x x
1 y y
1 z z
PS > Import-Csv (Get-ChildItem ./*.* -Include "small*.csv") | Select-Object * -unique | Format-Table
Column_1 Column_2 Column_3
-------- -------- --------
1 a a
1 b b
1 c c
1
1
1
I was wondering if I could do something like the following algorithm:
Import-Csv CSV_1 and CSV_2 to separate variables
Compare CSV_2 headers to CSV_1 headers, storing the unlike headers in CSV_2 into a separate variable
Select-Object all CSV_1 headers and unlike CSV_2 headers
Pipe the Select-Object output to Export-Csv
The only other method I could only think of is doing it line by line where I would:
Import-Csv both
remove all of the shared columns from CSV_2
change it from the custom object Powershell uses for CSVs to a string
append each line of CSV_2 to each line of CSV_1
It feels a bit unrefined and inflexible (flexibility can probably be dealt with by how columns/headers are isolated so there's no problem appending strings).
* This answer focuses on a high-level-of-abstraction OO solution.
* The OP's own solution relies more on string processing, which has the potential to be faster.
# The input file paths.
$files = 'csv1.csv', 'csv2.csv'
$outFile = 'csvMerged.csv'
# Read the 2 CSV files into collections of custom objects.
# Note: This reads the entire files into memory.
$doc1 = Import-Csv $files[0]
$doc2 = Import-Csv $files[1]
# Determine the column (property) names that are unique to document 2.
$doc2OnlyColNames = (
Compare-Object $doc1[0].psobject.properties.name $doc2[0].psobject.properties.name |
Where-Object SideIndicator -eq '=>'
).InputObject
# Initialize an ordered hashtable that will be used to temporarily store
# each document 2 row's unique values as key-value pairs, so that they
# can be appended as properties to each document-1 row.
$htUniqueRowD2Props = [ordered] #{}
# Process the corresponding rows one by one, construct a merged output object
# for each, and export the merged objects to a new CSV file.
$i = 0
$(foreach($rowD1 in $doc1) {
# Get the corresponding row from document 2.
$rowD2 = $doc2[$i++]
# Extract the values from the unique document-2 columns and store them in the ordered
# hashtable.
foreach($pname in $doc2OnlyColNames) { $htUniqueRowD2Props.$pname = $rowD2.$pname }
# Add the properties represented by the hashtable entries to the
# document-1 row at hand and output the augmented object (-PassThru).
$rowD1 | Add-Member -NotePropertyMembers $htUniqueRowD2Props -PassThru
}) | Export-Csv -NoTypeInformation -Encoding Utf8 $outFile
To put the above to the test, you can use the following sample input:
# Create sample input CSV files
#'
Date,Name,Assigned Router,City,Country,# of Calls,Calls in,Calls out
dt,nm,ar,ct,cy,cc,ci,co
dt2,nm2,ar2,ct2,cy2,cc2,ci2,co2
'# > csv1.csv
# Same column layout and data as above through column 'Country', then different.
#'
Date,Name,Assigned Router,City,Country,# of Minutes,Minutes in,Minutes out
dt,nm,ar,ct,cy,mc,mi,mo
dt2,nm2,ar2,ct2,cy2,mc2,mi2,mo2
'# > csv2.csv
The code should produce the following content in csvMerged.csv:
"Date","Name","Assigned Router","City","Country","# of Calls","Calls in","Calls out","# of Minutes","Minutes in","Minutes out"
"dt","nm","ar","ct","cy","cc","ci","co","mc","mi","mo"
"dt2","nm2","ar2","ct2","cy2","cc2","ci2","co2","mc2","mi2","mo2"
Edit 1:
# Read 2 CSVs into PowerShell CSV object
$table = Import-Csv test.csv
$table_add = Import-Csv test_add.csv
# Isolate unique headers in second CSV
$unique_headers = (Compare-Object -ReferenceObject $table[0].PSObject.Properties.Name -DifferenceObject $table_add[0].PSObject.Properties.Name | Where-Object SideIndicator -eq "=>").InputObject
# Convert CSVs to strings, with second CSV only containing unique columns
$table_str = ($table | ConvertTo-Csv -NoTypeInformation)
$table_add_str = ($table_add | Select-Object $unique_headers | ConvertTo-Csv -NoTypeInformation)
# Append CSV 2's unique columns to CSV 1
# Set line counter
$line = 0
# Concatenate CSV 2 lines to the end of CSV 1 lines until one or both are out of lines
While (($table_str[$line] -ne $null) -and ($table_add_str[$line] -ne $null)) {
If ($line -eq 0) {
$table_sum_str = $table_str[$line] + "," + $table_add_str[$line]
}
If ($line -ne 0) {
$table_sum_str = $table_sum_str + "`n" + ($table_str[$line] + "," + $table_add_str[$line])
}
$line = $line + 1
}
$table_sum_str | Set-Content -Path $outpath -Encoding UTF8
Using Measure-Command, the above code on my machine for the most part takes anywhere between 14-17 milliseconds to run. Running Measure-Command on mklement's yields effectively the same times from just eyeballing it.
Note that for both solutions, the data in the 2 CSV files must be in the same order. If you want to add 2 CSVs together that have complimentary data but in different orders, you need to use mklement's object oriented approach and add mechanisms to match the data to a location or name.
Original:
For those who don't want to use a hash table to do this:
# Make sure you're in same directory as files:
# CSV 1
$table = Import-Csv test.csv
# CSV 2
$table_add = Import-Csv test_add.csv
# Get array with CSV 1 headers
$header = ($table | Get-Member -MemberType NoteProperty).Name
# Get array with CSV 2 headers
$header_add = ($table_add | Get-Member -MemberType NoteProperty).Name
# Add arrays of both headers together
$header_diff = $header + $header_add
# Sort the headers, remove duplicate headers (first couple ones), keep unique ones
$header_diff = ($header_diff | Sort-Object -Unique)
# Remove all of CSV 1's unique headers and shared headers
$header_diff = (Compare-Object -ReferenceObject $header -DifferenceObject $header_diff -PassThru)
# Generate a CSV table containing only CSV 2's unique headers
$table_diff = ($table_add | Select-Object $header_diff)
# Convert CSV 1 from a custom PSObject to a string
$table_str = ($table | Select-Object * | ConvertTo-Csv)
# Convert CSV 2 (unique headers only) from custom PSObject to a string
$table_diff_str = ($table_diff | Select-Object * | ConvertTo-Csv)
# Set line counter
$line = 0
# Set flag for if headers have been processed
$headproc = 0
# Concatenate CSV 2 lines to the end of CSV 1 lines until one or both are out of lines.
While (($table_str[$line] -ne $null) -and ($table_diff_str[$line] -ne $null)) {
If ($headproc -eq 1) {
$table_sum_str = $table_sum_str + "`n" + ($table_str[$line] + "," + $table_diff_str[$line])
}
If ($headproc -eq 0) {
$table_sum_str = $table_str[$line] + "," + $table_diff_str[$line]
$headproc = 1
}
$line = $line + 1
}
$table_sum_str | ConvertFrom-Csv | Select-Object * | Export-Csv -Path "./test_sum.csv" -Encoding UTF8 -NoTypeInformation
Ran a quick comparison using Measure-Command between this and mklement0's script.
PS > Measure-Command {./self.ps1}
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 26
Ticks : 267771
TotalDays : 3.09920138888889E-07
TotalHours : 7.43808333333333E-06
TotalMinutes : 0.000446285
TotalSeconds : 0.0267771
TotalMilliseconds : 26.7771
PS > Measure-Command {./mklement.ps1}
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 18
Ticks : 185058
TotalDays : 2.141875E-07
TotalHours : 5.1405E-06
TotalMinutes : 0.00030843
TotalSeconds : 0.0185058
TotalMilliseconds : 18.5058
I assume speed differences are because I spend time creating a separate CSV PSObject to isolate columns instead of comparing them directly. mklement's also has the advantage of keeping the columns in the same order.
I have a CSV file called test.csv ($testCSV).
There are many columns in this file but I would simply like to select the first 10 columns and put these 10 columns in to another CSV file.
Please note that I DO NOT HAVE ANY COLUMN HEADERS so can not select columns based on a column name.
The below line of code will get the first 10 ROWS of the file:
$first10Rows = Get-Content $testCSV | select -First 10
However I need all the data for the first 10 COLUMNS and I am struggling to find a solution.
I have also had a look at splitting the file and attempting to return the first column as follows:
$split = ( Get-Content $testCSV) -split ','
$FirstColumn = $split[0]
I had hoped the $split[0] would return the entire first column but it only returns the very first field in the file.
Any help in solving this problem is very much appreciated.
Thanks in advance.
******UPDATE******
I am using the method as answered below by vonPryz to solve this problem, i.e.:
Import-Csv -Delimiter "," -Header #("a","b","c") -Path $testCSV | Select a,b
However I am now also trying to import the CSV file only where column b is not null by adding this extra bit of code:
Import-Csv -Delimiter "," -Header #("a","b","c") -Path $testCSV | Select a,b | where b -notmatch $null
I need to do this to speed up the script as there are tens of thousands of lines where column b is null and I do not need to import these lines.
However, the above code returns no data, either meaning the code must be wrong or it thinks the field b is not null. An example of 2 lines of the text file is:
1,2,3
x,,z
And I only want the line(s) where the second column is occupied.
I hope I've explained that well and again, any help is appreciated.
*******************ANSWER********************
Import-Csv -Delimiter "," -Header #("a","b","c") -Path $testCSV | Select a,b | Where-Object { $_.b -ne '' }
Thanks!
Lack of column headers is no problem. The cmdlet Import-CSV can specify headers with -Header switch. Assuming test data is saved as C:\temp\headerless.csv and contains
val11,val12,val13,val14
val21,val22,val23,val24
val31,val32,val33,val34
Importing it as CSV is trivial:
Import-Csv -Delimiter "," -Header #("a","b","c","d") -Path C:\temp\headerless.csv
#Output
a b c d
- - - -
val11 val12 val13 val14
val21 val22 val23 val24
val31 val32 val33 val34
Selecting just columns a and b is not hard either:
Import-Csv -Delimiter "," -Header #("a","b","c","d") -Path C:\temp\headerless.csv | select a,b | ft -auto
#Output
a b
- -
val11 val12
val21 val22
val31 val32
To start I want to mention that vonPryz's answer is a superb way of dealing with this. I just wanted to chime in about what you were trying to do and why it was not working.
You had the right idea. You were splitting the data on commas. However you were not doing this on every line. Just the file as a whole which was the source of your woes.
Get-Content $testCSV | ForEach-Object{
$split = $_ -split ","
$FirstColumn = $split[0]
}
That would split each line individually and then you could have populated the $FirstColumn variable.