Parse existing CSV file and strip certain columns

Parse existing CSV file and strip certain columns - powershell

I have a CSV file which contains, let's say 50 different columns with 1000 rows. I don't need all of this information though and would like to now parse through it and remove the columns for which I do not want.
I want to keep every row (users) so I shouldn't be removing those wholesale, however, I have about a dozen or so columns for which I need to remove the data. How can I do this?
Ex.
User1 | Name | Age | Location | Gender | HairColor
keep | keep | remove | remove | keep | remove

If you want the columns removed completely:
$CSV = Import-Csv $Path | Select-Object -Property User1, Name, Gender
$CSV | Export-Csv $NewPath -NoTypeInformation
Or this, if it's easier:
$CSV = Import-Csv $Path | Select-Object -Property * -ExcludeProperty Age, Location, HairColor
$CSV | Export-Csv $NewPath -NoTypeInformation
If you want the columns to remain but be empty:
$CSV = Import-Csv $Path | Select-Object -Property User1, Name, #{n='Age';e={}}, #{n='Location';e={}}, Gender, #{n='HairColor';e={}}
$CSV | Export-Csv $NewPath -NoTypeInformation

Related

export csv rows where duplicate values found in column

I have a csv file where I am trying to export rows into another csv file only where the values in the id column have duplicates.
I have the following csv file...
"id","blablah"
"valOne","valTwo"
"valOne","asdfdsa"
"valThree","valFour"
"valFive","valSix"
"valFive","qwreweq"
"valSeven","valEight"
I need the output csv file to look like the following...
"valOne","valTwo"
"valOne","asdfdsa"
"valFive","valSix"
"valFive","qwreweq"
Here is the code I have so far:
$inputCsv = Import-CSV './test.csv' -delimiter ","
#$output = #()
$inputCsv | Group-Object -prop id, blablah | Where-Object {$_.id -gt 1} |
Select-Object
##{n='id';e={$_.Group[0].id}},
##{n='blablah';e={$_.Group[0].blablah}}
#Export-Csv 'C:\scripts\powershell\output.csv' -NoTypeInformation
#Write-Host $output
#$output | Export-Csv 'C:\scripts\powershell\output.csv' -NoTypeInformation
I've searched multiple how-to's but can't seem to find the write syntax. Can anyone help with this?

Just group on the ID property and if there is more than 1 count in the group then expand those and export.
$inputCsv = Import-CSV './test.csv' -delimiter ","
$inputCsv |
Group-Object -Property ID |
Where-Object count -gt 1 |
Select-Object -ExpandProperty group |
Export-Csv output.csv -NoTypeInformation
output.csv will contain
"id","blablah"
"valOne","valTwo"
"valOne","asdfdsa"
"valFive","valSix"
"valFive","qwreweq"

PowerShell. Group-object usage in one file

I am trying to combine several rows into one, provided that the key cell is the same. And write data from all lines with the same key to the final line.
Example Pic
**Before**
ID | Name | DateTime | Duration | Call_Type |
1234509 | Mike | 2020-01-02T01:22:33 | | Start_Call |
1234509 | | 2020-01-02T01:32:33 | 600 | End_call |
AFTER
ID | Name | DateTime | Duration | Start_Call | End_call |
1234509 | Mike | 2020-01-02T01:22:33 | 600 |2020-01-02T01:22:33 | 2020-01-02T01:32:33 |
Before
ID;Name;DateTime;Duration;Call_Type
1234509;Mike;2020-01-02T01:22:33;;Start_Call
1234509;;2020-01-02T01:32:33;600;End_call
After
ID;Name;Duration;Start_Call;End_call
1234509;Mike;600;2020-01-02T01:22:33;2020-01-02T01:32:33
How to use here
$csv | Group-Object ID
and get the data as in the picture?

After grouping by ID with Group-Object, you can iterate each group and create a new System.Management.Automation.PSCustomObject with the properties you want to export in your output CSV file.
For ID we simply use the grouping key. Name and Duration we choose the first object that doesn't have a $null or empty version of that property using System.String.IsNullOrEmpty(). For Start_Call and End_Call we choose the object that has those values for the Call_Type property.
The filtering is done by Where-Object. To get the first and expanded versions of the properties, we also use -First and -ExpandProperty from Select-Object.
$csv = Import-Csv -Path .\data.csv -Delimiter ";"
$groups = $csv | Group-Object -Property ID
& {
foreach ($group in $groups)
{
[PSCustomObject]#{
ID = $group.Name
Name = $group.Group | Where-Object {-not [string]::IsNullOrEmpty($_.Name)} | Select-Object -First 1 -ExpandProperty Name
Duration = $group.Group | Where-Object {-not [string]::IsNullOrEmpty($_.Duration)} | Select-Object -First 1 -ExpandProperty Duration
Start_Call = $group.Group | Where-Object {$_.Call_Type -eq "Start_Call"} | Select-Object -First 1 -ExpandProperty DateTime
End_Call = $group.Group | Where-Object {$_.Call_Type -eq "End_Call"} | Select-Object -First 1 -ExpandProperty DateTime
}
}
} | Export-Csv -Path .\output.csv -Delimiter ";" -NoTypeInformation
output.csv
"ID";"Name";"Duration";"Start_Call";"End_Call"
"1234509";"Mike";"600";"2020-01-02T01:22:33";"2020-01-02T01:32:33"
If you want to remove quotes from the CSV file, you can use the -UseQuotes switch from Export-Csv. However, yhis does require PowerShell 7. If your using a lower PowerShell version, you can use some of the recommendations from How to remove all quotations mark in the csv file using powershell script?.

Getting only a repeating files from directory and subdirectories

I'm trying to do script for finding non-unique files.
The script should take one .csv file with data: name of files, LastWriteTime and Length. Then I try to make another .csv based on that one, which will contain only those objects whose combination of Name+Length+LastWriteTime is NON-unique.
I tried following script which uses $csvfile containing files list:
$csvdata = Import-Csv -Path $csvfile -Delimiter '|'
$csvdata |
Group-Object -Property Name, LastWriteTime, Length |
Where-Object -FilterScript { $_.Count -gt 1 } |
Select-Object -ExpandProperty Group -Unique |
Export-Csv $csvfile2 -Delimiter '|' -NoTypeInformation -Encoding Unicode
$csvfile was created by:
{
Get-ChildItem -Path $mainFolderPath -Recurse -File |
Sort-Object $sortMode |
Select-Object Name, LastWriteTime, Length, Directory |
Export-Csv $csvfile -Delimiter '|' -NoTypeInformation -Encoding Unicode
}
(Get-Content $csvfile) |
ForEach-Object { $_ -replace '"' } |
Out-File $csvfile -Encoding Unicode
But somehow in another $csvfile2 there is only the one (first) non-unique record. Does anyone have an idea how to improve it so it can list all non-unique records?

You need to use -Property * -Unique to get a list of unique objects. However, you cannot use -Property and -ExpandProperty at the same time here, because you want the latter parameter to apply to the input objects ($_) and the former parameter to apply to an already expanded property of those input objects ($_.Group).
Expand the property Group first, then select the unique objects:
... |
Select-Object -ExpandProperty Group |
Select-Object -Property * -Unique |
...

PowerShell gather info

I'm looking for a way to read csv file for header 'Customer' and 'Cars ID'. Issue is I have duplicate customer. I need to find a way to list each unique customer and all its cars id with it. If possible to export it as CustomerName and list all cars id under it. So if I have 3 unique customer, each customer will be export separately.
#Get unique customer
$GetUniqCustomer = Import-Csv $File |
Sort-Object {$_.customer} -Unique |
Select {$_.customer}
From here on I'm not sure how I would do what I've described.
This will list all carid under the specific.
Import-Csv $File | Where-Object {
$_.customer -eq $Customer
} | Select {$_."carid"}

For your particular scenario you want to use Group-Object rather then Sort-Object -Unique. Build new custom object from the grouped information and export them to the output CSV file.
Import-Csv $File | Group-Object customer | ForEach-Object {
New-Object -Type PSObject -Property #{
Customer = $_.Name
Cars = ($_.Group | Select-Object -Expand carid) -join ';'
}
} | Export-Csv 'C:\path\to\output.csv' -NoType

Possible to combine .csv where-object filters?

I'm trying to filter a .csv file based on a location column. The column has various location entries and I only need information from the rows that contain certain locations, that information then gets exported out to a separate .csv file. I can get it to work by searching the .csv file multiple times with each location filter, but I haven't had any luck when trying to combine it into 1 search.
What I have now is:
$csv = Import-Csv "${filepath}\temp1.csv"
$csv | Where-Object location -like "co*" | select EmployeeNumber | Export-Csv "${filepath}\disablelist.csv" -NoTypeInformation
$csv | Where-Object location -like "cc*" | select EmployeeNumber | Export-Csv "${filepath}\disablelist.csv" -Append -NoTypeInformation
$csv | Where-Object location -like "dc*" | select EmployeeNumber | Export-Csv "${filepath}\disablelist.csv" -Append -NoTypeInformation
$csv | Where-Object location -like "mf*" | select EmployeeNumber | Export-Csv "${filepath}\disablelist.csv" -Append -NoTypeInformation
What I'd like to have is something like below. I don't get any errors with it, but all I get is a blank .csv file:
$locations = "co*","cc*","dc*","mf*"
$csv = Import-Csv "${filepath}\temp1.csv"
$csv | Where-Object location -like $locations | select EmployeeNumber | Export-Csv "${filepath}\disablelist.csv" -NoTypeInformation
I've been lurking here for a while and I'm usually able to frankenstein a script together from what I find, but I can't seem to find anything on this. Thanks for your help.

You can replace multiple -like tests with a single -match test using an alternating regex:
$csv = Import-Csv "${filepath}\temp1.csv"
$csv | Where-Object {$_.location -match '^(co|cc|dc|mf)'} |
select EmployeeNumber |
Export-Csv "${filepath}\disablelist.csv" -NoTypeInformation
You can build that regex from a string array:
$locations = 'co','cc','dc','mf'
$LocationMatch = '^({0})' -f ($locations -join '|')
$csv = Import-Csv "${filepath}\temp1.csv"
$csv | Where-Object { $_.location -match $LocationMatch } |
select EmployeeNumber |
Export-Csv "${filepath}\disablelist.csv" -NoTypeInformation

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Parse existing CSV file and strip certain columns - powershell

Related

export csv rows where duplicate values found in column

PowerShell. Group-object usage in one file

Getting only a repeating files from directory and subdirectories

PowerShell gather info

Possible to combine .csv where-object filters?

Categories

Resources