Splitting CSV file by two columns - powershell

Starting with a 500,000 line CSV, I need to split the files by day and hour (the second and third columns). I've tried the modify the group to include the hour and while I see the hour get added to my filename, I get no results in the exported file.
The foreach doing the work:
foreach ($group in $data | Group Day,hour) {
$data | Where-Object { $_.Day -and $_.Hour -eq $group.Name }
ConvertTo-Csv -NoTypeInformation |
foreach {$_.Replace('"','')} |
Out-File "$Path\Testfile_$($group.name -replace $regexA, '').csv"
Sample Data:
Bob,1/27/2012,8:00,Basic,Operations
Charlie,2/3/2012,9:00,Advanced,Production
Bill,3/7/2012,10:00,Advanced,Production

You could import the CSV, determine the output filename on the fly, and append each record to the matchning file:
Import-Csv 'C:\path\to\input.csv' | ForEach-Object {
$filename = ('output_{0}_{1}.csv' -f $_.Day, $_.Hour) -replace '[/:]'
$_ | Export-Csv "C:\path\to\$filename" -Append -NoType
}
Note that Export-Csv -Append requires PowerShell v3 or newer.

Related

powershell foreach shows duplicate result

I use powershell to automate extracting of selected data from a CSV file.
My $target_servers also contains two the same server name but it has different data in each rows.
Here is my code:
$target_servers = Get-Content -Path D:\Users\Tools\windows\target_prd_servers.txt
foreach($server in $target_servers) {
Import-Csv $path\Serverlist_Template.csv | Where-Object {$_.Hostname -Like $server} | Export-Csv -Path $path/windows_prd.csv -Append -NoTypeInformation
}
After executing the above code it extracts CSV data based on a TXT file, but my problem is some of the results are duplicated.
I am expecting around 28 results but it gave me around 49.
As commented, -Append is the culprit here and you should check if the newly added records are not already present in the output file:
# read the Hostname column of the target csv file as array to avoid duplicates
$existingHostsNames = #((Import-Csv -Path "$path/windows_prd.csv").Hostname)
$target_servers = Get-Content -Path D:\Users\Tools\windows\target_prd_servers.txt
foreach($server in $target_servers) {
Import-Csv "$path\Serverlist_Template.csv" |
Where-Object {($_.Hostname -eq $server) -and ($existingHostsNames -notcontains $_.HostName)} |
Export-Csv -Path "$path/windows_prd.csv" -Append -NoTypeInformation
}
You can convert your data to array of objects and then use select -Unique, like this:
$target_servers = Get-Content -Path D:\Users\Tools\windows\target_prd_servers.txt
$data = #()
foreach($server in $target_servers) {
$data += Import-Csv $path\Serverlist_Template.csv| Where-Object {$_.Hostname -Like $server}
}
$data | select -Unique | Export-Csv -Path $path/windows_prd.csv -Append -NoTypeInformation
It will work only if duplicated rows have same value in every column. If not, you can pass column names to select which are important for you. For ex.:
$data | select Hostname -Unique | Export-Csv -Path $path/windows_prd.csv -Append -NoTypeInformation
It will give you list of unique hostnames.

Export-Csv adding unwanted header double quotes

I have got a source CSV file (without a header, all columns delimited by a comma) which I am trying split out into separate CSV files based upon the value in the first column and using that column value as the output file name.
Input file:
S00000009,2016,M04 01/07/2016,0.00,0.00,0.00,0.00,0.00,0.00,750.00,0.00,0.00
S00000009,2016,M05 01/08/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000009,2016,M06 01/09/2016,0.00,0.00,0.00,0.00,0.00,0.00,600.00,0.00,0.00
S00000010,2015,W28 05/10/2015,2275.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00
S00000010,2015,W41 04/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000010,2015,W42 11/01/2016,0.00,0.00,0.00,0.00,0.00,0.00,568.75,0.00,0.00
S00000012,2015,W10 01/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W11 08/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
S00000012,2015,W12 15/06/2015,0.00,0.00,0.00,0.00,0.00,0.00,650.00,0.00,0.00
My PowerShell script looks like this:
Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def |
Group-Object -Property "service_id" |
Foreach-Object {
$path = $_.Name + ".csv";
$_.group | Export-Csv -Path $path -NoTypeInformation
}
Output files:
S00000009.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000009","2016","M04 01/07/2016","0.00","0.00","0.00","0.00","0.00","0.00","750.00","0.00","0.00"
"S00000009","2016","M05 01/08/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
"S00000009","2016","M06 01/09/2016","0.00","0.00","0.00","0.00","0.00","0.00","600.00","0.00","0.00"
S00000010.csv:
"service_id","year","period","cash_exp","cash_inc","cash_def","act_exp","act_inc","act_def","comm_exp","comm_inc","comm_def"
"S00000010","2015","W28 05/10/2015","2275.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00","0.00"
"S00000010","2015","W41 04/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
"S00000010","2015","W42 11/01/2016","0.00","0.00","0.00","0.00","0.00","0.00","568.75","0.00","0.00"
It is generating the new files using the header value in column 1 (service_id).
There are 2 problems.
The output CSV file contains a header row which I don't need.
The columns are enclosed with double quotes which I don't need.
First of all the .csv file needs headers and the quote marks as a csv file structure. But if you don't want them then you can go on with a text file or...
$temp = Import-Csv INPUT_FILE.csv -Header service_id,year,period,cash_exp,cash_inc,cash_def,act_exp,act_inc,act_def,comm_exp,comm_inc,comm_def | Group-Object -Property "service_id" |
Foreach-Object {
$path=$_.name+".csv"
$temp0 = $_.group | ConvertTo-Csv -NoTypeInformation | Select-Object -Skip 1
$temp1 = $temp0.replace("""","")
$temp1 > $path
}
But this output is not a "real" csv file.
Hope that helps.
For your particular scenario you could probably use a simpler approach. Read the input file as a plain text file, group the lines by splitting off the first field, then write the groups to output files named after the groups:
Get-Content 'INPUT_FILE.csv' |
Group-Object { $_.Split(',')[0] } |
ForEach-Object { $_.Group | Set-Content ($_.Name + '.csv') }
Another solution,
using no named headers but simply numbers (as they aren't wanted in output anyway)
avoiding unneccessary temporary files.
removing only field delimiting double quotes.
Import-Csv INPUT_FILE.csv -Header (1..12) |
Group-Object -Property "1" | Foreach-Object {
($_.Group | ConvertTo-Csv -NoType | Select-Object -Skip 1).Trim('"') -replace '","',',' |
Set-Content -Path ("{0}.csv" -f $_.Name)
}

How do I find text between two words and export it to txt.file

I have a CSV file which contains many lines and I want to take the text between <STR_0.005_Long>, and µm,5.000µm.
Example line from the CSV:
Straightness(Up/Down) <STR_0.005_Long>,4.444µm,5.000µm,,Pass,‌​2.476µm,1.968µm,25,0‌​.566µm,0.720µm
This is the script that I am trying to write:
$arr = #()
$path = "C:\Users\georgi\Desktop\5\test.csv"
$pattern = "(?<=.*<STR_0.005_Long>,)\w+?(?=µm,5.000µm*)"
$Text = Get-Content $path
$Text.GetType() | Format-Table -AutoSize
$Text[14] | Foreach {
if ([Regex]::IsMatch($_, $pattern)) {
$arr += [Regex]::Match($_, $pattern)
Out-File C:\Users\georgi\Desktop\5\test.txt -Append
}
}
$arr | Foreach {$_.Value} | Out-File C:\Users\georgi\Desktop\5\test.txt -Append
Use a Where-Object filter with your regular expression and simply output the match to the output file:
Get-Content $path |
Where-Object { $_ -match $pattern } |
ForEach-Object { $matches[0] } |
Out-File 'C:\Users\georgi\Desktop\5\test.txt'
Of course, since you have a CSV, you could simply use Import-Csv and export the value of that particular column:
Import-Csv $path | Select-Object -Expand 'column_name' |
Out-File 'C:\Users\georgi\Desktop\5\test.txt'
Replace column_name with the actual name of the column. If the CSV doesn't have a column header you can specify one via the -Header parameter:
Import-Csv $path -Header 'col1','col2','col3',... |
Select-Object -Expand 'col2' |
Out-File 'C:\Users\georgi\Desktop\5\test.txt'

Parse csv along comparing txt file lined and log lines not found

As a continuation of a script I'm running, working on the following.
I have a CSV file that has formatted information, example as follows:
File named Import.csv:
Name,email,x,y,z
\I\RS\T\Name1\c\x,email#jksjks,d,f
\I\RS\T\Name2\d\f,email#jsshjs,d,f
...
This file is large.
I also have another file called Note.txt.
Name1
Name2
Name3
...
With help from #mathias-r-jessen
$Dir = PathToFile
$import = Import-Csv $Dir\import.csv
$NoteFile = "$Dir\Note.txt"
$Note = GC $NoteFile
$Import |Where-Object {$Note -contains $_.Name.Split('\')[4]} |Export-Csv "$Dir\Result.csv" -NoTypeInformation -Append
This code quickly and effortlessly parses the big csv and extracts every line that contains any of the lines in the $note file.
My next question is how do i log any lines in the $note file that were not found in the csv file.
I tried the following:
$result = $Import |Where-Object {$Note -contains $_.Name.Split('\')[4]} |Export-Csv "$Dir\Result.csv" -NoTypeInformation -Append
$Note | Where-Object {$result.Name.Split('\')[4] -notcontains $Note} | out-file $dir\not-found.log -append
This seems to return every line in $note.
#mathias-r-jessen any help you can provide would be appreciated.
You could use a Switch to do that.
Switch($Import){
{$Note -contains $_.Name.Split('\')[4]} {$_ | Export-Csv "$Dir\Result.csv" -NoTypeInformation -Append; continue}
default {$_ | Export-csv "$Dir\Not-Found.csv" -NoType -Append}
}
The continue in the first option makes it so that if the first case is a match it performs the relevant action, and then continues to the next record. If the first case doesn't match it moves on to the default action, which outputs it to a different file.
I solved it by using the following:
$result = $Import |Where-Object {$Note -contains $_.Name.Split('\')[4]}
$result | Export-Csv "$Dir\Result.csv" -NoTypeInformation -Append
$matches = $note | where-object { $result.Name -match $_}
compare-object $note $matches |where-object {$_.SideIndicator -like "<=" | select -ExpandProperty InputObject | Out-file "$Dir\Not_found.txt" -Append

Breaking up CSV Files

So I am looking at breaking up a CSV using Powershell. The CSV is delmited by | which isn't a problem, and I am looking to break it up into multiple smaller csvs while retaining the original. The breaks would occur based off of the value in a single column containing one of a list of values.
What I have done so far is to import the csv (delimited by |) and then
foreach($line in $csv) {
if($columnValue -like $target1) {
export-csv filename1.csv -Delimiter `| $line -append)}
elseif($columnValue -like $target2) {
export-csv filename2.csv -Delimiter `| $line -append)}
etc.
However I do not think it is exporting correctly, and I do not want there to be the quotes (and yes I know this is standard but I do not want them) Also I want the header from the original csv to be applied to the child csvs and its not being applied.
sorry if theres a better way to format the code still new here
Here is where I suggest the awesomeness of the Switch cmdlet. It compares something against multiple potential matches, and executes those matches where appropriate.
Switch($csv){
{$_.column -match $target1} {$_ | Export-CSV filename1.csv -append -delimiter '|'}
{$_.column -match $target2} {$_ | Export-CSV filename2.csv -append -delimiter '|'}
{$_.column -match $target3} {$_ | Export-CSV filename3.csv -append -delimiter '|'}
}
$data = import-csv $csvfile
$data | ?{$_.val -eq $criteria1} | export-csv -path "File1.csv"
$data | ?{$_.val -eq $criteria2} | export-csv -path "File2.csv"