Parse csv along comparing txt file lined and log lines not found - powershell

As a continuation of a script I'm running, working on the following.
I have a CSV file that has formatted information, example as follows:
File named Import.csv:
Name,email,x,y,z
\I\RS\T\Name1\c\x,email#jksjks,d,f
\I\RS\T\Name2\d\f,email#jsshjs,d,f
...
This file is large.
I also have another file called Note.txt.
Name1
Name2
Name3
...
With help from #mathias-r-jessen
$Dir = PathToFile
$import = Import-Csv $Dir\import.csv
$NoteFile = "$Dir\Note.txt"
$Note = GC $NoteFile
$Import |Where-Object {$Note -contains $_.Name.Split('\')[4]} |Export-Csv "$Dir\Result.csv" -NoTypeInformation -Append
This code quickly and effortlessly parses the big csv and extracts every line that contains any of the lines in the $note file.
My next question is how do i log any lines in the $note file that were not found in the csv file.
I tried the following:
$result = $Import |Where-Object {$Note -contains $_.Name.Split('\')[4]} |Export-Csv "$Dir\Result.csv" -NoTypeInformation -Append
$Note | Where-Object {$result.Name.Split('\')[4] -notcontains $Note} | out-file $dir\not-found.log -append
This seems to return every line in $note.
#mathias-r-jessen any help you can provide would be appreciated.

You could use a Switch to do that.
Switch($Import){
{$Note -contains $_.Name.Split('\')[4]} {$_ | Export-Csv "$Dir\Result.csv" -NoTypeInformation -Append; continue}
default {$_ | Export-csv "$Dir\Not-Found.csv" -NoType -Append}
}
The continue in the first option makes it so that if the first case is a match it performs the relevant action, and then continues to the next record. If the first case doesn't match it moves on to the default action, which outputs it to a different file.

I solved it by using the following:
$result = $Import |Where-Object {$Note -contains $_.Name.Split('\')[4]}
$result | Export-Csv "$Dir\Result.csv" -NoTypeInformation -Append
$matches = $note | where-object { $result.Name -match $_}
compare-object $note $matches |where-object {$_.SideIndicator -like "<=" | select -ExpandProperty InputObject | Out-file "$Dir\Not_found.txt" -Append

Related

Powershell - Combine CSV files and append a column

I'm trying (badly) to work through combining CSV files into one file and prepending a column that contains the file name. I'm new to PowerShell, so hopefully someone can help here.
I tried initially to do the well documented approach of using Import-Csv / Export-Csv, but I don't see any options to add columns.
Get-ChildItem -Filter *.csv | Select-Object -ExpandProperty FullName | Import-Csv | Export-Csv CombinedFile.txt -UseQuotes Never -NoTypeInformation -Append
Next I'm trying to loop through the files and append the name, which kind of works, but for some reason this stops after the first row is generated. Since it's not a CSV process, I have to use the switch to skip the first title row of each file.
$getFirstLine = $true
Get-ChildItem -Filter *.csv | Where-Object {$_.Name -NotMatch "Combined.csv"} | foreach {
$filePath = $_
$collection = Get-Content $filePath
foreach($lines in $collection) {
$lines = ($_.Basename + ";" + $lines)
}
$linesToWrite = switch($getFirstLine) {
$true {$lines}
$false {$lines | Select -Skip 1}
}
$getFirstLine = $false
Add-Content "Combined.csv" $linesToWrite
}
This is where the -PipelineVariable parameter comes in real handy. You can set a variable to represent the current iteration in the pipeline, so you can do things like this:
Get-ChildItem -Filter *.csv -PipelineVariable File | Where-Object {$_.Name -NotMatch "Combined.csv"} | ForEach-Object { Import-Csv $File.FullName } | Select *,#{l='OriginalFile';e={$File.Name}} | Export-Csv Combined.csv -Notypeinfo
Merging your CSVs into one and adding a column for the file's name can be done as follows, using a calculated property on Select-Object:
Get-ChildItem -Filter *.csv | ForEach-Object {
$fileName = $_.Name
Import-Csv $_.FullName | Select-Object #{
Name = 'FileName'
Expression = { $fileName }
}, *
} | Export-Csv path/to/merged.csv -NoTypeInformation

Remove unnecessary commas in a column in csv file by using PowerShell

I am trying to Remove unnecessary commas in a column in the CSV file. For now, I know a few issues and hard-coded it, But I wanted the code to be dynamic. Any suggestions are greatly appreciated.
$FilePath = "C:\Test\"
Get-ChildItem $FilePath -Filter .csv | ForEach-Object {
(Get-Content $_.FullName -Raw) | Foreach-Object {
$_ -replace ',"Frederick, Fred",' , ',"Frederick Fred",' `
-replace ',"Brian, Josiah",' , ',"Brian Josiah",' `
-replace ',"Lisinopril ,Tablet / 20MG",' , ',"Lisinopril Tablet / 20MG",'
} | Set-Content $_.FullName
}
Try this, also note that I worked with the csv sample that you gave here.It might not work with other csv files.
also make sure that you change the path of %YOURCSVFILE% to the real path of your file
#import the csv
$csv = Import-Csv -Path %YOURCSVFILE% -Delimiter ','
#going each row and replacing commas
foreach ($desc in $csv){
$desc.Desc = $desc.Desc -replace ',',''
}
#exporting the csv
$csv | Export-csv -NoTypeInformation "noCommas.csv"
Here's a few more alteratives for you:
Method 1. Loop through the rows with foreach(..) and capture the output:
$result = foreach ($row in (Import-Csv -Path 'D:\Test\FileWithCommasInDescription.csv')) {
$row.Desc = $row.Desc -replace ','
$row # output the updated item
}
$result | Export-Csv -Path 'D:\Test\FileWithoutCommasInDescription.csv' -NoTypeInformation
Method 2. Use ForEach-Object and the automatic variable $_. Pipe the results through:
Import-Csv -Path 'D:\Test\FileWithCommasInDescription.csv' | ForEach-Object {
$_.Desc = $_.Desc -replace ','
$_ # output the updated item
} | Export-Csv -Path 'D:\Test\FileWithoutCommasInDescription.csv' -NoTypeInformation
Method 3. Use a calculated property:
Import-Csv -Path 'D:\Test\FileWithCommasInDescription.csv' |
Select-Object ID, #{Name = 'Desc'; Expression = {$_.Desc -replace ','}}, Nbr -ExcludeProperty Desc |
Export-Csv -Path 'D:\Test\FileWithoutCommasInDescription.csv' -NoTypeInformation
All will result in a new CSV file
"ID","Desc","Nbr"
"12","Frederick Fred","11"
"21","Brian Josiah","31"
"13","Lisinopril Tablet / 20MG","17"

powershell foreach shows duplicate result

I use powershell to automate extracting of selected data from a CSV file.
My $target_servers also contains two the same server name but it has different data in each rows.
Here is my code:
$target_servers = Get-Content -Path D:\Users\Tools\windows\target_prd_servers.txt
foreach($server in $target_servers) {
Import-Csv $path\Serverlist_Template.csv | Where-Object {$_.Hostname -Like $server} | Export-Csv -Path $path/windows_prd.csv -Append -NoTypeInformation
}
After executing the above code it extracts CSV data based on a TXT file, but my problem is some of the results are duplicated.
I am expecting around 28 results but it gave me around 49.
As commented, -Append is the culprit here and you should check if the newly added records are not already present in the output file:
# read the Hostname column of the target csv file as array to avoid duplicates
$existingHostsNames = #((Import-Csv -Path "$path/windows_prd.csv").Hostname)
$target_servers = Get-Content -Path D:\Users\Tools\windows\target_prd_servers.txt
foreach($server in $target_servers) {
Import-Csv "$path\Serverlist_Template.csv" |
Where-Object {($_.Hostname -eq $server) -and ($existingHostsNames -notcontains $_.HostName)} |
Export-Csv -Path "$path/windows_prd.csv" -Append -NoTypeInformation
}
You can convert your data to array of objects and then use select -Unique, like this:
$target_servers = Get-Content -Path D:\Users\Tools\windows\target_prd_servers.txt
$data = #()
foreach($server in $target_servers) {
$data += Import-Csv $path\Serverlist_Template.csv| Where-Object {$_.Hostname -Like $server}
}
$data | select -Unique | Export-Csv -Path $path/windows_prd.csv -Append -NoTypeInformation
It will work only if duplicated rows have same value in every column. If not, you can pass column names to select which are important for you. For ex.:
$data | select Hostname -Unique | Export-Csv -Path $path/windows_prd.csv -Append -NoTypeInformation
It will give you list of unique hostnames.

Blank first line when using "select-string -pattern" to strip lines from file

I have a simple text file that looks like this...
A,400000051115,null,null,null,null,null,null,null,20190312,090300,Answer Machine,2019,3,14,10,0
A,400000051117,null,null,null,null,null,null,null,20190312,090300,Confirmed,2019,3,14,10,30
A,400000051116,null,null,null,null,null,null,null,20190312,090300,Answer Machine,2019,3,14,11,0
A,400000051114,null,null,null,null,null,null,null,20190312,090300,Wants to Cancel,2019,3,14,9,0
A,400000051117,null,null,null,null,null,null,null,20190312,091800,SMS Sent,2019,3,14,10,30
A,400000051116,null,null,null,null,null,null,null,20190312,091800,SMS Sent,2019,3,14,11,0
A,400000051115,null,null,null,null,null,null,null,20190312,091800,SMS Sent,2019,3,14,10,0
A,400000051116,null,null,null,null,null,null,null,20190312,093000,Appointment Cancelled/Rescheduled Via SMS,2019,3,14,11,0
I need to save all the lines except those that have "SMS Sent" in them to a new file. I am using the following...
get-content $SourceFile.FullName | select-string -pattern 'SMS Sent' -notmatch | Out-File $targetFile
Why in the resulting file do I get a blank first line?
If you change Out-File $targetFile to Out-Host or even just omit that last segment in the pipeline, you will see a blank line in the console output, too.
The output analog of Get-Content is Set-Content, so if you change Out-File $targetFile to Set-Content $targetFile the first line is no longer blank.
Also, since you're working with a CSV file you could use Import-CSV to read the data and Where-Object to filter on that specific column, although a little extra work is required to specify the headers and omit them from the output file...
$csvHeaders = 1..17 | ForEach-Object -Process { "Column $_" }
$csvHeaders[11] = 'Status'
Import-Csv -Path $SourceFile.FullName -Header $csvHeaders `
| Where-Object -Property 'Status' -NE -Value 'SMS Sent' `
| ConvertTo-Csv -NoTypeInformation `
| Select-Object -Skip 1 `
| Set-Content $targetFile
...which writes...
"A","400000051115","null","null","null","null","null","null","null","20190312","090300","Answer Machine","2019","3","14","10","0"
"A","400000051117","null","null","null","null","null","null","null","20190312","090300","Confirmed","2019","3","14","10","30"
"A","400000051116","null","null","null","null","null","null","null","20190312","090300","Answer Machine","2019","3","14","11","0"
"A","400000051114","null","null","null","null","null","null","null","20190312","090300","Wants to Cancel","2019","3","14","9","0"
"A","400000051116","null","null","null","null","null","null","null","20190312","093000","Appointment Cancelled/Rescheduled Via SMS","2019","3","14","11","0"
...to $targetFile. Note that all of the values are quoted now. If your input file does have headers then you could use simply...
Import-Csv -Path $SourceFile.FullName `
| Where-Object -Property 'Status' -NE -Value 'SMS Sent' `
| Export-Csv -NoTypeInformation -LiteralPath $targetFile
In either case the output file will not contain a leading blank line.

Splitting CSV file by two columns

Starting with a 500,000 line CSV, I need to split the files by day and hour (the second and third columns). I've tried the modify the group to include the hour and while I see the hour get added to my filename, I get no results in the exported file.
The foreach doing the work:
foreach ($group in $data | Group Day,hour) {
$data | Where-Object { $_.Day -and $_.Hour -eq $group.Name }
ConvertTo-Csv -NoTypeInformation |
foreach {$_.Replace('"','')} |
Out-File "$Path\Testfile_$($group.name -replace $regexA, '').csv"
Sample Data:
Bob,1/27/2012,8:00,Basic,Operations
Charlie,2/3/2012,9:00,Advanced,Production
Bill,3/7/2012,10:00,Advanced,Production
You could import the CSV, determine the output filename on the fly, and append each record to the matchning file:
Import-Csv 'C:\path\to\input.csv' | ForEach-Object {
$filename = ('output_{0}_{1}.csv' -f $_.Day, $_.Hour) -replace '[/:]'
$_ | Export-Csv "C:\path\to\$filename" -Append -NoType
}
Note that Export-Csv -Append requires PowerShell v3 or newer.