powershell: Write specific rows from files to formatted csv - powershell

The following code gives me the correct output to console. But I would need it in a csv file:
$array = #{}
$files = Get-ChildItem "C:\Temp\Logs\*"
foreach($file in $files){
foreach($row in (Get-Content $file | select -Last 2)){
if($row -like "Total peak job memory used:*"){
$sp_memory = $row.Split(" ")[5]
$array.Add(($file.BaseName),([double]$sp_memory))
break
}
}
}
$array.GetEnumerator() | sort Value -Descending |Format-Table -AutoSize
current output (console):
required output (csv):
In order to increase performance I would like to avoid the array and write output directly to csv (no append).
Thanks in advance!

Change your last line to this -
$array.GetEnumerator() | sort Value -Descending | select #{l='FileName'; e={$_.Name}}, #{l='Memory (MB)'; e={$_.Value }} | Export-Csv -path $env:USERPROFILE\Desktop\Output.csv -NoTypeInformation
This will give you a csv file named Output.csv on your desktop.
I am using Calculated properties to change the column headers to FileName and Memory (MB) and piping the output of $array to Export-Csv cmdlet.
Just to let you know, your variable $array is of type Hashtable which won't store duplicate keys. If you need to store duplicate key/value pairs, you can use arrays. Just suggesting! :)

Related

Why Import-Csv's Sort-Object is slow for 1 million records

I need to sort first column (column may differ) of csv files.
As my csv files have more than a million records, for executing below command , it is taking 10 minutes.
is there any other way to optimize the code to speed up the execution?
$CsvFile = "D:\Performance\10_lakh_records.csv"
$OutputFile ="D:\Performance\output.csv"
Import-Csv $CsvFile | Sort-Object { $_.psobject.Properties.Value[1] } | Export-Csv -Encoding default -Path $OutputFile -NoTypeInformation
You could try using the [array]::Sort() static method which might prove faster than Sort-Object, although it does take an extra step to first get a one-dimensional array of all values to sort upon..
Try
$CsvFile = "D:\Performance\10_lakh_records.csv"
$OutputFile = "D:\Performance\output.csv"
# import the data
$data = Import-Csv -Path $CsvFile
# determine the column name to sort on. In this demo the first column
# of course, if you know the column name you don't need that and can simply use the name as-is
$column = $data[0].PSObject.Properties.Name[0]
# use the Sort(Array, Array) overload method to sort the data by the
# values of the column you have chosen.
# see https://learn.microsoft.com/en-us/dotnet/api/system.array.sort?view=net-5.0#System_Array_Sort_System_Array_System_Array_
[array]::Sort($data.$column, $data)
$data | Export-Csv -Encoding default -Path $OutputFile -NoTypeInformation

Need to remove specific portion from rows in a csv using powershell

I have a csv file with two columns and multiple rows, which has the information of files with folder location and its corresponding size, like below
"Folder_Path","Size"
"C:\MSSQL\DATA\UsersData\FTP.txt","21345"
"C:\MSSQL\DATA\UsersData\Norman\abc.csv","78956"
"C:\MSSQL\DATA\UsersData\Market_Database\123.bak","1234456"
What i want do is remove the "C:\MSSQL\DATA\" part from every row in the csv and keep the rest of the folder path after starting from UsersData and all other data intact as this info is repetitive. So my csv should like this below.
"Folder_Path","Size"
"UsersData\FTP.txt","21345"
"UsersData\Norman\abc.csv","78956"
"UsersData\Market_Database\123.bak","1234456"
What i am running is as below
Import-Csv ".\abc.csv" |
Select-Object -Property #{n='Folder_Path';e={$_.'Folder_Path'.Split('C:\MSSQL\DATA\*')[0]}}, * |
Export-Csv '.\output.csv' -NTI
Any help is appreciated!
Seems like a job for a simple string replace:
Get-Content "abc.csv" | foreach { $_.replace("C:\MSSQL\DATA\", "") | Set-Content "output.csv"
or:
[System.IO.File]::WriteAllText("output.csv", [System.IO.File]::ReadAllText("abc.csv" ).Replace("C:\MSSQL\DATA\", ""))
This should work:
Import-Csv ".\abc.csv" |
Select-Object -Property #{n='Folder_Path';e={$_.'Folder_Path' -replace '^.*\\(.*\\.*)$', '$1'}}, Size |
Export-Csv '.\output.csv' -NoTypeInformation

Count unique numbers in CSV (PowerShell or Notepad++)

How to find the count of unique numbers in a CSV file? When I use the following command in PowerShell ISE
1,2,3,4,2 | Sort-Object | Get-Unique
I can get the unique numbers but I'm not able to get this to work with CSV files. If for example I use
$A = Import-Csv C:\test.csv | Sort-Object | Get-Unique
$A.Count
it returns 0. I would like to count unique numbers for all the files in a given folder.
My data looks similar to this:
Col1,Col2,Col3,Col4
5,,7,4
0,,9,
3,,5,4
And the result should be 6 unique values (preferably written inside the same CSV file).
Or would it be easier to do it with Notepad++? So far I have found examples only on how to count the unique rows.
You can try the following (PSv3+):
PS> (Import-CSV C:\test.csv |
ForEach-Object { $_.psobject.properties.value -ne '' } |
Sort-Object -Unique).Count
6
The key is to extract all property (column) values from each input object (CSV row), which is what $_.psobject.properties.value does;
-ne '' filters out empty values.
Note that, given that Sort-Object has a -Unique switch, you don't need Get-Unique (you need Get-Unique only if your input already is sorted).
That said, if your CSV file is structured as simply as yours, you can speed up processing by reading it as a text file (PSv2+):
PS> (Get-Content C:\test.csv | Select-Object -Skip 1 |
ForEach-Object { $_ -split ',' -ne '' } |
Sort-Object -Unique).Count
6
Get-Content reads the CSV file as a line of strings.
Select-Object -Skip 1 skips the header line.
$_ -split ',' -ne '' splits each line into values by commas and weeds out empty values.
As for what you tried:
Import-CSV C:\test.csv | Sort-Object | Get-Unique:
Fundamentally, Sort-Object emits the input objects as a whole (just in sorted order), it doesn't extract property values, yet that is what you need.
Because no -Property argument is passed to Sort-Object to base the sorting on, it compares the custom objects that Import-Csv emits as a whole, by their .ToString() values, which happen to be empty[1]
, so they all compare the same, and in effect no sorting happens.
Similarly, Get-Unique also determines uniqueness by .ToString() here, so that, again, all objects are considered the same and only the very first one is output.
[1] This may be surprising, given that using a custom object in an expandable string does yield a value: compare $obj = [pscustomobject] #{ foo ='bar' }; $obj.ToString(); '---'; "$obj". This inconsistency is discussed in this GitHub issue.

Powershell, Loop through CSV files and search for a string in a row, then Export

I have a directory on a server called 'servername'. In that directory, I have subdirectories whose name is a date. In those date directories, I have about 150 .csv file audit logs.
I have a partially working script that starts from inside the date directory, enumerates and loops through the .csv's and searches for a string in a column. Im trying to get it to export the row for each match then go on to the next file.
$files = Get-ChildItem '\\servername\volume\dir1\audit\serverbeingaudited\20180525'
ForEach ($file in $files) {
$Result = If (import-csv $file.FullName | Where {$_.'path/from' -like "*01May18.xlsx*"})
{
$result | Export-CSV -Path c:\temp\output.csv -Append}
}
What I am doing is searching the 'path\from' column for a string - like a file name. The column contains data that is always some form of \folder\folder\folder\filename.xls. I am searching for a specific filename and for all instances of that file name in that column in that file.
My issue is getting that row exported - export.csv is always empty. Id also like to start a directory 'up' and go through each date directory, parse, export, then go on to the next directory and files.
If I break it down to just one file and get it out of the IF it seems to give me a result so I think im getting something wrong in the IF or For-each but apparently thats above my paygrade - cant figure it out....
Thanks in advance for any assistance,
RichardX
The issue is your If block, when you say $Result = If () {$Result | ...} you are saying that the new $Result is equal what's returned from the if statement. Since $Result hasn't been defined yet, this is $Result = If () {$null | ...} which is why you are getting a blank line.
The If block isn't even needed. you filter your csv with Where-Object already, just keep passing those objects down the pipeline to the export.
Since it sounds like you are just running this against all the child folders of the parent, sounds like you could just use the -Recurse parameter of Get-ChildItem
Get-ChildItem '\\servername\volume\dir1\audit\serverbeingaudited\' -Recurse |
ForEach-Object {
Import-csv $_.FullName |
Where-Object {$_.'path/from' -like "*01May18.xlsx*"}
} | Export-CSV -Path c:\temp\output.csv
(I used a ForEach-Object loop rather than foreach just demonstrate objects being passed down the pipeline in another way)
Edit: Removed append per Bill_Stewart's suggestion. Will write out all entries for the the recursed folders in the run. Will overwrite on next run.
I don't see a need for appending the CSV file? How about:
Get-ChildItem '\\servername\volume\dir1\audit\serverbeingaudited\20180525' | ForEach-Object {
Import-Csv $_.FullName | Where-Object { $_.'path/from' -like '*01May18.xlsx*' }
} | Export-Csv 'C:\Temp\Output.csv' -NoTypeInformation
Assuming your CSVs are in the same format and that your search text is not likely to be present in any other columns you could use a Select-String instead of Import-Csv. So instead of converting string to object and back to string again, you can just process as strings. You would need to add an additional line to fake the header row, something like this:
$files = Get-ChildItem '\\servername\volume\dir1\audit\serverbeingaudited\20180525'
$result = #()
$result += Get-Content $files[0] -TotalCount 1
$result += ($files | Select-String -Pattern '01May18\.xlsx').Line
$result | Out-File 'c:\temp\output.csv'

Parse line of text and match with parse of CSV

As a continuation of a script I'm running, working on the following.
I have a CSV file that has formatted information, example as follows:
File named Import.csv:
Name,email,x,y,z
\I\RS\T\Name1\c\x,email#jksjks,d,f
\I\RS\T\Name2\d\f,email#jsshjs,d,f
...
This file is large.
I also have another file called Note.txt.
Name1
Name2
Name3
...
I'm trying to get the content of Import.csv and for each line in Note.txt if the line in Note.txt matches any line in Import.csv, then copy that line into a CSV with append. Continue adding every other line that is matched. Then this loops on each line of the CSV.
I need to find the best way to do it without having it import the CSV multiple times, since it is large.
What I got does the opposite though, I think:
$Dir = PathToFile
$import = Import-Csv $Dir\import.csv
$NoteFile = "$Dir\Note.txt"
$Note = GC $NoteFile
$Name = (($Import.Name).Split("\"))[4]
foreach ($j in $import) {
foreach ($i in $Note) {
$j | where {$Name -eq "$i"} | Export-Csv "$Dir\Result.csv" -NoTypeInfo -Append
}
}
This takes too long and I'm not getting the extraction I need.
This takes too long and I'm not getting the extraction I need.
That's because you only assign $name once, outside of the outer foreach loop, so you're basically performing the same X comparisons for each line in the CSV.
I would rewrite the nested loops as a single Where-Object filter, using the -contains operator:
$Import |Where-Object {$Note -contains $_.Name.Split('\')[4]} |Export-Csv "$Dir\Result.csv" -NoTypeInformation -Append
Group the imported data by your distinguishing feature, filter the groups by name, then expand the remaining groups and write the data to the output file:
Import-Csv "$Dir\import.csv" |
Group-Object { $_.Name.Split('\')[4] } |
Where-Object { $Note -contains $_.Name } |
Select-Object -Expand Group |
Export-Csv "$Dir\Result.csv" -NoType