how to add information to a row and csv with powershell? - powershell

i have csv file with 3 columns SID, SamAccount name, ENABLED.
i also have folder containing files that called in a combination of "UVHD-"+SID.
i try to update the csv file with Length, LastWriteTime
so it will be like this for example:
SID SAMAccountName Enabled Length LastWriteTime
S-... FelixR False 205520896 02/02/2021 9:13:40
i tried many things and all failed
this is the best i could get:
Import-Csv $path\SID-ListTEST2.csv | select -ExpandProperty SID | ForEach-Object { Get-Childitem –Path $path\"UVHD-"$_.vhdx | Export-Csv $path\SID-ListTEST2.csv -Append | where $_ }

Use calculated properties:
(
Import-Csv $path\SID-ListTEST2.csv |
Select-Object *,
#{
Name='LastWriteTime';
Expression={ (Get-Item "$path\UVHD-$($_.SID).vhdx").LastWriteTime }
}
) | # Export-Csv -NoTypeInformation -Encoding utf8 $path\SID-ListTEST2.csv
Outputs to the display; remove the # from the last line to export to a CSV file instead.
Note the (...) around the pipeline, which ensures that all output is collected up front, which is the prerequisite for saving the results back to the original input file. Note that the original character encoding isn't necessarily preserved - use -Encoding to specify the desired one.
This adds one additional property, LastWriteTime; construct the other ones analogously.
For improved performance, you could cache the result of the Get-Item call, so that it doesn't have to be repeated in every calculated property: In the simplest case, use ($script:file = Get-Item ...) in the first calculated property, which you can then reuse as $script:file (or just $file) in the subsequent ones. Note that the $script: scope modifier is necessary, because the script blocks of calculated properties run in child scopes.[1]
Note that if no matching file exists, the Get-Item call fails silently and defaults to $null.
[1] Therefore, the more robust - but more cumbersome - approach would be to use Set-Variable -Scope 1 file (Get-Item ...) instead of $script:file = Get-Item ..., to ensure that the variable is created in the immediate parent scope, whatever it happens to be.

Related

PowerShell, can't get LastWriteTime

I have this working, but need LastWriteTime and can't get it.
Get-ChildItem -Recurse | Select-String -Pattern "CYCLE" | Select-Object Path, Line, LastWriteTime
I get an empty column and zero Date-Time data
Select-String's output objects, which are of type Microsoft.PowerShell.Commands.MatchInfo, only contain the input file path (string), no other metadata such as LastWriteTime.
To obtain it, use a calculated property, combined with the common -PipelineVariable parameter,
which allows you to reference the input file at hand in the calculated property's expression script block as a System.IO.FileInfo instance as output by Get-ChildItem, whose .LastWriteTime property value you can return:
Get-ChildItem -File -Recurse -PipelineVariable file |
Select-String -Pattern "CYCLE" |
Select-Object Path,
Line,
#{
Name='LastWriteTime';
Expression={ $file.LastWriteTime }
}
Note how the pipeline variable, $file, must be passed without the leading $ (i.e. as file) as the -PipelineVariable argument . -PipelineVariable can be abbreviated to -pv.
LastWriteTime is a property of System.IO.FileSystemInfo, which is the base type of the items Get-ChildItem returns for the Filesystem provider (which is System.IO.FileInfo for files). Path and Line are properties of Microsoft.PowerShell.Commands.MatchInfo, which contains information about the match, not the file you passed in. Select-Object operates on the information piped into it, which comes from the previous expression in the pipeline, your Select-String in this case.
You can't do this as a (well-written) one-liner if you want the file name, line match, and the last write time of the actual file to be returned. I recommend using an intermediary PSCustomObject for this and we can loop over the found files and matches individually:
# Use -File to only get file objects
$foundMatchesInFiles = Get-ChildItem -Recurse -File | ForEach-Object {
# Assign $PSItem/$_ to $file since we will need it in the second loop
$file = $_
# Run Select-String on each found file
$file | Select-String -Pattern CYCLE | ForEach-Object {
[PSCustomObject]#{
Path = $_.Path
Line = $_.Line
FileLastWriteTime = $file.LastWriteTime
}
}
}
Note: I used a slightly altered name of FileLastWriteTime to exemplify that this comes from the returned file and not the match provided by Select-String, but you could use LastWriteTime if you wish to retain the original property name.
Now $foundMatchesInFiles will be a collection of files which have CYCLE occurring within them, the path of the file itself (as returned by Select-String), and the last write time of the file itself as was returned by the initial Get-ChildItem.
Additional considerations
You could also use Select-Object and computed properties but IMO the above is a more concise approach when merging properties from unrelated objects together. While not a poor approach, Select-Object outputs data with a type containing the original object type name (e.g. Selected.Microsoft.PowerShell.Commands.MatchInfo). The code may work fine but can cause some confusion when others who may consume this object in the future inspect the output members. LastWriteTime, for example, belongs to FileSystemInfo, not MatchInfo. Another developer may not understand where the property came from at first if it has the MatchInfo type referenced. It is generally a better design to create a new object with the merged properties.
That said this is a minor issue which largely comes down to stylistic preference and whether this object might be consumed by others aside from you. I write modules and scripts that many other teams in my organization consume so this is a concern for me. It may not be for you. #mklement0's answer is an excellent example of how to use computed properties with Select-Object to achieve the same functional result as this answer.

Compare folder to a hash file

There are a lot of questions and answers about comparing the hash of two folder for integrity like this one. Assuming I have a folder that I copied to a backup medium (External drive, flash, optical disc) and would like to delete the original to save space.
What is the best way to save the original's folder hashes (before deletion) in a text file perhaps and check the backup's integrity much later against that file.
Note that if you delete the originals first and later find that the backup lacks integrity, so to speak, all you'll know is that something went wrong; the non-corrupted data will be gone.
You can create a CSV file with 2 columns, RelativePath (the full file path relative to the input directory) and Hash, and save it to a CSV file with Export-Csv:
$inputDir = 'C:\path\to\dir' # Note: specify a *full* path.
$prefixLen = $inputDir.Length + 1
Get-ChildItem -File -Recurse -LiteralPath $inputDir |
Get-FileHash |
Select-Object #{
Name='RelativePath'
Expression={ $_.Path.Substring($prefixLen) }
},
Hash |
Export-Csv originalHashes.csv -NoTypeInformation -Encoding utf8
Note: In PowerShell (Core) 7+, neither -NoTypeInformation nor -Encoding utf8 are needed, though note that the file will have no UTF-8 BOM; use -Encoding utf8bom if you want one; conversely, in Windows PowerShell you invariably get a BOM.
Note:
The Microsoft.PowerShell.Commands.FileHashInfo instances output by Get-FileHash also have an .Algorithm property naming the hashing algorithm that was used ('SHA256' by default, or as specified via the -Algorithm parameter).
If you want this property included (whose value will be the same for all CSV rows), you simply add Algorithm to the array of properties passed to Select-Object above.
Note how a hashtable (#{ ... }) passed as the second property argument to Select-Object serves as a calculated property that derives the relative path from each .Path property value (which contains the full path).
You can later apply the same command to the backup directory tree, saving to, say, backupHashes.csv, and compare the two CSV files with Compare-Object:
Compare-Object (Import-Csv -LiteralPath originalHashes.csv) `
(Import-Csv -LiteralPath backupHashes.csv) `
-Property RelativePath, Hash
Note: There's no strict need to involve files in the operation - one or both output collections can be captured in memory and can be used directly in the comparison - just omit the Export-Csv call in the command above and save to a variable ($originalHashes = Get-ChildItem ...)

Create Csv with loop and output

This basically works
foreach ($cprev in $CopyPreventeds) {
Write-Host ("prevented copy $(($cprev)."Name")")
$cprev | Select-Object Path, Name, Length, LastWrite, DestinationNewer | Export-Csv '.\prevented.csv' -NoTypeInformation
}
But only the last output is written to the csv. How could I write all contents to a new csv with an output at the same time for the user in PowerShell.
Maybe I'm missing something?
While I appreciate a solution has already been proposed in the comments, I have to ask, given the narrow scope of the question why are we using an obscure, albeit clever technique? And/or, repeatedly invoking Export-Csv...
The question doesn't mention sparing a variable. Moreover, There doesn't appear to be a need for the ForEach loop.
$CopyPreventeds |
Select-Object Path, Name, Length, LastWrite, DestinationNewer |
Export-Csv '.\prevented.csv' -NoTypeInformation
In the above $CopyPreventeds already exists and remains so, unmolested after the export. You would need only to output it again for the benefit of an interactive user. All taking advantage of PowerShell's intuitive pipeline and features.
Moreover, since the iteration variable $cprev isn't needed you are still less one variable.
Note: You don't need -Append because you are streaming into a single Export-Csv command, as opposed to repeatedly invoking it.
There are at least 2 ways (probably many more) you could conveniently output to an interactive user.
1: Echo a header, something like "The following copies were prevented:" then echo the variable $CopyPreventeds, presumably to a table.
Note: That given multiple points at which you seem only interested in a subset of properties. You may think about trimming those objects beforehand:
$CopyPreventeds =
$CopyPreventeds |
Select-Object Path, Name, Length, LastWrite, DestinationNewer
$CopyPreventeds | Export-Csv '.\prevented.csv' -NoTypeInformation
Write-Host "The following copies were prevented:"
$CopyPreventeds | Format-Table -AutoSize | Out-Host
Note: More than 4 Properties in a [PSCustomObject] (resulting from Select-Object) where custom formatting hasn't been defined will by default output as a list, so use Format-Table to overcome that. Out-Host is then used to prevent pipeline pollution.
2: Return to using a ForEach-Object Loop for the output between the Select-Object and the Export-Csv command.
$CopyPreventeds |
Select-Object Path, Name, Length, LastWrite, DestinationNewer
ForEach-Object{
"Prevented Copy : {0}, {1}, {2}, {3}, {4}" -f $_.Path, $_.Name, $_.Length, $_.LastWrite, $_.DestinationNewer |
Write-Host
$_
} |
Export-Csv '.\prevented.csv' -NoTypeInformation
In this example, when you are done outputting to the screen (admittedly a little messy), you emit $_ from the loop, thus piping it to Export-Csv just the same.
Note: there are a number of ways to construct strings, I choose to use the -f operator here because it's a little cleaning than imbedding numerous $() sub expressions. And, of course this assume you want to prefix on every line Which I personally think is gratuitous, so I'd choose something more like #1..

Powershell Performance tuning for aggregation operation on big delimited files

I have a delimited file with 350 columns. The delimiter is \034(Field separator).
I have to extract a particular column value and find out the count of each distinct value of that column in the file. If the count of distinct value is greater or equal to 2, I need to output it to a file.
The source file is 1GB. I have written the following command. It is very slow.
Get-Content E:\Test\test.txt | Foreach {($_ -split '\034')[117]} | Group-Object -Property { $_ } | %{ if($_.Count -ge 2) { Select-Object -InputObject $_ -Property Name,Count} } | Export-csv -Path "E:\Test\test2.csv" -NoTypeInformation
Please help!
I suggest using a switch statement to process the input file quickly (by PowerShell standards):
# Get an array of all the column values of interest.
$allColValues = switch -File E:\Test\test.txt {
default { # each input line
# For better performance with *literal* separators,
# use the .Split() *method*.
# Generally, however, use of the *regex*-based -split *operator* is preferable.
$_.Split([char] 0x1c)[117] # hex 0x1c is octal 034
}
}
# Group the column values, and only output those that occur at least
# twice.
$allColValues | Group-Object -NoElement | Where-Object Count -ge 2 |
Select-Object Name, Count | Export-Csv E:\Test\test2.csv -NoTypeInformation
Tip of the hat to Mathias R. Jessen for suggesting the -NoElement switch, which streamlines the Group-Object call by only maintaining abstract group information; that is, only the grouping criteria (as reflected in .Name, not also the individual objects that make up the group (as normally reflected in .Group) are returned via the output objects.
As for what you tried:
Get-Content with line-by-line streaming in the pipeline is slow, both generally (the object-by-object passing introduces overhead) and, specifically, because Get-Content decorates each line it outputs with ETS (Extended Type System) metadata.
GitHub issue #7537 proposes adding a way to opt-out of this decoration.
At the expense of memory consumption and potentially additional work for line-splitting, the -Raw switch reads the entire file as a single, multi-line string, which is much faster.
Passing -Property { $_ } to Group-Object isn't necessary - just omit it. Without a -Property argument, the input objects are grouped as a whole.
Chaining Where-Object and Select-Object - rather than filtering via an if statement in a ForEach-Object call combined with multiple Select-Object calls - is not only conceptually clearer, but performs better.

Import-CSV returns null when piped files

I have a piece of code that should grab all the .CSV files out of a directory and import them, using pipe character delimiters.
$apeasy = dir .\APEasy\*.csv | Import-CSV -delimiter '|'
The problem is this returns null. Without exception, no matter what I do.
The weird thing is that this works:
dir .\APEasy\*.csv
It returns a FileInfo object, which SHOULD be getting piped into Import-CSV as the file to import. In addition, these two commands work:
$csvFiles = dir .\Processed_Data_Review -Filter *.txt | Import-CSV -header(1..19) -delimiter "$([char]0x7C)"
dir .\LIMS -Filter *.csv | Import-CSV | ? {$_.SampleName -like "????-*"}| Export-CSV -Path .\lims_output.txt -NoTypeInformation
I really have no idea what's going on here. I'm dealing with a basic pipe-delimited file, quotations around every field (which is fine, I can import the data with those). Nothing special going on here. The file is THERE, Import-CSV just isn't GETTING it for some reason.
So my question is this: What could cause a file grabbed by 'dir' to fail to be piped into Import-CSV?
EDIT: The overall goal of this is to read the CSV files in a directory without knowing their name in advance, and output specific columns into a variety of output files.
EDIT: This is the line of code as it stands right now:
$apeasy = Get-ChildItem .\APEasy\*.csv | Select-Object -ExpandProperty FullName | Import-CSV -delimiter "$([char]0x7C)"
Isolating the Get-ChildItem statement, and isolating Get-Child and Select-Object both return what they should. A list of csv files in the directory, and an array of their full paths, respectively. Still, when they get piped into Import-CSV, they dissappear. Get-Member on the variable returns that it's empty.
Import-Csv accepts only strings (path) from the pipeline so in order to pipe directly to it you need to first expand the paths:
dir .\APEasy\*.csv |
select -expand fullname |
Import-CSV -delimiter '|'
Although cmdlets like Get-Content work that way in that they can accept the Path parameter by property name (and LiteralPath by value, which makes sense), Import-Csv is a little inconsistent. It only accepts the path to import by value:
-Path <String[]>
Specifies the path to the CSV file to import. You can also pipe a path to Import-Csv.
Required? false
Position? 1
Default value None
Accept pipeline input? true (ByValue)
Accept wildcard characters? false
So you could use
Get-ChildItem ... | Select-Object -ExpandProperty FullName | Import-Csv
but it won't work out of the pipeline directly.
This is a known bug in Import-Csv - even though you should be able to pipe Get-ChildItem (a.k.a dir) output directly to Import-Csv, that is still broken as of Windows PowerShell v5.1 / PowerShell Core v6.0.2.
Outputting the files' .PSPath property value is a way of working around the problem (.FullName works too) : The input objects' .PSPath property is what should be bound to Import-Csv's -LiteralPath parameter, but currently isn't.
Since you're collecting all input in memory anyway, you can simply use member-access enumeration (PSv3+) to access the .PSPath property on all matching files:
$apeasy = (Get-ChildItem .\APEasy\*.csv).PSPath | Import-CSV -delimiter '|'