How gci (get child item) can print file contents? - powershell

Searching the web for a findstr equivalent for Powershell I found this site, which suggests using the the Cmdlet gci (get child items) and select-string. However, gci doesn't print the content of a file, instead it prints the directory content. How the pipelining in this case works, how can gci and select-string filter the content of a file (without piplining it first to the get-content)?

Select-String accepts pipeline input. When you pipe FileInfo objects, they bind to the InputObject parameter. The following two commands are equivalent:
PS> Get-ChildItem C:\test.txt | Select-String -Pattern logfile
PS> Select-String -InputObject (Get-ChildItem C:\test.txt) -Pattern logfile

The select-string cmdlet recieves a System.IO.FileInfo object from the pipeline. Thus it is able to determine which part of its parameters are file names and which are the strings to look for. See Select-string at Technet.

Related

Powershell how to get-content across several subfolders

I'm working on a script to output some data from multiple files based on a string search. It outputs the string found, followed by the following six characters. I can get this to work for an exact location. However, I want to search across files inside multiple subfolders in the path. Using the below script, I get PermissionDenied errors...
[regex] $pattern = '(?<=(a piece of text))(?<chunk>.*)'
Get-Content -Path 'C:\Temp\*' |
ForEach-Object {
if ($_ -match $pattern) {
$smallchunk = $matches.chunk.substring(0, 6)
}
}
"$smallchunk" | Out-File 'C:\Temp\results.txt'
If I change -Path to one of the subfolders, it works fine, but I need it to go inside each subfolder and execute the get-content.
e.g., look inside...
C:\Temp\folder1\*
C:\Temp\folder2\*
C:\Temp\folder3\*
And so on...
Following up on boxdog's suggestion of Select-String, the only limitation would be folder recursion. Unfortunately, Select-String only allows the searching of multiple files in one directory.
So, the way around this is piping the output of Get-ChildItem with a -Recurse switch into Select-String:
$pattern = "(?<=(a piece of text))(?<chunk>.*)"
Get-ChildItem -Path "C:\Temp\" -Exclude "results.txt" -File -Recurse |
Select-String -Pattern $pattern |
ForEach-Object -Process {
$_.Matches[0].Groups['chunk'].Value.Substring(0,6)
} | Out-File -FilePath "C:\Temp\results.txt"
If there's a need for the result to be saved to $smallchunk you can still do so inside the loop if need be.
Abraham Zinala's helpful answer is the best solution to your problem, because letting Select-String search your files' content is faster and more memory-efficient than reading and processing each line with Get-Content.
As for what you tried:
Using the below script I get PermissionDenied errors...
These stem from directories being among the file-system items output by Get-ChildItem, which Get-Content cannot read.
If your files have distinct filename extensions that your directories don't, one option is to pass them to the (rarely used with Get-Content) -Include parameter; e.g.:
Get-Content -Path C:\Temp\* -Include *.txt, *.c
However, as with Select-String, this limits you to a single directory's content, and it doesn't allow you to limit processing to files fundamentally, if extension-based filtering isn't possible.
For recursive listing, you can use Get-ChildItem with -Recurse, as in Abraham's answer, and pipe the file-info objects to Get-Content:
Get-ChildItem -Recurse C:\Temp -Include *.txt, *.c | Get-Content
If you want to simply limit output to files, whatever their name is, use the -File switch (similarly, -Directory limits output to directories):
Get-ChildItem -File -Recurse C:\Temp | Get-Content

Why does "get-childItem -recurse | select-string foo" not result in an error if there are subdirectories?

Trying to use select-string on a directory results in an error:
PS C:\> select-string pattern P:\ath\to\directory
select-string : The file P:\ath\to\directory cannot be read: Access to the path 'P:\ath\to\directory' is denied.
However, when I use get-childItem -recurse, the command finishes without problem:
PS C:\> get-childItem -recurse P:\ath\to | select-string pattern
This sursprises me because get-childItem recurse also passes directories down the pipeline. So, I'd have assumed that select-string raises the same error when it processes the first directory.
Since this is not the case, I wonder where or how the directory is filtered out in the pipeline.
TL;DR: object magic.
When Get-Childitem is run, it will return a collection of objects. This is passed to Select-String. The cmdlet's source is available on Github. There's a ProcessRecord() method that, well, processes input objects. It contains a few checks for object type and if those are directories. Like so,
if (_inputObject.BaseObject is FileInfo fileInfo)
...
if (expandedPathsMaybeDirectory && Directory.Exists(filename))
...
Thus, it doesn't matter that Get-Child's collection contains both DirectoryInfo and FileInfo objects; the cmdlet is smart enough to figure out what it's trying to consume.
The -Path parameter of Select-String expects paths to files:
-Path
Specifies the path to the files to search. Wildcards are permitted. The default location is the local directory.
Specify files in the directory, such as log1.txt, *.doc, or .. If you specify only a directory, the command fails.
That's why you get an error, when you pass a path to a directory to the -Path parameter.
When you pipe objects to Select-String, they do not necessarily need a Path attribute that can be mapped to -Path of Select-String:
You can pipe any object that has a ToString method to Select-String.
So you can also pipe raw strings to Select-String:
'test' | Select-String -Pattern 'test'
This line will return test.
Get-ChildItem returns objects of type System.IO.FileInfo or System.IO.DirectoryInfo. Select-String will use the Path attribute of System.IO.FileInfo to parse the content of the file and it will use the string representation of System.IO.DirectoryInfo to parse exactly this string representation (the path itsself). That's why you don't get errors in your pipeline.
New-Item -Name 'test' -Type Directory | Select-String -Pattern 'test'
This line will return the path to the just created directory test.

List file names in a folder matching a pattern, excluding file content

I am using the below to recursively list all files in a folder that contains the $pattern
Get-ChildItem $targetDir -recurse | Select-String -pattern "$pattern" | group path | select name
But it seems it both list files having the $pattern in its name and in its content, e.g. when I run the above where $pattern="SAMPLE" I get:
C:\tmp\config.include
C:\tmp\README.md
C:\tmp\specs\SAMPLE.data.nuspec
C:\tmp\specs\SAMPLE.Connection.nuspec
Now:
C:\tmp\config.include
C:\tmp\README.md
indeed contains the SAMPLE keywords/text but I don't care about that, I only need the command to list file names not file with content matching the pattern. What am I missing?
Based on the below answers I have also tried:
$targetDir="C:\tmp\"
Get-ChildItem $targetDir -recurse | where {$_.name -like "SAMPLE"} | group path | select name
and:
$targetDir="C:\tmp\"
Get-ChildItem $targetDir -recurse | where {$_.name -like "SAMPLE"} | select name
but it does not return any results.
Select-String is doing what you told it to. Emphasis mine.
The Select-String cmdlet searches for text and text patterns in input strings and files.
So if you are just looking to match with file names just use -Filter of Get-ChildItem or post process with Where-Object
Get-ChildItem -Path $path -Recurse -Filter "*sample*"
That should return all files and folders that have sample in their name. If you just wanted files or directories you would be able to use the switches -File or -Directory to return those specific object types.
If your pattern is more complicated than a simple word then you might need to use Where-Object like in Itchydon's answer with something like -match giving you access to regex.
The grouping logic in your code should be redundant since you are returning single files that all have unique paths. Therefore I have not included that here. If you just want the paths then you can pipe into Select-Object -Expand FullName or just (Get-ChildItem -Path $path -Recurse -Filter "*sample*").Fullname
get-ChildItem $targetDir -recurse | where {$_.name -like $pattern} | select name
To complement Matt's helpful answer:
Specifically, because what you're piping to Select-String are [System.IO.FileInfo] objects - which is what Get-ChildItem outputs - rather than strings, it is the contents of the files represented by these objects is being searched.
Assuming that you need to match only the file name part of each file's path and that your pattern can be expressed as a wildcard expression, you do not need Select-String at all and can instead use Get-ChildItem with -Filter, as in Matt's answer, or the slower, but slightly more powerful -Include.
Caveat:
Select-String -Pattern accepts a regular expression (e.g., .*sample.*; see Get-Help about_Regular_Expressions),
whereas Get-ChildItem -Filter/-Include accepts a wildcard expression (e.g., *sample*; see Get-Help about_Wildcards) - they are different things.
On a side note: If your intent is to match files only, you can tell Get-ChildItem to restrict output to files (as opposed to potentially also directories) using -File (analogously, you can limit output to directories with -Directory).
Group-Object path (group path) will not work as intended, because the .Path property of the match-information objects output by Select-String contains the full filename, so you'd be putting each file in its own group - essentially, a no-op.
When using just Get-ChildItem, the equivalent property name would be .FullName, but what you're looking for is to group by parent path (the containing directory's path), .DirectoryName), I presume, therefore:
... | Group-Object DirectoryName | Select-Object Name
This outputs the full path of each directory that contains at least 1 file with a matching file name.
(Note that the Name in Select-Object Name refers to the .Name property of the group objects returned by Group-Object, which in this case is the value of the .DirectoryName property on the input objects.)
To complement the excellent answer by #mklement0, you can ask Powershell to print the full path by appending a pipe as follows:
Get-ChildItem -Recurse -ErrorAction SilentlyContinue -Force -Filter "*sample*" | %{$_.FullName}
Note: When searching folders where you might get an error based on security, hence we use the SilentlyContinue option.
I went through the answer by #Itchydon
but couldn't follow the use of '-like' $pattern.
I was trying to list files having 32characters(letters and numbers) in the filename.
PS C:> Get-ChildItem C:\Users\ -Recurse | where {$_.name -match "[a-zA-Z0-9]{32}"} | select name
or
PS C:> Get-ChildItem C:\Users\010M\Documents\WindowsPowerShell -Recurse | Where-Object {$_.name -match "[A-Z0-9]{32}"} | select name
So, in this case it doesn't matter whether you use where or where-object.
You can use select-string directly to search for files matching a certain string, yes, this will return the filename:count:content ... etc, but, internally these have names that you can chose or omit, the one you need is the "filename" to do this pipe this into "select-object" choosing the "FileName" from the output.
So, to select all *.MSG files that has the pattern of "Subject: Webservices restarted", you can do the following:
Select-String -Path .*.MSG -Pattern 'Subject: WebServices Restarted'
-List | select-object Filename
Also, to remove these files on the fly, you could pip into a ForEach statement with the RM command as follows:
Select-String -Path .*.MSG -Pattern 'Subject: WebServices Restarted'
-List | select-object Filename | foreach { rm $_.FileName }

Printing recursive file and folder count in powershell?

I am trying to compare two sets of folders to determine discrepancies in file and folder counts. I have found a command that will output the data I am looking for, but cannot find a way to print it to a file. Here is the command I am using currently:
dir -recurse | ?{ $_.PSIsContainer } | %{ Write-Host $_.FullName (dir $_.FullName | Measure-Object).Count }
This is getting me the desired data but I need to find a way to print this to a text file. Any help would be greatly appreciated.
The problem is the use of the Write-Host cmdlet, which bypasses almost all pipeline handling. In this case, it is also unnecessary, as any output that isn't used by a cmdlet is automatically passed into the pipeline (or to the console if there's nothing further).
Here is your code rewritten to output a string to the pipeline instead of using Write-Host. This uses PowerShell's string subexpression operator $(). At the console, it will look the same, but it can be piped to a file or other cmdlet.
gci -Recurse -Directory | %{ "$($_.FullName) $((gci $_.FullName).Count)" }
You may also find it useful to put the data into a PSCustomObject. Once you have the object, you can do further processing such as sorting or filtering based on the count.
$folders = gci -Recurse -Directory | %{ [PSCustomObject]#{Name=$_.FullName; Count=(dir $_.FullName).Count }}
$folders | sort Count
$folders | where Count -ne 0
Some notes on idioms: dir is an alias for Get-Childitem, as is gci. Using gci's -Directory parameter is the best way to list only directories, rather than the PSIsContainer check. Finally, Measure-Object is unnecessary. You can take the Count of the file listing directly.
See also Write-Host Considered Harmful from the inventor of PowerShell

piping files to get-content

I'm trying to find a single line of code recursively using powershell.
To look for the line "TODO" in a known file I can do:
get-content ActivityLibrary\Accept.cs | select-string TODO
But I don't want to explicitly type every directory\file. I would like to pipe a series of filenames from get-childitem like this:
gci -filter *.cs -name -recurse | gc | select-string TODO
But then I see this error:
Get-Content : The input object cannot be bound to any parameters for
the comman d either because the command does not take pipeline input
or the input and its properties do not match any of the parameters
that take pipeline input. At line:1 char:37
What am I doing wrong?
You need to remove the -Name switch. It outputs just file names, not file objects. And you can also pipe directly to Select-String and drop 'gc'.