Why get-childitem with wildcards are not easy to use? [duplicate] - powershell

From documentation:
-Include
Retrieves only the specified items.
The value of this parameter qualifies
the Path parameter. Enter a path
element or pattern, such as "*.txt".
Wildcards are permitted.
The Include parameter is effective only when the command
includes the Recurse parameter or the
path leads to the contents of a
directory, such as C:\Windows*, where
the wildcard character specifies the
contents of the C:\Windows directory.
My first understanding was:
c:\test\a.txt
c:\test\b.txt
So to get 'a.txt' and 'b.txt' I can write:
gci -Path "c:\test\*" -Include "*.txt"
And this works. But now consider such hierarchy:
c:\test\a.txt
c:\test\b.txt
c:\test\c.txt\c.txt
The same command returns:
a.txt, b.txt, c.txt
The actual logic seems to be:
-Include used to match all entities specified by -Path. If matched element
is a file - return it. If matched
element is a folder, look inside and
return matching first level children.
Also, the documentation say:
The Include parameter is effective only when the command
includes the Recurse parameter or the
path leads to the contents of a
directory...
This is wrong as well. E.g.
gci -Path "c:\test" -Include "*.txt"
It returns nothing, while without -Include I get folder content. So -Include is definitely "effective". What really happens here? The -Path specify the "c:\test", and the -Include tries to match this path. As "*.txt" does not match "test", so nothing returned. But look at this:
gci -Path "c:\test" -Include "*t"
It returns a.txt, b.txt and c.txt as "*t" matched "test" and matched all child items.
After all, even knowing how Include works now, I don't understand when to use it. Why do I need it look to inside subfolders? Why should it be so complex?

You're confusing the use of -include. The -include flag is applied to the path, not the contents of the path. Without the use of the recursive flag, the only path that is in question is the path you specify. This is why the last example you gave works, the path c:\test has a t in the path and hence matches "*t".
You can verify this by trying the following
gci -path "c:\test" -in *e*
This will still produce all of the children in the directory yet it matches none of their names.
The reason that -include is more effective with the recurse parameter is that you end up applying the wildcard against every path in the hierarchy.

Try the -filter parameter (it has support for only one extension):
dir -filter *.txt

Tacking on to JaredPar's answer, in order to do pattern matching with Get-ChildItem, you can use common shell wildcards.
For example:
get-childitem "c:\test\t?st.txt"
where the "?" is a wildcard matching any one character or
get-childitem "c:\test\*.txt"
which will match any file name ending in ".txt".
This should get you the "simpler" behavior you were looking for.

I just asked a similar question and got three quick replies concerning the Get-Help for Get-ChildItem.
The answer is in the full description
of the command (Get-Help Get-ChildItem
-full):
The Include parameter is effective only when the command includes the
Recurse parameter or the path leads to
the contents of a directory, such as
C:\Windows*, where the wildcard
character specifies the contents of
the C:\Windows directory.
So the following would work without
recurse.
PS C:\foo> Get-childitem -path
"c:\foo*" -Include *.txt
From Stack Overflow question PowerShell Scripting - Get-ChildItem.
I hope this helps :-)

Including \* at the end of the path should work around the issue
PS C:\logfiles> Get-ChildItem .\* -include *.log
This should return .log files from the current working directory (C:\logfiles)
Alex's example above indicates that a directory with the name foo.log would also be returned. When I tried it, it wasn't but it's 6 years later and that could be from PS updates.
However, you can use the child item Mode to exclude directories I think.
PS C:\logfiles> Get-Childitem .\* -include *.log | where-object {$_.mode -notmatch "d"}
This should exclude anything with the 'directory' mode set.

get-childitem -include only works with -recursive or a wildcard in the path. I consider this a bug [Thought it was different in PS 6].

Related

Get-ChildItem Exclude and File parameters don't work together

I can't figure out why these two parameters of the Get-ChildItem cmdlet don't work together. To make my question as clear as possible, look at the following example. From the Powershell ISE command pane:
Type 'dir' --> All files and sub-folders in the current directory are displayed.
Type 'dir -File' --> Original list minus sub-folders is displayed.
Type 'dir -Exclude "*.txt"' --> Original list minus .txt files is displayed.
Type 'dir -File -Exclude "*.txt"' --> NOTHING is displayed.
I would expect the original list minus sub-folders and .txt files. But regardless of what argument I use for '-Exclude', I get no items listed.
I have looked at the Get-ChildItem -full documentation, and the related articles here (Stack Overflow) and at other reliable resources, and still don't understand why this fails. Even the classic "-Include '*.txt' -Exclude 'A*'" example fails when you add "-File". How can I use -File and -Exclude together?
Whilst dir is an alias for Get-ChildItem, I find it best to use the full cmdlets when providing answers.
To use proper PowerShell cmdlets it would be best for you to use the following:
Get-ChildItem * -Exclude "*.txt" -File
What you see above is the PowerShell cmdlet to get all items in the path specified (using the * assumes you want all items from the current location)
You can also use -Path and provide the location of the path to where you want to get the items as well, such as:
Get-ChildItem -Path "C:\Path\Folder" -Exclude "*.txt" -File

How does the Get-ChildItem -Exclude Parameter work?

How does the Get-ChildItem -Exclude parameter work? What rules does it follow?
The Get-Help for Get-ChildItem isn't detailed at all:
Omits the specified items. The value of this parameter qualifies the
Path parameter. Enter a path element or pattern, such as "*.txt".
Wildcards are permitted.
And on Stackoverflow and elsewhere the general consensus seems to be it's too difficult to use and we should all just pipe the output of Get-ChildItem to Where-Object instead.
While I'm willing to use Where-Object I'm curious as to the rules -Exclude follows.
For example, I have a folder with the following sub-folders:
HsacFixtures
HsacFixturesBuild
RestFixture
RestFixtureBuild
If I execute the following command:
Get-ChildItem $rootFolderPath -Exclude HsacFixturesBuild -Directory
it returns the results expected:
HsacFixtures
RestFixture
RestFixtureBuild
However, if I add a -Recurse parameter:
Get-ChildItem $rootFolderPath -Exclude HsacFixturesBuild -Directory -Recurse
Then it returns sub-folders in the HsacFixturesBuild folder.
I've also tried HsacFixturesBuild\ and HsacFixturesBuild\*, which have the same results.
So does -Exclude only apply to immediate children, and not to grand-children or deeper sub-folders?
Exclude omits child objects based on the Name,
For files, gets the name of the file. For directories, gets the name
of the last directory in the hierarchy if a hierarchy exists.
Otherwise, the Name property gets the name of the directory
not the FullName,
which gets the full path of the directory or file
So even though the object is a grandchild in a recursive call, the exclude only looks at the object's Name, not the FullName so the exclude wont affect omission unless the child objects share a common substring of the name that happens to be part of the exclude parameter
Source Get-ChildItem
Example 3: Get all child items using an inclusion and exclusion
This command lists the .txt files in the Logs subdirectory, except for
those whose names start with the letter A. It uses the wildcard
character (*) to indicate the contents of the Logs subdirectory, not
the directory container. Because the command does not include the
Recurse parameter, the command does not include the content of
directory automatically; you need to specify it.
Windows PowerShell
PS C:\> Get-ChildItem –Path "C:\Windows\Logs\*" -Include "*.txt" -Exclude "A*"
With respect to Get-ChildItem,-exclude parameter works on the objects name
It could be understood better from the following example:
Consider the following folder structure
First we use -exclude parameter without recurse.
As we could see in the above image, based on objects name , folder was excluded
Now we add recurse parameter to the above statement as follows
Now we see could see that sub-folders of folder is still present, because the exclusion was applied at objects name
Hope this HElps.

How do I prevent Get-ChildItem from traversing a particular directory?

Let me start by saying that I've looked at Unable to exclude directory using Get-ChildItem -Exclude parameter in Powershell and How can I exclude multiple folders using Get-ChildItem -exclude?. Neither of these has an answer that solves my problem.
I need to search a directory recursively for files with a certain extension. For simplicity, let's just say I need to find *.txt. Normally, this command would suffice:
Get-ChildItem -Path 'C:\mysearchdir\' -Filter '*.txt' -Recurse
But I have a major problem. There's a node_modules directory buried somewhere inside C:\mysearchdir\, and NPM creates extremely deep nested directories. (The detail of it being an NPM managed directory is only important because this means the depth is beyond my control.) This results in the following error:
Get-ChildItem : The specified path, file name, or both are too long. The fully qualified file name must be less than 260 characters, and the directory name must be less than 248 characters.
I believe this error bubbles up from the limitations in the .NET IO libraries.
I can't search in the other directories around it very easily. It's not at the top of the directory; it's deeper in, say at C:\mysearchdir\dir1\dir2\dir3\node_modules, and there are directories I need to search at all those levels. So just searching the other directories around it is going to be cumbersome and not very maintainable as more files and directories are added.
I've tried to -Exclude parameter without any success. That isn't surprising since I just read that -Exclude is only applied after the results are fetched. I can't find any real info on using -Filter (as is noted in this answer).
Is there any way I can get Get-ChildItem to work, or am I stuck writing my own recursive traversal?
Oh, man, I feel dumb. I was facing the same problem as you. I was working with #DarkLite1's answer, trying to parse it, when I got to the "-EA SilentlyContinue" part.
FACEPALM!
That's all you need!
This worked for me, try it out:
Get-ChildItem -Path 'C:\mysearchdir\' -Filter '*.txt' -Recurse -ErrorAction SilentlyContinue
Note: This will not exclude node_modules from a search, just hide any errors generated by traversing the long paths. If you need to exclude it entirely, you're going to need a more complicated solution.
Maybe you could try something like this:
$Source = 'S:\Prod'
$Exclude = #('S:\Prod\Dir 1', 'S:\Prod\Dir 2')
Get-ChildItem -LiteralPath $Source -Directory -Recurse -PipelineVariable Dir -EV e -EA SilentlyContinue |
Where {($Exclude | Where {($Dir.FullName -eq "$_") -or ($Dir.FullName -like "$_\*")}).count -eq 0}

Different result from same command in powershell 3.0

Given that Get-ChildItem -Path *.exe will show all the executables in the current directory, why doesn't Get-ChildItem -File -Include *.exe return the same result? Both commands are executed in the same directory, first command (with -Path) returns a list of executables but the second command (with -File) doesn't. (gci -File will list everything including the exe)
Get-ChildItem -File | gm #=> FileInfo
Get-ChildItem *.* | gm #=> DirectoryInfo and FileInfo
All the commands bellow return objects of type FileInfo
Get-ChildItem -File
Get-ChildItem *.* -Include *.exe
Get-ChildItem -Path *.exe
But mixing -File and -Include/-Exclude returns nothing, even though the -include is looking for a filetype:
Get-ChildItem -File -Include *.exe #=> Returns nothing
What am I missing here?
From TechNet:
The Include parameter is effective only when the command includes the
Recurse parameter or the path leads to the contents of a directory,
such as C:\Windows*, where the wildcard character specifies the
contents of the C:\Windows directory.
In other words, when you use the Include parameter, it does not automatically consider all files and directories unless you use the Path or the Recurse parameters. Notice, that when just using the Path parameter, you must include a wildcard to force it to consider the file and directory results underneath that path. I cannot think of why this is.
To get your examples to work, you would use one of the following (I'm dropping the File parameter because it seems redundant):
Get-ChildItem -Path * -Include *.exe
Get-ChildItem -Include *.exe -Recurse
The gist of my answer to your question is an opinion tho - from what I've seen the Include parameter should be removed - or its behavior repaired to match the default behavior of the Get-ChildItem cmdlet when used without parameters. There may be a good explanation to why it works this way, but I'm unaware of this.
If you drop the Include parameter from your examples, the behavior/results make more sense (to me):
Get-ChildItem -Path *.exe
In this case, we would only need the Exclude parameter to effectively cover all filtering requirements. Something like:
Get-ChildItem -Path *.exe -Exclude *system*

Confused with -Include parameter of the Get-ChildItem cmdlet

From documentation:
-Include
Retrieves only the specified items.
The value of this parameter qualifies
the Path parameter. Enter a path
element or pattern, such as "*.txt".
Wildcards are permitted.
The Include parameter is effective only when the command
includes the Recurse parameter or the
path leads to the contents of a
directory, such as C:\Windows*, where
the wildcard character specifies the
contents of the C:\Windows directory.
My first understanding was:
c:\test\a.txt
c:\test\b.txt
So to get 'a.txt' and 'b.txt' I can write:
gci -Path "c:\test\*" -Include "*.txt"
And this works. But now consider such hierarchy:
c:\test\a.txt
c:\test\b.txt
c:\test\c.txt\c.txt
The same command returns:
a.txt, b.txt, c.txt
The actual logic seems to be:
-Include used to match all entities specified by -Path. If matched element
is a file - return it. If matched
element is a folder, look inside and
return matching first level children.
Also, the documentation say:
The Include parameter is effective only when the command
includes the Recurse parameter or the
path leads to the contents of a
directory...
This is wrong as well. E.g.
gci -Path "c:\test" -Include "*.txt"
It returns nothing, while without -Include I get folder content. So -Include is definitely "effective". What really happens here? The -Path specify the "c:\test", and the -Include tries to match this path. As "*.txt" does not match "test", so nothing returned. But look at this:
gci -Path "c:\test" -Include "*t"
It returns a.txt, b.txt and c.txt as "*t" matched "test" and matched all child items.
After all, even knowing how Include works now, I don't understand when to use it. Why do I need it look to inside subfolders? Why should it be so complex?
You're confusing the use of -include. The -include flag is applied to the path, not the contents of the path. Without the use of the recursive flag, the only path that is in question is the path you specify. This is why the last example you gave works, the path c:\test has a t in the path and hence matches "*t".
You can verify this by trying the following
gci -path "c:\test" -in *e*
This will still produce all of the children in the directory yet it matches none of their names.
The reason that -include is more effective with the recurse parameter is that you end up applying the wildcard against every path in the hierarchy.
Try the -filter parameter (it has support for only one extension):
dir -filter *.txt
Tacking on to JaredPar's answer, in order to do pattern matching with Get-ChildItem, you can use common shell wildcards.
For example:
get-childitem "c:\test\t?st.txt"
where the "?" is a wildcard matching any one character or
get-childitem "c:\test\*.txt"
which will match any file name ending in ".txt".
This should get you the "simpler" behavior you were looking for.
I just asked a similar question and got three quick replies concerning the Get-Help for Get-ChildItem.
The answer is in the full description
of the command (Get-Help Get-ChildItem
-full):
The Include parameter is effective only when the command includes the
Recurse parameter or the path leads to
the contents of a directory, such as
C:\Windows*, where the wildcard
character specifies the contents of
the C:\Windows directory.
So the following would work without
recurse.
PS C:\foo> Get-childitem -path
"c:\foo*" -Include *.txt
From Stack Overflow question PowerShell Scripting - Get-ChildItem.
I hope this helps :-)
Including \* at the end of the path should work around the issue
PS C:\logfiles> Get-ChildItem .\* -include *.log
This should return .log files from the current working directory (C:\logfiles)
Alex's example above indicates that a directory with the name foo.log would also be returned. When I tried it, it wasn't but it's 6 years later and that could be from PS updates.
However, you can use the child item Mode to exclude directories I think.
PS C:\logfiles> Get-Childitem .\* -include *.log | where-object {$_.mode -notmatch "d"}
This should exclude anything with the 'directory' mode set.
get-childitem -include only works with -recursive or a wildcard in the path. I consider this a bug [Thought it was different in PS 6].