Confused with -Include parameter of the Get-ChildItem cmdlet - powershell

From documentation:
-Include
Retrieves only the specified items.
The value of this parameter qualifies
the Path parameter. Enter a path
element or pattern, such as "*.txt".
Wildcards are permitted.
The Include parameter is effective only when the command
includes the Recurse parameter or the
path leads to the contents of a
directory, such as C:\Windows*, where
the wildcard character specifies the
contents of the C:\Windows directory.
My first understanding was:
c:\test\a.txt
c:\test\b.txt
So to get 'a.txt' and 'b.txt' I can write:
gci -Path "c:\test\*" -Include "*.txt"
And this works. But now consider such hierarchy:
c:\test\a.txt
c:\test\b.txt
c:\test\c.txt\c.txt
The same command returns:
a.txt, b.txt, c.txt
The actual logic seems to be:
-Include used to match all entities specified by -Path. If matched element
is a file - return it. If matched
element is a folder, look inside and
return matching first level children.
Also, the documentation say:
The Include parameter is effective only when the command
includes the Recurse parameter or the
path leads to the contents of a
directory...
This is wrong as well. E.g.
gci -Path "c:\test" -Include "*.txt"
It returns nothing, while without -Include I get folder content. So -Include is definitely "effective". What really happens here? The -Path specify the "c:\test", and the -Include tries to match this path. As "*.txt" does not match "test", so nothing returned. But look at this:
gci -Path "c:\test" -Include "*t"
It returns a.txt, b.txt and c.txt as "*t" matched "test" and matched all child items.
After all, even knowing how Include works now, I don't understand when to use it. Why do I need it look to inside subfolders? Why should it be so complex?

You're confusing the use of -include. The -include flag is applied to the path, not the contents of the path. Without the use of the recursive flag, the only path that is in question is the path you specify. This is why the last example you gave works, the path c:\test has a t in the path and hence matches "*t".
You can verify this by trying the following
gci -path "c:\test" -in *e*
This will still produce all of the children in the directory yet it matches none of their names.
The reason that -include is more effective with the recurse parameter is that you end up applying the wildcard against every path in the hierarchy.

Try the -filter parameter (it has support for only one extension):
dir -filter *.txt

Tacking on to JaredPar's answer, in order to do pattern matching with Get-ChildItem, you can use common shell wildcards.
For example:
get-childitem "c:\test\t?st.txt"
where the "?" is a wildcard matching any one character or
get-childitem "c:\test\*.txt"
which will match any file name ending in ".txt".
This should get you the "simpler" behavior you were looking for.

I just asked a similar question and got three quick replies concerning the Get-Help for Get-ChildItem.
The answer is in the full description
of the command (Get-Help Get-ChildItem
-full):
The Include parameter is effective only when the command includes the
Recurse parameter or the path leads to
the contents of a directory, such as
C:\Windows*, where the wildcard
character specifies the contents of
the C:\Windows directory.
So the following would work without
recurse.
PS C:\foo> Get-childitem -path
"c:\foo*" -Include *.txt
From Stack Overflow question PowerShell Scripting - Get-ChildItem.
I hope this helps :-)

Including \* at the end of the path should work around the issue
PS C:\logfiles> Get-ChildItem .\* -include *.log
This should return .log files from the current working directory (C:\logfiles)
Alex's example above indicates that a directory with the name foo.log would also be returned. When I tried it, it wasn't but it's 6 years later and that could be from PS updates.
However, you can use the child item Mode to exclude directories I think.
PS C:\logfiles> Get-Childitem .\* -include *.log | where-object {$_.mode -notmatch "d"}
This should exclude anything with the 'directory' mode set.

get-childitem -include only works with -recursive or a wildcard in the path. I consider this a bug [Thought it was different in PS 6].

Related

Why get-childitem with wildcards are not easy to use? [duplicate]

From documentation:
-Include
Retrieves only the specified items.
The value of this parameter qualifies
the Path parameter. Enter a path
element or pattern, such as "*.txt".
Wildcards are permitted.
The Include parameter is effective only when the command
includes the Recurse parameter or the
path leads to the contents of a
directory, such as C:\Windows*, where
the wildcard character specifies the
contents of the C:\Windows directory.
My first understanding was:
c:\test\a.txt
c:\test\b.txt
So to get 'a.txt' and 'b.txt' I can write:
gci -Path "c:\test\*" -Include "*.txt"
And this works. But now consider such hierarchy:
c:\test\a.txt
c:\test\b.txt
c:\test\c.txt\c.txt
The same command returns:
a.txt, b.txt, c.txt
The actual logic seems to be:
-Include used to match all entities specified by -Path. If matched element
is a file - return it. If matched
element is a folder, look inside and
return matching first level children.
Also, the documentation say:
The Include parameter is effective only when the command
includes the Recurse parameter or the
path leads to the contents of a
directory...
This is wrong as well. E.g.
gci -Path "c:\test" -Include "*.txt"
It returns nothing, while without -Include I get folder content. So -Include is definitely "effective". What really happens here? The -Path specify the "c:\test", and the -Include tries to match this path. As "*.txt" does not match "test", so nothing returned. But look at this:
gci -Path "c:\test" -Include "*t"
It returns a.txt, b.txt and c.txt as "*t" matched "test" and matched all child items.
After all, even knowing how Include works now, I don't understand when to use it. Why do I need it look to inside subfolders? Why should it be so complex?
You're confusing the use of -include. The -include flag is applied to the path, not the contents of the path. Without the use of the recursive flag, the only path that is in question is the path you specify. This is why the last example you gave works, the path c:\test has a t in the path and hence matches "*t".
You can verify this by trying the following
gci -path "c:\test" -in *e*
This will still produce all of the children in the directory yet it matches none of their names.
The reason that -include is more effective with the recurse parameter is that you end up applying the wildcard against every path in the hierarchy.
Try the -filter parameter (it has support for only one extension):
dir -filter *.txt
Tacking on to JaredPar's answer, in order to do pattern matching with Get-ChildItem, you can use common shell wildcards.
For example:
get-childitem "c:\test\t?st.txt"
where the "?" is a wildcard matching any one character or
get-childitem "c:\test\*.txt"
which will match any file name ending in ".txt".
This should get you the "simpler" behavior you were looking for.
I just asked a similar question and got three quick replies concerning the Get-Help for Get-ChildItem.
The answer is in the full description
of the command (Get-Help Get-ChildItem
-full):
The Include parameter is effective only when the command includes the
Recurse parameter or the path leads to
the contents of a directory, such as
C:\Windows*, where the wildcard
character specifies the contents of
the C:\Windows directory.
So the following would work without
recurse.
PS C:\foo> Get-childitem -path
"c:\foo*" -Include *.txt
From Stack Overflow question PowerShell Scripting - Get-ChildItem.
I hope this helps :-)
Including \* at the end of the path should work around the issue
PS C:\logfiles> Get-ChildItem .\* -include *.log
This should return .log files from the current working directory (C:\logfiles)
Alex's example above indicates that a directory with the name foo.log would also be returned. When I tried it, it wasn't but it's 6 years later and that could be from PS updates.
However, you can use the child item Mode to exclude directories I think.
PS C:\logfiles> Get-Childitem .\* -include *.log | where-object {$_.mode -notmatch "d"}
This should exclude anything with the 'directory' mode set.
get-childitem -include only works with -recursive or a wildcard in the path. I consider this a bug [Thought it was different in PS 6].

How to recursively append to file name in powershell?

I have multiple .txt files in folders/their sub-folders.
I want to append _old to their file names.
I tried:
Get-ChildItem -Recurse | Rename-Item -NewName {$_.name -replace '.txt','_old.txt' }
This results in:
Some files get updated correctly
Some files get updated incorrectly - they get _old twice - example: .._old_old.txt
There are few errors: Rename-Item : Source and destination path must be different.
To prevent already renamed files from accidentally reentering the file enumeration and therefore getting renamed multiple times, enclose your Get-ChildItem call in (), the grouping operator, which ensures that all output is collected first[1], before sending the results through the pipeline:
(Get-ChildItem -Recurse) |
Rename-Item -NewName { $_.name -replace '\.txt$', '_old.txt' }
Note that I've used \.txt$ as the regex[2], so as to ensure that only a literal . (\.) followed by string txt at the end ($) of the file name is matched, so as to prevent false positives (e.g., a file named Atxt.csv or even a directory named AtxtB would accidentally match your original regex).
Note: The need to collect all Get-ChildItem output first arises from how the PowerShell pipeline fundamentally works: objects are (by default) sent to the pipeline one by one, and processed by a receiving command as they're being received. This means that, without (...) around Get-ChildItem, Rename-Item starts renaming files before Get-ChildItem has finished enumerating files, which causes problems. See this answer for more information about how the PowerShell pipeline works.
Tip of the hat to Matthew for suggesting inclusion of this information.
However, I suggest optimizing your command as follows:
(Get-ChildItem -Recurse -File -Filter *.txt) |
Rename-Item -NewName { $_.BaseName + '_old' + $_.Extension }
-File limits the the output to files (doesn't also return directories).
-Filter is the fastest way to limit results to a given wildcard pattern.
$_.BaseName + '_old' + $_.Extension uses simple string concatenation via the sub-components of a file name.
An alternative is to stick with -replace:
$_.Name -replace '\.[^.]+$', '_old$&'
Note that if you wanted to run this repeatedly and needed to exclude files renamed in a previous run, add -Exclude *_old.txt to the Get-ChildItem call.
[1] Due to a change in how Get-ChildItem is implemented in PowerShell [Core] 6+ (it now internally sorts the results, which invariably requires collecting them all first), the (...) enclosure is no longer strictly necessary, but this could be considered an implementation detail, so for conceptual clarity it's better to continue to use (...).
[2] PowerShell's -replace operator operates on regexes (regular expressions); it doesn't perform literal substring searches the way that the [string] type's .Replace() method does.
The below command will return ALL files from the current folder and sub-folders within the current directory the command is executed from.
Get-ChildItem -Recurse
Because of this you are also re-turning all the files you have already updated to have the _old suffix.
What you need to do is use the -Include -Exclude paramters of the Get-Childitem Cmdlet in order to ignore files that already have the _old suffix, and meet your include criteria, for example.
Get-ChildItem -Recure -Include "*.txt" -Exclude "*_old"
Then pipe the results into your re-name item command
Get-ChildItem cmdlet explanation can be found here.
https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.management/get-childitem?view=powershell-7

Get-ChildItem Exclude and File parameters don't work together

I can't figure out why these two parameters of the Get-ChildItem cmdlet don't work together. To make my question as clear as possible, look at the following example. From the Powershell ISE command pane:
Type 'dir' --> All files and sub-folders in the current directory are displayed.
Type 'dir -File' --> Original list minus sub-folders is displayed.
Type 'dir -Exclude "*.txt"' --> Original list minus .txt files is displayed.
Type 'dir -File -Exclude "*.txt"' --> NOTHING is displayed.
I would expect the original list minus sub-folders and .txt files. But regardless of what argument I use for '-Exclude', I get no items listed.
I have looked at the Get-ChildItem -full documentation, and the related articles here (Stack Overflow) and at other reliable resources, and still don't understand why this fails. Even the classic "-Include '*.txt' -Exclude 'A*'" example fails when you add "-File". How can I use -File and -Exclude together?
Whilst dir is an alias for Get-ChildItem, I find it best to use the full cmdlets when providing answers.
To use proper PowerShell cmdlets it would be best for you to use the following:
Get-ChildItem * -Exclude "*.txt" -File
What you see above is the PowerShell cmdlet to get all items in the path specified (using the * assumes you want all items from the current location)
You can also use -Path and provide the location of the path to where you want to get the items as well, such as:
Get-ChildItem -Path "C:\Path\Folder" -Exclude "*.txt" -File

How does the Get-ChildItem -Exclude Parameter work?

How does the Get-ChildItem -Exclude parameter work? What rules does it follow?
The Get-Help for Get-ChildItem isn't detailed at all:
Omits the specified items. The value of this parameter qualifies the
Path parameter. Enter a path element or pattern, such as "*.txt".
Wildcards are permitted.
And on Stackoverflow and elsewhere the general consensus seems to be it's too difficult to use and we should all just pipe the output of Get-ChildItem to Where-Object instead.
While I'm willing to use Where-Object I'm curious as to the rules -Exclude follows.
For example, I have a folder with the following sub-folders:
HsacFixtures
HsacFixturesBuild
RestFixture
RestFixtureBuild
If I execute the following command:
Get-ChildItem $rootFolderPath -Exclude HsacFixturesBuild -Directory
it returns the results expected:
HsacFixtures
RestFixture
RestFixtureBuild
However, if I add a -Recurse parameter:
Get-ChildItem $rootFolderPath -Exclude HsacFixturesBuild -Directory -Recurse
Then it returns sub-folders in the HsacFixturesBuild folder.
I've also tried HsacFixturesBuild\ and HsacFixturesBuild\*, which have the same results.
So does -Exclude only apply to immediate children, and not to grand-children or deeper sub-folders?
Exclude omits child objects based on the Name,
For files, gets the name of the file. For directories, gets the name
of the last directory in the hierarchy if a hierarchy exists.
Otherwise, the Name property gets the name of the directory
not the FullName,
which gets the full path of the directory or file
So even though the object is a grandchild in a recursive call, the exclude only looks at the object's Name, not the FullName so the exclude wont affect omission unless the child objects share a common substring of the name that happens to be part of the exclude parameter
Source Get-ChildItem
Example 3: Get all child items using an inclusion and exclusion
This command lists the .txt files in the Logs subdirectory, except for
those whose names start with the letter A. It uses the wildcard
character (*) to indicate the contents of the Logs subdirectory, not
the directory container. Because the command does not include the
Recurse parameter, the command does not include the content of
directory automatically; you need to specify it.
Windows PowerShell
PS C:\> Get-ChildItem –Path "C:\Windows\Logs\*" -Include "*.txt" -Exclude "A*"
With respect to Get-ChildItem,-exclude parameter works on the objects name
It could be understood better from the following example:
Consider the following folder structure
First we use -exclude parameter without recurse.
As we could see in the above image, based on objects name , folder was excluded
Now we add recurse parameter to the above statement as follows
Now we see could see that sub-folders of folder is still present, because the exclusion was applied at objects name
Hope this HElps.

Different result from same command in powershell 3.0

Given that Get-ChildItem -Path *.exe will show all the executables in the current directory, why doesn't Get-ChildItem -File -Include *.exe return the same result? Both commands are executed in the same directory, first command (with -Path) returns a list of executables but the second command (with -File) doesn't. (gci -File will list everything including the exe)
Get-ChildItem -File | gm #=> FileInfo
Get-ChildItem *.* | gm #=> DirectoryInfo and FileInfo
All the commands bellow return objects of type FileInfo
Get-ChildItem -File
Get-ChildItem *.* -Include *.exe
Get-ChildItem -Path *.exe
But mixing -File and -Include/-Exclude returns nothing, even though the -include is looking for a filetype:
Get-ChildItem -File -Include *.exe #=> Returns nothing
What am I missing here?
From TechNet:
The Include parameter is effective only when the command includes the
Recurse parameter or the path leads to the contents of a directory,
such as C:\Windows*, where the wildcard character specifies the
contents of the C:\Windows directory.
In other words, when you use the Include parameter, it does not automatically consider all files and directories unless you use the Path or the Recurse parameters. Notice, that when just using the Path parameter, you must include a wildcard to force it to consider the file and directory results underneath that path. I cannot think of why this is.
To get your examples to work, you would use one of the following (I'm dropping the File parameter because it seems redundant):
Get-ChildItem -Path * -Include *.exe
Get-ChildItem -Include *.exe -Recurse
The gist of my answer to your question is an opinion tho - from what I've seen the Include parameter should be removed - or its behavior repaired to match the default behavior of the Get-ChildItem cmdlet when used without parameters. There may be a good explanation to why it works this way, but I'm unaware of this.
If you drop the Include parameter from your examples, the behavior/results make more sense (to me):
Get-ChildItem -Path *.exe
In this case, we would only need the Exclude parameter to effectively cover all filtering requirements. Something like:
Get-ChildItem -Path *.exe -Exclude *system*