I am currently writing a little script/program that will identify and sort certain files in a Windows directory. I am using the ls -n command to output a list of files to later be used by grep for Windows. However, using the following command:
ls -n >test.txt
leaves off the file extensions for file names in the output file. When I use ls -n inside the Powershell console (no output redirection), the file extensions are in the output.
Does anyone know what the issue is or how to do this properly with Powershell?
This works fine for me:
PS C:\Users\fission\Desktop\test> dir
Directory: C:\Users\fission\Desktop\test
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 2011-06-19 3:22 PM 1250 capture.pcap
-a--- 2013-09-26 5:21 PM 154205 fail.pml
-a--- 2013-09-25 12:53 PM 1676383 hashfxn.exe
PS C:\Users\fission\Desktop\test> ls -n >test.txt
PS C:\Users\fission\Desktop\test> type test.txt
capture.pcap
fail.pml
hashfxn.exe
test.txt
As you can see, test.txt includes the extensions of the other files.
But may I make a suggestion? Piping text output to a file, then grepping it isn't very "idiomatic" in PowerShell. It's a bit counter to a central theme of PowerShell: one should pass objects, not text. You might consider working with the output of Get-ChildItem directly, eg by storing it in a variable, or piping it to Select-Object, etc.
Don't use aliases in scripts, because you can't depend upon them being set the same everywhere.
This will get you a listing of all files (and no directories) in the current directory, sort it alphabetically, and write it to test.txt.
Get-ChildItem |
where-object (!$_.PSIsContainer}|
select-object -expandproperty Name|
sort-object | out-file test.txt
If you're searching for strings within those files, you can use select-string instead of grep, to keep it completely within PowerShell.
Get-ChildItem |
where-object (!$_.PSIsContainer}|
select-string PATTERN
Related
On my Windows 10 PC, there are three files, 10GB each, that I want to merge via cat file_name_prefix* >> some_file.zip. However, the output file grew as much as 38GB large before I aborted the operation via Ctrl+C. Is this expected behavior? If not, where am I making a mistake?
Cat is an alias of Get-Content which assumes text files by default - the output size is probably due to this conversion. You can try adding the -raw switch for binary files - this might work? (not sure)
Its definitely possible to "cat" binary files together with a CMD shell using the copy command like below.
copy /b part1.bin+part2.bin+part3.bin some_file.zip
(The 3 part*.bin are the files to be combined into some_file.zip).
PowerShell's cat A.K.A Get-Content reads text file content into an array of strings by default. It also reads the file and checks for the BOM to handle encodings properly if you don't specify a charset. That means it won't work with binary files
To combine binary files in PowerShell 6+ you need to use the -AsByteStream parameter
Get-Content -AsByteStream file_name_prefix* | `
Set-Content -AsByteStream some_file.zip # or
Get-Content -AsByteStream file1, file2, file3 | `
Set-Content -AsByteStream some_file.zip
Older PowerShell doesn't have that option so the only thing you can use is -Raw
Get-Content -Raw file_name_prefix* | Set-Content -Raw some_file.zip
However it'll be very slow because the input files are still treated as text files and read line-by-line. For speed you'll need to use other solutions, like calling Win32 APIs directly from PowerShell
Update:
As mentioned, there's only -Raw in Get-Content, not in Set-Content and it's unsuitable for binary content. You need to use -Encoding Byte
Get-Content -Encoding Byte file_name_prefix* | Set-Content -Encoding Byte some_file.zip
See
Fast and simple binary concatenate files in Powershell
Concatenate files using PowerShell
It is probably going in a loop, recursively concatenating all files including the result to the result file (with the glob wildcard).
You can add an extension in the glob, temporarily save it as another extension and move it to the correct one. (As suggested in: https://stackoverflow.com/a/53079166/12657997)
E.g. when you have 3 files:
a.txt with a inside
b.txt with b inside
c.txt with c inside
cat *.txt > res.csv ; mv res.csv res.txt
cat .\res.txt
a
b
c
Edit
This cat command (as shown above), in combination with the output redirect > will increase the result text file as #mklement0 points out.
According to the documentation (https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.management/get-content?view=powershell-7.1):
-Encoding
Specifies the type of encoding for the target file. The default value is utf8NoBOM.
However the encoding with the output redirect changes the ecoding, as explained in this post: https://stackoverflow.com/a/40098904/12657997
To illustrate this I've converted the a.txt, b.txt and c.txt to zip files (now they are in a binary format).
cat -Encoding Byte *.zip > res.csv ; mv res.csv res2.txt
cat -Raw *.zip > res.csv ; mv res.csv res3.txt
ls .
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 15/03/2021 21:29 109 a.zip
-a---- 15/03/2021 21:29 109 b.zip
-a---- 15/03/2021 21:29 109 c.zip
-a---- 15/03/2021 21:39 2282 res2.txt
-a---- 15/03/2021 21:41 668 res3.txt
We can see that the output size doubles in size for res3.txt (for every utf-8 byte read utf-16 will output 2.
The -Encoding Byte output, in combination with the output redirect, will make it even worse.
Why does Get-Item not work on some directories? For example gi $env:USERPROFILE\AppData returns "Could not find item", but ls $env:USERPROFILE\AppData works fine and can list files?
I want to use gi to pass a string to it to turn it into an object that has other members like LastWriteTime. If I use ls for Get-ChildItem I get the children, i.e. files in the directory, but not the directory.
I can work around this by using a filter on the parent like this: ls -h $env:USERPROFILE | ? {$_.Name -match "AppData"} | select Name,LastWriteTime - but there has to be a better way and it does not explain why gi does not work directly.
The AppData directory has the hidden attribute set:
PS C:\> attrib $env:USERPROFILE\AppData
H C:\Users\username\AppData
The hidden attribute means that Get-Item ignores it by default. The workaround is to use -Force:
PS C:\> Get-Item $env:USERPROFILE\AppData -Force
Directory: C:\Users\username
Mode LastWriteTime Length Name
---- ------------- ------ ----
d--h-- 11/25/2019 12:41 PM AppData
I'm using the following Powershell command:
Get-ChildItem -Recurse *.txt
But if there's multiple results the output will be like this:
Directory: C:\TestFolder\myfolder\
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- d/m/yyyy hh:MM PM 1234 dragons.txt
Directory: C:\TestFolder\anotherfolder\
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- d/m/yyyy hh:MM PM 66550 jabberwocky.txt
But I want to get grouped results in some form.
Maybe like this:
Mode LastWriteTime Length Directory Name
---- ------------- ------ --------- ----
-a--- d/m/yyyy hh:MM PM 1234 C:\TestFolder\myfolder\ dragons.txt
-a--- d/m/yyyy hh:MM PM 66550 C:\TestFolder\anotherfolder\ jabberwocky.txt
Or this:
Length FullPath
------ --------
1234 C:\TestFolder\myfolder\dragons.txt
66550 C:\TestFolder\anotherfolder\jabberwocky.txt
You probably get the idea. How can I accomplish this, preferably in a simple and elegant manner?
I tried Get-ChildItem -Recurse *.txt | Format-Table but that doesn't do much. I've also checked the most relevant similar questions suggested by Stack Overflow (i.e. "Recurse with PowerShell's Get-ChildItem", and others), but haven't been able to distill a solution so far.
Addendum:
I used help group and found that group is actually the exact alias for the Cmdlet I thought I was looking for: Group-Object. If I do this:
Get-ChildItem -Recurse *.txt | Group-Object "FullName"
I get:
Count Name Group
----- -------- -----
1 C:\TestFold... {C:\TestFolder\myfolder\dragons.txt}
1 C:\TestFold... {C:\TestFolder\anotherfolder\jabberwocky.txt}
But this requires me to simplify with an additional step to:
Get-ChildItem -Recurse *.txt | Group-Object "FullName" | Select-Object "Name"
Which gets:
Name
----
C:\TestFolder\myfolder\dragons.txt
C:\TestFolder\anotherfolder\jabberwocky.txt
If I really want extra properties, then I guess I want to "group on multiple properties", making the question effectively a duplicate of this other SO question.
However, all this "grouping" seems like overkill. Is there not a direct way of using Get-ChildItem to get the output I want?
PowerShell has its own way of displaying System.IO.DirectoryInfo and System.IO.FileInfo objects. If you don't want to see that then you just need to use Select-Object.
Get-ChildItem -Recurse c:\temp | select Mode,LastWriteTime,Length,Directory,Name
Group-Object is completely unnecessary. Given your need I suppose Group-Object seemed appealing but its power is not needed here like it is used for in the linked question. What you really wanted to do is change how PowerShell deals with those objects. Format-Table does not work for the same reason. It was taking the PowerShell by design output and making a table. If you called the properties with Format-Table you would have the same solution as we did with Select-Object.
Get-ChildItem -Recurse c:\temp | Format-Table Mode,LastWriteTime,Length,Directory,Name
Please... Please... don't use that line if you intend to use the output in other functions. Format-cmdlets break objects and are used for the purpose of displaying data only.
If you are just trying to get a list of files recursively with their fullpath names, don't use Group or Select. All of these Commands pretends to be a spreadsheet of objects displayed in a text console.
Instead use the foreach-object operator "%{ }" to dump the raw string date to the console. Example:
Get-ChildItem -Recurse *.txt | %{ $_.fullname }
(Incidentally the above is equivalent to linux command: "find .")
If you want to see which fields are accessible from the foreach-object script block. you can issue this command:
Get-ChildItem my_filepath | get-member
Alternatively, you could pipe the output of Get-ChildItem to Export-Csv command and open it in notepad.
Get-ChildItem -Recurse -file *.txt |
Select FullName |
Export-Csv "files.csv"
notepad files.csv
Alternatively, use cmd:
cmd /c dir /b /s *.txt
I'm trying this but it doesn't print anything:
Dir -Recurse "C:\temp" | Select Fullname
Looks like this command just selects file names. I want to see them in console.
Take a look at Get-Childitem
Dir -Recurse c:\path\ | Get-Childitem
Concerning your code in the question.
Your command should have worked as is. You are, in fact, already calling Get-ChildItem. If you check Get-Alias you will see what I'm trying to tell you.
PS C:\users\Cameron\Downloads> Get-Alias dir
CommandType Name ModuleName
----------- ---- ----------
Alias dir -> Get-ChildItem
You code translates to
Get-ChildItem -Recurse "C:\temp" | Select Fullname
Again, I'm not sure why your code does not generate output since that is perfectly fine on a folder that contains files or directories. Might be an issue with the positional parameter maybe? What is your PowerShell version? ( Use Get-Host).
The code you have would send all file paths to console. Did you want that output somewhere else?
About the accepted answer
Pretty sure this code will double up output if you have folders in the path since directory will output to the second Get-ChildItem
Dir -Recurse c:\path\ | Get-Childitem
Consider the following folder tree
C:\TEMP\TEST
│ File1.txt
│ File2.txt
│
└───Folder1
File3.txt
Consider the two command run against that folder tree.
PS C:\users\Cameron\Downloads> Dir -Recurse c:\temp\test | Select Fullname
FullName
--------
C:\temp\test\Folder1
C:\temp\test\File1.txt
C:\temp\test\File2.txt
C:\temp\test\Folder1\File3.txt
PS C:\users\Cameron\Downloads> Dir -Recurse c:\temp\test | Get-Childitem | Select Fullname
FullName
--------
C:\temp\test\Folder1\File3.txt
C:\temp\test\File1.txt
C:\temp\test\File2.txt
C:\temp\test\Folder1\File3.txt
The second command shows two files called File3.txt when in reality there is only one.
get-childitem | format-list > filename.txt
This will give you a text file with name, size, last modified, etc.
if you want specific parameters from the item... such as name of the file only the command is
get-childitem | format-list name > filename.txt
this is give you the same text file, but with just the name of the files listed.
It might also be worth mentioning the -force switch which is required to see hidden items.
Is there a way to determine how wildcard matching is done in Get-ChildItem?
Various articles (1, 2) suggest that it is done through the WildcardPattern class, but I don’t think this is the case. For example, suppose you have a file in C:\test\test2\testfile.txt. Then Get-ChildItem –Path “C:\*\testfile.txt” will not find the file while WildcardPattern::IsMatch will. Wildcard "*" matching in Get-ChildItem seems to be on directory level: so "\*\" will never match more than one level, like "\A\B\".
So if WildcardPattern class isn't used, then what is?
From what I know, it's using the WildcardPattern as you describe. However, the cmdlet Get-ChildItem limits it to the current directory (characters except \), so it won't conflict with the -Recurse switch that goes to unlimited levels.
With "C:\*\testfile.txt", the asterisk plays a role just for the first level directory (e.g test). The file you're looking for is not there and the output you get is expected. Add another asterisk for the second level and you'll get the desired output (e.g "C:\*\*\testfile.txt"). You can also add the Recurse switch to start searching from the current location, all the way downwards.
Either would work:
gci c:\test\*\testfile.txt
or
gci c:\*\testfile.txt -recurse
Example:
PS C:\temp\test2> dir
Directory: C:\temp\test2
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 4/4/2013 10:41 PM 0 testfile.txt
PS C:\temp\test2> cd \
PS C:\> gci c:\*\testfile.txt -recurse -ea SilentlyContinue
Directory: C:\Temp\test2
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 4/4/2013 10:41 PM 0 testfile.txt