Multiple counts from Powershell Get-ChildItem? - powershell

How would I use powershell to count the number of files and the number of folders under a folder, emulating the Windows "properties" of that folder?
The FAQ says cross-posting is OK, so the full question can be found at: https://superuser.com/questions/605911/multiple-counts-from-powershell-get-childitem

Basically you need to enumerate all files and folders for a given path and maintain a count of each object type:
$files=$folders=0
$path = "d:\temp"
dir $path -recurse | foreach { if($_.psiscontainer) {$folders+=1} else {$files+=1} }
"'$path' contains $files files, $folders folders"

Edit:
Edited to improve efficiency and stop loops...
This can also be done by using the Measure-Object cmdlet with Get-childItem cmdlet:
gci "C:\" -rec | Measure -property psiscontainer -max -sum | Select `
#{Name="Path"; Expression={$directory.FullName}},
#{Name="Files"; Expression={$_.Count - $_.Sum}},
#{Name="Folders"; Expression={$_.Sum}}
Output:
Path Files Folders
---- ----- -------
C:\Test 470 19

Related

Search a specified path for multiple .xml files within the same folder

I am seeking help creating a PowerShell script which will search a specified path for multiple .xml files within the same folder.
The script should provide the full path of the file(s) if found.
The script should also provide a date.
Here's my code:
$Dir = Get-ChildItem C:\windows\system32 -Recurse
#$Dir | Get-Member
$List = $Dir | where {$_.Extension -eq ".xml"}
$List | Format-Table Name
$folder = "C:\Windows\System32"
$results = Get-ChildItem -Path $folder -File -Include "*.xml" | Select Name, FullName, LastWriteTime
This will return all xml files only and display the file name, full path to the file and last time it was written to. The "-File" switch is only available in Powershell 4 and up. So if doing it off a Windows 7 or Windows 2008 R2 Server, you will have to make sure you updated your WMF to 4 or higher. Without file the second like will look like.
#Powershell 2.0
$results = Get-ChildItem -Path $folder -Include "*.xml" | Where {$_.PSIsContainer -eq $false} | Select Name, FullName, LastWriteTime
I like the Select method mentioned above for the simpler syntax, but if for some reason you just want the file names with their absolute path and without the column header that comes with piping to Select (perhaps because it will be used as input to another script, or piped to another function) you could do the following:
$folder = 'C:\path\to\folder'
Get-ChildItem -Path $folder -Filter *.xml -File -Name | ForEach-Object {
[System.IO.Path]::GetFullPath($_)
}
I'm not sure if Select lets you leave out the header.
You could also take a look at this answer to give you some more ideas or things to try if you need the results sorted, or the file extension removed:
https://stackoverflow.com/a/31049571/10193624
I was able to make a few changes exporting the results to a .txt file, but though it provides the results I only want to isolate the same .xml files.
$ParentFolder = "C:\software"
$FolderHash = #{}
$Subfolders = Get-ChildItem -Path $ParentFolder
foreach ($EventFolder in $Subfolders) {
$XMLFiles = Get-ChildItem -Path $EventFolder.fullname -Filter *.xml*
if ($XMLFiles.Count -gt 1) {
$FolderHash += #{$EventFolder.FullName = $EventFolder.LastWriteTime}
}
}
$FolderHash
Judging from your self-answer you want a list of directories that contain more than one XML file without recursively searching those directories. In that case your code could be simplified to something like this:
Get-ChildItem "${ParentFolder}\*\*.xml" |
Group-Object Directory |
Where-Object { $_.Count -ge 2 } |
Select-Object Name, #{n='LastWriteTime';e={(Get-Item $_.Name).LastWriteTime}}

Powershell to display duplicate files

I have a task to check if new files are imported for the day in a shared location folder and alert if any duplicate files and no recursive check needed.
Below code displays all the file details with size which are 1 day old However I need only files with the same size as I cannot compare them using name.
$Files = Get-ChildItem -Path E:\Script\test |
Where-Object {$_.CreationTime -gt (Get-Date).AddDays(-1)}
$Files | Select-Object -Property Name, hash, LastWriteTime, #{N='SizeInKb';E={[double]('{0:N2}' -f ($_.Length/1kb))}}
I didn't like the big DOS-like script answer written here, so here's an idiomatic way of doing it for Powershell:
From the folder you want to find the duplicates, just run this simple set of pipes
Get-ChildItem -Recurse -File `
| Group-Object -Property Length `
| ?{ $_.Count -gt 1 } `
| %{ $_.Group } `
| Get-FileHash `
| Group-Object -Property Hash `
| ?{ $_.Count -gt 1 } `
| %{ $_.Group }
Which will show all files and their hashes that match other files.
Each line does the following:
get files
from current directory (use -Path $directory otherwise)
recursively (if not wanted, remove -Recurse)
group based on file size
discard groups with less than 2 files
grab all those files
get hashes for each
group based on hash
discard groups with less than 2 files
get all those files
Add | %{ $_.path } to just show the paths instead of the hashes.
Add | %{ $_.path -replace "$([regex]::escape($(pwd)))",'' } to only show the relative path from the current directory (useful in recursion).
For the question-asker specifically, don't forget to whack in | Where-Object {$_.CreationTime -gt (Get-Date).AddDays(-1)} after the gci so you're not comparing files you don't want to consider, which might get very time-consuming if you have a lot of coincidentally same-length files in that shared folder.
Finally, if you're like me and just wanted to find dupes based on name, as google will probably take you here too:
gci -Recurse -file | Group-Object name | Where-Object { $_.Count -gt 1 } | select -ExpandProperty group | %{ $_.fullname }
All the examples here take in account only timestamp, lenght and name. That is for sure not enough.
Imagine this example
You have two files:
c:\test_path\test.txt and c:\test_path\temp\text.txt.
The first one contains 12345. The second contains 54321. In this case these files will be considered identical even when they are not.
I have create a duplicate checker based on hash calculation. It was created right now from my head so it is rather crude (but I think you get the idea and it will be easy to optimize):
Edit I've decided the source code was "too crude" (nick name for incorrect) and I have improved it (removed superfluous code):
# The current directory where the script is executed
$path = (Resolve-Path .\).Path
$hash_details = #{}
$duplicities = #{}
# Remove unique record by size (different size = different hash)
# You can select only those you need with e.g. "*.jpg"
$file_names = Get-ChildItem -path $path -Recurse -Include "*.*" | ? {( ! $_.PSIsContainer)} | Group Length | ? {$_.Count -gt 1} | Select -Expand Group | Select FullName, Length
# I'm using SHA256 due to SHA1 collisions found
$hash_details = ForEach ($file in $file_names) {
Get-FileHash -Path $file.Fullname -Algorithm SHA256
}
# just counter for the Hash table key
$counter = 0
ForEach ($first_file_hash in $hash_details) {
ForEach ($second_file_hash in $hash_details) {
If (($first_file_hash.hash -eq $second_file_hash.hash) -and ($first_file_hash.path -ne $second_file_hash.path)) {
$duplicities.add($counter, $second_file_hash)
$counter += 1
}
}
}
##Throw output with duplicity files
If ($duplicities.count -gt 0) {
#Write-Output $duplicities.values
Write-Output "Duplicate files found:" $duplicities.values.Path
$duplicities.values | Out-file -Encoding UTF8 duplicate_log.txt
} Else {
Write-Output 'No duplicities found'
}
I have created a test structure:
PS C:\prg\PowerShell\_Snippets\_file_operations\duplicities> Get-ChildItem -path $path -Recurse
Directory: C:\prg\PowerShell\_Snippets\_file_operations\duplicities
Mode LastWriteTime Length Name
---- ------------- ------ ----
d---- 9.4.2018 9:58 test
-a--- 9.4.2018 11:06 2067 check_for_duplicities.ps1
-a--- 9.4.2018 11:06 757 duplicate_log.txt
Directory: C:\prg\PowerShell\_Snippets\_file_operations\duplicities\test
Mode LastWriteTime Length Name
---- ------------- ------ ----
d---- 9.4.2018 9:58 identical_file
d---- 9.4.2018 9:56 t
-a--- 9.4.2018 9:55 5 test.txt
Directory: C:\prg\PowerShell\_Snippets\_file_operations\duplicities\test\identical_file
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 9.4.2018 9:55 5 test.txt
Directory: C:\prg\PowerShell\_Snippets\_file_operations\duplicities\test\t
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 9.4.2018 9:55 5 test.txt
(Where file in the ..\duplicities\test\t is different from the others).
The result of the running script.
The console output:
PS C:\prg\PowerShell\_Snippets\_file_operations\duplicities> .\check_for_duplicities.ps1
Duplicate files found:
C:\prg\PowerShell\_Snippets\_file_operations\duplicities\test\identical_file\test.txt
C:\prg\PowerShell\_Snippets\_file_operations\duplicities\test\test.txt
The duplicate_log.txt file contains more detailed information:
Algorithm Hash Path
--------- ---- ----
SHA256 5994471ABB01112AFCC18159F6CC74B4F511B99806DA59B3CAF5A9C173CACFC5 C:\prg\PowerShell\_Snippets\_file_operations\duplicities\test\identical_file\test.txt
SHA256 5994471ABB01112AFCC18159F6CC74B4F511B99806DA59B3CAF5A9C173CACFC5 C:\prg\PowerShell\_Snippets\_file_operations\duplicities\test\test.txt
Conclusion
As you see the different file is correctly omitted from the result set.
Since the file contents that you are determining to be duplicate. It's more prudent to just hash files and compare the hash.
The name, size. timestamp would not be a prudent attributes for the defined use case. Since the hash would tell you if the files have the same content.
See these discussions
Need a way to check if two files are the same? Calculate a hash of
the files. Here is one way to do it:
https://blogs.msdn.microsoft.com/powershell/2006/04/25/duplicate-files
Duplicate File Finder and Remover
And now the moment you have been waiting for....an all PowerShell file
duplicate finder and remover! Now you can clean up all those copies of
pictures, music files, and videos. The script opens a file dialog box
to select the target folder, recursively scans each file for duplica
https://gallery.technet.microsoft.com/scriptcenter/Duplicate-File-Finder-and-78f40ae9
This might helpful for you.
$files = Get-ChildItem 'E:\SC' | Where-Object {$_.CreationTime -eq (Get-Date).AddDays(-1)} | Group-Object -Property Length
foreach($filegroup in $allfiles)
{
if ($filegroup.Count -ne 1)
{
foreach ($file in $filegroup.Group)
{
Invoke-Item $file.fullname
}
}
}

Get total number of files and Sub folders in a folder

I have a task to carry out 3 times a day on a WS2012R2 to get the Disk size, total number of files and folders including subdirectories from a folder on a remote server.
Currently I get this information by RDP'ing to the target, navigating to the folder and right clicking the folder to copy the info:
I have already tried the PowerShell script :
Get-ChildItem E:\Data -Recurse -File | Measure-Object | %{$_.Count}
and other PowerShell scripts.
Which produced countless errors pertaining to not having permissions for some sub directories or simply gave results I didn't want such.
I have tried VBscript but VBscript simply cannot get this information.
You can just access the count property:
$items = Get-ChildItem E:\Data -Recurse -File
($items | Where { -not $_.PSIsContainer}).Count #files
($items | Where $_.PSIsContainer).Count #folders
To summarise the comments so far.
You can get the size of the C drive with:
$Drive = "C"
Get-PSDrive -Name $Drive | Select-Object #{N="Used Space on $Drive Drive (GB)"; E={[Math]::Round($_.Used/1GB)}}
But you cannot use something like the code below to count the files & folders for the whole C drive, but it works for simple filestructures and simple drives.
I use the code to calculate my other drives and for my homepath, just change $Path to e.g "$Env:Homepath".
There's the path length problem which is a .NET thing and won't be
fixed until everyone (and PowerShell) is using .NET 4.6.2. Then
there's that you're acting as you counting it, not the operating
system counting.
[Int]$NumFolders = 0
[Int]$NumFiles = 0
$Path = "C:\"
$Objects = Get-ChildItem $Path -Recurse
$Size = Get-ChildItem $Path -Recurse | Measure-Object -property length -sum
$NumFiles = ($Objects | Where { -not $_.PSIsContainer}).Count
$NumFolders = ($Objects | Where {$_.PSIsContainer}).Count
$Summary = New-Object PSCustomObject -Property #{"Path" = $Path ; "Files" = $NumFiles ; "Folders" = $NumFolders ; "Size in MB" = ([Math]::Round($Size.sum /1MB))}
$Summary | Format-list
To run this on remote computers, I would recommend using New-PsSession to the computer/computers, then Invoke-Command to run the code using the new sessions.

Determine recursively both COUNT and SUM of all extensions in a folder

I would like to be able to select a remote folder and scan it recursively for all file extensions. For each extension discovered, I would need a total count and well as the sum for individual file types.
I've found a script here that works for a single file extension using the -include switch, but rather than running the script scores of times, it would be nice to simply run once and collect all extensions.
$hostname=hostname
$directory = "D:\foo"
$FolderItems = Get-ChildItem $directory -recurse -Include *.txt
$Measurement = $FolderItems | Measure-Object -property length -sum
$colitems = $FolderItems | measure-Object -property length -sum
"$hostname;{0:N2}" -f ($colitems.sum / 1MB) + "MB;" + $Measurement.count + " files;"
I think I need to use Get-ChildItem $directory | Group-Object -Property Extension to somehow list the extensions, if that's helpful.
The ideal output would be something like this:
Extension, Size (MB), Count
jpg,1.72,203
txt,0.23,105
xlsx,156.12,456
I'm using Powershell v4.0 on a Windows 7 machine to remotely connect to the server, I could run the script locally, but it only has V3.0 for the Win 2008 R2 machine.
Does anyone have any ideas?
This is one approach:
#Get all items
Get-ChildItem -Path $directory -Recurse |
#Get only files
Where-Object { !$_.PSIsContainer } |
#Group by extension
Group-Object Extension |
#Get data
Select-Object #{n="Extension";e={$_.Name -replace '^\.'}}, #{n="Size (MB)";e={[math]::Round((($_.Group | Measure-Object Length -Sum).Sum / 1MB), 2)}}, Count
Extension Size (MB) Count
--------- --------- -----
mkv 164,03 1
xlsx 0,03 3
dll 0,32 5
lnk 0 1
url 0 1
txt 0 1

Comparing 2 folders and saving the paths into a variable?

I've been given the task of comparing 2 folders, FolderA and FolderB and noting any files that exist in A but not in B.
Sorry for not explaining myself fully. Maybe it would help if I explain our situation. A company sales employee has left our company to go to a competitor. He has files in on his work laptop local hard drive. We are trying to establish if there are any files that exist on his computer but not on the shared network folder.
I need to produce a list of any files (along with their paths) that are present on his laptop but not on the share network location. The file structure between the laptop local hard drive and the shared network location are different. What's the best way to go about this?
$folderAcontent = "C:\temp\test1"
$folderBcontent = "C:\temp\test2"
$FolderAContents = Get-ChildItem $folderAcontent -Recurse | where-object {!$_.PSIsContainer}
$FolderBContents = Get-ChildItem $folderBcontent -Recurse | where-object {!$_.PSIsContainer}
$FolderList = Compare-Object -ReferenceObject ($FolderAContents ) -DifferenceObject ($FolderBContents) -Property name
$FolderList | fl *
Use the compare-Object cmdlet :
Compare-Object (gci $folderAcontent) (gci $folderBcontent)
if you want to list the file that are only in $folderAcontent select the results with the <= SideIndicator :
Compare-Object (gci $folderAcontent) (gci $folderBcontent) | where {$_.SideIndicator -eq "<="}
assuming that the filenames in both the directories are same, you can do something like the following :-
$folderAcontent = "C:\temp\test1"
$folderBcontent = "C:\temp\test2"
ForEach($File in Get-ChildItem -Recurse -LiteralPath $FolderA | where {$_.psIsContainer -eq $false} | Select-Object Name)
{
if(!(Test-Path "$folderBcontent\$File"))
{
write-host "Missing File: $folderBcontent\$File"
}
}
The above will only work for files (not subdirectories) present in folder A
Try:
#Set locations
$laptopfolder = "c:\test1"
$serverfolder = "c:\test2"
#Get contents
$laptopcontents = Get-ChildItem $laptopfolder -Recurse | where {!$_.PSIsContainer}
$servercontents = Get-ChildItem $serverfolder -Recurse | where {!$_.PSIsContainer}
#Compare on name and length and find changed files on laptop
$diff = Compare-Object $laptopcontents $servercontents -Property name, length -PassThru | where {$_.sideindicator -eq "<="}
#Output differences
$diff | Select-Object FullName
If you add lastwritetime after length in the compare-object cmdlet it will compare modified date too(if the file was updated but still same size). Just be aware that it only looks for differnt dates, not if it's newer or older. :)