PowerShell - Find duplicate file inside ZIPs and CABs - powershell

I am trying to write a script that will find duplicate file inside a compressed files.
The compressed files can be ZIP or CAB (Need help to extract CAB file also because currently its not working).
What I have so far is to extract the zips to a temp folder (don't know how to extract cab) and if there is a vip inside I need to extract him also to the same folder. currently all the files are extracted to the same temp folder what I need is to extract each zip/cab into a folder with the original name even if he has a vip inside. (the zip/cab are not flat) in the next step I need to find duplication files and display all the duplication and where they found.
The script below is not working...
$tempFolder = Join-Path ([IO.Path]::GetTempPath()) (New-GUID).ToString('n')
$compressedfiles = Get-ChildItem -path C:\Intel\* -Include "*.zip","*.CAB"
foreach ($file in $compressedfiles) {
if ($file -like "*.zip") {
$zip = [System.IO.Compression.ZipFile]::ExtractToDirectory($file, $tempFolder)
$test = Get-ChildItem -path $tempFolder\* -Include "*.vip"
if ($test) {
$zip = [System.IO.Compression.ZipFile]::ExtractToDirectory($test, $tempFolder)
}
}
}
$Files=gci -File -Recurse -path $tempFolder | Select-Object -property FullName
$MatchedSourceFiles=#()
ForEach ($SourceFile in $Files)
{
$MatchingFiles=#()
$MatchingFiles=$Files |Where-Object {$_.name -eq $SourceFile.name}
if ($MatchingFiles.Count -gt 0)
{
$NewObject=[pscustomobject][ordered]#{
File=$SourceFile.FullName
#MatchingFiles=$MatchingFiles
}
$MatchedSourceFiles+=$NewObject
}
}
$MatchedSourceFiles
Remove-Item $tempFolder -Force -Recurse

Building on what you have already tried, you could do this like:
Add-Type -AssemblyName System.IO.Compression.FileSystem
$tempFolder = Join-Path -Path ([IO.Path]::GetTempPath()) -ChildPath (New-GUID).Guid
$compressedfiles = Get-ChildItem -Path 'C:\Intel' -Include '*.zip','*.CAB' -File -Recurse
$MatchedSourceFiles = foreach ($file in $compressedfiles) {
switch ($file.Extension) {
'.zip' {
# the destination folder should NOT already exist here
$null = [System.IO.Compression.ZipFile]::ExtractToDirectory($file.FullName, $tempFolder)
Get-ChildItem -Path $tempFolder -Filter '*.vip' -File -Recurse | ForEach-Object {
$null = [System.IO.Compression.ZipFile]::ExtractToDirectory($_.FullName, $tempFolder)
}
}
'.cab' {
# the destination folder MUST exist for expanding .cab files
$null = New-Item -Path $tempFolder -ItemType Directory -Force
expand.exe $file.FullName -F:* $tempFolder > $null
}
}
# now see if there are files with duplicate names
Get-ChildItem -Path $tempFolder -File -Recurse | Group-Object Name |
Where-Object { $_.Count -gt 1 } | ForEach-Object {
foreach ($item in $_.Group) {
# output objects to be collected in $MatchedSourceFiles
[PsCustomObject]#{
SourceArchive = $file.FullName
DuplicateFile = '.{0}' -f $item.FullName.Substring($tempFolder.Length) # relative path
}
}
}
# delete the temporary folder
$tempFolder | Remove-Item -Force -Recurse
}
# display on screen
$MatchedSourceFiles
# save as CSV file
$MatchedSourceFiles | Export-Csv -Path 'X:\DuplicateFiles.csv' -UseCulture -NoTypeInformation
The output would be something like this:
SourceArchive DuplicateFile
------------- -------------
D:\Test\test.cab .\test\CA2P30.BA0
D:\Test\test.cab .\test\dupes\CA2P30.BA0
D:\Test\test.zip .\test\CA2P3K.DAT
D:\Test\test.zip .\test\dupes\CA2P3K.DAT
D:\Test\test.zip .\test\CA2P60.BA0
D:\Test\test.zip .\test\dupes\CA2P60.BA0

Related

copy newer files without keeping folders

I have a folder with a number of subfolders containing files and want to copy all files to the root folder but only overwrite if newer.
In powershell I can do -
Get-ChildItem D:\VaM\Custom\Atom\Person\Morphs\temp2\female -Recurse -file | Copy-Item -Destination D:\VaM\Custom\Atom\Person\Morphs\female
But this will overwrite all files, I only want to overwrite files if the copied file is newer.
robocopy can overwrite only older this but keeps the folder structure.
Try this
$root = 'D:\VaM\Custom\Atom\Person\Morphs\temp2\female'
[bool]$Delete = $false
Get-ChildItem $root -Recurse -File |
Where-Object {$_.DirectoryName -ne $root } | # Do not touch files already seated in root
ForEach-Object {
$rootNameBrother = Get-Item "$root\$($_.Name)" -ea 0
if($rootNameBrother -and $rootNameBrother.LastWriteTime -lt $_.LastWriteTime) {
# RootFile with same name exists and is Older - Copy and override
Copy-Item -Path $_.FullName -Destination $rootNameBrother.FullName -Force
}
elseif ($rootNameBrother -and $rootNameBrother.LastWriteTime -ge $_.LastWriteTime) {
# RootFile with same name exists and is Newer or same Age
# Delete non root File if allowed
if($Delete) { Remove-Item $_.FullName -Force }
}
}
Set...
$Delete = $true
...if you wish to delete non root files that could not be copied because there already was a file with the same name and greater modiefydate in root.
You also can set the
$VerbosePreference = "Continue"
$WhatIfPreference = "Continue"
variables, just to be safe when you execute the script for the first time.
If you wish to delete all empty subfolder, you can run this:
$allFolders =`
Get-ChildItem $root -Recurse -Directory |
ForEach-Object {
# Add now Depth Script Property
$_ | Add-Member -PassThru -Force -MemberType ScriptProperty -Name Depth -Value {
# Get Depth of folder by looping through each letter and counting the backshlashes
(0..($this.FullName.Length - 1) | ForEach {$this.FullName.Substring($_,1)} | Where-Object {$_ -eq "\"}).Count
}
}
# Sort all Folder by new Depth Property annd Loop throught
$allFolders | Sort -Property Depth -Descending |
ForEach-Object {
# if .GetFileSystemInfos() method return null, the folder is empty
if($_.GetFileSystemInfos().Count -eq 0) {
Remove-Item $_.FullName -Force # Remove Folder
}
}
You can do it like this:
$source = 'D:\VaM\Custom\Atom\Person\Morphs\temp2\female'
$destination = 'D:\VaM\Custom\Atom\Person\Morphs\female'
Get-ChildItem -Path $source -Recurse -File | ForEach-Object {
# try and get the existing file in the destination folder
$destFile = Get-Item -Path (Join-Path -Path $destination -ChildPath $_.Name) -ErrorAction SilentlyContinue
if (!$destFile -or $_.LastWriteTime -gt $destFile.LastWriteTime) {
# copy the file if it either did not exist in the destination or if this file is newer
Write-Host "Copying file $($_.Name)"
$_ | Copy-Item -Destination $destination -Force
}
}
I ended up doing this:
Get-ChildItem G:\VaM\Custom\Atom\Person\Morphs\temp2\ -Recurse |
Where-Object { $_.PSIsContainer -eq $true } |
Foreach-Object { robocopy $_.FullName G:\VaM\Custom\Atom\Person\Morphs\female /xo /ndl /np /mt /nfl}
it runs through the directory structure and copys the contents of each directory to the destination but only overwrites older files.

PowerShell works only when running in a console and not as a PS1 executable file

I am fairly new to PowerShell and am having challenges trying to get a PS1 executable file to work. Running the script in a PowerShell console works completely fine and copy's items and creates the correct log filename.
The expectation would be to Right-click the PS1 file containing the script, run the script with "Run with PowerShell", and then allow the script to finish with a log file populated and files copied when user prompt selects yes.
At this point, there are no errors messages, other than the PS1 file script gets replaced by ton of unrecognizable symbols/characters and creates the log file as "Box Project Files.ps1JobFileLocations.log" instead of "JobFileLocations.log".
The PowerShell version being used is 5.1. Windows 10 OS. Set-ExecutionPolicy was set to Unrestricted and confirmed as Unrestricted for CurrentUser and LocalMachine. Unblock-File was also tried.
Below is the script that works in a PowerShell Console but not as a PS1 executable file.
# Drawing Tag Searches
$MechDWGFilterList = #('*IFC*','*mech*', '*permit*', '*final*')
$DatabaseFilterList = #('*field*','*software*')
# Root folder and destination folder
$JobNumber = '*'+(Read-Host -Prompt 'Enter in job number')+'*'
$srcRoot = 'C:\Users\username\Box\'
$JobRoot = (Get-ChildItem -Path $srcRoot -Filter "*Active Projects*" -Recurse -Directory -Depth 1).Fullname
$dstRoot = $MyInvocation.MyCommand.Path
# Find job numer pdf file
$JobFolder = (Get-ChildItem -Path $JobRoot -Filter "$JobNumber" -Recurse -Directory -Depth 0).Fullname
$Logfile = $dstRoot+"JobFileLocations.log"
$reply = Read-Host -Prompt "Make a copy of relevant project files to local drive?[y/n]"
# Find sub-folder from job folder
$ProposalFolder = (Get-ChildItem -Path $JobFolder -Filter "*Proposals*" -Recurse -Directory).Fullname
$MechDWGFolder = (Get-ChildItem -Path $JobFolder -Filter "*Plans*" -Recurse -Directory).Fullname
$SubmittalFolder = (Get-ChildItem -Path $JobFolder -Filter "*Submittal*" -Recurse -Directory).Fullname
$DatabaseFolder = (Get-ChildItem -Path $JobFolder -Filter "*Backup*" -Recurse -Directory).Fullname
$EstimateFolder = (Get-ChildItem -Path $JobFolder -Filter "*Estimate*" -Recurse -Directory).Fullname
# Find files from list
$ProposalList = Get-ChildItem -Path $ProposalFolder -Filter '*proposal*.pdf' -r | Sort-Object -Descending -Property LastWriteTime | Select -First 1
$MechDWGList = Get-ChildItem -Path $MechDWGFolder -Filter *.pdf -r | Sort-Object -Descending -Property LastWriteTime
$SubmittalList = Get-ChildItem $SubmittalFolder -Filter '*submittal*.pdf' -r | Sort-Object -Descending -Property LastWriteTime | Select -First 1
$DatabaseList = Get-ChildItem $DatabaseFolder -Filter *.zip -r | Sort-Object -Descending -Property LastWriteTime | Select -First 1
$EstimateList = Get-ChildItem -Path $EstimateFolder -Filter *.xl* -r | Sort-Object -Descending -Property LastWriteTime
# Log file path location and copy file to local directory
# Function to add items to a log text file
Function LogWrite
{
Param ([string]$logstring)
Add-content $Logfile -value $logstring
}
# Log file path location and copy file to local directory
LogWrite "::==========================================::`n|| Project Document Paths ||`n::==========================================::"
LogWrite "`nNote: If a section has more than one file path, files are listed from most recent to oldest.`n"
LogWrite "----------Scope Document/Proposal(s)----------"
foreach ($file in $ProposalList)
{
LogWrite $file.FullName
if ( $reply -match "[yY]" )
{
Copy-Item -Path $($file.FullName) -Destination $dstRoot
}
}
LogWrite "`n-------------Mechanical Drawing(s)------------"
foreach ($file in $MechDWGList)
{
# Where the file name contains one of these filters
foreach($filter in $MechDWGFilterList)
{
if($file.Name -like $filter)
{
LogWrite $file.FullName
if ( $reply -match "[yY]" )
{
Copy-Item -Path $($file.FullName) -Destination $dstRoot
}
}
}
}
LogWrite "`n-------------Controls Submittal(s)------------"
foreach ($file in $SubmittalList)
{
LogWrite $file.FullName
if ( $reply -match "[yY]" )
{
Copy-Item -Path $($file.FullName) -Destination $dstRoot
}
}
LogWrite "`n-------------------Database-------------------"
foreach ($file in $DatabaseList)
{
LogWrite $file.FullName
if ( $reply -match "[yY]" )
{
Copy-Item -Path $($file.FullName) -Destination $dstRoot
}
}
LogWrite "`n------------------Estimate(s)-----------------"
foreach ($file in $EstimateList)
{
LogWrite $file.FullName
if ( $reply -match "[yY]" )
{
Copy-Item -Path $($file.FullName) -Destination $dstRoot
}
}
# If running in the console, wait for input before closing.
if ($Host.Name -eq "ConsoleHost")
{
Write-Host "Press any key to continue..."
$Host.UI.RawUI.ReadKey("NoEcho,IncludeKeyUp") > $null
}
Could someone help me understand what is wrong with running the script as a PS1 file?
The problem with your script is this particular line:
$dstRoot = $MyInvocation.MyCommand.Path
$MyInvocation.MyCommand.Path resolves the rooted filesystem path to the script itself - which is why you get Box Project Files.ps1 (presumably the name of the script) in the log path.
The get the path of the parent directory of any file path, you can use Split-Path -Parent:
$dstRoot = Split-Path -LiteralPath $MyInvocation.MyCommand.Path -Parent
That being said, since Windows PowerShell 3.0, both the directory and script file paths have been available via the $PSCommandPath and $PSScriptRoot automatic variables, so you can simplify the code to just:
$dstRoot = $PSScriptRoot

Trying to use powershell to put files in folders based on their extension and the name of the folder

I have a directory with three files: .xlsx, .docx, and .txt, I also have folders in that same directory called xlsx, docx and txt. Basically trying to put each file into its corresponding folder, as a way to practice my PowerShell skills. I'm very new to PowerShell and have tried the following. I can tell its wrong, but I'm not quite sure why.
$folders = Get-ChildItem -Directory
$files = Get-ChildItem -File
foreach ($file in $files) {
foreach ($folder in $folders) {
if ("*$file.extension*" -like "*$folder.Name*") {
move-item $file -Destination "C:\Users\userA\Desktop\$folder.name"
}
}
}
Try the code below. With the Where-Object function, you find the corresponding file. I remove the dot because it is included in the extension otherwise.
$folders = Get-ChildItem -Directory
$files = Get-ChildItem -File
foreach ($file in $files) {
$folder = $folders | Where-Object { $_.Name -Like $file.Extension.Replace(".","") }
Move-Item -Path $file -Destination $folder
}
In your example, be careful how your strings are actually been interpreted. If you have "*$item.Name*" the string actually "* variable.Name*". In this case you need to use "*$($var.Name)*" in order to get the correct string.
Here are some adjustments to your approach that make it work. Breaking the -Destination parameter out to a separate variable $newpath lets you set a debug statement there so you can easily examine what it's creating.
$folders = Get-ChildItem -Directory
$files = Get-ChildItem -File
foreach ($file in $files) {
foreach ($folder in $folders) {
if ($file.extension.trim(".") -like $folder.Name) {
$newpath = ("{0}\{1}" -f $folder.FullName, $file.Name)
move-item $file -Destination $newpath
}
}
}
You could even create target folders for extensions if they do not exist yet:
$SourceFolder = C:\sample
$TargetFolder = D:\sample
Get-ChildItem -Path $SourceFolder -File |
ForEach-Object{
$DesinationFolder = Join-Path -Path $TargetFolder -ChildPath $_.Extension.TrimStart('.')
if(-not (Test-Path -Path $DesinationFolder)){
New-Item -Path $DesinationFolder -ItemType Directory | Out-Null
}
Copy-Item -Path $_.FullName -Destination $DesinationFolder -Force
}

Duplicate file name as folder, insert file

I am trying to use Powershell to
scan folder D://Mediafolder for names of media files
create a folder for each media file scanned, with same name
insert each media file in to matching folder name.
I can find no documentation or thread of this, and I am more fluent in Linux than Windows. I've tried many times to piece this together, but to no avail.
Hope this will help :)
This will create a folder for each file with the same name, so if you have a file called xyz.txt, it will create a folder called xyz and move the file to this folder.
$path = "D:\MediaFolder"
$items = Get-ChildItem $path
Foreach ($item in $items)
{
$folderName = $item.name.Split('.')[0]
New-Item "$path\$folderName" -ItemType Directory
Move-Item -Path "$path\$item" -Destination "$path\$foldername"
}
File Sorting based on extension should do the job:
$folder_path = read-host "Enter the folder path without space"
$file = gci $folder_path -Recurse | ? {-not $_.psiscontainer}
$file | group -property extension | % {if(!(test-path(join-path $folder_path -child $_.name.replace('.','')))){new-item -type directory $(join-path $folder_path -child $_.name.replace('.','')).toupper()}}
$file | % { move-item $_.fullname -destination $(join-path $folder_path -child $_.extension.replace(".",""))}
$a = Get-ChildItem $folder_path -recurse | Where-Object {$_.PSIsContainer -eq $True}
$a | Where-Object {$_.GetFiles().Count -eq 0} | Remove-Item -Force
This will iterate over the files in the media_dir and move those with the extensions in media_types to a folder with the same basename. When you are satisfied that the files will be moved to the correct directory, remove the -WhatIf from the Move-Item statement.
PS C:\src\t> type .\ms.ps1
$media_dir = 'C:\src\t\media'
$new_dir = 'C:\src\t\newmedia'
$media_types = #('.mp3', '.mp4', '.jpeg')
Get-ChildItem -Path $media_dir |
ForEach-Object {
$base_name = $_.BaseName
if ($media_types -contains $_.Extension) {
if (-not (Test-Path $new_dir\$base_name)) {
New-Item -Path $new_dir\$base_name -ItemType Directory | Out-Null
}
Move-Item $_.FullName $new_dir\$base_name -WhatIf
}
}

How to create folder structure skeleton using Powershell?

we are having folder/sub folder structure for our application.
Whenever we are adding new modules, we have to copy the folder structures exactly without copying files.
How to copy the folders and subfolders in the hierarchy but leaving the files alone?
This is a bit 'hacky' as Powershell doesn't handle this very well... received wisdom says to use xCopy or Robocopy but if you really need to use Powershell, this seems to work OK:
$src = 'c:\temp\test\'
$dest = 'c:\temp\test2\'
$dirs = Get-ChildItem $src -recurse | where {$_.PSIsContainer}
foreach ($dir in $dirs)
{
$target = ($dir.Fullname -replace [regex]::Escape($src), $dest)
if (!(test-path $target))
{
New-Item -itemtype "Directory" $target -force
}
}
$source = "<yoursourcepath>"
$destination = "<yourdestinationpath>"
Get-ChildItem -Path $source -Recurse -Force |
Where-Object { $_.psIsContainer } |
ForEach-Object { $_.FullName -replace [regex]::Escape($source), $destination } |
ForEach-Object { $null = New-Item -ItemType Container -Path $_ }