Get PathTwo Counts In Dynamic PathOne - powershell

I need to find all the PathTwos in PathOne under the directory Path; currently I can get all the PathOne's by using:
$path = Get-ChildItem "C:\Path\" | ?{ $_.PSIsContainer }
Since PathOne is dynamic (it's name may be anything), this helps me loop through all the possible paths. Now, PathOne may have 2 or more folders, like PathTwo1, PathTwo2 and PathTwo3. I need to know how many folders are in the dynamic PathOne. Originally I thought that I could loop within PathOne, get the name of the dynamic path and then loop through PathOne, counting all the PathTwos and return everything over 1; unfortunately that doesn't return what I need.
I've tried:
A loop within a loop: creates a mess and doesn't return the correct result.
Use C:\Path\.\ to get the count of the folders within PathOne, by jumping to whatever the next folder would be.
Based on comments, example:
C:\Path\PathOne1\PathTwo1
C:\Path\PathOne1\PathTwo2
C:\Path\PathOne2\PathTwo1
C:\Path\PathOne2\PathTwo2
C:\Path\PathOne2\PathTwo3
C:\Path\PathOne3\PathTwo1 # don't want because only one PathTwo
I don't care how many PathOnes there are, but I do need every PathOne that has more than one PathTwos.

You can get desired result by this command:
# Get all items on two level deep.
Get-Item C:\Path\*\* |
# Get only directories.
Where-Object PSIsContainer |
# Group them by parent.
Group-Object {$_.Parent.FullName} -NoElement |
# Choose groups with Count more then one.
Where-Object Count -gt 1 |
# Select name of parent directory
Select-Object -ExpandProperty Name

Related

Move Files Based on Segements of Filename

I am trying to build a script to move old PDFs into an archive folder from their source folder.
I have organize ~15,000 PDFs into a series of folders based on their numerical name. The next challenge is that there are multiple revisions of the same file, IE:
27850_rev0.pdf
27850_rev1.pdf
27850_rev2.pdf
What is the best approach to keeping the highest rev number in the source folder and moving all lower revisions to an archive?
Any help is appreciated.
Thanks,
enter image description here
You can use an expression with Group-Object to isolate all the files that start with that root filename, i.e. 27850*. If you then sort those files you know the last one is the highest revision number:
Get-ChildItem 'C:\temp\06-11-21' -Filter *.txt |
Group-Object -Property { $_.Name.Split('_')[0] } |
ForEach-Object{
$_.Group | Sort-Object Name | Select-Object -SkipLast 1 |
Copy-Item -Destination 'C:\temp\06-11-21_backup'
}
I used a few text files in this example, but it should work just the same.
Note: Obviously you'll have to change the folders and filters...
Group-Object returns GroupInfo objects, so to get the group of original object I reference $_.Groups.
This does depend on the naming format being static. If you have underscores elsewhere in file names we'll likely have a problem. However, we can always adjust the expression.

Powershell "if more than one, then delete all but one"

Is there a way to do something like this in Powershell:
"If more than one file includes a certain set of text, delete all but one"
Example:
"...Cam1....jpg"
"...Cam2....jpg"
"...Cam2....jpg"
"...Cam3....jpg"
Then I would want one of the two "...Cam2....jpg" deleted, while the other one should stay.
I know that I can use something like
gci *Cam2* | del
but I don't know how I can make one of these files stay.
Also, for this to work, I need to look through all the files to see if there are any duplicates, which defeats the purpose of automating this process with a Powershell script.
I searched for a solution to this for a long time, but I just can't find something that is applicable to my scenario.
Get a list of files into a collection and use range operator to select a subset of its elements. To remove all but first element, start from index one. Like so,
$cams = gci "*cam2*"
if($cams.Count -gt 1) {
$cams[1..$cams.Count] | remove-item
}
Expanding on the idea of commenter boxdog:
# Find all duplicately named files.
$dupes = Get-ChildItem c:\test -file -recurse | Group-Object Name | Where-Object Count -gt 1
# Delete all duplicates except the 1st one per group.
$dupes | ForEach-Object { $_.Group | Select-Object -Skip 1 | Remove-Item -Force }
I've split this up into two sub tasks to make it easier to understand. Also it is a good idea to always separate directory iteration from file deletion, to avoid inconsistent results.
First statement uses Group-Object to group files by names. It outputs a Count property containing the number of files per group. Then Where-Object is used to get only groups that contain more than one file, which will be the dupes. The result is stored in variable $dupes, which is an array that looks like this:
Count Name Group
----- ---- -----
2 file1.txt {C:\test\subdir1\file1.txt, C:\test\subdir2\file1.txt}
2 file2.txt {C:\test\subdir1\file2.txt, C:\test\subdir2\file2.txt}
The second statement uses ForEach-Object to iterate over all groups of duplicates. From the Group-Object call of the 1st statement we got a Group property that contains an array of file informations. Using Select-Object -Skip 1 we select all but the 1st element of this array, which are passed to Remove-Item to delete the files.

PowerShell: Find similar filenames in a directory

In a purely hypothetical situation of a person that downloaded some TV episodes, but is wondering if he/she accidentally downloaded an HDTV, a WEBRip and a WEB-DL version of an episode, how could PowerShell find these 'duplicates' so the lower quality versions can be automagically deleted?
First, I'd get all the files in the directory:
$Files = Get-ChildItem -Path $Directory -Exclude '*.nfo','*.srt','*.idx','*.sub' |
Sort-Object -Property Name
I exclude the non-video extensions for now, since they would cause false positives. I would still have to deal with them though (during the delete phase).
At this point, I would likely use a ForEach construct to parse through the files one by one and look for files that have the same episode number. If there are any, they should be looked at.
Assuming a common spaces equals dots notation here, a typical filename would be AwesomeSeries.S01E01.HDTV.x264-RLSGRP
To compare, I need to get only the episode number. In the above case, that means S01E01:
If ($File.BaseName -match 'S*(\d{1,2})(x|E)(\d{1,2})') { $EpisodeNumber = $Matches[0] }
In the case of S01E01E02 I would simply add a second if-statement, so I'm not concerned with that for now.
$EpisodeNumber should now contain S01E01. I can use that to discover if there are any other files with that episode number in $Files. I can do that with:
$Files -match $EpisodeNumber
This is where my trouble starts. The above will also return the file I'm processing. I could at this point handle the duplicates immediately, but then I would have to do the Get-ChildItem again because otherwise the same match would be returned when the ForEach construct gets to the duplicate file which would then result in an error.
I could store the files I wish to delete in an array and process them after the ForEach contruct is over, but then I'd still have to filter out all the duplicates. After all, in the ForEach loop,
AwesomeSeries.S01E01.HDTV.x264-RLSGRP
would first match
AwesomeSeries.S01E01.WEB-DL.x264.x264-RLSGRP, only for
AwesomeSeries.S01E01.WEB-DL.x264.x264-RLSGRP
to match
AwesomeSeries.S01E01.HDTV.x264-RLSGRP afterwards.
So maybe I should process every episode number only once, but how?
I get the feeling I'm being very inefficient here and there must be a better way to do this, so I'm asking for help. Can anyone point me in the right direction?
Filter the $Files array to exclude the current file when matching:
($Files | Where-Object {$_.FullName -ne $File.FullName}) -match $EpisodeNumber
Regarding the duplicates in the array the end, you can use Select-Object -Unique to only get distinct entries.
Since you know how to get the episode number let's use that to group the files together.
$Files = Get-ChildItem -Path $Directory -Exclude '*.nfo','*.srt','*.idx','*.sub' | Select-Object FullName, #{Name="EpisodeIndex";Expression={
# We do not have to do it like this but if your detection logic gets more complicated then having
# this select-object block will be a cleaner option then using a calculated property
If ($_.BaseName -match 'S*(\d{1,2})(x|E)(\d{1,2})'){$Matches[0]}
}}
# Group the files by season episode index (that have one). Return groups that have more than one member as those would need attention.
$Files | Where-Object{$_.EpisodeIndex } | Group-Object -Property EpisodeIndex |
Where-Object{$_.Count -gt 1} | ForEach-Object{
# Expand the group members
$_.Group
# Not sure how you plan on dealing with it.
}

PowerShell find most recent file

I'm new to powershell and scripting in general. Doing lots of reading and testing and this is my first post.
Here is what I am trying to do. I have a folder that contains sub-folders for each report that runs daily. A new sub-folder is created each day.
The file names in the sub-folders are the same with only the date changing.
I want to get a specific file from yesterday's folder.
Here is what I have so far:
Get-ChildItem -filter “MBVOutputQueriesReport_C12_Custom.html” -recurse -path D:\BHM\Receive\ | where(get-date).AddDays(-1)
Both parts (before and after pipe) work. But when I combine them it fails.
What am I doing wrong?
What am I doing wrong?
0,1,2,3,4,5 | Where { $_ -gt 3 }
this will compare the incoming number from the pipeline ($_) with 3 and allow things that are greater than 3 to get past it - whenever the $_ -gt 3 test evaluates to $True.
0,1,2,3,4,5 | where { $_ }
this has nothing to compare against - in this case, it casts the value to boolean - 'truthy' or 'falsey' and will allow everything 'truthy' to get through. 0 is dropped, the rest are allowed.
Get-ChildItem | where Name -eq 'test.txt'
without the {} is a syntax where it expects Name is a property of the thing coming through the pipeline (in this case file names) and compares those against 'test.txt' and only allows file objects with that name to go through.
Get-ChildItem | where Length
In this case, the property it's looking for is Length (the file size) and there is no comparison given, so it's back to doing the "casting to true/false" thing from earlier. This will only show files with some content (non-0 length), and will drop 0 size files, for example.
ok, that brings me to your code:
Get-ChildItem | where(get-date).AddDays(-1)
With no {} and only one thing given to Where, it's expecting the parameter to be a property name, and is casting the value of that property to true/false to decide what to do. This is saying "filter where *the things in the pipeline have a property named ("09/08/2016 14:12:06" (yesterday's date with current time)) and the value of that property is 'truthy'". No files have a property called (yesterday's date), so that question reads $null for every file, and Where drops everything from the pipeline.
You can do as Jimbo answers, and filter comparing the file's write time against yesterday's date. But if you know the files and folders are named in date order, you can save -recursing through the entire folder tree and looking at everything, because you know what yesterday's file will be called.
Although you didn't say, you could do approaches either like
$yesterday = (Get-Date).AddDays(-1).ToString('MM-dd-yyyy')
Get-ChildItem "d:\receive\bhm\$yesterday\MBVOutputQueriesReport_C12_Custom.html"
# (or whatever date pattern gets you directly to that file)
or
Get-ChildItem | sort -Property CreationTime -Descending | Select -Skip 1 -First 1
to get the 'last but one' thing, ordered by reverse created date.
Read output from get-date | Get-Member -MemberType Property and then apply Where-Object docs:
Get-ChildItem -filter “MBVOutputQueriesReport_C12_Custom.html” -recurse -path D:\BHM\Receive\ | `
Where-Object {$_.LastWriteTime.Date -eq (get-date).AddDays(-1).Date}
Try:
where {$_.lastwritetime.Day -eq ((get-date).AddDays(-1)).Day}
You could pipe the results to the Sort command, and pipe that to Select to just get the first result.
Get-ChildItem -filter “MBVOutputQueriesReport_C12_Custom.html” -recurse -path D:\BHM\Receive\ | Sort LastWriteTime -Descending | Select -First 1
Can do something like this.
$time = (get-date).AddDays(-1).Day
Get-ChildItem -Filter "MBVOutputQueriesReport_C12_Custom.html" -Recurse -Path D:\BHM\Receive\ | Where-Object { $_.LastWriteTime.Day -eq $time }

How to get Select-Object to return a raw type (e.g. String) rather than PSCustomObject?

The following code gives me an array of PSCustomObjects, how can I get it to return an array of Strings?
$files = Get-ChildItem $directory -Recurse | Select-Object FullName | Where-Object {!($_.psiscontainer)}
(As a secondary question, what's the psiscontainer part for? I copied that from an example online)
Post-Accept Edit: Two great answers, wish I could mark both of them. Have awarded the original answer.
You just need to pick out the property you want from the objects. FullName in this case.
$files = Get-ChildItem $directory -Recurse | Select-Object FullName | Where-Object {!($_.psiscontainer)} | foreach {$_.FullName}
Edit: Explanation for Mark, who asks, "What does the foreach do? What is that enumerating over?"
Sung Meister's explanation is very good, but I'll add a walkthrough here because it could be helpful.
The key concept is the pipeline. Picture a series of pingpong balls rolling down a narrow tube one after the other. These are the objects in the pipeline. Each stage of pipeline--the code segments separated by pipe (|) characters--has a pipe going into it and pipe going out of it. The output of one stage is connected to the input of the next stage. Each stage takes the objects as they arrive, does things to them, and sends them back out into the output pipeline or sends out new, replacement objects.
Get-ChildItem $directory -Recurse
Get-ChildItem walks through the filesystem creating FileSystemInfo objects that represent each file and directory it encounters, and puts them into the pipeline.
Select-Object FullName
Select-Object takes each FileSystemInfo object as it arrives, grabs the FullName property from it (which is a path in this case), puts that property into a brand new custom object it has created, and puts that custom object out into the pipeline.
Where-Object {!($_.psiscontainer)}
This is a filter. It takes each object, examines it, and sends it back out or discards it depending on some condition. Your code here has a bug, by the way. The custom objects that arrive here don't have a psiscontainer property. This stage doesn't actually do anything. Sung Meister's code is better.
foreach {$_.FullName}
Foreach, whose long name is ForEach-Object, grabs each object as it arrives, and here, grabs the FullName property, a string, from it. Now, here is the subtle part: Any value that isn't consumed, that is, isn't captured by a variable or suppressed in some way, is put into the output pipeline. As an experiment, try replacing that stage with this:
foreach {'hello'; $_.FullName; 1; 2; 3}
Actually try it out and examine the output. There are four values in that code block. None of them are consumed. Notice that they all appear in the output. Now try this:
foreach {'hello'; $_.FullName; $ x = 1; 2; 3}
Notice that one of the values is being captured by a variable. It doesn't appear in the output pipeline.
To get the string for the file name you can use
$files = Get-ChildItem $directory -Recurse | Where-Object {!($_.psiscontainer)} | Select-Object -ExpandProperty FullName
The -ExpandProperty parameter allows you to get back an object based on the type of the property specified.
Further testing shows that this did not work with V1, but that functionality is fixed as of the V2 CTP3.
For Question #1
I have removed "select-object" portion - it's redundant and moved "where" filter before "foreach" unlike dangph's answer - Filter as soon as possible so that you are dealing with only a subset of what you have to deal with in the next pipe line.
$files = Get-ChildItem $directory -Recurse | Where-Object {!$_.PsIsContainer} | foreach {$_.FullName}
That code snippet essentially reads
Get all files full path of all files recursively (Get-ChildItem $directory -Recurse)
Filter out directories (Where-Object {!$_.PsIsContainer})
Return full file name only (foreach {$_.FullName})
Save all file names into $files
Note that for foreach {$_.FullName}, in powershell, last statement in a script block ({...}) is returned, in this case $_.FullName of type string
If you really need to get a raw object, you don't need to do anything after getting rid of "select-object". If you were to use Select-Object but want to access raw object, use "PsBase", which is a totally different question(topic) - Refer to "What's up with PSBASE, PSEXTENDED, PSADAPTED, and PSOBJECT?" for more information on that subject
For Question #2
And also filtering by !$_.PsIsContainer means that you are excluding a container level objects - In your case, you are doing Get-ChildItem on a FileSystem provider(you can see PowerShell providers through Get-PsProvider), so the container is a DirectoryInfo(folder)
PsIsContainer means different things under different PowerShell providers;
e.g.) For Registry provider, PsIsContainer is of type Microsoft.Win32.RegistryKey
Try this:
>pushd HKLM:\SOFTWARE
>ls | gm
[UPDATE] to following question: What does the foreach do? What is that enumerating over?
To clarify, "foreach" is an alias for "Foreach-Object"
You can find out through,
get-help foreach
-- or --
get-alias foreach
Now in my answer, "foreach" is enumerating each object instance of type FileInfo returned from previous pipe (which has filtered directories). FileInfo has a property called FullName and that is what "foreach" is enumerating over.
And you reference object passed through pipeline through a special pipeline variable called "$_" which is of type FileInfo within the script block context of "foreach".
For V1, add the following filter to your profile:
filter Get-PropertyValue([string]$name) { $_.$name }
Then you can do this:
gci . -r | ?{!$_.psiscontainer} | Get-PropertyName fullname
BTW, if you are using the PowerShell Community Extensions you already have this.
Regarding the ability to use Select-Object -Expand in V2, it is a cute trick but not obvious and really isn't what Select-Object nor -Expand was meant for. -Expand is all about flattening like LINQ's SelectMany and Select-Object is about projection of multiple properties onto a custom object.