powershell filter to remove .pdf extension in the name of a file - powershell

I am trying to use powershell to get all child elements in a folder the code I am using is
Get-ChildItem -Recurse -path C:\clntfiles
this code gives output like
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 4/29/2015 9:11 AM 6919044 HD 100616 Dec2014.pdf
-a--- 5/1/2015 11:42 AM 7091019 HD 101642 Jan2015.pdf
I don't want Mode lastWriteTime Length and name of file without .pdf extension
the output should be like
Dec2014
Jan2015
I am not sure how to filter that. please advise

I'll start by posting something similar to Leptonator's answer, but simplified by using the Select-Object command (alias Select used in code because it's habit, and I'm lazy).
$files = Get-ChildItem -Recurse -path C:\clntfiles | Select -ExpandProperty BaseName
Now that gets you the file names without extension. But, you actually asked for only part of the file names, as the first file name is "HD 100616 Dec2014.pdf" and you specified that you actually only want "Dec2014" to be returned. We can do that a couple different ways, but my favorite of them would be a RegEx match (because RegEx is awesome, and I think the LastIndexOf/SubString combo is overly complicated imho).
So, a RegEx match of "\w+$" will get what you want. That is broken down like this:
\w means any letter or number
+ means 1 or more of them
$ means the end of the string/line
So that's 1 or more alpha-numeric characters at the end of the string. We pipe our array of file names into a ForEach-Object loop (alias ForEach used out of habit), and then we have:
$Files | ForEach{ [RegEx]::Matches($_,"\w+$")}
Now, this outputs a [System.Text.RegularExpressions.Match] object, which is more than you want, but it does have a property Value which is exactly what you asked for! So we use Select -Expand again for that property and the output is precisely what you asked for:
$files = Get-ChildItem -Recurse -path C:\clntfiles | Select -ExpandProperty BaseName
$files | ForEach{[regex]::Matches($_,"\w+$")} | Select -Expand Value
RegEx matches are really handy, and if you learn about them you can simplify that quite a bit more like this:
gci C:\clntfiles -Rec | ?{$_.BaseName -match "(\w+)$"} | %{$Matches[1]}
That one line, as well as the two line code above it both should output:
Dec2014
Jan2015

Something like this should do it for you..
$files = Get-ChildItem -Recurse -path C:\clntfiles
if ($files -ne $null)
{
foreach ($file in $files)
{
$file.BaseName
}
}
In my folder, it shows:
> 2014-03-28_exeresult_file
> 2014-03-30_exeresult_file
> 2014-03-31_exeresult_file
> 2014-04-02_exeresult_file
> 2014-04-03_exeresult_file
> 2014-04-04_exeresult_file
> 2014-04-06_exeresult_file
> 2014-04-08_exeresult_file
and are indeed .txt files
Hope this helps!

Use the following Get-ChildItem -Recurse -name -path C:\clntfiles. This will get you only the file names.
Working solution:
$names = Get-ChildItem -name
foreach($n in $names) {$n.Substring(0,$n.IndexOf("."))}
You can also use LastIndexOf if part of the file name is .

Related

Is there a way to display the latest file of multiple paths with information in a table format?

I check every day, whether a CSV-File has been exported to a specific folder (path). At the moment there are 14 different paths with 14 different files to check. The files are being stored in the folder and are not deleted. So i have to differ between a lot of files with "lastwritetime". I would like a code to display the results in table format. I would be happy with something like this:
Name LastWriteTime Length
ExportCSV1 21.09.2022 00:50 185
ExportCSV2 21.09.2022 00:51 155
My code looks like this:
$Paths = #('Path1', 'Path2', 'Path3', 'Path4', 'Path5', 'Path6', 'Path7', 'Path8', 'Path9', 'Path10', 'Path11', 'Path12', 'Path13', 'Path13')
foreach ($Path in $Paths){
Get-ChildItem $path | Where-Object {$_.LastWriteTime}|
select -last 1
Write-host $Path
}
pause
This way i want to make sure, that the files are being sent each day.
I get the results that i want, but it is not easy to look at the results individually.
I am new to powershell and would very much appreciate your help. Thank you in advance.
Continuing from my comments, here is how you could do this:
$Paths = #('Path1', 'Path2', 'Path3', 'Path4', 'Path5', 'Path6', 'Path7', 'Path8', 'Path9', 'Path10', 'Path11', 'Path12', 'Path13', 'Path13')
$Paths | ForEach-Object {
Get-ChildItem $_ | Where-Object {$_.LastWriteTime} | Select-Object -Last 1
} | Format-Table -Property Name, LastWriteTime, Length
If you want to keep using foreach() instead, you have to wrap it in a scriptblock {…} to be able to chain everything to Format-Table:
. {
foreach ($Path in $Paths){
Get-ChildItem $path | Where-Object {$_.LastWriteTime} | Select-Object -Last 1
}
} | Format-Table -Property Name, LastWriteTime, Length
Here the . operator is used to run the scriptblock immediately, without creating a new scope. If you want to create a new scope (e. g. to define temporary variables that exist only within the scriptblock), you could use the call operator & instead.

create file index manually using powershell, tab delimited

Sorry in advance for the probably trivial question, I'm a powershell noob, please bear with me and give me advice on how to get better.
I want to achieve a file index index.txt that contains the list of all files in current dir and subdirs in this format:
./dir1/file1.txt 07.05.2020 16:16 1959281
where
dirs listed are relative (i.e. this will be run remotely and to save space, the relative path is good enough)
the delimiter is a tab \t
the date format is day.month.fullyear hours:minutes:seconds, last written (this is the case for me, but I'm guessing this would be different on system setting and should be enforced)
(the last number is the size in bytes)
I almost get there using this command in powershell (maybe that's useful to someone else as well):
get-childitem . -recurse | select fullname,LastWriteTime,Length | Out-File index.txt
with this result
FullName LastWriteTime Length
-------- ------------- ------
C:\Users\user1\Downloads\test\asdf.txt 07.05.2020 16:19:29 1490
C:\Users\user1\Downloads\test\dirtree.txt 07.05.2020 16:08:44 0
C:\Users\user1\Downloads\test\index.txt 07.05.2020 16:29:01 0
C:\Users\user1\Downloads\test\test.txt 07.05.2020 16:01:23 814
C:\Users\user1\Downloads\test\text2.txt 07.05.2020 15:55:45 1346
So the questions that remain are: How to...
get rid of the headers?
enforce this date format?
tab delimit everything?
get control of what newline character is used (\n or \r or both)?
Another approach could be this:
$StartDirectory = Get-Location
Get-ChildItem -Path $StartDirectory -recurse |
Select-Object -Property #{Name='RelPath';Expression={$_.FullName.toString() -replace [REGEX]::Escape($StartDirectory.ToString()),'.'}},
#{Name='LastWriteTime';Expression={$_.LastWriteTime.toString('dd.MM.yyyy HH:mm:ss')}},
Length |
Export-Csv -Path Result.csv -NoTypeInformation -Delimiter "`t"
I recommend to use proper CSV files if you have structured data like this. The resulting CSV file will be saved in the current working directory.
If the path you are running this from is NOT the current scrip path, do:
$path = 'D:\Downloads' # 'X:\SomeFolder\SomeWhere'
Set-Location $path
first.
Next, this ought to do it:
Get-ChildItem . -Recurse -File | ForEach-Object {
"{0}`t{1:dd.MM.yyyy HH:mm}`t{2}" -f ($_ | Resolve-Path -Relative), $_.LastWriteTime, $_.Length
} | Out-File 'index.txt'
On Windows the newline will be \r\n (CRLF)
If you want control over that, this should do:
$newline = "`n" # for example
# capture the lines as string array in variable $lines
$lines = Get-ChildItem . -Recurse -File | ForEach-Object {
"{0}`t{1:dd.MM.yyyy HH:mm}`t{2}" -f ($_ | Resolve-Path -Relative), $_.LastWriteTime, $_.Length
}
# join the array with the chosen newline and save to file
$lines -join $newline | Out-File 'index.txt' -NoNewline
Because your requirement is to NOT have column headers in the output file, I'm using Out-File here instead of Export-Csv

Powershell - How to create array of filenames based on filename?

I'm looking to create an array of files (pdf's specifically) based on filenames in Powershell. All files are in the same directory. I've spent a couple of days looking and can't find anything that has examples of this or something that is close but could be changed. Here is my example of file names:
AR - HELLO.pdf
AF - HELLO.pdf
RT - HELLO.pdf
MH - HELLO.pdf
AR - WORLD.pdf
AF - WORLD.pdf
RT - WORLD.pdf
HT - WORLD.pdf
....
I would like to combine all files ending in 'HELLO' into an array and 'WORLD' into another array and so on.
I'm stuck pretty early on in the process as I'm brand new to creating scripts, but here is my sad start:
Get-ChildItem *.pdf
Where BaseName -match '(.*) - (\w+)'
Updated Info...
I do not know the name after the " - " so using regex is working.
My ultimate goal is to combine PDF's based on the matching text after the " - " in the filename and the most basic code for this is:
$file1 = "1 - HELLO.pdf"
$file2 = "2 - HELLO.PDF"
$mergedfile = "HELLO.PDF"
Merge-PDF -InputFile $file1, $file2 -OututFile $mergedfile
I have also gotten the Merge-PDF to work using this code which merges all PDF's in the directory:
$Files = Get-ChildItem *.pdf
$mergedfiles = "merged.pdf"
Merge-PDF -InputFile $Files -OutputFile $mergedfiles
Using this code from #Mathias the $suffix portion of the -OutputFile works but the -InputFile portion is returning an error "Exception calling "Close" with "0" argument(s)"
$groups = Get-ChildItem *.pdf |Group-Object {$_.BaseName -replace
'^.*\b(\w+)$','$1'} -AsHashTable
foreach($suffix in $groups.Keys) {Merge-PDF -InputFile $(#($groups[$suffix]))
-OutputFile "$suffix.pdf"}
For the -InputFile I've tried a lot of different varieties and I keep getting the "0" arguments error. The values in the Hashtable seem to be correct so I'm not sure why this isn't working.
Thanks
This should do the trick:
$HELLO = Get-ChildItem *HELLO.pdf |Select -Expand Name
$WORLD = Get-ChildItem *WORLD.pdf |Select -Expand Name
If you want to group file names by the last word in the base name and you don't know them up front, regex is indeed an option:
$groups = Get-ChildItem *.pdf |Group-Object {$_.BaseName -replace '^.*\b(\w+)$','$1'} -AsHashTable
And then you can do:
$groups['HELLO'].Name
for all the file names ending with the word HELLO, or, to iterate over all of them:
foreach($suffixGroup in $groups.GetEnumerator()){
Write-Host "There are $($suffixGroup.Value.Count) files ending in $($suffixGroup.Key)"
}
Another option is to get all items with Get-ChildItem and use Where-Object to filter.
$fileNames = Get-ChildItem | Select-Object -ExpandProperty FullName
#then filter
$fileNames | Where-Object {$_.EndsWith("HELLO.PDF")}
#or use the aliases if you want to do less typing:
$fileNames = gci | select -exp FullName
$fileNames | ? {$_.EndsWith("HELLO.PDF")}
Just wanted to show more options -especially the Where-Object cmdlet which comes in useful when you're calling cmdlets that don't have parameters to filter.
Side note:
You may be asking what -ExpandProperty does.
If you just call gci | select -exp FullName, you will get back an array of PSCustomObjects (each of them with one property called FullName).
This can be confusing for people who don't really see that the objects are typed as it is not visible just by looking at the PowerShell script.

Edit an object non-destructively in PowerShell

I am trying to format the resulting object without destroying it. But all my efforts and research has failed me. Any tips are welcome.
My code looks like this:
Set-Location 'C:\Temp'
$Files = Get-ChildItem -File | Select-Object FullName, Length
And what I get, is this:
FullName Length
-------- ------
C:\Temp\CleanupScript.txt 10600
C:\Temp\Columns.csv 4214
C:\Temp\Content.html 271034
C:\Temp\Content.txt 271034
C:\Temp\DirSizes.csv 78
What I want is this:
FullName Length
-------- ------
Temp\CleanupScript.txt 10600
Temp\Columns.csv 4214
Temp\Content.html 271034
Temp\Content.txt 271034
Temp\DirSizes.csv 78
When I tried this:
$Files = Get-ChildItem -File | Select-Object FullName, Length | % { $_.FullName.Remove(0, 3) }
I got the right result, but I lost the Length column.
PS C:\Temp> $Files
Temp\CleanupScript.txt
Temp\Columns.csv
Temp\Content.html
Temp\Content.txt
Temp\DirSizes.csv
Please help.
Big thanks
Patrik
The easiest way to do this is to construct the property you want in the Select command, such as:
$Files = Get-ChildItem -File | Select #{l='FullName';e={$_.FullName.Substring(3)}},Length
The format for this is a hashtable with two entries. The keys are lable (or name), and expression. You can shorten them to l (or n), and e. The label entry defines the name of the property you are constructing, and the expression defines the value.
If you want to retain all of the original methods and properties of the objects you should add a property to them rather than using calculated properties. You can do that with Add-Member as such:
$Files = GCI -File | %{Add-Member -inputobject $_ -notepropertyname 'ShortPath' -notepropertyvalue $_.FullName.Substring(3) -PassThru}
Then you can use that property by name like $Files | FT ShortPath,Length -Auto, while still retaining the ability to use the file's methods like Copy() and what not.
I would recommend using a calculated property and Split-Path -NoQualifier; e.g.:
Get-ChildItem -File | Select-Object `
#{Name = "NameNoQualifier"; Expression = {Split-Path $_.FullName -NoQualifier}},
Length
For help on calculated properties, see the help for Select-Object.
(Aside: To correct your terminology a bit, this is not modifying objects non-destructively but rather outputting new objects containing the properties you want formatted how you want them.)

Powershell - Strange output when using Get-ChildItem to search within files

I have a problem I am hoping someone could help with....
I have a powershell script containing the lines shown below:
$output = Get-ChildItem -path $target -recurse | Select-String -pattern hello | group path | select name
Write-Output "Output from the string match is $output"
The error I am getting:
Output from the string match Microsoft.Powershell.Commands.GroupInfo Microsoft.Powershell.Commands.GroupInfo
When I run this command on it's own (ie not within a script) it works perfectly and returns the two files in that location that contains the word "hello".
It appears that it knows there are two things it has found because it prints the "Microsoft.Powershell.Commands.GroupInfo" text twice (as shown above in the error). But why is it printing this and not the path to the files as it should do?
There must be something obvious I am overlooking but I dont know what.
Your help is much appreciated, thanks
The reason you're seeing this is because $output is an array of Selected.Microsoft.PowerShell.Commands.GroupInfo objects -- the objects returned by Group-Object when passed to Select-Object (without Select-Object they would just be Microsoft.PowerShell.Commands.GroupInfo objects instead). You can confirm the type of objects in $ouput by running:
$output | Get-Member
Check the TypeName that is displayed at the top of the output.
When you run these commands interactively in the console, you are seeing the paths because PowerShell knows how to display GroupInfo objects in the console so that they are human-readable. Note that when you just call $output in the console, you see a "Name" header underlined with dash characters -- this is PowerShell interpreting the GroupInfo object you gave it and displaying the Name property for you in the console.
The problem occurs when you try to output the $output array inside a string. Then PowerShell is not able to use its more advanced formatting logic and instead merely tries to convert the object to a string to insert into your string. When it does that, it doesn't have enough logic to know that what you really want to appear in your string is the Name property of these GroupInfo objects, so instead if just prints out a string with the type name of each of the objects in the $output array. So that's why you see the type name twice.
The simple solution to this problem is the -ExpandProperty parameter for Select-Object. This does what it says -- it expands the property you asked for with Select-Object and returns just that property, not the parent object. So the Name property of a GroupInfo object is a string. If you call Select-Object Name, you get a GroupInfo object with the Name property. If you call Select-Object -ExpandProperty Name, you get just the Name property as a String object. Which is what I expect that you want in this case.
So try this instead:
$output = Get-ChildItem -path $target -recurse | Select-String -pattern hello | group path | select -ExpandProperty name
A foreach would be appropriate here I believe. Try this:
$output = Get-ChildItem -path $target -recurse | where {$_.name -like "*hello*"} | select name
foreach ($file in $output) {
write-host $file.name
}
Or this:
$output = Get-ChildItem -path $target -recurse | select-string -pattern "hello" | select name
foreach ($file in $output) {
write-output $file.name
}