Cataloging file names over 256 characters - powershell

I am brand new to scripting (or coding of any sort). I had an issue where I wanted to generate csv files to catalog directories and certain file names to aid in my work. I was able to put something together that works for what I need. With one exception, long names return the following error:
ERROR: The specified path, file name, or both are too long. The fully qualified file name must be less than 260 characters, and the directory name must be less than 248 characters.
Here is my script:
Write-Host "Andy's File Lister v2.2"
$drive = Read-Host "R or Q?"
$client = Read-Host "What is the client's name as it appears on the R or Q drive?"
$path= "${drive}:\${client}"
Get-ChildItem $path -Recurse -dir | Select-Object FullName | Export-CSV $home\downloads\"$client directories.csv"
Get-ChildItem $path -Recurse -Include *.pdf, *.jp*, *.xl*, *.doc* | Select-Object FullName | Export-CSV $home\downloads\"$client files.csv"
Write-Host "Check your downloads folder."
Pause
As I said, I am brand new to this. Is there a different command I could use, or a way to tell the script to skip directory names or files over a certain length?
Thanks!

You can check the value of the .Length child property of the .FullName property of each item you check, and if it's greater than 256 characters, use Out-Null:
Ex.
$items = Get-ChildItem -Path C:\users\myusername\desktop\myfolder
foreach($item in $items)
{
if($item.FullName.Length -lt 256)
{
do some stuff
}
elseif($item.FullName.Length)
{
Out-Null
}
}
If you want to check the parent folder's path as well, you could check
$item.Parent.FullName.Length
in your processing as well.

I think you should close your strings on lines 5 and 6.
Instead of using ", you should use \" because currently your script parses the entire line 6 as a one string.

Related

Powershell capture first filename from a folder

Newbie to powershell
I need to capture first file name from a directory. However my current script captures all file names. Please suggest changes to my code below.
# Storing path of the desired folder
$path = "C:\foo\bar\"
$contents = Get-ChildItem -Path $path -Force -Recurse
$contents.Name
The result is following
test-01.eof
test-02.eof
test-03.eof
i just want one file (any) from this list. So expected result should be
test-01.eof
You could use Select-Object with the -first switch and set it to 1
$path = "C:\foo\bar\"
$contents = Get-ChildItem -Path $path -Force -Recurse -File | Select-Object -First 1
I've added the -File switch to Get-ChildItem as well, since you only want to return files.
$path = "C:\foo\bar\"
$contents = Get-ChildItem -Path $path -Force -Recurse
$contents # lists out all details of all files
$contents.Name # lists out all files
$contents[0].Name # will return 1st file name
$contents[1].Name # will return 2nd file name
$contents[2].Name # will return 3rd file name
The count starts from 0. So $contents here is an array or a list and whatever integer you mention in [] is the position of the item in that array/list. So when you type $contents[9], you are telling powershell that get the 9th item from the array $contents. This is how you iterate through a list. In most programming languages the count begins from 0 and not 1. It is a bit confusing for a someone who is entering the coding world but you get used to it.
Please use below command which is simple and helpful. Adding recurse will only put slight load on the machine or powershell (when the code is huge and it has been used somewhere)
Storing path of the desired folder
$path = "C:\foo\bar\"
$contents = Get-ChildItem -Path $path | sort | Select-Object -First 1
$contents.Name
Output Will be as as expected:
test-01.eof
Select-Object -First will select each object (or row) and provide first row data as an output if you put as 1

How to get Get-ChildItem to handle path with non-breaking space

I have the following code that works for most files. The input file (FoundLinks.csv) is a UTF-8 file with one file path per line. It is full paths of files on a particular drive that I need to process.
$inFiles = #()
$inFiles += #(Get-Content -Path "C:\Users\sw_admin\FoundLinks.csv")
foreach ($inFile in $inFiles) {
Write-Host("Processing: " + $inFile)
$objFile = Get-ChildItem -LiteralPath $inFile
New-Object PSObject -Prop #{
FullName = $objFile.FullName
ModifyTime = $objFile.LastWriteTime
}
}
But even though I've used -LiteralPath, it continues to not be able to process files that have a non-breaking space in the file name.
Processing: q:\Executive\CLC\Budget\Co  2018 Budget - TO Bob (GA Prophix).xlsx
Get-ChildItem : Cannot find path 'Q:\Executive\CLC\Budget\Co  2018 Budget - TO Bob (GA Prophix).xlsx'
because it does not exist.
At ListFilesWithModifyTime.ps1:6 char:29
+ $objFile = Get-ChildItem <<<< -LiteralPath $inFile
+ CategoryInfo : ObjectNotFound: (Q:\Executive\CL...A Prophix).xlsx:String) [Get-ChildItem], ItemNotFound
Exception
+ FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetChildItemCommand
I know my input file has the non-breaking space in the path because I'm able to open it in Notepad, copy the offending path, paste into Word, and turn on paragraph marks. It shows a normal space followed by a NBSP just before 2018.
Is PowerShell not reading in the NBSP? Am I passing it wrong to -LiteralPath? I'm at my wit's end. I saw this solution, but in that case they are supplying the path as a literal in the script, so I can't see how I could use that approach.
I've also tried: -Encoding UTF8 parameter on Get-Content, but no difference.
I'm not even sure how I can check $inFile in the code just to confirm if it still contains the NBSP.
Grateful for any help to get unstuck!
Confirmed that $inFile has NBSP
Thank you all! As per #TheMadTechnician, I have updated the code like this, and also reduced my input file to only the one file having a problem.
$inFiles = #()
$inFiles += #(Get-Content -Path "C:\Users\sw_admin\FoundLinks.csv" -Encoding UTF8)
foreach ($inFile in $inFiles) {
Write-Host("Processing: " + $inFile)
# list out all chars to confirm it has an NBSP
$inFile.ToCharArray()|%{"{0} -> {1}" -f $_,[int]$_}
$objFile = Get-ChildItem -LiteralPath $inFile
New-Object PSObject -Prop #{
FullName = $objFile.FullName
ModifyTime = $objFile.LastWriteTime
}
}
And so now I can confirm that $inFile in fact still contains the NBSP just as it gets passed to Get-ChildItem. Yet Get-ChildItem says the file does not exist.
More I've tried:
Same if I use Get-Item instead of Get-ChildItem
Same if I use -Path instead of -LiteralPath
Windows explorer and Excel can deal with the file successfully.
I'm on a Windows 7 machine, Powershell 2.
Thanks again for all the responses!
It's still unclear why Sandra's code didn't work: PowerShell v2+ is capable of retrieving files with paths containing non-ASCII characters; perhaps a non-NTFS filesystem with different character encoding was involved?
However, the following workaround turned out to be effective:
$objFile = Get-ChildItem -Path ($inFile -replace ([char] 0xa0), '?')
The idea is to replace the non-breaking space char. (Unicode U+00A0; hex. 0xa) in the input file path with wildcard character ?, which represents any single char.
For Get-ChildItem to perform wildcard matching, -Path rather than -LiteralPath must be used (note that -Path is actually the default if you pass a path argument positionally, as the first argument).
Hypothetically, the wildcard-based paths could match multiple files; if that were the case, the individual matches would have to be examined to identify the specific match that has a non-breaking space in the position of the ?.
Get-ChildItem is for listing children so you would be giving it a directory, but it seems you are giving it a file, so when it says it cannot find the path, it's because it can't find a directory with that name.
Instead, you would want to use Get-Item -LiteralPath to get each individual item (this would be the same items you would get if you ran Get-ChildItem on its parent.
I think swapping in Get-Item would make your code work as is.
After testing, I think the above is in fact false, so sorry for that, but I will leave the below in case it's helpful, even though it may not solve your immediate problem.
But let's take a look at how it can be simplified with the pipeline.
First, you're starting with an empty array, then calling a command (Get-Content) which likely already returns an array, wrapping that in an array, then concatenating it to the empty one.
You could just do:
$inFiles = Get-Content -Path "C:\Users\sw_admin\FoundLinks.csv"
Yes, there is a chance that $inFiles will contain only a single item and not an array at all.
But the nice thing is that foreach won't mind one bit!
You can do something like this and it just works:
foreach ($string in "a literal single string") {
Write-Host $string
}
But Get-Item (and Get-ChildItem for that matter) accept pipeline input, so they accept multiple items.
That means you could do this:
$inFiles = Get-Content -Path "C:\Users\sw_admin\FoundLinks.csv" | Get-Item
foreach ($inFile in $inFiles) {
Write-Host("Processing: " + $inFile)
New-Object PSObject -Prop #{
FullName = $inFile.FullName
ModifyTime = $inFile.LastWriteTime
}
}
But even more than that, there is a pipeline-aware cmdlet for processing items, called ForEach-Object, to which you pass a [ScriptBlock], in which $_ represents the current item, so we could do it like this:
Get-Content -Path "C:\Users\sw_admin\FoundLinks.csv" |
Get-Item |
ForEach-Object -Process {
Write-Host("Processing: " + $_)
New-Object PSObject -Prop #{
FullName = $_.FullName
ModifyTime = $_.LastWriteTime
}
}
All in one pipeline!
But further, you're creating a new object with the 2 properties you want.
PowerShell has a nifty cmdlet called Select-Object which takes an input object and returns a new object containing only the properties you want; this would make for a cleaner syntax:
Get-Content -Path "C:\Users\sw_admin\FoundLinks.csv" |
Get-Item |
Select-Object -Property FullName,LastWriteTime
This is the power of the the pipeline passing real objects from one command to another.
I realize this last example does not write the processing message to the screen, however you could re-add that in if you wanted:
Get-Content -Path "C:\Users\sw_admin\FoundLinks.csv" |
Get-Item |
ForEach-Object -Process {
Write-Host("Processing: " + $_)
$_ | Select-Object -Property FullName,LastWriteTime
}
But you might also consider that many cmdlets support verbose output and try to just add -Verbose to some of your existing cmdlets. Sadly, it won't really help in this case.
One final note, when you pass items to the filesystem cmdlets via pipeline, the parameter they bind to is in fact -LiteralPath, not -Path, so your special characters are still safe.
I just run into the same issue. Looks like get-childitem ak gci expects the path in unicode (UTF-16). So either convert the csv file into unicode or convert the lines that include the path as unicode within your script.
Testet on PS 5.1.22621.608

Powershell check if file exists in multiple folders and Output

I have a few hundred folders that look like this:
\\\uat.xxx.com\FileExport\New Collections\LCTS
\\\uat.xxx.com\FileExport\New Collections\GBSS
\\\uat.xxx.com\FileExport\New Collections\TRGS
etc
I need to check them for a specific file e.g. "Results 20150722New.dat"
I need to know the folders that do not contain the file, it may be nice if that can be outputted to a file e.g. log.txt but its more I need a list of folders that do not contain it.
I have been trying to use Test-Path but am really not getting anywhere
any chance someone could help me make a start on this
As one time operation you can find names (string) of all directories, that does not contain such file name, using:
Get-ChildItem "\\uat.xxx.com\FileExport\New Collections\" |
Where {$_.PSIsContainer } |
ForEach { if (-not(Test-Path "$($_.FullName)\Results 20150722New.dat")) {Echo $_.FullName } }
Optionally specify -Recurse switch to search folders recursively.
If there's a need to manipulate with results later, I'd prefer to save DirectoryInfo objects to a clollection instead of converting them to strings with Echo cmdlet.
$dirs_not_containing_file = #()
$dirs_not_containing_file +=
Get-ChildItem "\\uat.xxx.com\FileExport\New Collections\" |
Where {$_.PSIsContainer } |
ForEach { if (-not(Test-Path "$($_.FullName)\Results 20150722New.dat")) {$_} }
Splitted second statement to multiple lines for readability.

How to use Powershell to list duplicate files in a folder structure that exist in one of the folders

I have a source tree, say c:\s, with many sub-folders. One of the sub-folders is called "c:\s\Includes" which can contain one or more .cs files recursively.
I want to make sure that none of the .cs files in the c:\s\Includes... path exist in any other folder under c:\s, recursively.
I wrote the following PowerShell script which works, but I'm not sure if there's an easier way to do it. I've had less than 24 hours experience with PowerShell so I have a feeling there's a better way.
I can assume at least PowerShell 3 being used.
I will accept any answer that improves my script, but I'll wait a few days before accepting the answer. When I say "improve", I mean it makes it shorter, more elegant or with better performance.
Any help from anyone would be greatly appreciated.
The current code:
$excludeFolder = "Includes"
$h = #{}
foreach ($i in ls $pwd.path *.cs -r -file | ? DirectoryName -notlike ("*\" + $excludeFolder + "\*")) { $h[$i.Name]=$i.DirectoryName }
ls ($pwd.path + "\" + $excludeFolder) *.cs -r -file | ? { $h.Contains($_.Name) } | Select #{Name="Duplicate";Expression={$h[$_.Name] + " has file with same name as " + $_.Fullname}}
1
I stared at this for a while, determined to write it without studying the existing answers, but I'd already glanced at the first sentence of Matt's answer mentioning Group-Object. After some different approaches, I get basically the same answer, except his is long-form and robust with regex character escaping and setup variables, mine is terse because you asked for shorter answers and because that's more fun.
$inc = '^c:\\s\\includes'
$cs = (gci -R 'c:\s' -File -I *.cs) | group name
$nopes = $cs |?{($_.Group.FullName -notmatch $inc)-and($_.Group.FullName -match $inc)}
$nopes | % {$_.Name; $_.Group.FullName}
Example output:
someFile.cs
c:\s\includes\wherever\someFile.cs
c:\s\lib\factories\alt\someFile.cs
c:\s\contrib\users\aa\testing\someFile.cs
The concept is:
Get all the .cs files in the whole source tree
Split them into groups of {filename: {files which share this filename}}
For each group, keep only those where the set of files contains any file with a path that matches the include folder and contains any file with a path that does not match the includes folder. This step covers
duplicates (if a file only exists once it cannot pass both tests)
duplicates across the {includes/not-includes} divide, instead of being duplicated within one branch
handles triplicates, n-tuplicates, as well.
Edit: I added the ^ to $inc to say it has to match at the start of the string, so the regex engine can fail faster for paths that don't match. Maybe this counts as premature optimization.
2
After that pretty dense attempt, the shape of a cleaner answer is much much easier:
Get all the files, split them into include, not-include arrays.
Nested for-loop testing every file against every other file.
Longer, but enormously quicker to write (it runs slower, though) and I imagine easier to read for someone who doesn't know what it does.
$sourceTree = 'c:\\s'
$allFiles = Get-ChildItem $sourceTree -Include '*.cs' -File -Recurse
$includeFiles = $allFiles | where FullName -imatch "$($sourceTree)\\includes"
$otherFiles = $allFiles | where FullName -inotmatch "$($sourceTree)\\includes"
foreach ($incFile in $includeFiles) {
foreach ($oFile in $otherFiles) {
if ($incFile.Name -ieq $oFile.Name) {
write "$($incFile.Name) clash"
write "* $($incFile.FullName)"
write "* $($oFile.FullName)"
write "`n"
}
}
}
3
Because code-golf is fun. If the hashtables are faster, what about this even less tested one-liner...
$h=#{};gci c:\s -R -file -Filt *.cs|%{$h[$_.Name]+=#($_.FullName)};$h.Values|?{$_.Count-gt1-and$_-like'c:\s\includes*'}
Edit: explanation of this version: It's doing much the same solution approach as version 1, but the grouping operation happens explicitly in the hashtable. The shape of the hashtable becomes:
$h = {
'fileA.cs': #('c:\cs\wherever\fileA.cs', 'c:\cs\includes\fileA.cs'),
'file2.cs': #('c:\cs\somewhere\file2.cs'),
'file3.cs': #('c:\cs\includes\file3.cs', 'c:\cs\x\file3.cs', 'c:\cs\z\file3.cs')
}
It hits the disk once for all the .cs files, iterates the whole list to build the hashtable. I don't think it can do less work than this for that bit.
It uses +=, so it can add files to the existing array for that filename, otherwise it would overwrite each of the hashtable lists and they would be one item long for only the most recently seen file.
It uses #() - because when it hits a filename for the first time, $h[$_.Name] won't return anything, and the script needs put an array into the hashtable at first, not a string. If it was +=$_.FullName then the first file would go into the hashtable as a string and the += next time would do string concatenation and that's no use to me. This forces the first file in the hashtable to start an array by forcing every file to be a one item array. The least-code way to get this result is with +=#(..) but that churn of creating throwaway arrays for every single file is needless work. Maybe changing it to longer code which does less array creation would help?
Changing the section
%{$h[$_.Name]+=#($_.FullName)}
to something like
%{if (!$h.ContainsKey($_.Name)){$h[$_.Name]=#()};$h[$_.Name]+=$_.FullName}
(I'm guessing, I don't have much intuition for what's most likely to be slow PowerShell code, and haven't tested).
After that, using h.Values isn't going over every file for a second time, it's going over every array in the hashtable - one per unique filename. That's got to happen to check the array size and prune the not-duplicates, but the -and operation short circuits - when the Count -gt 1 fails, the so the bit on the right checking the path name doesn't run.
If the array has two or more files in it, the -and $_ -like ... executes and pattern matches to see if at least one of the duplicates is in the includes path. (Bug: if all the duplicates are in c:\cs\includes and none anywhere else, it will still show them).
--
4
This is edited version 3 with the hashtable initialization tweak, and now it keeps track of seen files in $s, and then only considers those it's seen more than once.
$h=#{};$s=#{};gci 'c:\s' -R -file -Filt *.cs|%{if($h.ContainsKey($_.Name)){$s[$_.Name]=1}else{$h[$_.Name]=#()}$h[$_.Name]+=$_.FullName};$s.Keys|%{if ($h[$_]-like 'c:\s\includes*'){$h[$_]}}
Assuming it works, that's what it does, anyway.
--
Edit branch of topic; I keep thinking there ought to be a way to do this with the things in the System.Data namespace. Anyone know if you can connect System.Data.DataTable().ReadXML() to gci | ConvertTo-Xml without reams of boilerplate?
I'd do more or less the same, except I'd build the hashtable from the contents of the includes folder and then run over everything else to check for duplicates:
$root = 'C:\s'
$includes = "$root\includes"
$includeList = #{}
Get-ChildItem -Path $includes -Filter '*.cs' -Recurse -File |
% { $includeList[$_.Name] = $_.DirectoryName }
Get-ChildItem -Path $root -Filter '*.cs' -Recurse -File |
? { $_.FullName -notlike "$includes\*" -and $includeList.Contains($_.Name) } |
% { "Duplicate of '{0}': {1}" -f $includeList[$_.Name], $_.FullName }
I'm not as impressed with this as I would like but I thought that Group-Object might have a place in this question so I present the following:
$base = 'C:\s'
$unique = "$base\includes"
$extension = "*.cs"
Get-ChildItem -Path $base -Filter $extension -Recurse |
Group-Object $_.Name |
Where-Object{($_.Count -gt 1) -and (($_.Group).FullName -match [regex]::Escape($unique))} |
ForEach-Object {
$filename = $_.Name
($_.Group).FullName -notmatch [regex]::Escape($unique) | ForEach-Object{
"'{0}' has file with same name as '{1}'" -f (Split-Path $_),$filename
}
}
Collect all the files with the extension filter $extension. Group the files based on their names. Then of those groups find every group where there are more than one of that particular file and one of the group members is at least in the directory $unique. Take those groups and print out all the files that are not from the unique directory.
From Comment
For what its worth this is what I used for testing to create a bunch of files. (I know the folder 9 is empty)
$base = "E:\Temp\dev\cs"
Remove-Item "$base\*" -Recurse -Force
0..9 | %{[void](New-Item -ItemType directory "$base\$_")}
1..1000 | %{
$number = Get-Random -Minimum 1 -Maximum 100
$folder = Get-Random -Minimum 0 -Maximum 9
[void](New-Item -Path $base\$folder -ItemType File -Name "$number.txt" -Force)
}
After looking at all the others, I thought I would try a different approach.
$includes = "C:\s\includes"
$root = "C:\s"
# First script
Measure-Command {
[string[]]$filter = ls $includes -Filter *.cs -Recurse | % name
ls $root -include $filter -Recurse -Filter *.cs |
Where-object{$_.FullName -notlike "$includes*"}
}
# Second Script
Measure-Command {
$filter2 = ls $includes -Filter *.cs -Recurse
ls $root -Recurse -Filter *.cs |
Where-object{$filter2.name -eq $_.name -and $_.FullName -notlike "$includes*"}
}
In my first script, I get all the include files into a string array. Then i use that string array as a include param on the get-childitem. In the end, I filter out the include folder from the results.
In my second script, I enumerate everything and then filter after the pipe.
Remove the measure-command to see the results. I was using that to check the speed. With my dataset, the first one was 40% faster.
$FilesToFind = Get-ChildItem -Recurse 'c:\s\includes' -File -Include *.cs | Select Name
Get-ChildItem -Recurse C:\S -File -Include *.cs | ? { $_.Name -in $FilesToFind -and $_.Directory -notmatch '^c:\s\includes' } | Select Name, Directory
Create a list of file names to look for.
Find all files that are in the list but not part of the directory the list was generated from
Print their name and directory

Powershell - Assigning unique file names to duplicated files using list inside a .csv or .txt

I have limited experience with Powershell doing very basic tasks by itself (such as simple renaming or moving files), but I've never created one that has the need to actually extract information from inside a file and apply that data directly to a file name.
I'd like to create a script that can reference a simple .csv or text file containing a list of unique identifiers and have it assign those to a batch of duplicated files (they all have the same contents) that share a slightly different name in the form of a 3-digit number appended as the prefix of a generic name.
For example, let's say my list of files are something like this:
001_test.txt
002_test.txt
003_test.txt
004_test.txt
005_test.txt
etc.
Then my .csv contains an alphabetical list of what I would like those to become:
Alpha.txt
Beta.txt
Charlie.txt
Delta.txt
Echo.txt
etc.
I tried looking at similar examples, but I'm failing miserably trying to tailor them to get it to do the above.
EDIT: I didn't save what I already modified, but here is the baseline script I was messing with:
$file_server = Read-Host "Enter the file server IP address"
$rootFolder = 'C:\TEMP\GPO\source\5'
Get-ChildItem -LiteralPath $rootFolder -Directory |
Where-Object { $_.Name -as [System.Guid] } |
ForEach-Object {
$directory = $_.FullName
(Get-Content "$directory\gpreport.xml") |
ForEach-Object { $_ -replace "99.999.999.999", $file_server } |
Set-Content "$directory\gpreport.xml"
# ... etc
}
I think this is to replace a string inside a file though. I need to replace the file name itself using a list from another file (that is not getting renamed), while not changing the contents of the files that are being renamed.
So you want to rename similar files with those listed in a text file. Ok, here's what you are going to need for my solution (alias listed in parenthesis): Get-Content (GC), Get-ChildItem (GCI), Where (?), Rename-Item, ForEach (%)
$NewNames = GC c:\temp\Namelist.txt #Path, including file name, to list of new names
$Name = "dog.txt" #File name without the 001_ prefix
$Path = "C:\Temp" #Path to search
$i=0
GCI $path | ?{$_.Name -match "\d{3}_$Name"}|%{Rename-Item $_.FullName $NewNames[$i];$i++}
Tested as working. That gets your list of new names and saves it as an array. Then it defines your file name, path, and sets $i to 0 as a counter. Then for each file that matches your pattern it renames it based off of item number $i in the array of new names, and then increments $i up one number and moves to the next file.
I haven't tested this, but it should be pretty close. It assumes you have a CSV with a column named FileNames and that you have at least as many names in that list as there are on disk.
$newNames = Import-Csv newfilenames.csv | Select -ExpandProperty FileNames
$existingFiles = Get-ChildItem c:\someplace
for ($i = 0; $i -lt $existingFiles.count; $i++)
{
Rename-Item -Path $existingFiles[$i].FullName -NewName $newNames[$i]
}
Basically, you create two arrays and using a basic for loop steping through the list of files on disk and pull the name from the corresponding index in the newNames array.
Does your CSV file map the identifiers to the file names?
Identifier,NewName
001,Alpha
002,Beta
If so, you'll need to look up the identifier before renaming the file:
# Define the naming convention
$Suffix = '_test'
$Extension = 'txt'
# Get the files and what to rename them to
$Files = Get-ChildItem "*$Suffix.$Extension"
$Csv = Import-Csv 'Names.csv'
# Rename the files
foreach ($File in $Files) {
$NewName = ($Csv | Where-Object { $File.Name -match '^' + $_.Identifier } | Select-Object -ExpandProperty NewName)
Rename-Item $File "$NewName.$Extension"
}
If your CSV file is just a sequential list of filenames, logicaldiagram's answer is probably more along the lines of what you're looking for.