Only convert files with the string "DUPLICATE" in the name - powershell

I'm trying to make a script that converts PDF's to Tif.
It copies the right files from one folder to another (thanks to the communities previous help).
Next it converts all of the pdfs to tiff.
Lastly it converts the tiff to tif (name change)
What I want to do now is to only convert pdf's with "DUPLICATE" in its file name to tiff. And finally remove the "DUPLICATE" from the new tiff's filename.
Does anyone know how to do that?
gci X:\IT\PDFtoTIFF\1 -filter {VKF*} | Move-Item -destination X:\IT\PDFtoTIFF\2
$tool = 'C:\Program Files (x86)\GPLGS\gswin32c.exe'
$pdfs = get-childitem . -recurse | where {$_.Extension -match "pdf"}
foreach($pdf in $pdfs)
{
$tiff = $pdf.FullName.split('.')[0] + '.tiff'
if(test-path $tiff)
{
"tiff file already exists " + $tiff
}
else
{
'Processing ' + $pdf.Name
$param = "-sOutputFile=$tiff"
& $tool -q -dNOPAUSE -sDEVICE=tiffg4 $param -r300 $pdf.FullName -c quit
}
}
Dir *.tiff | rename-item -newname { $_.name -replace ".tiff",".tif" }
More details:
The script needs to work like this:
All file in the folder \itgsrv028\invoices$\INST that start with vkf need to be moved to this folder: \itgsrv028\invoices$\INST\V3
(This is currently working in the script)
Only convert the files with “Duplicaat” in it’s name to Tiff
Rename VKF_320150309DUPLICAAT.Tiff to 320150309.tif
Example:
These files in the folder:
VKF_320150309.PDF
VKF_320150309DUPLICAAT.PDF
Need to become:
VKF_320150309.PDF
VKF_320150309DUPLICAAT.PDF
320150309.TIF (Converted from: VKF_320150309DUPLICAAT.PDF)

About using only "DUPLICAAT": You have to change your filtering a bit, to include a match for "DUPLICAAT" in there, like this:
$pdfs = get-childitem . -recurse | where {$_.Extension -match "pdf" -and $_.basename -match "DUPLICAAT"}
About building a new name for the TIFF: You can use group placeholders in a regular expression to retrieve your valuable part from the middle of known characters. With your VKF_320150309DUPLICAAT.PDF as an example, you can convert it to a proper TIFF file name with this construction:
$tiff="$($pdf.directory)\$($pdf.basename -replace "VKF_([\w\s]+)DUPLICAAT",'$1').tiff"
This combines a -replace operator over a string, a replacement of $(expression) with its evaluated value in a string and combining proper extension string with path separator within a formatted string. This resolves as follows:
This is a string, as indicated by double quotes wrapping.
$(expression) at first occurrence evaluates to the value of $pdf.directory which contains path to parent without a trailing backslash. With $pdf equal to X:\IT\PDFtoTIFF\2\VKF_320150309DUPLICAAT.PDF this will return X:\IT\PDFtoTIFF\2.
The $(expression) at the second occurrence evaluates to $pdf.basename -replace "VKF_(\w+)DUPLICAAT",'$1'. With the same PDF this equals to "VKF_320150309DUPLICAAT"-replace "VKF_(\w+)DUPLICAAT",'$1'. The round braces regexp portion in the expression matches "320150309" and this value is assigned to $1 which is then placed instead of the whole matched region. Thus your name gets stripped of both "VKF_" and "DUPLICAAT" letters in one go.
The two returned strings get formed into one with a backslash in between and trailing .tiff, resulting in a X:\IT\PDFtoTIFF\2\320150309.tiff.
Hope this would help you in building better scripts that play with strings in Powershell.

Related

How do I remove a "base" portion of a path?

I have a series of folders in a source directory that I need to recreate in another environment. The source folder look like this:
C:\temp\stuff\folder01\
C:\temp\stuff\folder02\
C:\temp\stuff\folder03\
C:\temp\stuff\folder02\dog
C:\temp\stuff\folder02\cat
I need to remove the base portion of the path - C:\temp\stuff - and grab the rest of t he path to concatenate with the destination base folder in order to create the folder structure somewhere else.
I have following script. The problem seems to be with the variable $DIR. $DIR never gets assigned the expected "current path minus the base path".
The variable $tmp is assigned something close to the expected folder name. It contains the folder name minus vowels, is split across multiple lines, and includes a bunch of leading whitespace. For example, a folder named folder01 would look like the following:
f
ld
r01
Here's the PowerShell script:
$SOURCES_PATH = "C:\temp\stuff"
$DESTINATION_PATH = "/output001"
Get-ChildItem "$SOURCES_PATH" -Recurse -Directory |
Foreach-Object {
$tmp = $_.FullName.split("$SOURCES_PATH/")
$DIR = $_.FullName.split("$SOURCES_PATH/")[1]
$destination = "$DESTINATION_PATH/$DIR"
*** code omitted ***
}
I suspect the $DIR appears to be unassigned because the [1] element is whitespace but I'm don't know why there is whitespace and what's happening to the folder name.
What's going on and how do I fix this?
String.Split("somestring") will split on every occurrence of any of the characters in "somestring", which is why you're seeing the paths being split into many more parts than you're expecting.
I'd suggest using -replace '^base' to remove the leading part of the path:
$SOURCES_PATH = "C:\temp\stuff"
$DESTINATION_PATH = "/output001"
Get-ChildItem "$SOURCES_PATH" -Recurse -Directory |Foreach-Object {
# This turns "C:\temp\stuff\folder01" into "\folder01"
$relativePath = $_.FullName -replace "^$([regex]::Escape($SOURCES_PATH))"
# This turns "\folder01" into "/folder01"
$relativeWithForwardSlash = $relativePath.Replace('\','/')
# Concatenate destination root and relative path
$rootedByDestination = "${DESTINATION_DIR}${relativeWithForwardSlash}"
# Create/Copy $rootedByDestination here
}
-replace is a regular expression operator, which is why I run [regex]::Escape() against the input path, to double-escape the backslashes :)
Consider replacing the source path with an empty string. Then you can either concat what's left onto destination path, or use Path::Combine to take care of the concatenation and any separator drama.
Source:
Get-ChildItem | ForEach-Object {
$destination = [System.IO.Path]::Combine($DESTINATION_PATH, $_.FullName.Replace($SOURCES_PATH, ''))
}

How can I replace \\server\Share$ to D: using powershell

I'm trying to gather a report of long directory paths to provide to each user who has them so that they can use it and make this folders paths short.
How can I replace \\server\Share$ to X: ? I tried the below but nothing changes. I can only get results if I do only one character or one string "\\server" but not the combination "\\server\Share$" can someone tell me what I'm doing wrong.
$results= "\\\server\Share$\super\long\directory\path\"
$usershare="\\\server\Share$"
$Results | ForEach-Object { $_.FullName = $_.FullName -replace "$usershare", 'X:' }
The output I need is which is what the users will see in their systems.
X:\super\long\directory\path\
Because the $userShare variable contains characters that have special meaning in Regular Expressions (and -replace uses Regex), you need to [Regex]::Escape() that string.
First thing to notice is that you start the UNC paths with three backslashes, where you should only have two.
Next is that your $results variable is simply declared as string and should probably be the result of a Get-ChildItem command..
I guess what you want to do is something like this:
$uncPath = "\\server\Share$\super\long\directory\path\" #"# the UNC folder path
$usershare = "\\server\Share$"
$results = Get-ChildItem -Path $uncPath | ForEach-Object {
$_.FullName -replace ([regex]::Escape($usershare)), 'X:'
}
Hope that helps

Windows cmd command for stripping versions from filenames?

Need Windows cmd command to rename files to names without version numbers, e.g.:
filename.exa.1 => filename.exa
filename_a.exb.23 => filename_a.exb
filename_b.exc.4567 => filename_b.exc
Filenames are variable in number of characters, and the primary extension is always 3 characters.
I once had a Solaris script "stripv" to accomplish this. I could enter "stripv *" in a directory and get a nice clean set of non-versioned files. If the command would result in duplicate filenames because multiple versions exist, then it would just skip the operation altogether.
TIA
Don't know how to do it in CMD, but here is some Powershell that would work for you:
# Quick way to get an array of filenames. You could also create a proper array,
# or read each line into an array from a file.
$filepaths = #"
C:\full\path\to\filename.exa.1
C:\full\path\to\filename_a.exb.23
\\server\share\path\to\filename_b.exc.4567
"# -Split "`n"
# For each path in $filepaths
$filepaths | Foreach-Object {
$path = $_
# Split-Path -Leaf gets only the filename
# -Replace expression just means to match on the ".number" at the end of the
# filename and replace it with an empty string (effectively removing it)
$newFilename = ( Split-Path -Leaf $path ) -Replace '\.\d+$', ''
# Warning output
Write-Warning "Renaming '${path}' to '${newFilename}'"
# Rename the file to the new name
Rename-Item -Path $path -NewName $newFilename
}
Basically, this code creates an array of full paths to files. For each path, it strips the filename from the full path and replaces the .number pattern at the end with nothing, which removes it from the filename. Now that we have the new filename, we use Rename-Item to rename the file to the new name.
Supply the folder name to this script block's $Folder variable, and it will enumerate the items within that folder, locate the last '.' character within the file name, and rename it as everything prior to the '.'.
E.g.: Filename.123.wrcrw.txt.123 would be renamed as Filename.123.wrcrw.txt or in your case, your files would lose the extraneous characters from the final '.' onwards. If the new name for the file already exists, it will write a warning stating that it could not rename the file, and continue on without trying.
$Folder = "C:\ProgramData\Temp"
Get-ChildItem -Path $Folder | Foreach {
$NewName = $_.Name.Substring(0,$_.Name.LastIndexOf('.'))
IF (!(Test-Path $Folder\$NewName))
{
Rename-Item $Folder\$_ -NewName $NewName
}
Else
{
Write-Warning "$($_.Name) cannot be renamed, $NewName already exists."
}
}
This should effectively mimic the behaviour you described for stripv *. This could easily be turned into a function with name stripv and added to your PowerShell profile to make it available at the command-line interactively and used in the same way as your Solaris script.

How do I replace substrings that start with 'X' and end with '.tif'?

I have a string that includes both data and the names of image files, delineated by tabs
The names of the image files are 41 characters long and end with the file extension .tif (example: X1126225548817153725411111_PPPPP_00333.tif)
I would like to remove the substrings that match the following criteria, but I'm not sure which string tricks to use
You can try the following to rename these files:
get-childitem "YourDirectory\*.tif" |
foreach { $newName = ($_.BaseName).TrimStart("X")
Rename-Item $_.FullName $newName }
Basename removes the file extension, and TrimStart("X") removes the leading "X".
I figured it out. I was using the wrong wildcard for regular expressions
Here's the code:
PS C:\Users\mharper> $data[1] -Replace "X......................................tif\`t" , ""\`
-Replace "\`t\`t" , "`t"

Reversing a file name based on delimiter then truncating part

I need to rename many hundreds of files to follow a new naming convention, but I'm having awful trouble. This really needs to be scripted in powershell or VBS so we can automate the task in a regular basis.
Original File Name
Monday,England.txt
New File Name
EnglanMo
Convention Rules:
The file name is reversed around the delimiter (,) to England,Monday and then truncated to 6/2 char
Englan,Mo
The Delimiter is then removed
englanmo.txt
Say we had Wednesday,Spain.txt spain being 5 char, this is not subject to any reduction
SpainWe.txt
All the txt files can be accessed in one directory, or from a CSV, whatever is easiest.
Without having the exact details of your file paths, where it'll run, etc. you'll have to adapt this to point at the appropriate path(s).
$s= "Monday,England.txt";
#$s = "Wednesday,Spain.txt";
$nameparts = $s.split(".")[0].split(",");
if ($nameparts[1].length -gt 6) {
$newname = $nameparts[1].substring(0,6);
} else {
$newname = $nameparts[1];
}
if ($nameparts[0].length -gt 2) {
$newname += $nameparts[0].substring(0,2);
} else {
$newname += $nameparts[0];
}
$newname = $newname.toLower() + "."+ $s.split(".")[1];
$newname;
get-item $s |rename-item -NewName $newname;
I'm certain this isn't the most efficient/elegant way to do this, but it works with both of your test cases.
Use Get-ChildItem to grab the files, then on files that match your criteria, use regular expressions to capture the first two characters of the day of the week and the first six characters of the location, then use those captures to create a new filename. This is my best guess. Use -WhatIf on the Move-Item cmdlet until you get the regular expression and the destination path correct.
Get-ChildItem C:\Path\To\Files *.txt |
Where-Object { $_.BaseName -matches '^([^,]{2})[^,]*,(.{1,6})' } |
Move-Item -WhatIf -Destination {
$newFileName = '{0}{1}.txt' -f $matches[1],$matches[2]
Join-Path C:\Path\To\Files $newFileName.ToLower()
}
I think you should be able to achieve this by splitting the string into arrays in powershell and then recording the array to get your reverse.
For example:
$fileNameExtension = "Monday,England.txt";
$fileName = $fileNameExtension.split("."); // gets you an array [Monday,England][txt]
$fileparts = $fileName.split(","); // gets you an array [Monday][England]
//Create the new substring parts, notice you can now pull items from the array in any order you need,
//You will need to check the length before using substringing
$part1 = $fileparts[1].substring(0,5);
$part2 = $fileparts[0].substring(0,2);
//Now construct the new file name by rebuilding the string
$newfileName = $part1 + $part2 + “.” + $fileName[1];