Read a file and then based on file content names, move the file from one folder to another using Powershell script - powershell

I need to read a file (e.g. file.txt) which has file names as its content. File names are separated by unique character (e.g. '#'). So my file.txt looks something like:
ABC.txt#
CDE.csv#
XYZ.txt#
I need to read its content line by line based on its extension. I have 1 source folder and 1 destination folder. Below is my scenario that I need to achieve:
If extension = txt then
check if that file name exists in destination_folder1 or destination_folder2
if that file exists then
copy that file from source_folder1 to destination_folder1
else delete that file from destination_folder1
Else display msg as "Invalid file"
I am new to powershell scripting. can someone pls help? Thanks in advance.

It will make my job easier if we assume the following pseudocode. Then you can take the elements I demonstrate and change them to fit your needs.
If the string from "file.txt" contains the file extension "txt" then continue.
If the file does not exist in the destination folder then copy the file from the source folder to the destination folder.
Use Get-Content to read a text file.
Get-Content .\file.txt
Get-Content processes files line by line. This has a few consequences:
Each line in our input text file will trigger our code.
Each time our code triggers, it will have input that looks like this: ABC.txt#
We can focus on solving the problem for one line.
If we need to evaluate strings, I suggest using regular expressions.
Remember, we are operating on a single line from the text file:
ABC.txt#
We need to detect the file extension.
A good place to start would be the end of the string.
In regular expressions, the end of a string is represented by $
So let's start there.
Here is our regular expression so far:
$
The next thing that would be useful is if we accounted for that # symbol. We can do that by adding it before $
#$
If there was a different character, we would add that instead: ;$ Keep in mind that there are reserved characters in regular expressions. So we might need to escape certain characters with a backslash: \$$
Now we have to account for the file extension.
We have three letters, we don't know what they are.
Regular expressions have a special escape sequence (called a character class) that can match any letter: \w
Let's add three of those.
\w\w\w#$
Now, while crafting regular expressions, it is a good idea to limit the text we're looking for.
As humans, we know we're looking for .txt# But, so far, the computer only knows about txt# with no dot. So it would accept .txt#, .xlsx#, and anythingGoes# as matches. We limited the right side of our string. Now let's limit the left side.
We're only interested in three characters. And the left side is bounded by a . So let's add that to our regular expression. I'll also mention that a period is a reserved character in regular expressions. So, we will have to escape it.
\.\w\w\w#$
So if we're looking at text like this
ABC.txt#
then our regular expression will output text like this
.txt#
Now, .txt# is a pretty good result. But we can make our job a little easier by limiting the result to just the file extension.
There are several ways of doing this. But I suggest using regular expression groups.
We create a group by surrounding our target with parentheses.
\.(\w\w\w)#$
This now produces output like:
txt
From here, we can just make intuitive comparisons like if txt = txt.
Another piece of the puzzle is testing whether a file already exists. For this we can use the Test-Path and Join-Path cmdlets.
$destination = ".\destination 01"
$exampleFile = "ABC.txt"
$destinationFilePath = Join-Path -Path $destination -ChildPath $exampleFile
Test-Path -Path $destinationFilePath
With these concepts, it is possible to write a working example.
# Folder locations.
$source = ".\source"
$destination = ".\destination 01"
# Load input file.
Get-Content .\file.txt |
Where-Object {
# Enter our regular expression.
# I've added an extra group to capture the file name.
# The $matches automatic variable is created when the -match comparison operator is used.
if ($_ -match '([\w ]+\.(\w\w\w))#$')
{
# Which file extensions are we interested in processing?
# Here $matches[2] represents the file extension: ex "txt".
# We use a switch statement to handle each type of file extension.
# Accept new file types by creating new switch cases.
switch ($matches[2])
{
"txt" {$true; Break}
#"csv" {$true; Break}
#"pdf" {$true; Break}
default {$false}
}
}
else { $false }
} |
ForEach-Object {
# Here $matches[1] is the file name captured from the input file.
$sourceFilePath = Join-Path -Path $source -ChildPath $matches[1]
$destinationFilePath = Join-Path -Path $destination -ChildPath $matches[1]
$fileExists = Test-Path -Path $destinationFilePath
# Copy the source file to the destination if the destination doesn't exist.
if (!$fileExists)
{ Copy-Item -Path $sourceFilePath -Destination $destinationFilePath }
}
Note on Copy-Item
Copy-Item has known issues.
Issue #10458 | PowerShell | GitHub
Issue #2581 | PowerShell | GitHub
You can substitute robocopy which is more reliable.
Robocopy - Wikipedia
The robocopy syntax is:
robocopy <source> <destination> [<file>[ ...]] [<options>]
where <source> and <destination> can be folders only.
So, if you want to copy a file, you have to write it like this:
robocopy .\source ".\destination 01" ABC.txt
We can invoke robocopy using Start-Process and the variables we already have.
# Copy the source file to the destination if the destination doesn't exist.
if (!$fileExists)
{
Start-Process -FilePath "robocopy.exe" -ArgumentList "`"$source`" `"$destination`" `"$($matches[1])`" /unilog+:.\robolog.txt" -WorkingDirectory (Get-Location) -NoNewWindow
}
Using Get-ChildItem
You use file.txt as input. If you wanted to gather a list of files on disc, you can use Get-ChildItem.
Multiple Conditions
You wrote "destination_folder1 or destination_folder2". If you need multiple conditions you can construct this with three things.
Use the if statement. Inside the test condition, you can add multiple conditions with logical -or And you can group statements together to make them easier to read.
Functions
If you need to move a piece of code around, you can use a function. Just remember to create parameters for the inputs to the function. Then call a PowerShell function without parentheses or commas:
# Calling a PowerShell function.
myFunction parameterOne parameterTwo parameterThree
Writing Output
You can use Write-Output to send text to the console.
Write-Output "Invalid File"
Further Reading
Here are some references which you might find useful.
about_Comparison_Operators - PowerShell | Microsoft Docs
about_Pipelines - PowerShell | Microsoft Docs
about_Switch - PowerShell | Microsoft Docs
Regular-Expressions.info - Regex Tutorial, Examples and Reference - Regexp Patterns
Where-Object (Microsoft.PowerShell.Core) - PowerShell | Microsoft Docs

Related

How can I add a line break for every tilde found within the contents of several files found at a path?

I would like to use PowerShell to add a line break for every tilde it finds in a file.
The source could contain main .in files which contain tildes.
I have this script so far, and could benefit by some assistance in how to tweak it.
This will work for one file, but not for many:
(Get-Content -Path '.\amalgamatedack.in') |
ForEach-Object {$_.Replace('~', "~`r`n")} |
Set-Content -Path '.\amalgamatedack.in'
You can use Get-ChildItem to find all your .in files, then follow the same logic, just replace the input and output hardcoded file name for the absolute path of each file (.FullName property).
Your code could also benefit by using Get-Content -Raw, assuming these files are not very big and they fit in memory, reading the content as single multi-line string is always faster.
# If you need to search recursively for the files use `-Recurse`
Get-ChildItem path\to\sourcefolder -Filter *.in | ForEach-Object {
($_ | Get-Content -Raw).Replace('~', "~`r`n") |
Set-Content -Path $_.FullName
}

Script returning error: "Get-Content : An object at the specified path ... does not exist, or has been filtered by the -Include or -Exclude parameter

EDIT
I think I now know what the issue is - The copy numbers are not REALLY part of the filename. Therefore, when the array pulls it and then is used to get the match info, the file as it is in the array does not exist, only the file name with no copy number.
I tried writing a rename script but the same issue exists... only the few files I manually renamed (so they don't contain copy numbers) were renamed (successfully) by the script. All others are shown not to exist.
How can I get around this? I really do not want to manually work with 23000+ files. I am drawing a blank..
HELP PLEASE
I am trying to narrow down a folder full of emails (copies) with the same name "SCADA Alert.eml", "SCADA Alert[1].eml"...[23110], based on contents. And delete the emails from the folder that meet specific content criteria.
When I run it I keep getting the error in the subject line above. It only sees the first file and the rest it says do not exist...
The script reads through the folder, creates an array of names (does this correctly).
Then creates an variable, $email, and assigns the content of that file. for each $filename in the array.
(this is where is breaks)
Then is should match the specific string I am looking for to the content of the $email var and return true or false. If true I want it to remove the email, $filename, from the folder.
Thus narrowing down the email I have to review.
Any help here would be greatly appreciated.
This is what I have so far... (Folder is in the root of C:)
$array = Get-ChildItem -name -Path $FolderToRead #| Get-Content | Tee C:\Users\baudet\desktop\TargetFile.txt
Foreach ($FileName in $array){
$FileName # Check File
$email = Get-Content $FolderToRead\$FileName
$email # Check Content
$ContainsString = "False" # Set Var
$ContainsString # Verify Var
$ContainsString = %{$email -match "SYS$,ROC"} # Look for String
$ContainsString # Verify result of match
#if ($ContainsString -eq "True") {
#Remove-Item $FolderToRead\$element
#}
}
Here's a PowerShell-idiomatic solution that also resolves your original problems:
Get-ChildItem -File -LiteralPath $FolderToRead | Where-Object {
(Get-Content -Raw -LiteralPath $_.FullName) -match 'SYS\$,ROC'
} | Remove-Item -WhatIf
Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.
Note how the $ character in the RHS regex of the -match operator is \-escaped in order to use it verbatim (rather than as metacharacter $, the end-of-input anchor).
Also, given that $ is also used in PowerShell's string interpolation, it's better to use '...' strings (single-quoted, verbatim strings) to represent regexes, assuming no actual up-front string expansion is needed before the regex engine sees the resulting string - see this answer for more information.
As for what you tried:
The error message stemmed from the fact that Get-Content $FolderToRead\$FileName binds the file-name argument, $FolderToRead\$FileName, implicitly (positionally) to Get-Content's -Path parameter, which expects PowerShell wildcard patterns.
Since your file names literally contain [ and ] characters, they are misinterpreted by the (implied) -Path parameter, which can be avoided by using the -LiteralPath parameter instead (which must be specified explicitly, as a named argument).
%{$email -match "SYS$,ROC"} is unnecessarily wrapped in a ForEach-Object call (% is a built-in alias); while that doesn't do any harm in this case, it adds unnecessary overhead;
$email -match "SYS$,ROC" is enough, though it needs to be corrected to
$email -match 'SYS\$,ROC', as explained above.
[System.IO.Directory]::EnumerateFiles($Folder) |
Where-Object {$true -eq [System.IO.File]::ReadAllText($_, [System.Text.Encoding]::UTF8).Contains('SYS$,ROC') } |
ForEach-Object {
Write-Host "Removing $($_)"
#[System.IO.File]::Delete($_)
}
Your mistakes:
%{$email -match "SYS$,ROC"} - What % is intended to be? This is ForEach-Object alias.
%{$email -match "SYS$,ROC"} - Why use -match? This is much slower than -like or String.Contains()
%{$email -match "SYS$,ROC"} - When using $ inside double quotes, you should escape this using single backtick symbol (I have `$100). Otherwise, everything after $ is variable name: Hello, $username; I's $($weather.ToString()) today!
Write debug output in a right way: use Write-Debug, Write-Verbose, Write-Host, Write-Warning, Write-Error, Write-Information.
Can be better:
Avoid using Get-ChildItem, because Get-ChildItem returns files with attributes (like mtime, atime, ctime, etc). This additional info is additional request per file. When you need only list of files, use native .Net EnumerateFiles from System.IO.Directory. This is significant performace boost on huge amounts of files.
Use RealAllText or ReadAllLines or ReadAllBytes from System.IO.File static class to be more concrete instead of using universal Get-Content.
Use pipelines ;-)

How do I copy a list of files and rename them in a PowerShell Loop

We are copying a long list of files from their different directories into a single location (same server). Once there, I need to rename them.
I was able to move the files until I found out that there are duplicates in the list of file names to move (and rename). It would not allow me to copy the file multiple times into the same destination.
Here is the list of file names after the move:
"10.csv",
"11.csv",
"12.csv",
"13.csv",
"14.csv",
"15.csv",
"16.csv",
"17.csv",
"18.csv",
"19.csv",
"20.csv",
"Invoices_Export(16) - Copy.csv" (this one's name should be "Zebra.csv")
I wrote a couple of foreach loops, but it is not working exactly correctly.
The script moves the files just fine. It is the rename that is not working the way I want. The first file does not rename; the other files rename. However, they leave the moved file in place too.
This script requires a csv that has 3 columns:
Path of the file, including the file name (eg. c:\temp\smefile.txt)
Destination of the file, including the file name (eg. c:\temp\smefile.txt)
New name of the file. Just the name and extention.
# Variables
$Path = (import-csv C:\temp\Test-CSV.csv).Path
$Dest = (import-csv C:\temp\Test-CSV.csv).Destination
$NN = (import-csv C:\temp\Test-CSV.csv).NewName
#Script
foreach ($D in $Dest) {
$i -eq 0
Foreach ($P in $Path) {
Copy-Item $P -destination C:\Temp\TestDestination -force
}
rename-item -path "$D" -newname $NN[$i] -force
$i += 1
}
There were no error per se, just not the outcome that I expected.
Welcome to Stack Overflow!
There are a couple ways to approach the duplicate names situation:
Check if the file exists already in the destination with Test-Path. If it does, start a while loop that appends a number to the end of the name and check if that exists. Increment the number you append after each check with Test-Path. Keep looping until Test-Path comes back $false and then break out of the loop.
Write an error message and skip that row in the CSV.
I'm going to show a refactored version of your script with approach #2 above:
$csv = Import-Csv 'C:\temp\Test-CSV.csv'
foreach ($row in $csv)
{
$fullDestinationPath = Join-Path -Path $row.Destination -ChildPath $row.NewName
if (Test-Path $fullDestinationPath)
{
Write-Error ("The path '$fullDestinationPath' already exists. " +
"Skipping row for $($row.Path).")
continue
}
# You may also want to check if $row.Path exists before attempting to copy it
Copy-Item -Path $row.Path -Destination $fullDestinationPath
}
Now that your question is answered, here are some thoughts for improving your code:
Avoid using acronyms and abbreviations in identifiers (variable names, function names, etc.) when possible. Remember that code is written for humans and someone else has to be able to understand your code; make everything as obvious as possible. Someone else will have to read your code eventually, even if it's Future-You™!
Don't Repeat Yourself (called the "DRY" principle). As Lee_daily mentioned in the comments, you don't need to import the CSV file three times. Import it once into a variable and then use the variable to access the properties.
Try to be consistent. PowerShell is case-insensitive, but you should pick a style and stick to it (i.e. ForEach or foreach, Rename-Item or rename-item, etc.). I would recommend PascalCase as PowerShell cmdlets are all in PascalCase.
Wrap literal paths in single quotes (or double quotes if you need string interpolation). Paths can have spaces in them and without quotes, PowerShell interprets a space as you are passing another argument.
$i -eq 0 is not an assignment statement, it is a boolean expression. When you run $i -eq 0, PowerShell will return $true or $false because you are asking it if the value stored in $i is 0. To assign the value 0 to $i, you need to write it like this: $i = 0.
There's nothing wrong with $i += 1, but it could be shortened to $i++, if you want to.
When you can, try to check for common issues that may come up with your code. Always think about what can go wrong. "If I copy a file, what can go wrong? Does the source file or folder exist? Is the name pulled from the CSV a valid path name or does it contain characters that are invalid in a path (like :)?" This is called defensive programming and it will save you so so many headaches. As with anything in life, be careful not to go overboard. Only check for likely scenarios; rare edge-cases should just raise errors.
Write some decent logs so you can see what happened at runtime. PowerShell provides a pair of great cmdlets called Start-Transcript and Stop-Transcript. These cmdlets log all the output that was sent to the PowerShell console window, in addition to some system information like the version of PowerShell installed on the machine. Very handy!

Copying files defined in a list from network location

I'm trying to teach myself enough powershell or batch programming to figure out to achieve the following (I've had a search and looked through a couple hours of Youtube tutorials but can't quite piece it all together to figure out what I need - I don't get Tokens, for example, but they seem necessary in the For loop). Also, not sure if the below is best achieved by robocopy or xcopy.
Task:
Define a list of files to retrieve in a csv (file name will be listed as a 13 digit number, extension will be UNKNOWN, but will usually be .jpg but might occasionally be .png - could this be achieved with a wildcard?)
list would read something like:
9780761189931
9780761189988
9781579657159
For each line in this text file, do:
Search a network folder and all subfolders
If exact filename is found, copy to an arbitrary target (say a new folder created on desktop)
(Not 100% necessary, but nice to have) Once the For loop has completed, output a list of files copied into a text file in the newly created destination folder
I gather that I'll maybe need to do a couple of things first, like define variables for the source and destination folders? I found the below elsewhere but couldn't quite get my head around it.
set src_folder=O:\2017\By_Month\Covers
set dst_folder=c:\Users\%USERNAME&\Desktop\GetCovers
for /f "tokens=*" %%i in (ISBN.txt) DO (
xcopy /K "%src_folder%\%%i" "%dst_folder%"
)
Thanks in advance!
This solution is in powershell, by the way.
To get all subfiles of a folder, use Get-ChildItem and the pipeline, and you can then compare the name to the insides of your CSV (which you can get using import-CSV, by the way).
Get-ChildItem -path $src_folder -recurse | foreach{$_.fullname}
I'd personally then use a function to edit the name as a string, but I know this probably isn't the best way to do it. Create a function outside of the pipeline, and have it return a modified path in such a way that you can continue the previous line like this:
Get-ChildItem -path $src_folder -recurse | foreach{$_.CopyTo (edit-path $_.fullname)}
Where "edit-directory" is your function that takes in the path, and modifies it to return your destination path. Also, you can alternatively use robocopy or xcopy instead of CopyTo, but Copy-Item is a powershell native and doesn't require much string manipulation (which in my experience, the less, the better).
Edit: Here's a function that could do the trick:
function edit-path{
Param([string] $path)
$modified_path = $dst_folder + "\"
$modified_path = $path.substring($src_folder.length)
return $modified_path
}
Edit: Here's how to integrate the importing from CSV, so that the copy only happens to files that are written in the CSV (which I had left out, oops):
$csv = import-csv $CSV_path
Get-ChildItem -path $src_folder -recurse | where-object{$csv -contains $_.name} | foreach{$_.CopyTo (edit-path $_.fullname)}
Note that you have to put the whole CSV path in the $CSV_path variable, and depending on how the contents of that file are written, you may have to use $_.fullname, or other parameters.
This seems like an average enough problem:
$Arr = Import-CSV -Path $CSVPath
Get-ChildItem -Path $Folder -Recurse |
Where-Object -FilterScript { $Arr -contains $PSItem.Name.Substring(0,($PSItem.Length - 4)) } |
ForEach-Object -Process {
Copy-Item -Destination $env:UserProfile\Desktop
$PSItem.Name | Out-File -FilePath $env:UserProfile\Desktop\Results.txt -Append
}
I'm not great with string manipulation so the string bit is a bit confusing, but here's everything spelled out.

Powershell concatenating text to a variable

My source files all reside in one folder whose path is contained in a variable named $template.
I need to specify the exact filename as each file goes to a different destination.
My goal is to merely concatenate the filename to the variable.
Example:
$template = "D:\source\templatefiles\"
Filename1 is: "graphic-183.jpg"
I have tried:
Join-Path $template graphic-183.jpg
Issuing this at the cli appears to do what I want.
But now, how do I reference this concatenated file path short of creating a new variable for each file? It isn't as simple as for-nexting my way through a list as depending on the filename that determines where the file goes.
I am toying with case else, elseIf, but surely it isn't this hard.
The bottom line is, I just want to prefix the folder path to each filename and hard code the destination as it will always be the same each time the script is run.
edit
I just edited this as I forgot to mention how I am trying to use this.
In my script I intend to have lines like:
Copy-Item -Path $template filename.ext -Destination $destfolder
It's the highlighted part above that I am trying to join $template to the filename.
Thanks for any advice.
-= Bruce D. Meyer
maybe this is what you want?
you can call cmdlets in place, using parentheses, like so:
Copy-Item -Path (Join-Path $template filename.ext) -Destination $destfolder
this causes PowerShell to go from "argument mode" to "expression mode" - i.e., it returns the output of the Join-Path cmdlet as an expression.
and yes, David's and Ansgar's suggestions are also helpful - try this to get full paths only:
(get-childitem $template) | select fullname
You could build the path like this:
$template = "D:\source\templatefiles\"
Copy-Item -Path "${template}filename.ext" ...
However, I think David's suggestion might be a better solution for your problem. You could map filenames to destination folders with a hash table and do something like this:
$locations = #{
"foo" = "C:\some",
"bar" = "C:\other",
...
}
Get-ChildItem $template | % { Copy-Item $_ $location[$_.Name] }