Check if file has already been moved? - powershell

So right now I have a program that moves files automatically from one folder to another only once.
So if that file gets into that folder again, it shouldn't be moved.The application is being executed every 30 minutes. So right now what I have is if LastWriteTime is older than 30 minutes, don't move it.
# Check if file is older than 30 minutes
$olderthan = #(Get-ChildItem -Path $src\$_.pdf | ? { $_.LastWriteTime -ge $date} -ov olderthan)
if (-not $olderthan){
# If it's older than 30 minutes, move no file
$timesall = #(Get-ChildItem -Path $src\$_.pdf | Select-Object -Property BaseName)
write-LogRecord -Typ WARNING "'$($timesall.BaseName)' file(s) are not being moved because they're older than 30 minutes"
$timesall = 0
} else {
#Move File
}
And yes it works, but are there other, better ways to do it?
Thanks in advance!

The other alternative to inspecting file attributes is to do file tracking. I'll assume that the files do not continue to live in the destination folder (otherwise you can use TEST-PATH to see if a file exists before moving).
To me, the most straight forward tracking system would be to create a parallel folder where you can put files with the same name into it. Assuming the file has not been submitted before you would copy A.txt into your destination, and create a A.txt in your tracking path (which could be a empty file, or not, see below). Now you test is to see if the same named file exists in your tracking folder.
Note: this method allows you to easily reprocess a file by removing it from your tracking folder. It also just works when the scheduler does not fire, for whatever reason.
If you need more complex options, like accommodating a file that has changed, you could store finger print information, like size and a hash, in your tracking file. Your test the could also inspect those as part of it's test.
Lastly, at some point you'd probably want to groom your tracking folder. Using LastWriteTime and removing everything past, say, 1 month (or whatever if right for your circumstances) would keep your tracking folder from getting too big. You could run this every time after the transfers, or on a separate schedule.

Related

Powershell: Go through all files (PDF's) in a directory and move them based on what's written in the first 6 bytes

I am currently trying to write a powershell script that does the following:
Go through all PDF-Files in the directory in which the script is in
Check the first few bytes of those PDF-Files
If those bytes say something along the lines of "PK", move them to a different location
If the bytes say something else (ex: PDF1.4), dont move them at all and go to the next one.
Context: We have around 70k PDF-Files that cant be opened. After checking them with a certain tool, it looks like around 99% of those are damaged and the remaining 1% are zip files.
The first bytes of a zipped PDF file start with "PK", the first bytes of a broken PDF-File start with PDF1.4 for example.
I need to unzip all zip files and relocate them. Going through 70k PDF-Files by hand is kinda painful, so im looking for a way to automate it.
I know im supposed to provide a code sample, but the truth is that i am absolutely lost. I have written a few powershell scripts before, but i have no idea how to do something like this.
So, if anyone could kindly point me to the right direction or give me a useful function, i would really appreciate it a lot.
You can use Get-Content to get your first 6 bytes as you asked.
We can then tie that into a loop on all the documents and configure a simple if statement to decide what to do next, e.g. move the file to another dir
EDITED BASED ON YOUR COMMENT:
$pdfDirectory = 'C:\Temp\struktur_id_1225\ext_dok'
$newLocation = 'C:\Path\To\New\Folder'
Get-ChildItem "$pdfDirectory" -Filter "*.pdf" | foreach {
if((Get-Content $_.FullName | select -first 1 ) -like "%PDF-1.5*"){
$HL7 = $_.FullName.replace("ext_dok","MDM")
$HL7 = $HL7.replace(".pdf",".hl7")
move $_.FullName $newLocation;
move $HL7 $newLocation
}
}
Try using the above, which is also a bit easier to edit.
$pdfDirectory will need to be set to the folder containing the PDF Files
$newLocation will obviously be the new directory!
And you will still need to change the -like "%PDF-1.5*" to suit your search!
It should do the rest for you, give it a shot
Another Edit
I have mimicked your folder structure on my computer, and placed a few PDF files and matching HL7 files and the script is working perfectly.
Get-Content is not suited for PDF's, you'd want to use iTextSharp to read PDF's.
Download the iTextSharp(found in releases) and put the itextsharp.dll somewhere easy to find (ie. the folder your script is located in).
You can install the .nupkg by using Install-Package, or simply using an archive tool to extract the contents of the .nupkg file (it's basically a .zip file)
The code below adds every word on page 1 for each PDF separated by whitespace to an array. You can then test if the array contains your keyword
Add-Type -Path "C:\path\to\itextsharp.dll"
$pdfs = Get-ChildItem "C:\path\to\pdfs" *.pdf
foreach ($pdf in $pdfs) {
$reader = New-Object itextsharp.text.pdf.pdfreader -ArgumentList $pdf.Fullname
$text = [iTextSharp.text.pdf.parser.PdfTextExtractor]::GetTextFromPage($reader,1).Split("")
foreach($line in $text) {
# do your test here
}
}

Powershell IF ELSE based on File Size

Trying to find a way in powershell that allowed me to move a file based on its size. I could not find exactly what I was looking for. I found how to move files of only a certain size and to do other if/then statements but not to move a file to different locations based on there size.
Why did I need/want to do this? A exe I am running creates and output even if it has no data. so sometimes the file is empty and sometimes it has data. When it has data I need it sent to someone, when its empty I just wanted it in a backup folder for reference.
This part let me move a file based on size: -cle is less than or equal to
$BlankFiles = Get-ChildItem c:\test\*.rej | where { $_.Length -cle 0kb}
This part let me check if an empty file exist: After lots of reading went with system.io.file over test-path
[System.IO.File]::Exists($BlankFiles)
Putting this all in a IF/ELSE statement was the problem i struggled with. Answer I came up with is below.
I am mainly posting this since I could not find the exact scenario and if any one sees a problem with this approach that I missed.
Here is the solution I came up with and it all the test I did it appears to be working as intended. Note: I only need to do this on one file at a time, which is why this works and why I left out recursive or loop steps.
If the file is blank it moves it to a backup folder and appends it with the date, if it has data it makes a copy with the date append to the backup folder and moves the file with date append to a different location that is accessible to the necessary users.
I was thinking about going with check to see how many lines are in the file over the size of the file, but it appears the file when blank sometimes has a return in it and sometimes it doesn't. So I went with size method instead
$BlankFiles = Get-ChildItem c:\test\*.rej | where { $_.Length -cle 0kb}
$date = Get-Date
$fndate = $date.ToString("MMddyyyy")
If ([System.IO.File]::Exists($BlankFiles) -eq "True") {
Move-Item C:\test\*.rej c:\test\blankfiles -"$fndate".rej
}
Else {
Copy-Item c:\test\*.rej c:\test\realfiles-"$fndate".rej
Move-Item c:\test\*.rej c:\user\accessible\realfiles-"$fndate".rej -Force
}
If anyone see any issues with doing this way or has a better suggestions, but as I mentioned from my test it appears to be working wonderfully and I thought I would share.

how to check if file with same name but with different extension exists in a directory Powershell

I am trying to find two things here. I have thousands of files in a folder. lets take example of one file and We can apply same logic to all files
If file with same name but with different extension exists.
If it exists, I need to compare the lastwritetime or the timestamp to find out which file is newer.
For example, if I have a file culture.txt I supposed to have a corresponding file culture.log.
If I have culture.txt but culture.log file is missing, then its an issue, so I want to output names of all .txt file for which corresponding .log files are missing.
If both culture.txt and culture.log are available, then I want to check if the culture.txt was generated after culture.log. If culture.txt is generated before culture.log, there is an issue so, I need to output the names of such .txt files with this issue saying "Culture.txt was generated before culture.log- Please rerun the program".
Anyone who can help would be appreciated. Thank You.
A little more help needed on same question if I can get. The code suggested by Esperento is completely working fine but the requirement is updated. In a folder, I have multiple files with multiple extensions and not limited to just .txt and .log. I can have .doc, .docx, .xls and many other files in the same folder.
Now about updated requirement. I have to look for file names with 3 specific extensions only. One of them is program file. Which should be generated first obviously. Let’s say Culture.prog. then when I run the Culture.Prog two files will be generated like culture.log first and culture.txt respectively.
So obviously, the timestamp on prog is older than log and timestamp on log is older than txt which generated very last.
We have to check the availability of 2 corresponding files(log and prog) in reference to .txt file only which is generated last.
So, first check is, if 2 corresponding files are available for .txt file. Next check is the timestamp is corresponding for these 3 files in order. We have to output only if one of the condition is not satisfied, otherwise its ok if we don’t output anything. For example, if for culture.txt, if .log or .prog file is missing we have to output the fact that which or both files are missing. If the time stamp of txt file is older than log and/or prog we have to output that fact. I hope I am clear in my request. Thank you
try this:
#list file and group by name without extension
Get-ChildItem "C:\temp\test" -file -filter "*.*" | group Basename |
%{
$group= $_.group
# if not same name, missing message
if ($_.Count -eq 1)
{
"'{0}' are missing" -f $group.Name
}
#else search into current group file with great creation time and print message
else
{
$group | % {$file=$_; $group | %{if ($_.CreationTime -gt $file.CreationTime) {"'{0}' has beeen generated before '{1} " -f $file.Name, $_.Name} } }
}
} | out-file "C:\temp\test\result.txt"

Can I keep a PowerShell script from deleting moved and cut files?

My need is to delete all files older than 14 days in a public folder. I have cobbled together a PowerShell script that just about does the trick as I need it. . . The only problem is, if the user moves a file into the folder - as opposed to copying it - my script will delete that file if it was last accessed more than 14 days ago, even if it was moved into the public folder the same day. The same thing happens with cut and paste. So this is a pretty serious problem.
Here is my script:
# Delete all files older than "file_age" days, at "path".
$path = "C:\Users\emcguire\Desktop\Test"
$file_age = "-14"
$current_date = Get-Date
$date_to_delete = $current_date.AddDays($file_age)
Get-ChildItem $path -Recurse | Where-Object { $_.LastAccessTime -lt $date_to_delete } | Remove-Item
I am pretty new to PowerShell, so I may be missing something very obvious. Is there an easy way to check for files that were moved into the folder but do not have their access timestamp changed? Is there a better way to approach this?
I appreciate any help!
The property LastAccessTime is a notoriously avoided property. Try to use LastWriteTime where ever possible first. Additionally, all those properties are stale, meaning they aren't refreshed when you call them. Use this code to call the refresh method to guarantee you've got the fresh file system info before you query the property:
$file = c:\somefile.txt
$fileObj = New-Object System.IO.FileInfo $file
$fileObj.Refresh()
As you're wanting to base your actions off of how long the file has been in the archive directory, you may want to value you the CreationTime attribute. I added a link below to the list of what your choices are in case there's a better one for your needs.
For reference on the refresh method
For reference on properties to value

Powershell script to move files based on a source list (.txt

I have thousands of files in a directory (.pdf, .xls, .doc) and they all have a similar naming convention (the "type" is always a constant string, ie: billing or invoice);
accountname_accountnumber_type.pdf
accountname_ accountnumber_type.doc
accountname_accountnumber_type.xls
The task at hand is to receive a random list of accountnames and account numbers (the "type" is always a constant, ie: billing, invoice, shipping or order and they vary in format) and move them from Directory A into Directory B. I can get the list into a .csv file to match the accountname_accountnumber_type.
I have been trying to create a powershell script to reference the accountname_accountnumber and move those items from one directory A to directory B with no luck.
SAMPLE I found something a bit simpler, but I wanted to be able to edit this to create a new destination and not halt if the file is not found from this list. Also, if I could have this pick from a .txt list I think that would be easier than pasting everything.
$src_dir = "C:\DirA\"
$dst_dir = "D:\DirB-mm-dd-yyyy\" #This code requires the destination dir to already be there and I need to have the code output a new Directory, it could output based on date script run that would be amazing
$file_list = "accountname1_accountnumber001_type", #If I can select to import csv here
"accountname2_accountnumber002_type",
"accountname3_accountnumber003_type",
"accountname4_accountnumber004_type",
"accountname5_accountnumber005_type",
"accountname6_accountnumber006_type"
foreach ($file in $file_list) #This errors out and stops the script if the file is not located in source and I need to have the script continue and move on, with hopefully an error output
{
move-Item $src_dir$file $dst_dir
}
They can be any file format, I am trying to get the code to match ONLY the accountname and accountnumber since those two will define the exact customer. Whether it is invoice, billing or shipping doesn't matter since they want all files associated with that customer moved.
For Example there could be 4 of each for every account and the type format may vary from pdf, doc and xls, I need to move all files based on their first two indicators (accountname,accountnumber).
alice_001_invoice.pdf
alice_001_billing.doc
alice_001_shipping.pdf
alice_001_order.xls
George_245_invoice.pdf
George_245_billing.doc
George_245_shipping.pdf
George_245_order.xls
Bob_876_invoice.pdf
Bob_876_billing.doc
Bob_876_shipping.pdf
Bob_876_order.xls
Horman_482_invoice.pdf
Horman_482_billing.doc
Horman_482_shipping.pdf
Horman_482_order.xls
CSV:
accountname,accountnumber
Alice,001
George,245
Bob,876
Horman,482
How about this :
$CurrentDate = [DateTime]::Now.ToString("MM-dd-yyyy")
$DestinationDir = "D:\DirB-$CurrentDate"
New-Item $DestinationDir -ItemType Directory -ErrorAction SilentlyContinue
$AccountToMove = Import-CSV $CSVPath
Foreach ( $Account In $AccountToMove ){
$FilePattern = "*$($Account.AccountName)*$($Account.AccountNumber)*"
ls $SourceDir | Where Name -like $FilePattern | Move-Item -Destination $DestinationDir
}
The code part - you already edited off from the post - about moving files to subdirectories doesn't make much sense with your business rules. As you never show the sample CSV file contents, it's all guessing.
For easier processing, assume you got the following source files. Edit your post to show the CSV file contents and where you would like to move the files.
C:\some\path\A\Alice_001_bill.doc
C:\some\path\A\Alice_001_invoice.xls
C:\some\path\A\Bob_002_invoice.pdf
C:\some\path\A\Bob_002_invoice.doc
C:\some\path\A\Eve_003_bill.xls
C:\some\path\A\Eve_003_invoice.doc