Rename Files with Index(Excel) - powershell

Anyone have any ideas on how to rename files by finding an association with an index file?
I have a file/folder structure like the following:
Folder name = "Doe, John EO11-123"
Several files under this folder
The index file(MS Excel) has several columns. It contains the names in 2 columns(First and Last). It also has a column containing the number EO11-123.
What I would like to do is write maybe a script to look at the folder names in a directory, compare/find an associated value in the index file(like that number EO11-123) and then rename all the files under the folder using a 4th column value in the index.
So,
Folder name = "Doe, John EO11-123", index column1 contains same value "EO11-123", use column2 value "111111_000000" and rename all the files under that directory folder to "111111_000000_0", "111111_000000_1", "111111_000000_2" and so on.
This possible with powershell or vbscript?

Ok, I'll answer your questions in your comment first. Importing the data into PowerShell allows you to make an array in powershell that you can match against, or better yet make a HashTable to reference for your renaming purposes. I'll get into that later, but it's way better than trying to have PowerShell talk to Excel and use Excel's search functions because this way it's all in PowerShell and there's no third party application dependencies. As for importing, that script is a function that you can load into your current session, so you run that function and it will automatically take care of the import for you (it opens Excel, then opens the XLS(x) file, saves it as a temp CSV file, closes Excel, imports that CSV file into PowerShell, and then deletes the temp file).
Now, you did not state what your XLS file looks like, so I'm going to assume it's got a header row, and looks something like this:
FirstName | Last Name | Identifier | FileCode
Joe | Shmoe | XA22-573 | JS573
John | Doe | EO11-123 | JD123
If that's not your format, you'll need to either adapt my code, or your file, or both.
So, how do we do this? First, download, save, and if needed unblock the script to Import-XLS. Then we will dot source that file to load the function into the current PowerShell session. Once we have the function we will run it and assign the results to a variable. Then we can make an empty hashtable, and for each record in the imported array create an entry in the hashtable where the 'Identifier' property (in your example above that would be the one that has the value "EO11-123" in it), make that the Key, then make the entire record the value. So, so far we have this:
#Load function into current session
. C:\Path\To\Import-XLS.ps1
$RefArray = Import-XLS C:\Path\To\file.xls
$RefHash = #{}
$RefArray | ForEach( $RefHash.Add($_.Identifier, $_)}
Now you should be able to reference the identifier to access any of the properties for the associated record such as:
PS C:\> $RefHash['EO11-123'].FileCode
JD123
Now, we just need to extract that name from the folder, and rename all the files in it. Pretty straight forward from here.
Get-ChildItem c:\Path\to\Folders -directory | Where{$_.Name -match "(?<= )(\S+)$"}|
ForEach{
$Files = Get-ChildItem $_.FullName
$NewName = $RefHash['$($Matches[1])'].FileCode
For($i = 1;$i -lt $files.count;$i++){
$Files[$i] | Rename-Item -New "$NewName_$i"
}
}
Edit: Ok, let's break down the rename process here. It is a lot of piping here, so I'll try and take it step by step. First off we have Get-ChildItem that gets a list of folders for the path you specify. That part's straight forward enough. Then it pipes to a Where statement, that filters the results checking each one's name to see if it matches the Regular Expression "(?<= )(\S+)$". If you are unfamiliar with how regular expressions work you can see a fairly good breakdown of it at https://regex101.com/r/zW8sW1/1. What that does is matches any folders that have more than one "word" in the name, and captures the last "word". It saves that in the automatic variable $Matches, and since it captured text, that gets assigned to $Matches[1]. Now the code breaks down here because your CSV isn't laid out like I had assumed, and you want the files named differently. We'll have to make some adjustments on the fly.
So, those folder that pass the filter will get piped into a ForEach loop (which I had a typo in previously and had a ( instead of {, that's fixed now). So for each of those folders it starts off by getting a list of files within that folder and assigning them to the variable $Files. It also sets up the $NewName variable, but since you don't have a column in your CSV named 'FileCode' that line won't work for you. It uses the $Matches automatic variable that I mentioned earlier to reference the hashtable that we setup with all of the Identifier codes, and then looks at a property of that specific record to setup the new name to assign to files. Since what you want and what I assumed are different, and your CSV has different properties we'll re-work both the previous Where statement, and this line a little bit. Here's how that bit of the script will now read:
Get-ChildItem c:\Path\to\Folders -directory | Where{$_.Name -match "^(.+?), .*? (\S+)$"}|
ForEach{
$Files = Get-ChildItem $_.FullName
$NewName = $Matches[2] + "_" + $Matches[1]
That now matches the folder name in the Where statement and captures 2 things. The first thing it grabs is everything at the beginning of the name before the comma. Then it skips everything until it gets tho the last piece of text at the end of the name and captures everything after the last space. New breakdown on RegEx101: https://regex101.com/r/zW8sW1/2
So you want the ID_LName, which can be gotten from the folder name, there's really no need to even use your CSV file at this point I don't think. We build the new name of the files based off the automatic $Matches variable using the second capture group and the first capture group and putting an underscore between them. Then we just iterate through the files with a For loop basing it off how many files were found. So we start with the first file in the array $Files (record 0), add that to the $NewName with an underscore, and use that to rename the file.

Related

Choose which CSV to import when running a PowerShell script

I get a CSV every week that our finance team puts in a shared drive. I have a script for that CSV that I run once I get it.
The first command of the script is of course Import-Csv.
The problem is, the finance team insists on naming the file differently each time plus they don't always put it in the same location within the drive.
As a result, I have to first hunt for the file, put it into the directory that the script points to and then rename the file.
I've tried talking to the team about putting it in the same location and making sure the filename is the same but they only follow the instructions for a couple of weeks before just doing whatever.
Ideally, I'd like for it so that when I run the script, there would be a popup that would ask me to pick a CSV (Similar to how it looks when you do "Save As" on an Office Document).
Anyway for this to be done within PowerShell?
You can access .Net classes and interface with the forms library to instantiate and take input from the standard FileOpen dialog. Something like below:
Using Namespace System.Windows.Forms
$FileBrowser = [OpenFileDialog]::new()
$FileBrowser.InitialDirectory = 'c:\temp'
$FileBrowser.Filter = 'Comma Separated Values (*.csv) | *.csv'
[Void]$FileBrowser.ShowDialog()
$CsvFile = $FileBrowser.FileName
Then use $CsvFile int he Import-Csv command.
You can change the .InitialDirectory property to make navigating a little more convenient.
Use the .Filter property to limit the file open display to CSV files, to make things that much more convenient.
Also, use the [Void] class to prevent the status return (usually 'OK' or 'Cancel') from echoing to the screen.
Note: A simple Google search will turn up many examples. I refined some of the work from here. That will also document some of the other properties if you want to explore etc.
If you are willing to settle for a selection box that doesn't look as nice as the Save As dialog, you can use Out-Gridview. Something along these lines might help.
$filenames =
#(Get-ChildItem -Path C:\temp -Recurse -Filter *.csv |
Sort-Object LastWriteTime -Descending |
Out-GridView -Title 'Choose a file' -PassThru)
$csvfile = $filenames[0].FullName
Import-Csv $csvfile | More
The -Path specifies a directory that contains all the locations where your csv file might be delivered. The sort is just to put the recently written files at the top of the grid. This supposedly makes selection easier. The #() wrapper merely makes sure the result stored in $filenames is an array.
You would do something else with the results of Import-Csv.
Steven's response certainly satisfies your original question, but an alternative would be to let PowerShell do the work. If you know the drive, and you know the name of the file this week, you can pass the name to your script and let it search the drive filtering on the specific csv file you need. Make it recursive, and open the only file that matches. Sorry, didn't have time yesterday to include code. Here's a function that returns the full file path when provided with a top level search path and a filename with possible wildcards.
function gfp { $result=gci $args[0] -recurse -include $args[1]; return ($result.DirectoryName + "\" + $result.Name) }
Example: gfp "d:\rootfolder" "thisweeksfilename.csv"

How do I copy a list of files and rename them in a PowerShell Loop

We are copying a long list of files from their different directories into a single location (same server). Once there, I need to rename them.
I was able to move the files until I found out that there are duplicates in the list of file names to move (and rename). It would not allow me to copy the file multiple times into the same destination.
Here is the list of file names after the move:
"10.csv",
"11.csv",
"12.csv",
"13.csv",
"14.csv",
"15.csv",
"16.csv",
"17.csv",
"18.csv",
"19.csv",
"20.csv",
"Invoices_Export(16) - Copy.csv" (this one's name should be "Zebra.csv")
I wrote a couple of foreach loops, but it is not working exactly correctly.
The script moves the files just fine. It is the rename that is not working the way I want. The first file does not rename; the other files rename. However, they leave the moved file in place too.
This script requires a csv that has 3 columns:
Path of the file, including the file name (eg. c:\temp\smefile.txt)
Destination of the file, including the file name (eg. c:\temp\smefile.txt)
New name of the file. Just the name and extention.
# Variables
$Path = (import-csv C:\temp\Test-CSV.csv).Path
$Dest = (import-csv C:\temp\Test-CSV.csv).Destination
$NN = (import-csv C:\temp\Test-CSV.csv).NewName
#Script
foreach ($D in $Dest) {
$i -eq 0
Foreach ($P in $Path) {
Copy-Item $P -destination C:\Temp\TestDestination -force
}
rename-item -path "$D" -newname $NN[$i] -force
$i += 1
}
There were no error per se, just not the outcome that I expected.
Welcome to Stack Overflow!
There are a couple ways to approach the duplicate names situation:
Check if the file exists already in the destination with Test-Path. If it does, start a while loop that appends a number to the end of the name and check if that exists. Increment the number you append after each check with Test-Path. Keep looping until Test-Path comes back $false and then break out of the loop.
Write an error message and skip that row in the CSV.
I'm going to show a refactored version of your script with approach #2 above:
$csv = Import-Csv 'C:\temp\Test-CSV.csv'
foreach ($row in $csv)
{
$fullDestinationPath = Join-Path -Path $row.Destination -ChildPath $row.NewName
if (Test-Path $fullDestinationPath)
{
Write-Error ("The path '$fullDestinationPath' already exists. " +
"Skipping row for $($row.Path).")
continue
}
# You may also want to check if $row.Path exists before attempting to copy it
Copy-Item -Path $row.Path -Destination $fullDestinationPath
}
Now that your question is answered, here are some thoughts for improving your code:
Avoid using acronyms and abbreviations in identifiers (variable names, function names, etc.) when possible. Remember that code is written for humans and someone else has to be able to understand your code; make everything as obvious as possible. Someone else will have to read your code eventually, even if it's Future-You™!
Don't Repeat Yourself (called the "DRY" principle). As Lee_daily mentioned in the comments, you don't need to import the CSV file three times. Import it once into a variable and then use the variable to access the properties.
Try to be consistent. PowerShell is case-insensitive, but you should pick a style and stick to it (i.e. ForEach or foreach, Rename-Item or rename-item, etc.). I would recommend PascalCase as PowerShell cmdlets are all in PascalCase.
Wrap literal paths in single quotes (or double quotes if you need string interpolation). Paths can have spaces in them and without quotes, PowerShell interprets a space as you are passing another argument.
$i -eq 0 is not an assignment statement, it is a boolean expression. When you run $i -eq 0, PowerShell will return $true or $false because you are asking it if the value stored in $i is 0. To assign the value 0 to $i, you need to write it like this: $i = 0.
There's nothing wrong with $i += 1, but it could be shortened to $i++, if you want to.
When you can, try to check for common issues that may come up with your code. Always think about what can go wrong. "If I copy a file, what can go wrong? Does the source file or folder exist? Is the name pulled from the CSV a valid path name or does it contain characters that are invalid in a path (like :)?" This is called defensive programming and it will save you so so many headaches. As with anything in life, be careful not to go overboard. Only check for likely scenarios; rare edge-cases should just raise errors.
Write some decent logs so you can see what happened at runtime. PowerShell provides a pair of great cmdlets called Start-Transcript and Stop-Transcript. These cmdlets log all the output that was sent to the PowerShell console window, in addition to some system information like the version of PowerShell installed on the machine. Very handy!

how to check if file with same name but with different extension exists in a directory Powershell

I am trying to find two things here. I have thousands of files in a folder. lets take example of one file and We can apply same logic to all files
If file with same name but with different extension exists.
If it exists, I need to compare the lastwritetime or the timestamp to find out which file is newer.
For example, if I have a file culture.txt I supposed to have a corresponding file culture.log.
If I have culture.txt but culture.log file is missing, then its an issue, so I want to output names of all .txt file for which corresponding .log files are missing.
If both culture.txt and culture.log are available, then I want to check if the culture.txt was generated after culture.log. If culture.txt is generated before culture.log, there is an issue so, I need to output the names of such .txt files with this issue saying "Culture.txt was generated before culture.log- Please rerun the program".
Anyone who can help would be appreciated. Thank You.
A little more help needed on same question if I can get. The code suggested by Esperento is completely working fine but the requirement is updated. In a folder, I have multiple files with multiple extensions and not limited to just .txt and .log. I can have .doc, .docx, .xls and many other files in the same folder.
Now about updated requirement. I have to look for file names with 3 specific extensions only. One of them is program file. Which should be generated first obviously. Let’s say Culture.prog. then when I run the Culture.Prog two files will be generated like culture.log first and culture.txt respectively.
So obviously, the timestamp on prog is older than log and timestamp on log is older than txt which generated very last.
We have to check the availability of 2 corresponding files(log and prog) in reference to .txt file only which is generated last.
So, first check is, if 2 corresponding files are available for .txt file. Next check is the timestamp is corresponding for these 3 files in order. We have to output only if one of the condition is not satisfied, otherwise its ok if we don’t output anything. For example, if for culture.txt, if .log or .prog file is missing we have to output the fact that which or both files are missing. If the time stamp of txt file is older than log and/or prog we have to output that fact. I hope I am clear in my request. Thank you
try this:
#list file and group by name without extension
Get-ChildItem "C:\temp\test" -file -filter "*.*" | group Basename |
%{
$group= $_.group
# if not same name, missing message
if ($_.Count -eq 1)
{
"'{0}' are missing" -f $group.Name
}
#else search into current group file with great creation time and print message
else
{
$group | % {$file=$_; $group | %{if ($_.CreationTime -gt $file.CreationTime) {"'{0}' has beeen generated before '{1} " -f $file.Name, $_.Name} } }
}
} | out-file "C:\temp\test\result.txt"

Getting most recent objects in a folder excluding certain strings

I have folders that contain files, for the sake of the question, named as follows:
a-001.txt
a-002.txt
b-001.txt
b-002.txt
d-001.txt
d-002.txt
Now I am using PowerShell to initially order these files so that the top of the list is the most recent file in the folder:
d-002.txt
b-002.txt
a-001.txt
a-002.txt
b-001.txt
d-001.txt
EDIT: I then store the top X recent number of files into a variable. However, I want to ignore anything that starts with A if I already have one that begins with A in my array but still ensure I end up with X files which are the most recent. I.e. from above, I would want to end up with below if X was 4.
d-002.txt
b-002.txt
a-001.txt
b-001.txt
This is a simple example, the folders I am dealing with contain 1000s of files - with more complex naming conventions but the logic is the same. How can I handle this in PowerShell?
Removing the logic for any other Sort-Object and Select-Object criteria as you already have that addressed I present the following.
Get-ChildItem $somePath | Select-Object *,#{Label="Prefix";Expression={(($_.Name) -Split "-",2)[0]}} | Group-Object prefix | ForEach-Object{
$_.Group | Select-Object -First 1 -Property Fullname
}
What happens here is that we add a property to the output of Get-ChildItem called "Prefix". Now, your criteria might be more complicated but given the sample I assumed the files were being grouped by the contents of the name before the first "-". So we take every file name and build its prefix based on that. The magic comes from Group-Object which will group all items and then we just select the first one. In your case that would be the newest X amount. Let me know if you are having trouble integrating this.
Aside from grouping logic any sorting an what not would need to exists before the Select-Object in our example above.
FYI for other readers
There were issues with OP's actual data since the above code didnt work exactly. We worked it out in chat and using the same logic we able to address the OPs concern. The test data in the question and my answer work as intended.

Powershell - Splitting string into separate components

I am writing a script which will basically do the following:
Read from a text file some arguments:
DriveLetter ThreeLetterCode ServerName VolumeLetter Integer
Eg. W MSS SERVER01 C 1
These values happen to form a folder destination W:\MSS\, and a filename which works in the following naming convention:
SERVERNAME_VOLUMELETTER_VOL-b00X-iYYY.spi - Where The X is the Integer above
The value Y I need to work out later, as this happens to be the value of the incremental image (backups) and I need to work out the latest incremental.
So at the moment --> Count lines in file, and loop for this many lines.
$lines = Get-Content -Path PostBackupCheck-Textfile.txt | Measure-Object -Line
for ($i=0; $i -le $lines.Lines; $i++)
Within this loop I need to do a Get-Content to read off the line I am currently looking at i.e. line 0, line 1, line 2, as there will be multiple lines in the format I wrote at the beginning and split the line into an array, whereby each part of the file, as seen above naming convention, is in a[0], a[1], a[2]. etc
The reason for this is because, I need to then sort the folder that contains these, find the latest file, by date, and take the _iXXX.spi part and place this into the array value a[X] so I then have a complete filename to mount. This value will replace iYYY.spi
It's a little complex because I also have to make sure when I do a Get-ChildItem with -Include before I sort it all by date, I am only including the filename that matches the arguments fed to it from the text file :
So,
SERVER01_C_VOL-b001-iYYY.spi and not anything else.
i.e. not SERVER01_D_VOL-b001-iYYY.spi
Then take the iYYY value from the sort on the Get-ChildItem -Include and place that into the appropriate array item.
I've literally no idea where to start, so any ideas are appreciated!
Hopefully I've explained in enough detail. I have also placed the code on Pastebin: http://pastebin.com/vtFifTW6
This doesn't need to be that complex. You can start by operating over lines in your file with a simple pipeline:
Get-Content PostBackupCheck-Textfile.txt |
Foreach-Object {
$drive, $folder, $server, $volume, [int]$i = -split $_
...
}
The line inside the loop splits the current input line at spaces and assigns appropriate variables. This saves you the trouble of handling an array there. Everything that follows needs to be in said loop as well.
You can then construct the file name pattern:
$filename = "$server_$drive_VOL-b$($i.ToString('000'))-i*.spi"
which you can use to find all fitting files and sort them by date:
$lastFile = Get-ChildItem $filename | sort LastWriteTime | select -last 1