PowerShell Script - Report specific string from filename - powershell

I am currently trying to build a simple script that would report back a string from a filename
I have a directory full of files named as follows:
C:\[20141002134259308][302de103-6dc8-4e29-b303-5fdbd39c60c3][U0447744][10.208.15.40_54343][ABC_01].txt
C:\[20141002134239815][302de103-6dc8-4e29-b303-5fdbd39c60c3][U0011042][10.168.40.34_57350][ABC_01].txt
C:\[20141002134206386][302de103-6dc8-4e29-b303-5fdbd39c60c3][u1603381][10.132.171.132_54385][ABC_01].txt
C:\[2014100212260259][302de103-6dc8-4e29-b303-5fdbd39c60c3][U0010217][10.173.0.132_49921][ABC_01].txt
So, I'd like to extract from each filename the user ids that are identified starting with a letter U and seven digits, then create a new txt o csv and have all these Ids listed. That's it.

As Patrice pointed out, you really should try and do it yourself and come to us with the relevant piece of code you tried and the error that you are getting. That said, I'm bored, and this is really easy. I'd use a regex match against the name of the file, and then for each one that matched I'd output the captured string:
Get-ChildItem 'C:\Path\To\Files\*.txt' | Where{$_.Name -match "\[(U\d{7})\]"} | ForEach{$Matches[1]}
That will return:
U0447744
U0011042
u1603381
U0010217
If you want to output it to a file, just pipe that to Out-File, and specify the full path and name of the file you want to save that in.

Related

PowerShell `Select-String` selecting some but not others

I have a folder with a list of excel files in .xls and .xlsx formats. I have verified the strings exist in some of the files. When I run the below code, I can find some string patterns but not others. Example - I can find 'randomword' - but not '5625555555' or 'P-888452'. When I run -NotMatch on '5625555555' or 'P-888452' I do get a list of file names that do not match (although they return duplicated in many rows) so I know the pattern is registering. What could be happening here? Why is it playing nice with some string (it seems mostly letters) but no others (that contain integers).
gci "path" -Filter "*.xls" -Recurse -File | Select-String '\bANYTEXTORINT\b' | Select FileName
I also do not get an error when I run code. Just a return completed with white text and no results. I do get results for the 'randomword' though. Three files get returned that contain that pattern.

powershell -match Unexpect Results

i've written a simple PowerShell script that is designed to take a file name and then move the file into a particular folder.
The files in question are forms scanned in as PDF documents.
Within the file name, I have defined a unique string of characters used to help identify which form it is and then my script will sort the file into the correct folder.
I've captured the file name as a string variable.
I am using -match to evaluate the file name string variable and my issue is that match is acting like...well -like.
For example, my script looks for (F1) in the string and if this returns true my script will move the file into a folder named IT Account Requests.
This all works well until my script finds a file with (F10) in the name, as 'match' will evaluate the string and find a match for F1 also.
How can I use 'match' to return true for an exact string block?
I know this may sound like a fairly basic newbie question to ask, but how do I use -match to tell the different between the two file types?
I've scoured the web looking to learn how to force -match to do what I would like but maybe I need a re-think here and use something other than 'match' to gain the result I need?
I appreciate your time reading this.
Code Example:
$Master_File_name = "Hardware Request (F10).pdf"
if ($Master_File_name -match "(F1)"){Write-Output "yes"}
if ($Master_File_name -match "(F10)"){Write-Output "yes"}
Both if statements return 'yes'
-match does a regular expression based match against your string, meaning that the right-hand side argument is a regex pattern, not a literal string.
In regex, (F1) means "match on F and 1, and capture the substring as a separate group".
To match on the literal string (F1), escape the pattern either manually:
if($Master_File_Name -match '\(F1\)'){Write-Output 'yes'}
or have it done for you automatically using the Regex.Escape() method:
if($Master_File_Name -match [regex]::Escape('(F1)')){Write-Output 'yes'}

how to check if file with same name but with different extension exists in a directory Powershell

I am trying to find two things here. I have thousands of files in a folder. lets take example of one file and We can apply same logic to all files
If file with same name but with different extension exists.
If it exists, I need to compare the lastwritetime or the timestamp to find out which file is newer.
For example, if I have a file culture.txt I supposed to have a corresponding file culture.log.
If I have culture.txt but culture.log file is missing, then its an issue, so I want to output names of all .txt file for which corresponding .log files are missing.
If both culture.txt and culture.log are available, then I want to check if the culture.txt was generated after culture.log. If culture.txt is generated before culture.log, there is an issue so, I need to output the names of such .txt files with this issue saying "Culture.txt was generated before culture.log- Please rerun the program".
Anyone who can help would be appreciated. Thank You.
A little more help needed on same question if I can get. The code suggested by Esperento is completely working fine but the requirement is updated. In a folder, I have multiple files with multiple extensions and not limited to just .txt and .log. I can have .doc, .docx, .xls and many other files in the same folder.
Now about updated requirement. I have to look for file names with 3 specific extensions only. One of them is program file. Which should be generated first obviously. Let’s say Culture.prog. then when I run the Culture.Prog two files will be generated like culture.log first and culture.txt respectively.
So obviously, the timestamp on prog is older than log and timestamp on log is older than txt which generated very last.
We have to check the availability of 2 corresponding files(log and prog) in reference to .txt file only which is generated last.
So, first check is, if 2 corresponding files are available for .txt file. Next check is the timestamp is corresponding for these 3 files in order. We have to output only if one of the condition is not satisfied, otherwise its ok if we don’t output anything. For example, if for culture.txt, if .log or .prog file is missing we have to output the fact that which or both files are missing. If the time stamp of txt file is older than log and/or prog we have to output that fact. I hope I am clear in my request. Thank you
try this:
#list file and group by name without extension
Get-ChildItem "C:\temp\test" -file -filter "*.*" | group Basename |
%{
$group= $_.group
# if not same name, missing message
if ($_.Count -eq 1)
{
"'{0}' are missing" -f $group.Name
}
#else search into current group file with great creation time and print message
else
{
$group | % {$file=$_; $group | %{if ($_.CreationTime -gt $file.CreationTime) {"'{0}' has beeen generated before '{1} " -f $file.Name, $_.Name} } }
}
} | out-file "C:\temp\test\result.txt"

Search for string in multiple files go to next file after 1st occurrence in powershell

I have 10k of html files and about 3k of them include youtube video embed code, so i search for the string to get list of those files like this:
$LocalListOfYTPosts = Get-ChildItem "$outputfolder*.html" -recurse |
Select-String -pattern "https://www.youtube.com/embed" |
group path |
select name -ExpandProperty Name
Youtube embed code is included only once per file, so i thought that i could speed things up if search will go to next file after 1st occurrence of the string (I'm not sure if that's actually correct)
Is there a way to tell to powershell to go to next file if the string has been found in in the current one?
(Comment -> Answer)
Try the -List option on Select-String:
Returns only the first match in each input file. By default, Select-String returns a MatchInfo object for each match it finds.

Rename Files with Index(Excel)

Anyone have any ideas on how to rename files by finding an association with an index file?
I have a file/folder structure like the following:
Folder name = "Doe, John EO11-123"
Several files under this folder
The index file(MS Excel) has several columns. It contains the names in 2 columns(First and Last). It also has a column containing the number EO11-123.
What I would like to do is write maybe a script to look at the folder names in a directory, compare/find an associated value in the index file(like that number EO11-123) and then rename all the files under the folder using a 4th column value in the index.
So,
Folder name = "Doe, John EO11-123", index column1 contains same value "EO11-123", use column2 value "111111_000000" and rename all the files under that directory folder to "111111_000000_0", "111111_000000_1", "111111_000000_2" and so on.
This possible with powershell or vbscript?
Ok, I'll answer your questions in your comment first. Importing the data into PowerShell allows you to make an array in powershell that you can match against, or better yet make a HashTable to reference for your renaming purposes. I'll get into that later, but it's way better than trying to have PowerShell talk to Excel and use Excel's search functions because this way it's all in PowerShell and there's no third party application dependencies. As for importing, that script is a function that you can load into your current session, so you run that function and it will automatically take care of the import for you (it opens Excel, then opens the XLS(x) file, saves it as a temp CSV file, closes Excel, imports that CSV file into PowerShell, and then deletes the temp file).
Now, you did not state what your XLS file looks like, so I'm going to assume it's got a header row, and looks something like this:
FirstName | Last Name | Identifier | FileCode
Joe | Shmoe | XA22-573 | JS573
John | Doe | EO11-123 | JD123
If that's not your format, you'll need to either adapt my code, or your file, or both.
So, how do we do this? First, download, save, and if needed unblock the script to Import-XLS. Then we will dot source that file to load the function into the current PowerShell session. Once we have the function we will run it and assign the results to a variable. Then we can make an empty hashtable, and for each record in the imported array create an entry in the hashtable where the 'Identifier' property (in your example above that would be the one that has the value "EO11-123" in it), make that the Key, then make the entire record the value. So, so far we have this:
#Load function into current session
. C:\Path\To\Import-XLS.ps1
$RefArray = Import-XLS C:\Path\To\file.xls
$RefHash = #{}
$RefArray | ForEach( $RefHash.Add($_.Identifier, $_)}
Now you should be able to reference the identifier to access any of the properties for the associated record such as:
PS C:\> $RefHash['EO11-123'].FileCode
JD123
Now, we just need to extract that name from the folder, and rename all the files in it. Pretty straight forward from here.
Get-ChildItem c:\Path\to\Folders -directory | Where{$_.Name -match "(?<= )(\S+)$"}|
ForEach{
$Files = Get-ChildItem $_.FullName
$NewName = $RefHash['$($Matches[1])'].FileCode
For($i = 1;$i -lt $files.count;$i++){
$Files[$i] | Rename-Item -New "$NewName_$i"
}
}
Edit: Ok, let's break down the rename process here. It is a lot of piping here, so I'll try and take it step by step. First off we have Get-ChildItem that gets a list of folders for the path you specify. That part's straight forward enough. Then it pipes to a Where statement, that filters the results checking each one's name to see if it matches the Regular Expression "(?<= )(\S+)$". If you are unfamiliar with how regular expressions work you can see a fairly good breakdown of it at https://regex101.com/r/zW8sW1/1. What that does is matches any folders that have more than one "word" in the name, and captures the last "word". It saves that in the automatic variable $Matches, and since it captured text, that gets assigned to $Matches[1]. Now the code breaks down here because your CSV isn't laid out like I had assumed, and you want the files named differently. We'll have to make some adjustments on the fly.
So, those folder that pass the filter will get piped into a ForEach loop (which I had a typo in previously and had a ( instead of {, that's fixed now). So for each of those folders it starts off by getting a list of files within that folder and assigning them to the variable $Files. It also sets up the $NewName variable, but since you don't have a column in your CSV named 'FileCode' that line won't work for you. It uses the $Matches automatic variable that I mentioned earlier to reference the hashtable that we setup with all of the Identifier codes, and then looks at a property of that specific record to setup the new name to assign to files. Since what you want and what I assumed are different, and your CSV has different properties we'll re-work both the previous Where statement, and this line a little bit. Here's how that bit of the script will now read:
Get-ChildItem c:\Path\to\Folders -directory | Where{$_.Name -match "^(.+?), .*? (\S+)$"}|
ForEach{
$Files = Get-ChildItem $_.FullName
$NewName = $Matches[2] + "_" + $Matches[1]
That now matches the folder name in the Where statement and captures 2 things. The first thing it grabs is everything at the beginning of the name before the comma. Then it skips everything until it gets tho the last piece of text at the end of the name and captures everything after the last space. New breakdown on RegEx101: https://regex101.com/r/zW8sW1/2
So you want the ID_LName, which can be gotten from the folder name, there's really no need to even use your CSV file at this point I don't think. We build the new name of the files based off the automatic $Matches variable using the second capture group and the first capture group and putting an underscore between them. Then we just iterate through the files with a For loop basing it off how many files were found. So we start with the first file in the array $Files (record 0), add that to the $NewName with an underscore, and use that to rename the file.