find string from a group of files - powershell

I have a group of txt files contain similar strings like this:
Windows 7 Professional Service Pack 1
Product Part No.: *****
Installed from 'Compliance Checked Product' media.
Product ID: 0000-0000-0000 match to CD Key data
CD Key: xxxxx-xxxxx-xxxxx-xxxxx-xxxxx
Computer Name: COMP001
Registered Owner: ABC
Registered Organization:
Microsoft Office Professional Edition 2003
Product ID: 00000-00000-00000-00000
CD Key: xxxxx-xxxxx-xxxxx-xxxxx-xxxxx
How may I pick all office keys one time and save into another file?
My code:
$content = Get-ChildItem -Path 'S:\New folder' -Recurse |
where Name -like "b*" |
select name
Get-Content $content
I get a list of files name but it wouldn't run for Get-Content.

The code you posted doesn't work, because $content contains a list of custom objects with one property (name) containing just the file name without path. Since you're apparently not listing the files in the current working directory, but some other folder (S:\New Folder), you need the full path to those files (property FullName) if you want to be able to read them. Also, the property isn't expanded automatically. You must either expand it when enumerating the files:
$content = Get-ChildItem -Path 'S:\New folder' -Recurse |
Where-Object { $_.Name -like "b*" } |
Select-Object -Expand FullName
Get-Content $content
or when passing the value to Get-Content:
$content = Get-ChildItem -Path 'S:\New folder' -Recurse |
Where-Object { $_.Name -like "b*" } |
Select-Object FullName
Get-Content $content.FullName
With that out of the way, none of the code you have does even attempt to extract the data you're looking for. Assuming that the license information blocks in your files is always separated by 2 or more consecutive line breaks you could split the content of the files at consecutive line breaks and extract the information with a regular expression like this:
Get-ChildItem -Path 'S:\New folder' -Recurse | Where-Object {
-not $_.PSIsContainer -and
$_.Name -like "b*"
} | ForEach-Object {
# split content of each file into individual license information fragments
(Get-Content $_.FullName | Out-String) -split '(\r?\n){2,}' | Where-Object {
# filter for fragments that contain the string "Microsoft Office" and
# match the line beginning with "CD Key: " in those fragments
$_ -like '*Microsoft Office*' -and
$_ -match '(?m)(?<=^CD Key: ).*'
} | ForEach-Object {
# remove leading/trailing whitespace from the extracted key
$matches[0].Trim()
}
} | Set-Content 'C:\output.txt'
(\r?\n){2,} is a regular expression that matches 2 or more consecutive line breaks (both Windows and Unix style).
(?m)(?<=^CD Key: ).* is a regular expression that matches a line beginning with the string CD Key: and returns the rest of the line after that string. (?<=...) is a so-called positive lookbehind assertion that is used for matching a pattern without including it in the returned value. (?m) is a regular expression option that allows ^ to match the beginning of a line inside a multiline string instead of just the beginning of the string.

try Something like this:
Get-ChildItem "c:\temp" -file -filter "*.txt" |
%{select-string -Path $_.FullName -Pattern "CD Key:" } | select line | export-csv "c:\temp\found.csv" -notype

If you want computer information you can do it (-context take N rows before and M rows after example -context 3, 2 take 3 before and 2 after) :
Get-ChildItem "c:\temp" -file -filter "*.txt" |
%{select-string -Path $_.FullName -Pattern "CD Key:" -context 6,0 } | where {$_.Context.PreContext[0] -like 'Computer Name:*'} |
select Line, #{Name="Computer";E={($_.Context.PreContext[0] -split ':')[1] }} | export-csv "c:\temp\found.csv" -notype

Or classically:
Get-ChildItem "c:\temp" -file -filter "*.txt" | foreach{
$CurrenFile=$_.FullName
#split current file rows to 2 column with ':' like delimiter
$KeysValues=get-content $CurrenFile | ConvertFrom-String -Delimiter ":" -PropertyNames Key, Value
#if file contains CD Key, its good file
if ($KeysValues -ne $null -and $KeysValues[2].Key -eq 'CD Key')
{
#build object with asked values
$Object=[pscustomobject]#{
File=$CurrenFile
ComputerName=$KeysValues[3].Value
OfficeKey=$KeysValues[7].Value
}
#send objet to standard output
$Object
}
} | export-csv "c:\temp\found.csv" -notype

Related

How can i search for multiple string patterns in text files within a directory

I have a textbox that takes an input and searches a drive.
Drive for example is C:/users/me
let's say I have multiple files and subdirectories in there and I would like to search if the following strings exist in the file: "ssn" and "DOB"
Once user inputs the two strings. I split the string but space, so I can loop through the array. But here is my current code, but I'm stuck on how to proceed.
gci "C:\Users\me" -Recurse | where { ($_ | Select-String -pattern ('SSN') -SimpleMatch) -or ($_ | Select-String -pattern ('DOB') -SimpleMatch ) } | ft CreationTime, Name -Wrap -GroupBy Directory | Out-String
this above code works if i pasted it manually into powershell but I'm trying to recreate this in a script but having confusion and how to do so.
this code below is not getting all the files needed.
foreach ($x in $StringArrayInputs) {
if($x -eq $lastItem){
$whereClause = ($_ | Select-String -Pattern $x)
}else{
$whereClause = ($_ | Select-String -Pattern $x) + '-or'
}
$files= gci $dv -Recurse | Where { $_ | Select-String -Pattern $x -SimpleMatch} | ft CreationTime, Name -Wrap -GroupBy Directory | Out-String
}
Select-String's -Pattern parameter accepts an array of strings (any one of which triggers a match), so piping directly to a single Select-String call should do:
$files= Get-ChildItem -File -Recurse $dv |
Select-String -List -SimpleMatch -Pattern $StringArrayInputs } |
Get-Item |
Format-Table CreationTime, Name -Wrap -GroupBy Directory |
Out-String
Note:
Using -File with Get-ChildItem makes it return only files, not also directories.
Using -List with Select-String is an optimization that ensures that at most one match per file is looked for and reported.
Passing Select-String's output to Get-Item automatically binds the .Path property of the former's output to the -Path parameter of the latter.
Strictly speaking, binding to -Path subjects the argument to interpretation as a wildcard expression, which, however, is generally not a concern - except if the path contains [ characters.
If that is a possibility, insert a pipeline segment with Select-Object #{ Name='LiteralPath'; Expression='Path' } before Get-Item, which ensures binding to -LiteralPath instead.
I just followed your examples and combined both with a regex. I escaped the regex to avoid accidential usage of expressions (like a dot for any char).
It is working with my testfiles but may differ with your files. You may need to add " -Encoding UTF8" with your appropriate encoding so you may get regional specific chars as well.
$String = Read-Host "Enter multiple strings seperated by space to search for"
$escapedRegex = ([Regex]::Escape($String)) -replace "\\ ","|"
Get-ChildItem -Recurse -Attributes !Directory | Where-Object {
$_ | Get-Content | Select-String -Pattern $escapedRegex
} | Format-Table CreationTime, Name -Wrap -GroupBy Directory | Out-String

Using Powershell, how to return a list of files based on the existence of duplicate files with a different naming convention?

There are multiple .webp files in a project folder. Some .webps are the original picture and some function as thumbnail (their size is different). The used naming convention is: original files are just called NAME.webp and tumbnails are NAME-thumb.webp.
I am trying to return all .webp files based on if the corresponding thumb-webp exists. So if picture SAMPLE.webp has a SAMPLE-thumb.webp, don't add this file to the list. But if SAMPLE.webp doesn't have a corresponding SAMPLE-thumb.webp, then do at it to the list.
This is what i've tried so far:
$example = Get-ChildItem -File $dir\*.webp |
Group-Object { $_.BaseName } |
Where-Object { $_.Name -NotContains "-thumb" } |
ForEach-Object Group
You can get this without the grouping with a Where-Object and testing paths.
Get-ChildItem -File $dir\*.webp |
Where-Object {$_.Name -notmatch "-thumb" -and -not(Test-Path ($_.FullName -replace ".webp","-thumb.webp"))}
This should get you a list of all the files that do not have a corresponding thumbnail file.
You can do the following:
(Get-ChildItem $dir\*.webp -File |
Group-Object {$_.BaseName -replace '-thumb$'} |
Where Count -eq 1).Group
You must have a commonality with grouping. Replacing the ending -thumb in the BaseName property creates that. If there is no filename and filename-thumb the resulting GroupInfo will have a count value of 1.
Using the syntax ().Group returns all file objects. If you want to process code against each file, you may use Foreach-Object instead:
Get-ChildItem $dir\*.webp -File |
Group-Object {$_.BaseName -replace '-thumb$'} |
Where Count -eq 1 | Foreach-Object {
$_.Group
}

Correction in sub folder names by replacing first two characters, if needed

I am using below Powershell script which successfully traverses through all my case folders within the main folder named Test. What it is incapable of doing is to rename each sub folder, if required, as can be seen in current and desired output. Script should first sort the sub folders based on current numbering and then give them proper serial numbers as folder name prefix by replacing undesired serial numbers.
I have hundreds of such cases and their sub folders which need to be renamed properly.
The below output shows two folders named "352" and "451" (take them as order IDs for now) and each of these folders have some sub-folders with a 2 digit prefix in their names. But as you can notice they are not properly serialized.
$Search = Get-ChildItem -Path "C:\Users\User\Desktop\test" -Filter "??-*" -Recurse -Directory | Select-Object -ExpandProperty FullName
$Search | Set-Content -Path 'C:\Users\User\Desktop\result.txt'
Below is my current output:
C:\Users\User\Desktop\test\Case-352\02-Proceedings
C:\Users\User\Desktop\test\Case-352\09-Corporate
C:\Users\User\Desktop\test\Case-352\18-Notices
C:\Users\User\Desktop\test\Case-451\01-Contract
C:\Users\User\Desktop\test\Case-451\03-Application
C:\Users\User\Desktop\test\Case-451\09-Case Study
C:\Users\User\Desktop\test\Case-451\14-Violations
C:\Users\User\Desktop\test\Case-451\21-Verdict
My desired output is as follows:
C:\Users\User\Desktop\test\Case-352\01-Proceedings
C:\Users\User\Desktop\test\Case-352\02-Corporate
C:\Users\User\Desktop\test\Case-352\03-Notices
C:\Users\User\Desktop\test\Case-451\01-Contract
C:\Users\User\Desktop\test\Case-451\02-Application
C:\Users\User\Desktop\test\Case-451\03-Case Study
C:\Users\User\Desktop\test\Case-451\04-Violations
C:\Users\User\Desktop\test\Case-451\05-Verdict
Thank you so much. If my desired functionality can be extended to this script, it will be of great help.
Syed
You can do the following based on what you have posted:
$CurrentParent = $null
$Search = Get-ChildItem -Path "C:\Users\User\Desktop\test" -Filter '??-*' -Recurse -Directory | Where Name -match '^\d\d-\D' | Foreach-Object {
if ($_.Parent.Name -eq $CurrentParent) {
$Increment++
} else {
$CurrentParent = $_.Parent.Name
$Increment = 1
}
$CurrentNumber = "{0:d2}" -f $Increment
Join-Path $_.Parent.FullName ($_.Name -replace '^\d\d',$CurrentNumber)
}
$Search | Set-Content -Path 'C:\Users\User\Desktop\result.txt'
I added Where to filter more granularly beyond what -Filter allows.
-match and -replace both use regex to perform the matching. \d is a digit. \D is a non-digit. ^ matches the position at the beginning of the string.
The string format operator -f is used to maintain the 2-digit requirement. If you happen to reach 3-digit numbers, then 3 digit numbers will be output instead.
You can take this further to perform a rename operation:
$CurrentParent = $null
Get-ChildItem . -Filter '??-*' -Recurse -Directory | Where Name -match '^\d\d-\D' | Foreach-Object {
if ($_.Parent.Name -eq $CurrentParent) {
$Increment++
} else {
$CurrentParent = $_.Parent.Name
$Increment = 1
}
$CurrentNumber = "{0:d2}" -f $Increment
$NewName = $_.Name -replace '^\d\d',$CurrentNumber
$_ | Where Name -ne $NewName | Rename-Item -NewName $NewName -WhatIf
}
$NewName is used to simply check if the new name already exists. If it does, a rename will not happen for that object. Remove the -WhatIf if you are happy with the results.

Copy files listed in a txt document, keeping multiple files of the same name in PowerShell

I have a bunch of lists of documents generated in powershell using this command:
Get-ChildItem -Recurse |
Select-String -Pattern "acrn164524" |
group Path |
select Name > test.txt
In this example it generates a list of files containing the string acrn164524 the output looks like this:
Name
----
C:\data\logo.eps
C:\data\invoice.docx
C:\data\special.docx
InputStream
C:\datanew\special.docx
I have been using
Get-Content "test.txt" | ForEach-Object {
Copy-Item -Path $_ -Destination "c:\destination\" -Recurse -Container -Force
}
However, this is an issue if two or more files have the same name and also throws a bunch of errors for any lines in the file that are not a path.
sorry if I was not clear enough I would like to keep files with the same name by appending something to the end of the file name.
You seem to want the files, not the output of Select-String. So let's keep the files.
Get-ChildItem -Recurse -File | Where-Object {
$_ | Select-String acrn164524 -Quiet
} | Select-Object -ExpandProperty FullName | Out-File test.txt
Here
-File will make Get-ChildItem only return actual files. Think
about using a filter like *.txt to reduce the workload more.
-Quiet will make Select-String return $true or $false, which
is perfect for Where-Object.
Instead of Select-Object -ExpandProperty X in order to retrieve an array of raw property values (as opposed to an array of PSObjects, which is what Select-Object would normally do), it's simpler to use ForEach-Object X instead.
Get-ChildItem -Recurse -File | Where-Object {
$_ | Select-String acrn164524 -Quiet
} | ForEach-Object FullName | Out-File test.txt

Need PS Get-Childitem to not truncate name results

This is what I'm running:
Get-Childitem $("C:\Powershell Tests\Group 1") -Recurse -Force | where { -not$_.PSIsContainer } | group name -NoElement | sort name > "C:\Powershell Tests\Group 1.txt"
I'm later using this text file and comparing with the names in another to see what he differences are between the two.
In the text file I'm getting the name truncated with "..."
What can I add so that it doesn't truncate so that I can compare?
PowerShell outputs objects, not text.
If you want to output the file's names, then select the names and output them:
Get-ChildItem "C:\PowerShell Tests\Group 1" -Recurse -Force |
Where-Object { -not $_.PSIsContainer } |
Select-Object -ExpandProperty Name |
Sort-Object -Unique |
Out-File "C:\Powershell Tests\Group 1.txt"
Notes:
you don't need the subexpression operator, $( ), for the parameter to Get-ChildItem.
I removed your call to Group-Object. (It looked to me like you want a sorted list of unique file names.)