Deleting CSV the entire row if text in a column matches a specific path or a file name - powershell

I'm new to Powershell so please try to explain things a little bit too if you can. I'm trying to export the contents of a directory along with some other information in a CSV .
The CSV file contains information about the files however, I just need to match the FileName column (which contains the full path). If it's matched, I need to delete the entire row.
$folder1 = OldFiles
$folder2 = Log Files\January
$file1 = _updatehistory.txt
$file2 = websites.config
In the CSV file, if any of these is matched, the entire row must be deleted. The CSV file contains FileName in this manner:
**FileName**
C:\Installation\New Applications\Root
I've tried doing this:
Import-csv -Path "C:\CSV\Recursion.csv" | Where-Object { $_.FileName -ne $folder2} | Export-csv -Path "C:\CSV\RecursionUpdated.csv" -NoTypeInformation
But it's not working out. I would really appreciate help here.

It looks like you want to match only parts of the full path, so you should use -like or -match operators (or their negated variants) which can do non-exact matching:
$excludes = '*\OldFiles', '*\Log Files\January', '*\_updatehistory.txt', '*\websites.config'
Import-csv -Path "C:\CSV\Recursion.csv" |
Where-Object {
# $matchesExclude Will be $true if at least one exclude pattern matches
# against FileName. Otherwise it will be $null.
$matchesExclude = foreach( $exclude in $excludes ) {
# Output $true if pattern matches, which will be captured in $matchesExclude.
if( $_.FileName -like $exclude ) { $true; break }
}
# This outputs $true if the filename is not excluded, thus Where-Object
# passes the row along the pipeline.
-not $matchesExclude
} | Export-csv -Path "C:\CSV\RecursionUpdated.csv" -NoTypeInformation
This code makes heavily use of PowerShell's implicit output behaviour. E. g. the literal $true in the foreach loop body is implicit output which will be automatically captured in $matchesExclude. If it were not for the assignment $matchesExclude = foreach ..., the value would have been written to the console instead (if not captured somewhere else in the callstack).

Related

Duplicate lines in a text file multiple times based on a string and alter duplicated lines

SHORT: I am trying to duplicate lines in all files in a folder based on a certain string and then replace original strings in duplicated lines only.
Contents of the original text file (there are double quotes in the file):
"K:\FILE1.ini"
"K:\FILE1.cfg"
"K:\FILE100.cfg"
I want to duplicate the entire line 4 times only if a string ".ini" is present in a line.
After duplicating the line, I want to change the string in those duplicated lines (original line stays the same) to: for example, ".inf", ".bat", ".cmd", ".mov".
So the expected result of the script is as follows:
"K:\FILE1.ini"
"K:\FILE1.inf"
"K:\FILE1.bat"
"K:\FILE1.cmd"
"K:\FILE1.mov"
"K:\FILE1.cfg"
"K:\FILE100.cfg"
Those files are small, so using streams is not neccessary.
I am at the beginning of my PowerShell journey, but thanks to this community, I already know how to replace string in files recursively:
$directory = "K:\PS"
Get-ChildItem $directory -file -recurse -include *.txt |
ForEach-Object {
(Get-Content $_.FullName) -replace ".ini",".inf" |
Set-Content $_.FullName
}
but I have no idea how to duplicate certain lines multiple times and handle multiple string replacements in those duplicated lines.
Yet ;)
Could point me in the right direction?
To achieve this with the operator -replace you can do:
#Define strings to replace pattern with
$2replace = #('.inf','.bat','.cmd','.mov','.ini')
#Get files, use filter instead of include = faster
get-childitem -path [path] -recurse -filter '*.txt' | %{
$cFile = $_
#add new strings to array newData
$newData = #(
#Read file
get-content $_.fullname | %{
#If line matches .ini
If ($_ -match '\.ini'){
$cstring = $_
#Add new strings
$2replace | %{
#Output new strings
$cstring -replace '\.ini',$_
}
}
#output current string
Else{
$_
}
}
)
#Write to disk
$newData | set-content $cFile.fullname
}
This gives you the following output:
$newdata
"K:\FILE1.inf"
"K:\FILE1.bat"
"K:\FILE1.cmd"
"K:\FILE1.mov"
"K:\FILE1.ini"
"K:\FILE1.cfg"
"K:\FILE100.cfg"

Scanning log file using ForEach-Object and replacing text is taking a very long time

I have a Powershell script that scans log files and replaces text when a match is found. The list is currently 500 lines, and I plan to double/triple this. the log files can range from 400KB to 800MB in size. 
Currently, when using the below, a 42MB file takes 29mins, and I'm looking for help if anyone can see any way to make this faster?
I tried changing ForEach-Object with ForEach-ObjectFast but it's causing the script to take sufficiently longer. also tried changing the first ForEach-Object to a forloop but still took ~29 mins. 
$lookupTable= #{
'aaa:bbb:123'='WORDA:WORDB:NUMBER1'
'bbb:ccc:456'='WORDB:WORDBC:NUMBER456'
}
Get-Content -Path $inputfile | ForEach-Object {
$line=$_
$lookupTable.GetEnumerator() | ForEach-Object {
if ($line-match$_.Key)
{
$line=$line-replace$_.Key,$_.Value
}
}
$line
}|Set-Content -Path $outputfile
Since you say your input file could be 800MB in size, reading and updating the entire content in memory could potentially not fit.
The way to go then is to use a fast line-by-line method and the fastest I know of is switch
# hardcoded here for demo purposes.
# In real life you get/construct these from the Get-ChildItem
# cmdlet you use to iterate the log files in the root folder..
$inputfile = 'D:\Test\test.txt'
$outputfile = 'D:\Test\test_new.txt' # absolute full file path because we use .Net here
# because we are going to Append to the output file, make sure it doesn't exist yet
if (Test-Path -Path $outputfile -PathType Leaf) { Remove-Item -Path $outputfile -Force }
$lookupTable= #{
'aaa:bbb:123'='WORDA:WORDB:NUMBER1'
}
# create a regex string from the Keys of your lookup table,
# merging the strings with a pipe symbol (the regex 'OR').
# your Keys could contain characters that have special meaning in regex, so we need to escape those
$regexLookup = '({0})' -f (($lookupTable.Keys | ForEach-Object { [regex]::Escape($_) }) -join '|')
# create a StreamWriter object to write the lines to the new output file
# Note: use an ABSOLUTE full file path for this
$streamWriter = [System.IO.StreamWriter]::new($outputfile, $true) # $true for Append
switch -Regex -File $inputfile {
$regexLookup {
# do the replacement using the value in the lookup table.
# because in one line there may be multiple matches to replace
# get a System.Text.RegularExpressions.Match object to loop through all matches
$line = $_
$match = [regex]::Match($line, $regexLookup)
while ($match.Success) {
# because we escaped the keys, to find the correct entry we now need to unescape
$line = $line -replace $match.Value, $lookupTable[[regex]::Unescape($match.Value)]
$match = $match.NextMatch()
}
$streamWriter.WriteLine($line)
}
default { $streamWriter.WriteLine($_) } # write unchanged
}
# dispose of the StreamWriter object
$streamWriter.Dispose()

Powershell Files fetch

Am looking for some help to create a PowerShell script.
I have a folder where I have lots of files, I need only those file that has below two content inside it:
must have any matching string pattern as same as in file file1 (the content of file 1 is -IND 23042528525 or INDE 573626236 or DSE3523623 it can be more strings like this)
also have date inside the file in between 03152022 and 03312022 in the format mmddyyyy.
file could be old so nothing to do with creation time.
then save the result in csv containing the path of the file which fulfill above to conditions.
Currently am using the below command that only gives me the file which fulfilling the 1 condition.
$table = Get-Content C:\Users\username\Downloads\ISIN.txt
Get-ChildItem `
-Path E:\data\PROD\server\InOut\Backup\*.txt `
-Recurse |
Select-String -Pattern ($table)|
Export-Csv C:\Users\username\Downloads\File_Name.csv -NoTypeInformation
To test if a file contains a certain keyword from a range of keywords, you can use regex for that. If you also want to find at least one valid date in format 'MMddyyyy' in that file, you need to do some extra work.
Try below:
# read the keywords from the file. Ensure special characters are escaped and join them with '|' (regex 'OR')
$keywords = (Get-Content -Path 'C:\Users\username\Downloads\ISIN.txt' | ForEach-Object {[regex]::Escape($_)}) -join '|'
# create a regex to capture the date pattern (8 consecutive digits)
$dateRegex = [regex]'\b(\d{8})\b' # \b means word boundary
# and a datetime variable to test if a found date is valid
$testDate = Get-Date
# set two variables to the start and end date of your range (dates only, times set to 00:00:00)
$rangeStart = (Get-Date).AddDays(1).Date # tomorrow
$rangeEnd = [DateTime]::new($rangeStart.Year, $rangeStart.Month, 1).AddMonths(1).AddDays(-1) # end of the month
# find all .txt files and loop through. Capture the output in variable $result
$result = Get-ChildItem -Path 'E:\data\PROD\server\InOut\Backup'-Filter '*.txt'-File -Recurse |
ForEach-Object {
$content = Get-Content -Path $_.FullName -Raw
# first check if any of the keywords can be found
if ($content -match $keywords) {
# now check if a valid date pattern 'MMddyyyy' can be found as well
$dateFound = $false
$match = $dateRegex.Match($content)
while ($match.Success -and !$dateFound) {
# we found a matching pattern. Test if this is a valid date and if so
# set the $dateFound flag to $true and exit the while loop
if ([datetime]::TryParseExact($match.Groups[1].Value,
'MMddyyyy',[CultureInfo]::InvariantCulture,
[System.Globalization.DateTimeStyles]::None,
[ref]$testDate)) {
# check if the found date is in the set range
# this tests INCLUDING the start and end dates
$dateFound = ($testDate -ge $rangeStart -and $testDate -le $rangeEnd)
}
$match = $match.NextMatch()
}
# finally, if we also successfully found a date pattern, output the file
if ($dateFound) { $_.FullName }
elseif ($content -match '\bUNKNOWN\b') {
# here you output again, because unknown was found instead of a valid date in range
$_.FullName
}
}
}
# result is now either empty or a list of file fullnames
$result | set-content -Path 'C:\Users\username\Downloads\MatchedFiles.txt'

Export CSV. Folder, subfolder and file into separate column

I created a script that lists all the folders, subfolders and files and export them to csv:
$path = "C:\tools"
Get-ChildItem $path -Recurse |select fullname | export-csv -Path "C:\temp\output.csv" -NoTypeInformation
But I would like that each folder, subfolder and file in pfad is written into separate column in csv.
Something like this:
c:\tools\test\1.jpg
Column1
Column2
Column3
tools
test
1.jpg
I will be grateful for any help.
Thank you.
You can split the Fullname property using the Split() method. The tricky part is that you need to know the maximum path depth in advance, as the CSV format requires that all rows have the same number of columns (even if some columns are empty).
# Process directory $path recursively
$allItems = Get-ChildItem $path -Recurse | ForEach-Object {
# Split on directory separator (typically '\' for Windows and '/' for Unix-like OS)
$FullNameSplit = $_.FullName.Split( [IO.Path]::DirectorySeparatorChar )
# Create an object that contains the splitted path and the path depth.
# This is implicit output that PowerShell captures and adds to $allItems.
[PSCustomObject] #{
FullNameSplit = $FullNameSplit
PathDepth = $FullNameSplit.Count
}
}
# Determine highest column index from maximum depth of all paths.
# Minus one, because we'll skip root path component.
$maxColumnIndex = ( $allItems | Measure-Object -Maximum PathDepth ).Maximum - 1
$allRows = foreach( $item in $allItems ) {
# Create an ordered hashtable
$row = [ordered]#{}
# Add all path components to hashtable. Make sure all rows have same number of columns.
foreach( $i in 1..$maxColumnIndex ) {
$row[ "Column$i" ] = if( $i -lt $item.FullNameSplit.Count ) { $item.FullNameSplit[ $i ] } else { $null }
}
# Convert hashtable to object suitable for output to CSV.
# This is implicit output that PowerShell captures and adds to $allRows.
[PSCustomObject] $row
}
# Finally output to CSV file
$allRows | Export-Csv -Path "C:\temp\output.csv" -NoTypeInformation
Notes:
The syntax Select-Object #{ Name= ..., Expression = ... } creates a calculated property.
$allRows = foreach captures and assigns all output of the foreach loop to variable $allRows, which will be an array if the loop outputs more than one object. This works with most other control statements as well, e. g. if and switch.
Within the loop I could have created a [PSCustomObject] directly (and used Add-Member to add properties to it) instead of first creating a hashtable and then converting to [PSCustomObject]. The choosen way should be faster as no additional overhead for calling cmdlets is required.
While a file with rows containing a variable number of items is not actually a CSV file, you can roll your own and Microsoft Excel can read it.
=== Get-DirCsv.ps1
Get-Childitem -File |
ForEach-Object {
$NameParts = $_.FullName -split '\\'
$QuotedParts = [System.Collections.ArrayList]::new()
foreach ($NamePart in $NameParts) {
$QuotedParts.Add('"' + $NamePart + '"') | Out-Null
}
Write-Output $($QuotedParts -join ',')
}
Use this to capture the output to a file with:
.\Get-DirCsv.ps1 | Out-File -FilePath '.\dir.csv' -Encoding ascii

Powershell - Assigning unique file names to duplicated files using list inside a .csv or .txt

I have limited experience with Powershell doing very basic tasks by itself (such as simple renaming or moving files), but I've never created one that has the need to actually extract information from inside a file and apply that data directly to a file name.
I'd like to create a script that can reference a simple .csv or text file containing a list of unique identifiers and have it assign those to a batch of duplicated files (they all have the same contents) that share a slightly different name in the form of a 3-digit number appended as the prefix of a generic name.
For example, let's say my list of files are something like this:
001_test.txt
002_test.txt
003_test.txt
004_test.txt
005_test.txt
etc.
Then my .csv contains an alphabetical list of what I would like those to become:
Alpha.txt
Beta.txt
Charlie.txt
Delta.txt
Echo.txt
etc.
I tried looking at similar examples, but I'm failing miserably trying to tailor them to get it to do the above.
EDIT: I didn't save what I already modified, but here is the baseline script I was messing with:
$file_server = Read-Host "Enter the file server IP address"
$rootFolder = 'C:\TEMP\GPO\source\5'
Get-ChildItem -LiteralPath $rootFolder -Directory |
Where-Object { $_.Name -as [System.Guid] } |
ForEach-Object {
$directory = $_.FullName
(Get-Content "$directory\gpreport.xml") |
ForEach-Object { $_ -replace "99.999.999.999", $file_server } |
Set-Content "$directory\gpreport.xml"
# ... etc
}
I think this is to replace a string inside a file though. I need to replace the file name itself using a list from another file (that is not getting renamed), while not changing the contents of the files that are being renamed.
So you want to rename similar files with those listed in a text file. Ok, here's what you are going to need for my solution (alias listed in parenthesis): Get-Content (GC), Get-ChildItem (GCI), Where (?), Rename-Item, ForEach (%)
$NewNames = GC c:\temp\Namelist.txt #Path, including file name, to list of new names
$Name = "dog.txt" #File name without the 001_ prefix
$Path = "C:\Temp" #Path to search
$i=0
GCI $path | ?{$_.Name -match "\d{3}_$Name"}|%{Rename-Item $_.FullName $NewNames[$i];$i++}
Tested as working. That gets your list of new names and saves it as an array. Then it defines your file name, path, and sets $i to 0 as a counter. Then for each file that matches your pattern it renames it based off of item number $i in the array of new names, and then increments $i up one number and moves to the next file.
I haven't tested this, but it should be pretty close. It assumes you have a CSV with a column named FileNames and that you have at least as many names in that list as there are on disk.
$newNames = Import-Csv newfilenames.csv | Select -ExpandProperty FileNames
$existingFiles = Get-ChildItem c:\someplace
for ($i = 0; $i -lt $existingFiles.count; $i++)
{
Rename-Item -Path $existingFiles[$i].FullName -NewName $newNames[$i]
}
Basically, you create two arrays and using a basic for loop steping through the list of files on disk and pull the name from the corresponding index in the newNames array.
Does your CSV file map the identifiers to the file names?
Identifier,NewName
001,Alpha
002,Beta
If so, you'll need to look up the identifier before renaming the file:
# Define the naming convention
$Suffix = '_test'
$Extension = 'txt'
# Get the files and what to rename them to
$Files = Get-ChildItem "*$Suffix.$Extension"
$Csv = Import-Csv 'Names.csv'
# Rename the files
foreach ($File in $Files) {
$NewName = ($Csv | Where-Object { $File.Name -match '^' + $_.Identifier } | Select-Object -ExpandProperty NewName)
Rename-Item $File "$NewName.$Extension"
}
If your CSV file is just a sequential list of filenames, logicaldiagram's answer is probably more along the lines of what you're looking for.