PowerShell Workflow do...while loop..looped - powershell

I am tinkering with Workflows to process concurrently many files. I have written a (shitty) piece of code prior to the actual implementation.
The problem is the loop..is looped!. The idea of the script is to get a list of files onto an Array, (lets call it "Original") create new arrays (lets call them "LoopingArray") to process the files in batches of "x" files and remove the processed items form the array to create a new array to process..etc, until the original Array is empty.
For each item a txt file is created, so when the "Original" array is empty, the Do..While should stop. But it doesn't. it keeps creating files over and over. What I am doing wrong?:
Workflow Test-Workflow {
$SourceFolder = "c:\test\whatever"
$Files = [System.IO.Directory]::EnumerateFiles($SourceFolder, '*.*', 'AllDirectories')
$ArrayCount = $Files| Measure-Object | Select-Object Count
Do {
$NewArray = $Files | Select-Object -First 20
$Files = $Files | Where-Object { $NewArray -notcontains $_ }
$ArrayCount = $Files| Measure-Object | Select-Object Count
ForEach -Parallel ($it in $NewArray)
{
$Name = Get-Random
$Filepath = "C:\temp\test\"+"$Name"+".txt"
$it | Out-File -FilePath $Filepath
}
} Until ($ArrayCount.Count -eq "0")
}
Thanks!

Related

How to sort array based on naming structure

I've built a small report which essentially just does a row counts for Excel files within a share. However, there is now a requirement for the report to display the directory count in a specific order.
I cannot fathom how I'd go about that.
#Searching location
$searchinfolder = '\\Report\testing\'
#Creation of Array.
$data = #()
#Get Child items where not folder object and directory not "Postions"
$Files = Get-ChildItem -Path $searchinfolder -Recurse | Where { ! $_.Directory.Name -ne "Positions" }
Foreach ($File in $Files) {
#Main section. Get csv files, does a row count after removing top 2 and last 3 lines.
$fileStats = Get-Content $File.FullName | Select-Object -Skip 2 | Select-Object -SkipLast 3 | Measure-Object -line
$linesInFile = $fileStats.Lines - 1
#Added a counter because arrays start at 0.. need to start at 1.
$linesInFile++
#Only gets files with data in them.
if ($linesInFile -gt 0) {
$data += #(
[pscustomobject]#{
Filename = $File.fullname;
Rowcount = $linesInFile;
Directory = $File.Directory.Name
})
}
}
#Group by directory and get total sum of each file.
$data = $data | Group-Object Directory | ForEach-Object {
[PSCustomObject]#{
Directory = $_.Group.Directory | Get-Unique
Rowcount = ($_.Group.Rowcount | Measure-Object -sum).Sum
}
}
So for example, let's say the folder structure we're scraping is Cat, Dog, Goat, Programmer, Lama, Mouse.
Let's say all the folders but 1 contain files. How would I go about having the $data array arranged in a specific order of choosing? Furthermore, how would you go about setting the order and just skipping to the next assigned directory if the current directory is empty?
See below my attempt at pseudo-code trying to explain this.
Foreach ($item in $data){
if ($item.directory -eq "cat") { $item = $array[0] }
if ($item.directory -eq "dog") { $item = $array[1] }
if ($item.directory -eq "goat") { $item = $array[2] }
if ($item.directory -eq "Programmer") { $item = $array[3] }
if ($item.directory -eq "Lama") { $item = $array[4] }
if ($item.directory -eq "Mouse") { $item = $array[5] }
}

Repeat foreach loop after all iteration completed in Powershell

Anyone here can help me how to repeat the code from the beginning after all iteration in foreach loop has been completed. The code below will get all the files having 'qwerty' pattern inside the file, feed the list in foreach loop and display the filename and last 10 lines on each file and terminate the code if there is no new/updated file in certain amount of time
$today=(Get-date).Date
$FILES=Get-ChildItem -Path C:\Test\ | `
Where-Object {$_.LastWriteTime -ge $today} | `
Select-String -pattern "qwerty" | `
Select-Object FileName -Unique
foreach ($i in $FILES) {
Write-host $i -foregroundcolor red
Get-content -Path \\XXXXXX\$i -tail 10
Start-Sleep 1
}
You can use this:
For ($r = 0; $r -eq NumberOfTimesYouWantToRepeat; $r++) {
$today=(Get-date).Date
$FILES=Get-ChildItem -Path C:\Test\ | `
Where-Object {$_.LastWriteTime -ge $today} | `
Select-String -pattern "qwerty" | `
Select-Object FileName -Unique
foreach ($i in $FILES) {
Write-host $i -foregroundcolor red
Get-content -Path \\XXXXXX\$i -tail 10
Start-Sleep 1
}
}
PS: Replace TheNumberOfTimesToRepeat with the number of time you want to repeat.
If I understand the question properly, you would like to test for files in a certain folder containing a certain string. For each of these files, the last 10 lines should be displayed.
The first difficulty comes from the fact that you want to do this inside a loop and test new or updated files.
That means you need to keep track of files you have already tested and only display new or updated files. The code below uses a Hashtable $alreadyChecked for that so we can test if a file is either new or updated.
If no new or updated files are found during a certain time, the code should end. To do that, I'm using two other variables: $endTime and $checkTime.
$checkTime gets updated on every iteration, making it the current time
$endTime only gets updated if files were found.
$today = (Get-Date).Date
$sourceFolder = 'D:\Test'
$alreadyChecked = #{} # a Hashtable to keep track of files already checked
$maxMinutes = 5 # the max time in minutes to perform the loop when no new files or updates are added
$endTime = (Get-Date).AddMinutes($maxMinutes)
do {
$checkTime = Get-Date
$files = Get-ChildItem -Path $sourceFolder -File |
# only files created today and that have not been checked already
Where-Object {$_.LastWriteTime -ge $today -and
(!$alreadyChecked.ContainsKey($_.FullName) -or
$alreadyChecked[$_.FullName].LastWriteTime -ne $_.LastWriteTime) } |
ForEach-Object {
$filetime = $_.LastWriteTime
$_ | Select-String -Pattern "qwerty" -SimpleMatch | # -SimpleMatch if you don't use a Regex match
Select-Object Path, FileName, #{Name = 'LastWriteTime'; Expression = { $filetime }}
}
if ($files) {
foreach ($item in $files) {
Write-Host $item.Filename -ForegroundColor Red
Write-Host (Get-content -Path $item.Path -Tail 10)
Write-Host
# update the Hashtable to keep track of files already done
$alreadyChecked[$item.Path] = $item | Select-Object FileName, LastWriteTime
Start-Sleep 1
}
# files were found, so update the time to check for no updates/new files
$endTime = (Get-Date).AddMinutes($maxMinutes)
}
# exit the loop if no new or updated files have been found during $maxMinutes time
} while ($checkTime -le $endTime)
For demo, I'm using 5 minutes to wait for the loop to expire if no new or updated files are found, but you can change that to suit your needs.

Using Array and get-childitem to find filenames with specific ids

In the most basic sense, I have a SQL query which returns an array of IDs, which I've stored into a variable $ID. I then want to perform a Get-childitem on a specific folder for any filenames that contain any of the IDs in said variable ($ID) There are three possible filenames that could exist:
$ID.xml
$ID_input.xml
$ID_output.xml
Once I have the results of get-childitem, I want to output this as a text file and delete the files from the folder. The part I'm having trouble with is filtering the results of get-childitem to define the filenames I'm looking for, so that only files that contain the IDs from the SQL output are displayed in my get-childitem results.
I found another way of doing this, which works fine, by using for-each ($i in $id), then building the desired filenames from that and performing a remove item on them:
# Build list of XML files
$XMLFile = foreach ($I in $ID)
{
"$XMLPath\$I.xml","$XMLPath\$I`_output.xml","$XMLPath\$I`_input.xml"
}
# Delete XML files
$XMLFile | Remove-Item -Force
However, this produces a lot of errors in the shell, as it tries to delete files that don't exist, but whose IDs do exist in the database. I also can't figure out how to produce a text output of the files that were actually deleted, doing it this way, so I'd like to get back to the get-childitem approach, if possible.
Any ideas would be greatly appreciated. If you require more info, just ask.
You can find all *.xml files with Get-ChildItem to minimize the number of files to test and then use regex to match the filenames. It's faster than a loop/multiple test, but harder to read if you're not familiar with regex.
$id = 123,111
#Create regex-pattern (search-pattern)
$regex = "^($(($id | ForEach-Object { [regex]::Escape($_) }) -join '|'))(?:_input|_output)?$"
$filesToDelete = Get-ChildItem -Path "c:\users\frode\Desktop\test" -Filter "*.xml" | Where-Object { $_.BaseName -match $regex }
#Save list of files
$filesToDelete | Select-Object -ExpandProperty FullName | Out-File "deletedfiles.txt" -Append
#Remove files (remove -WhatIf when ready)
$filesToDelete | Remove-Item -Force -WhatIf
Regex demo: https://regex101.com/r/dS2dJ5/2
Try this:
clear
$ID = "a", "b", "c"
$filesToDelete = New-Object System.Collections.ArrayList
$files = Get-ChildItem e:\
foreach ($I in $ID)
{
($files | Where-object { $_.Name -eq "$ID.xml" }).FullName | ForEach-Object { $filesToDelete.Add($_) }
($files | Where-object { $_.Name -eq "$ID_input.xml" }).FullName | ForEach-Object { $filesToDelete.Add($_) }
($files | Where-object { $_.Name -eq "$ID_output.xml" }).FullName | ForEach-Object { $filesToDelete.Add($_) }
}
$filesToDelete | select-object -Unique | ForEach-Object { Remove-Item $_ -Force }

Compare contents of 6 objects and delete which are not matching

I have some 6 files which are created dynamically (so,I dont know the contents). I need to compare these 6 files (exactly speaking compare one file with 5 others) and see what all contents in the file 1 are matching with the other 5. The contents which are matching should be saved, others need to be deleted.
I coded something like below, but is deleting everything (which are matching too).
$lines = Get-Content "C:\snaps.txt"
$check1 = Get-Content "C:\Previous_day_latest.txt"
$check2 = Get-Content "C:\this_week_saved_snaps.txt"
$check3 = Get-Content "C:\all_week_latest_snapshots.txt"
$check4 = Get-Content "C:\each_month_latest.txt"
$check5 = Get-Content "C:\exclusions.txt"
foreach($l in $lines)
{
if(($l -notmatch $check1) -and ($l -notmatch $check2) -and ($l -notmatch $check3) -and ($l -notmatch $check4))
{
Remove-Item -Path "C:\$l.txt"
}else
{
#nothing
}
}
foreach($ch in $check5)
{
Remove-Item -Path "C:\$ch.txt"
}
Contents of 6 files will be as shown below:
$lines
testinstance-01-07-15-08-00
testinstance-10-07-15-23-00
testinstance-13-02-15-13-00
testinstance-15-06-15-23-00
testinstance-19-01-15-23-00
testinstance-23-05-15-20-00
testinstance-27-03-15-23-00
testinstance-28-02-15-23-00
testinstance-29-07-15-08-00
testinstance-30-04-15-23-00
testinstance-30-06-15-23-00
testinstance-31-01-15-23-00
testinstance-31-12-14-23-00
$check1
testinstance-29-07-15-08-00
$check2
testinstance-23-05-15-20-00
testinstance-27-03-15-23-00
$check3
testinstance-01-07-15-23-00
testinstance-13-02-15-13-00
testinstance-19-01-15-23-00
$check4
testinstance-28-02-15-23-00
testinstance-30-04-15-23-00
testinstance-30-06-15-23-00
testinstance-31-01-15-23-00
$check5
testinstance-31-12-14-23-00
I've read about compare-object. But not sure how that can be implemented in my case as contents of all 5 files will be different and all those contents should be saved from deletion. Can someone please guide me to achieve what I said.? Any help would be really appreciated.
I would create an array of the files to check so you can simply add new files without modifying other parts of your script.
I use the where cmdlet which filters all lines that are in the reference file using -in condition and finally overwrite the file:
$referenceFile = 'C:\snaps.txt'
$compareFiles = #(
'C:\Previous_day_latest.txt',
'C:\this_week_saved_snaps.txt',
'C:\all_week_latest_snapshots.txt',
'C:\each_month_latest.txt',
'C:\exclusions.txt'
)
# get the content of the reference file
$referenceContent = (gc $referenceFile)
foreach ($file in $compareFiles)
{
# get the content of the file to check
$content = (gc $file)
# filter all contents from the file to check which are in the reference file and save it
$content | where { $_ -in $referenceContent } | sc $file
}
You can use the -contains operator to compare array contents. If you open all the files you want to check and store into an array, you can compare that with the reference file:
$lines = Get-Content "C:\snaps.txt"
$check1 = "C:\Previous_day_latest.txt"
$check2 = "C:\this_week_saved_snaps.txt"
$check3 = "C:\all_week_latest_snapshots.txt"
$check4 = "C:\each_month_latest.txt"
$check5 = "C:\exclusions.txt"
$checklines = #()
(1..5) | ForEach-Object {
$comp = Get-Content $(Get-Variable check$_).value
$checklines += $comp
}
$matches = $lines | ? { $checklines -contains $_ }
If you switch the -contains to -notcontains you'll see the three lines that don't match
The other answers here are great but I wanted to show you that Compare-Object could still work. You need to use it in a loop however. Just to try and show something else I included a simple use of Join-Path for building the array of checks. Basically we are saving some typing when you move your files to a production area. Update one path instead of more.
$rootPath = "C:\"
$fileNames = "Previous_day_latest.txt", "this_week_saved_snaps.txt", "all_week_latest_snapshots.txt", "each_month_latest.txt", "exclusions.txt"
$lines = Get-Content (Join-path $rootPath "snaps.txt")
$checks = $fileNames | ForEach-Object{Join-Path $rootPath $_}
ForEach($check in $checks){
Compare-Object -ReferenceObject $lines -DifferenceObject (Get-Content $check) -IncludeEqual |
Where-Object{$_.SideIndicator -eq "=="} |
Select-Object -ExpandProperty InputObject |
Set-Content $check
}
So we take each file path and use Compare-Object in a loop comparing each to the $lines array. Using -IncludeEqual we find the lines that both files share and write those back to the file.
Depending on how many checks you have and where they are it might be easier to have this line to build the array $checks
$checks = Get-ChildItem "C:\" -Filter "*.txt" | Select-Object -Expand FullName

Need to add the full path of where test was referenced from

So far I have a hash table with 2 values in it. Right now the code below, exports all the unique lines and gives me a count of how many times the line was referenced in 100's of xml files. This is one part.
I now need to find out which subfolder had the xml file in it that has that unique line of referenced in the hash table. Is this possible?
$ht = #{}
Get-ChildItem -recurse -Filter *.xml | Get-Content | %{$ht[$_] = $ht[$_]+1}
$ht
# To export to CSV:
$ht.GetEnumerator() | select key, value | Export-Csv D:\output.csv
To get file path to your output, you need to assign it to a variable in the first pipe.
Is this something similar to what you need?
$ht = #{}
Get-ChildItem -recurse -Filter *.xml | %{$path = $_.FullName; Get-Content $path} | % { $ht[$_] = $ht[$_] + $path + ";"}
The code above will return a hash-table in "config line" = "count" format.
EDIT:
If you need to return three elements (unique line, count and array of paths where it was found) it gets more complicated. Here is a code that will return an array of PSObjects. Each contains info for one unique line in XML files.
$ht = #()
$files = Get-ChildItem -recurse -Filter *.xml
foreach ($file in $files) {
$path = $file.FullName
$lines = Get-Content $path
foreach ($line in $lines) {
if ($match = $ht | where {$_.line -EQ $line}) {
$match.count = $match.count + 1
$match.Paths += $path
} else {
$ht += new-object PSObject -Property #{
Count = 1
Paths = #(,$path)
Line = $line }
}
}
}
$ht
I'm sure it can be shortened and optimized, but hopefully it is enough to get you started.