Powershell script been running for days when doing comparison - powershell

I got a powershell query, it works fine for smaller amount of data but i am trying to run my CSV against a folder which has multiple folders and files within. Folder size is nearly 800GB and 180 folders within.
I want to see if the file exists in the folder, I can manually search the files within Windows and does not take to long to return a result but my CSV has 3000 rows and i do not wish to do this for 3000 rows. My script works fine for a smaller amount of data.
The script has been running for 6 days and it has not generated a file with data as of yet. it is 0KB and I am running it via task scheduler.
Script is below.
$myFolder = Get-ChildItem 'C:\Test\TestData' -Recurse -ErrorAction
SilentlyContinue -Force
$myCSV = Import-Csv -Path 'C:\Test\differences.csv' | % {$_.'name' -replace "\\", ""}
$compare = Compare-Object -ReferenceObject $myCSV -DifferenceObject $myFolder
Write-Output "`n_____MISSING FILES_____`n"
$compare
Write-Output "`n_____MISSING FILES DETAILS____`n"
foreach($y in $compare){
if($y.SideIndicator -eq "<="){
write-output "$($y.InputObject) Is present in the CSV but not in Missing folder."
}
}
I then created another script which runs the above script and contains an out file command and runs with Task scheduler.
C:\test\test.ps1 | Out-File 'C:\test\Results.csv'
is there a better way of doing this?
Thanks

is there a better way of doing this?
Yes!
Add each file name on disk to a HashSet[string]
the HashSet type is SUPER FAST at determining whether it contains a
specific value or not, much faster than Compare-Object
Loop over your CSV records, check if each file name exists in the set from step 1
# 1. Build our file name index using a HashSet
$fileNames = [System.Collections.Generic.HashSet[string]]::new()
Get-ChildItem 'C:\Test\TestData' -Recurse -ErrorAction
SilentlyContinue -Force |ForEach-Object {
[void]$fileNames.Add($_.Name)
}
# 2. Check each CSV record against the file name index
Import-Csv -Path 'C:\Test\differences.csv' |ForEach-Object {
$referenceName = $_.name -replace '\\'
if(-not $fileNames.Contains($referenceName)){
"${referenceName} is present in CSV but not on disk"
}
}
Another option is to use the hash set from step 1 in a Where-Object filter:
$csvRecordsMissingFromDisk = Import-Csv -Path 'C:\Test\differences.csv' |Where-Object { -not $fileNames.Contains($_) }

Related

Powershell: ForEach Copy-Item doesn't rename properly when retrieving data from array

I am pretty new to PowerShell and I need some help. I have a .bat file that I want to copy as many times as there are usernames in my array and then also rename at the same time. This is because the code in the .bat file remains the same, but for it to work on the client PC it has to have the username as a prefix in the filename.
This is the code that I have tried:
$usernames = Import-Csv C:\Users\Admin\Desktop\usernames.csv
$file = Get-ChildItem -Path 'C:\Users\Admin\Desktop\generatedbat\' -Recurse
foreach ($username in $usernames)
{
ForEach-Object {Copy-Item $file.FullName ('C:\Users\Admin\Desktop\generatedbat\' + $username + $File.BaseName + ".bat")}
}
This copies everything and it kind of works but I have one problem.
Instead of having this filename: JohnR-VPNNEW_up.bat
I get this: #{Username=JohnR}-VPNNEW_up.bat
Any help? Thanks!
So you have one .bat file C:\Users\Admin\Desktop\generatedbat\VPNNEW_up.bat you want to copy to the same directory with new names taken from the usernames.csv --> Username column.
Then try
# get an array of just the UserNames column in the csv file
$usernames = (Import-Csv -Path 'C:\Users\Admin\Desktop\usernames.csv').Username
# get the file as object so you can use its properties
$originalFile = Get-Item -Path 'C:\Users\Admin\Desktop\generatedbat\VPNNEW_up.bat'
foreach ($username in $usernames) {
$targetFile = Join-Path -Path $originalFile.DirectoryName -ChildPath ('{0}-{1}' -f $username, $originalFile.Name)
$originalFile | Copy-Item -Destination $targetFile -WhatIf
}
I have added switch -WhatIf so you can first test this out. If what is displayed in the console window looks OK, then remove that -WhatIf safety switch and run the code again so the file is actually copied
I kept the code the same but instead of using a .csv file I just used a .txt file and it worked perfectly.

Script to scan computers from a list and identify which ones have a software title installed

I have narrowed down a little more what exactly my end game is.
I have pre-created the file that I want the results to write to.
Here is a rough script of what I want to do:
$computers = Get-content "C:\users\nicholas.j.nedrow\desktop\scripts\lists\ComputerList.txt"
# Ping all computers in ComputerList.txt.
# Need WinEst_Computers.csv file created with 3 Columns; Computer Name | Online (Y/N) | Does File Exist (Y/N)
$output = foreach ($comp in $computers) {
$TestConn = Test-connection -cn $comp -BufferSize 16 -Count 1 -ea 0 -quiet
if ($TestConn -match "False")
{
#Write "N" to "Online (Y/N)" Column in primary output .csv file
}
if ($TestConn -match "True")
{
#Write "Y" to "Online (Y/N)" Column in primary output .csv file
}
#For computers that return a "True" ping value:
#Search for WinEst.exe application on C:\
Get-ChildItem -Path "\\$comp\c$\program files (x86)\WinEst.exe" -ErrorAction SilentlyContinue |
if ("\\$comp\c$\program files (x86)\WinEst.exe" -match "False")
{
#Write "N" to "Does File Exist (Y/N)" Column in primary output .csv file
}
if ("\\$comp\c$\program files (x86)\WinEst.exe" -match "True")
{
#Write "Y" to "Does File Exist (Y/N)" Column in primary output .csv file
}
Select #{n='ComputerName';e={$comp}},Name
}
$output | Out-file "C:\users\nicholas.j.nedrow\desktop\scripts\results\CSV Files\WinEst_Computers.csv"
What I need help with is the following:
How to get each result to either write to the appropriate line (I.e. computername, online, file exist?) or would it be easier to do one column at a time;
--Write all PC's to Column A
--Ping each machine and record results in Column B
--Search each machine for the .exe and record results.
Any suggestions? Sorry I keep changing things. Just trying to figure out the best way to do this.
You are using the foreach command, which has a syntax foreach ($itemVariable in $collectionVariable) { }. If $computer is your collection, then your current item cannot also be $computer inside your foreach.
Get-Item does not return a property computerName. Therefore you cannot explicitly select it with Select-Object. However, you can use a calculated property to add a new property to the custom object that Select-Object outputs.
If your CSV file has a row of header(s), it is simpler to use Import-Csv to read the file. If it is just a list of computer names, then Get-Content works well.
If you are searching for a single file and you know the exact path, then just stick with -Path or -LiteralPath and forget -Include. -Include is not intuitive and isn't explained well in the online documentation.
If you are piping output to Export-Csv using a single pipeline, there's no need for -Append unless you already have an existing CSV with data you want to retain. However, if you choose to pipe to Export-Csv during each loop iteration, -Append would be necessary to retain the output.
Here is some updated code using the recommendations:
$computers = Get-content "C:\users\nicholas.j.nedrow\desktop\scripts\lists\ComputerList.txt"
$output = foreach ($comp in $computers) {
Get-Item -Path "\\$comp\c$\program files (x86)\WinEst.exe" -ErrorAction SilentlyContinue |
Select #{n='ComputerName';e={$comp}},Name
}
$output | Export-Csv -Path "C:\users\nicholas.j.nedrow\desktop\scripts\results\CSV Files\WinEst_Computers.csv" -NoType

How to select [n] Items from a CSV list to assign them to a variable and afterwards remove those items and save the file using PowerShell

I'm parsing a CSV file to get the names of folders which I need to copy to another location. Because there are hundreds of them, I need to select the first 10 or so and run the copy routine but to avoid copying them again I'm removing them from the list and saving the file.
I'll run this on a daily scheduled task to avoid having to wait for the folders to finish copying. I'm having a problem using the 'Select' and 'Skip' options in the code (see below), if I remove those lines the folders are copied (I'm using empty folders to test) but if I have them in, then nothing happens when I run this in PowerShell.
I looked around in other questions about similar issues but did not find anything that answers this particular issue selecting and skipping rows in the CSV.
$source_location = 'C:\Folders to Copy'
$folders_Needed = gci $source_location
Set-Location -Path $source_location
$Dest = 'C:\Transferred Folders'
$csv_name = 'C:\List of Folders.csv'
$csv_Import = Get-Content $csv_name
foreach($csv_n in $csv_Import | Select-Object -First 3){
foreach ($folder_Tocopy in $folders_Needed){
if("$folder_Tocopy" -contains "$csv_n"){
Copy-Item -Path $folder_Tocopy -Destination $Dest -Recurse -Verbose
}
}
$csv_Import | Select-Object -Skip 3 | Out-File -FilePath $csv_name
}
It should work with skip/first as in your example, but I cannot really test it without your sample data. Also, it seems wrong that you write the same output to the csv file at every iteration of the loop. And I assume it's not a csv file but actually just a plain text file, a list of folders? Just folder names or full paths? (I assume the first.)
Anyways, here is my suggested update to the script (see comments):
$source_location = 'C:\Folders to Copy'
$folders_Needed = Get-ChildItem $source_location
$Dest = 'C:\Transferred Folders'
$csv_name = 'C:\List of Folders.csv'
$csv_Import = #(Get-Content $csv_name)
# optional limit
# set this to $csv_Import.Count if you want to copy all folders
$limit = 10
# loop over the csv entries
for ($i = 0; $i -lt $csv_Import.Count -and $i -lt $limit; $i++) {
# current line in the csv file
$csv_n = $csv_Import[$i]
# copy the folder(s) which name matches the csv entry
$folders_Needed | where {$_.Name -eq $csv_n} | Copy-Item -Destination $Dest -Recurse -Verbose
# update the csv file (skip all processed entries)
$csv_Import | Select-Object -Skip ($i + 1) | Out-File -FilePath $csv_name
}

Compress File per file, same name

I hope you are all safe in this time of COVID-19.
I'm trying to generate a script that goes to the directory and compresses each file to .zip with the same name as the file, for example:
sample.txt -> sample.zip
sample2.txt -> sample2.zip
but I'm having difficulties, I'm not that used to powershell, I'm learning and improving this script. In the end it will be a script that deletes files older than X days, compresses files and makes them upload in ftp .. the part of excluding with more than X I've already managed it for days, now I grabbed a little bit on this one.
Last try at moment.
param
(
#Future accept input
[string] $InputFolder,
[string] $OutputFolder
)
#test folder
$InputFolder= "C:\Temp\teste"
$OutputFolder="C:\Temp\teste"
$Name2 = Get-ChildItem $InputFolder -Filter '*.csv'| select Name
Set-Variable SET_SIZE -option Constant -value 1
$i = 0
$zipSet = 0
Get-ChildItem $InputFolder | ForEach-Object {
$zipSetName = ($Name2[1]) + ".zip "
Compress-Archive -Path $_.FullName -DestinationPath "$OutputFolder\$zipSetName"
$i++;
$Name2++
if ($i -eq $SET_SIZE) {
$i = 0;
$zipSet++;
}
}
You can simplify things a bit, and it looks like most of the issues are because in your script example $Name2 will contain a different set of items than the Get-ChildItem $InputFolder will return in the loop (i.e. may have other objects other than .csv files).
The best way to deal with things is to use variables with the full file object (i.e. you don't need to use |select name). So I get all the CSV file objects right away and store in the variable $CsvFiles.
We can additionally use the special variable $_ inside the ForEach-Object which represents the current object. We also can use $_.BaseName to give us the name without the extension (assuming that's what you want, otherwise use $_Name to get a zip with the name like xyz.csv).
So a simplified version of the code can be:
$InputFolder= "C:\Temp\teste"
$OutputFolder="C:\Temp\teste"
#Get files to process
$CsvFiles = Get-ChildItem $InputFolder -Filter '*.csv'
#loop through all files to zip
$CsvFiles | ForEach-Object {
$zipSetName = $_.BaseName + ".zip"
Compress-Archive -Path $_.FullName -DestinationPath "$OutputFolder\$zipSetName"
}

PowerShell script to execute if threshold exceeded

First off, sorry for the long post - I'm trying to be detailed!
I'm looking to automate a work around for an issue I discovered. I have a worker that periodically bombs once the "working" directory has more than 100,000 files in it. Preventatively I can stop the process and rename the working directory to "HOLD" and create new working dir to keep it going. Then I move files from the HOLD folder(s) back into the working dir a little bit at a time until its caught up.
What I would like to do is automate the entire process via Task Scheduler with 2 PowerShell scripts.
----SCRIPT 1----
Here's the condition:
If file count in working dir is greater than 60,000
I find that( [System.IO.Directory]::EnumerateFiles($Working)is faster than Get-ChildItem.
The actions:
Stop-Service for Service1, Service2, Service3
Rename-Item -Path "C:\Prod\Working\" -NewName "Hold" or "Hold1","2","3",etc.. if the folder already exists --I'm not particular about the numeration as long as it is consistent so if it's easier to let the system name it HOLD, HOLD(1), HOLD(2), etc.. or append the date after HOLD then that's fine.
New-Item C:\Prod\Working -type directory
Start-Service Service1, Service2, Service3
---SCRIPT 2----
Condition:
If file count in working dir is less than 50,000
Actions:
Move 5,000 files from HOLD* folder(s) --Move 5k files from the HOLD folder until empty, then skip the empty folder and start moving files from HOLD1. This process should be dynamic and repeat to the next folders.
Before it comes up, I'm well aware it would be easier to simply move the files from the working folder to a Hold folder, but the size of the files can be very large and moving them always seems to take much longer.
I greatly appreciate any input and I'm eager to see some solid answers!
EDIT
Here's what I'm running for Script 2 -courtesy of Bacon
#Setup
$restoreThreshold = 30000; # Ensure there's enough room so that restoring $restoreBatchSize
$restoreBatchSize = 500; # files won't push $Working's file count above $restoreThreshold
$Working = "E:\UnprocessedTEST\"
$HoldBaseDirectory = "E:\"
while (#(Get-ChildItem -File -Path $Working).Length -lt $restoreThreshold - $restoreBatchSize)
{
$holdDirectory = Get-ChildItem -Path $HoldBaseDirectory -Directory -Filter '*Hold*' |
Select-Object -Last 1;
if ($holdDirectory -eq $null)
{
# There are no Hold directories to process; don't keep looping
break;
}
# Restore the first $restoreBatchSize files from $holdDirectory and store the count of files restored
$restoredCount = Get-ChildItem $holdDirectory -File `
| Select-Object -First $restoreBatchSize | Move-Item -Destination $Working -PassThru |
Measure-Object | Select-Object -ExpandProperty 'Count';
# If less than $restoreBatchSize files were restored then $holdDirectory is now empty; delete it
if ($restoredCount -lt $restoreBatchSize)
{
Remove-Item -Path $holdDirectory;
}
}
The first script could look like this:
$rotateThreshold = 60000;
$isThresholdExceeded = #(
Get-ChildItem -File -Path $Working `
| Select-Object -First ($rotateThreshold + 1) `
).Length -gt $rotateThreshold;
#Alternative: $isThresholdExceeded = #(Get-ChildItem -File -Path $Working).Length -gt $rotateThreshold;
if ($isThresholdExceeded)
{
Stop-Service -Name 'Service1', 'Service2', 'Service3';
try
{
$newName = 'Hold_{0:yyyy-MM-ddTHH-mm-ss}' -f (Get-Date);
Rename-Item -Path $Working -NewName $newName;
}
finally
{
New-Item -ItemType Directory -Path $Working -ErrorAction SilentlyContinue;
Start-Service -Name 'Service1', 'Service2', 'Service3';
}
}
The reason for assigning $isThresholdExceeded the way I am is because we don't care what the exact count of files is, just if it's above or below that threshold. As soon as we know that threshold has been exceeded we don't need any further results from Get-ChildItem (or the same for [System.IO.Directory]::EnumerateFiles($Working)), so as an opimization Select-Object will terminate the pipeline on the element after the threshold is reached. In a directory with 100,000 files on an SSD I found this to be almost 40% faster than allowing Get-ChildItem to enumerate all files (4.12 vs. 6.72 seconds). Other implementations using foreach or ForEach-Object proved to be slower than #(Get-ChildItem -File -Path $Working).Length.
As for generating the new name for the 'Hold' directories, you could save and update an identifier somewhere, or just generate new names with an incrementing suffix until you find one that's not in use. I think it's easier to just base the name on the current time. As long as the script doesn't run more than once a second you'll know the name is unique, they'll sort just as well as numerals, plus it gives you a little diagnostic information (the time that directory was rotated out) for free.
Here's some basic code for the second script:
$restoreThreshold = 50000;
$restoreBatchSize = 5000;
# Ensure there's enough room so that restoring $restoreBatchSize
# files won't push $Working's file count above $restoreThreshold
while (#(Get-ChildItem -File -Path $Working).Length -lt $restoreThreshold - $restoreBatchSize)
{
$holdDirectory = Get-ChildItem -Path $HoldBaseDirectory -Directory -Filter 'Hold_*' `
| Select-Object -First 1;
if ($holdDirectory -eq $null)
{
# There are no Hold directories to process; don't keep looping
break;
}
# Restore the first $restoreBatchSize files from $holdDirectory and store the count of files restored
$restoredCount = Get-ChildItem -File -Path $holdDirectory.FullName `
| Select-Object -First $restoreBatchSize `
| Move-Item -Destination $Working -PassThru `
| Measure-Object `
| Select-Object -ExpandProperty 'Count';
# If less than $restoreBatchSize files were restored then $holdDirectory is now empty; delete it
if ($restoredCount -lt $restoreBatchSize)
{
Remove-Item -Path $holdDirectory.FullName;
}
}
As noted in the comment before the while loop, the condition is ensuring that the count of files in $Working is at least $restoreBatchSize files away from $restoreThreshold so that if $restoreBatchSize files are restored it won't exceed the threshold in the process. If you don't care about that, or the chosen threshold already accounts for that, you change the condition to compare against $restoreThreshold instead of $restoreThreshold - $restoreBatchSize. Alternatively, leave the condition the same and change $restoreThreshold to 55000.
The way I've written the loop, on each iteration at most $restoreBatchSize files will be restored from the first 'Hold_*' directory it finds, then the file count in $Working is reevaluated. Considering that, as I understand it, there are files being added and removed from $Working external to this script and simultaneous to its execution, this might be the safest approach and also the simplest approach. You could certainly enhance this by calculating how far below $restoreThreshold you are and performing the necessary number of batch restores, from one or more 'Hold_*' directories, all in one iteration of the loop.