PowerShell, more efficient way to find duplicate folders - powershell

I wrote a little function to scan each folder in $PSModulePath to see if duplicate folder names exist in the various paths in there (as I've found this problem happening quite often in my PowerShell environments!). I use simple logic and I was wondering if some PowerShell gurus maybe have more compact / faster / more efficient ways to achieve a sweep like this (I quite often find that those better at PowerShell seem to have 2-line solutions to something that takes me 15 lines! :-) )?
I'm just taking a path in $PSModulePath and creating an array of the subfolder names there, then looking at the subfolders of the other paths in $PSModulePath and comparing them one by one against the array that I made for the first path, and then repeating for the other paths.
function Find-ModuleDuplicates {
$hits = ""
$ModPaths = $env:PSModulePath -Split ";" -replace "\\+$", "" | sort
foreach ($i in $ModPaths) {
foreach ($j in $ModPaths) {
if ($j -notlike "*$i*") {
$arr_i = (gci $i -Dir).Name
$arr_j = (gci $j -Dir).Name
foreach ($x in $arr_j) {
if ($arr_i -contains $x) {
$hits += "Module '$x' in '$i' has a duplicate`n"
}
}
}
}
}
if ($hits -ne "") { echo "" ; echo $hits }
else { "`nNo duplicate Module folders were found`n" }
}

The following is a solution using Group-Object.
$env:PSModulePath.Split(";") | gci -Directory | group Name |
where Count -gt 1 | select Count,Name,#{ n = "ModulePath"; e = { $_.Group.Parent.FullName } }

Related

Outputs and easy algorithm question Object Oriented

$a = dir
foreach ($file in $a)
{
if (($file.index%2 ) -eq 0)
//Hopefully this function works, supposed to
(Ideally) print every other file
{
Write-Host $file.name
}
}
The function -eq 0... not sure if that prints out every other file. I do not know exactly how the files are numbered, or how you reference a number to the file. Do you treat every file as an object and number them? Then make a function regarding the numbers made appended to the file?
Fairly new to this, I'm used to html, css.
If you have a more proficient answer, I'm open to the idea too.
Your script almost works.
Removed alias for dir, and sorted results as requested.
The -File switch for Get-ChildItem excludes folders. I guess that's what you want, but remove it otherwise.
Since there's not an easy way to get the current position in foreach, I used a for loop instead, but it's the same idea. If you want to try with foreach, you could set a variable to true, and then not (!) it each iteration.
$Path = 'C:\yourpath'
$Files = Get-ChildItem -Path $Path -File |
Sort-Object -Property 'Name' -Descending
for ($i = 0; $i -lt $Files.Count; $i++) {
if ($i % 2 -eq 0) {
Write-Host $Files[$i].Name
}
}
If you're using this output further, it's highly recommended to write results to an object rather than the console window.
Why not simply use a for loop and increment the index counter with a value of 2?
for ($i = 0; $i -lt $a.Count; $i += 2) {
Write-Host $a[$i].Name
}

Merging counting unique paths in powershell

I am trying to figure out a way to merge all the parents directory paths into on e path, so imagine I have this data in a txt file:
\BANANA\APPLE\BERRIES\GRAPES\
\BANANA\APPLE\BERRIES\
\BANANA\APPLE\BERRIES\GRAPES\PEACH\
\BANANA\APPLE\
\BANANA\
\BANANA\APPLE\BERRIES\GRAPES\PEACH\AVOCADO\
I want the output of my loop to be just:
\BANANA\APPLE\BERRIES\GRAPES\PEACH\AVOCADO\
Because it is the longest path containing all the other previous paths.
But I am trying to do a loop for all the unique paths in a file containing all the previous parent folders as follows:
rm UNIQUE_PATHS.txt
#"LINE:"+$line
$count=0
foreach ($line in gc COUNT_DIR.txt){
foreach ($line2 in gc COUNT_DIR.txt){
# $line -contains $checking
if ($line2.contains($line2)) {
"COMPARING:"+$line2+" AND "+$line
$count = $count+1
}
if ($count -eq 1){
$line+$count >> UNIQUE_PATHS.txt
}
}
}
cat UNIQUE_PATHS.txt
So looks my count of the unique path is not working, that should be a better script for this ?
like this?
$Content=get-content "C:\temp\COUNT_DIR.txt"
$Content | %{
$Current=$_
$Founded= $Content | where {$_ -ne $Current -and $_.contains($Current)} | select -First 1
if($Founded -eq $null)
{
$Current
}
}

Why Isn't This Counting Correctly | PowerShell

Right now, I have a CSV file which contains 3,800+ records. This file contains a list of server names, followed by an abbreviation stating if the server is a Windows server, Linux server, etc. The file also contains comments or documentation, where each line starts with "#", stating it is a comment. What I have so far is as follows.
$file = Get-Content .\allsystems.csv
$arraysplit = #()
$arrayfinal = #()
[int]$windows = 0
foreach ($thing in $file){
if ($thing.StartsWith("#")) {
continue
}
else {
$arraysplit = $thing.Split(":")
$arrayfinal = #($arraysplit[0], $arraysplit[1])
}
}
foreach ($item in $arrayfinal){
if ($item[1] -contains 'NT'){
$windows++
}
else {
continue
}
}
$windows
The goal of this script is to count the total number of Windows servers. My issue is that the first "foreach" block works fine, but the second one results in "$Windows" being 0. I'm honestly not sure why this isn't working. Two example lines of data are as follows:
example:LNX
example2:NT
if the goal is to count the windows servers, why do you need the array?
can't you just say something like
foreach ($thing in $file)
{
if ($thing -notmatch "^#" -and $thing -match "NT") { $windows++ }
}
$arrayfinal = #($arraysplit[0], $arraysplit[1])
This replaces the array for every run.
Changing it to += gave another issue. It simply appended each individual element. I used this post's info to fix it, sort of forcing a 2d array: How to create array of arrays in powershell?.
$file = Get-Content .\allsystems.csv
$arraysplit = #()
$arrayfinal = #()
[int]$windows = 0
foreach ($thing in $file){
if ($thing.StartsWith("#")) {
continue
}
else {
$arraysplit = $thing.Split(":")
$arrayfinal += ,$arraysplit
}
}
foreach ($item in $arrayfinal){
if ($item[1] -contains 'NT'){
$windows++
}
else {
continue
}
}
$windows
1
I also changed the file around and added more instances of both NT and other random garbage. Seems it works fine.
I'd avoid making another ForEach loop for bumping count occurrences. Your $arrayfinal also rewrites everytime, so I used ArrayList.
$file = Get-Content "E:\Code\PS\myPS\2018\Jun\12\allSystems.csv"
$arrayFinal = New-Object System.Collections.ArrayList($null)
foreach ($thing in $file){
if ($thing.StartsWith("#")) {
continue
}
else {
$arraysplit = $thing -split ":"
if($arraysplit[1] -match "NT" -or $arraysplit[1] -match "Windows")
{
$arrayfinal.Add($arraysplit[1]) | Out-Null
}
}
}
Write-Host "Entries with 'NT' or 'Windows' $($arrayFinal.Count)"
I'm not sure if you want to keep 'Example', 'example2'... so I have skipped adding them to arrayfinal, assuming the goal is to count "NT" or "Windows" occurrances
The goal of this script is to count the total number of Windows servers.
I'd suggest the easy way: using cmdlets built for this.
$csv = Get-Content -Path .\file.csv |
Where-Object { -not $_.StartsWith('#') } |
ConvertFrom-Csv
#($csv.servertype).Where({ $_.Equals('NT') }).Count
# Compatibility mode:
# ($csv.servertype | Where-Object { $_.Equals('NT') }).Count
Replace servertype and 'NT' with whatever that header/value is called.

While loop does not produce pipeline output

It appears that a While loop does not produce an output that can continue in the pipeline. I need to process a large (many GiB) file. In this trivial example, I want to extract the second field, sort on it, then get only the unique values. What am I not understanding about the While loop and pushing things through the pipeline?
In the *NIX world this would be a simple:
cut -d "," -f 2 rf.txt | sort | uniq
In PowerShell this would be not quite as simple.
The source data.
PS C:\src\powershell> Get-Content .\rf.txt
these,1,there
lines,3,paragraphs
are,2,were
The script.
PS C:\src\powershell> Get-Content .\rf.ps1
$sr = New-Object System.IO.StreamReader("$(Get-Location)\rf.txt")
while ($line = $sr.ReadLine()) {
Write-Verbose $line
$v = $line.split(',')[1]
Write-Output $v
} | sort
$sr.Close()
The output.
PS C:\src\powershell> .\rf.ps1
At C:\src\powershell\rf.ps1:7 char:3
+ } | sort
+ ~
An empty pipe element is not allowed.
+ CategoryInfo : ParserError: (:) [], ParseException
+ FullyQualifiedErrorId : EmptyPipeElement
Making it a bit more complicated than it needs to be. You have a CSV without headers. The following should work:
Import-Csv .\rf.txt -Header f1,f2,f3 | Select-Object -ExpandProperty f2 -Unique | Sort-Object
Nasir's workaround looks like the way to go here.
If you want to know what was going wrong in your code, the answer is that while loops (and do/while/until loops) don't consistently return values to the pipeline the way that other statements in PowerShell do (actually that is true, and I'll keep the examples of that, but scroll down for the real reason it wasn't working for you).
ForEach-Object -- a cmdlet, not a built-in language feature/statement; does return objects to the pipeline.
1..3 | % { $_ }
foreach -- statement; does return.
foreach ($i in 1..3) { $i }
if/else -- statement; does return.
if ($true) { 1..3 }
for -- statement; does return.
for ( $i = 0 ; $i -le 3 ; $i++ ) { $i }
switch -- statement; does return.
switch (2)
{
1 { 'one' }
2 { 'two' }
3 { 'three' }
}
But for some reason, these other loops seem to act unpredictably.
Loops forever, returns $i (0 ; no incrementing going on).
$i = 0; while ($i -le 3) { $i }
Returns nothing, but $i does get incremented:
$i = 0; while ($i -le 3) { $i++ }
If you wrap the expression inside in parentheses, it seems it does get returned:
$i = 0; while ($i -le 3) { ($i++) }
But as it turns out (I'm learning a bit as I go here), while's strange return semantics have nothing to do with your error; you just can't pipe statements into functions/cmdlets, regardless of their return value.
foreach ($i in 1..3) { $i } | measure
will give you the same error.
You can "get around" this by making the entire statement a sub-expression with $():
$( foreach ($i in 1..3) { $i } ) | measure
That would work for you in this case. Or in your while loop instead of using Write-Output, you could just add your item to an array and then sort it after:
$arr = #()
while ($line = $sr.ReadLine()) {
Write-Verbose $line
$v = $line.split(',')[1]
$arr += $v
}
$arr | sort
I know you're dealing with a large file here, so maybe you're thinking that by piping to sort line by line you'll be avoiding a large memory footprint. In many cases piping does work that way in PowerShell, but the thing about sorting is that you need the whole set to sort it, so the Sort-Object cmdlet will be "collecting" each item you pass to it anyway and then do the actual sorting in the end; I'm not sure you can avoid that at all. Admittedly letting Sort-Object do that instead of building the array yourself might be more efficient depending on how its implemented, but I don't think you'll be saving much on RAM.
other solution
Get-Content -Path C:\temp\rf.txt | select #{Name="Mycolumn";Expression={($_ -split "," )[1]}} | select Mycolumn -Unique | sort

File Output in Powershell without Extension

Here is what I have so far:
Get-ChildItem "C:\Folder" | Foreach-Object {$_.Name} > C:\Folder\File.txt
When you open the output from above, File.txt, you see this:
file1.txt
file2.mpg
file3.avi
file4.txt
How do I get the output so it drops the extension and only shows this:
file1
file2
file3
file4
Thanks in advance!
EDIT
Figured it out with the help of the fellows below me. I ended up using:
Get-ChildItem "C:\Folder" | Foreach-Object {$_.BaseName} > C:\Folder\File.txt
Get-ChildItem "C:\Folder" | Select BaseName > C:\Folder\File.txt
Pass the file name to the GetFileNameWithoutExtension method to remove the extension:
Get-ChildItem "C:\Folder" | `
ForEach-Object { [System.IO.Path]::GetFileNameWithoutExtension($_.Name) } `
> C:\Folder\File.txt
I wanted to comment on #MatthewMartin's answer, which splits the incoming file name on the . character and returns the first element of the resulting array. This will work for names with zero or one ., but produces incorrect results for anything else:
file.ext1.ext2 yields file
powershell.exe is good for me. Let me explain to thee..doc yields powershell
The reason is because it's returning everything before the first . when it should really be everything before the last .. To fix this, once we have the name split into segments separated by ., we take every segment except the last and join them back together separated by .. In the case where the name does not contain a . we return the entire name.
ForEach-Object {
$segments = $_.Name.Split('.')
if ($segments.Length -gt 1) {
$segmentsExceptLast = $segments | Select-Object -First ($segments.Length - 1)
return $segmentsExceptLast -join '.'
} else {
return $_.Name
}
}
A more efficient approach would be to walk backwards through the name character-by-character. If the current character is a ., return the name up to but not including the current character. If no . is found, return the entire name.
ForEach-Object {
$name = $_.Name;
for ($i = $name.Length - 1; $i -ge 0; $i--) {
if ($name[$i] -eq '.') {
return $name.Substring(0, $i)
}
}
return $name
}
The [String] class already provides a method to do this same thing, so the above can be reduced to...
ForEach-Object {
$i = $_.Name.LastIndexOf([Char] '.');
if ($i -lt 0) {
return $_.Name
} else {
return $_.Name.Substring(0, $i)
}
}
All three of these approaches will work for names with zero, one, or multiple . characters, but, of course, they're a lot more verbose than the other answers, too. In fact, LastIndexOf() is what GetFileNameWithoutExtension() uses internally, while what BaseName uses is functionally the same as calling $_.Name.Substring() except it takes advantage of the already-computed extension.
And now for a FileInfo version, since everyone else beat me to a Path.GetFileNameWithoutExtension solution.
Get-ChildItem "C:\" | `
where { ! $_.PSIsContainer } | `
Foreach-Object {([System.IO.FileInfo]($_.Name)).Name.Split('.')[0]}
(ls).BaseName > C:\Folder\File.txt
Use the BaseName property instead of the Name property:
Get-ChildItem "C:\Folder" | Select-Object BaseName > C:\Folder\File.txt