Merging counting unique paths in powershell - powershell

I am trying to figure out a way to merge all the parents directory paths into on e path, so imagine I have this data in a txt file:
\BANANA\APPLE\BERRIES\GRAPES\
\BANANA\APPLE\BERRIES\
\BANANA\APPLE\BERRIES\GRAPES\PEACH\
\BANANA\APPLE\
\BANANA\
\BANANA\APPLE\BERRIES\GRAPES\PEACH\AVOCADO\
I want the output of my loop to be just:
\BANANA\APPLE\BERRIES\GRAPES\PEACH\AVOCADO\
Because it is the longest path containing all the other previous paths.
But I am trying to do a loop for all the unique paths in a file containing all the previous parent folders as follows:
rm UNIQUE_PATHS.txt
#"LINE:"+$line
$count=0
foreach ($line in gc COUNT_DIR.txt){
foreach ($line2 in gc COUNT_DIR.txt){
# $line -contains $checking
if ($line2.contains($line2)) {
"COMPARING:"+$line2+" AND "+$line
$count = $count+1
}
if ($count -eq 1){
$line+$count >> UNIQUE_PATHS.txt
}
}
}
cat UNIQUE_PATHS.txt
So looks my count of the unique path is not working, that should be a better script for this ?

like this?
$Content=get-content "C:\temp\COUNT_DIR.txt"
$Content | %{
$Current=$_
$Founded= $Content | where {$_ -ne $Current -and $_.contains($Current)} | select -First 1
if($Founded -eq $null)
{
$Current
}
}

Related

how to find unique line in a txt file?

I have a LARGE list of hashes. I need to find out which ones only appear once as most are duplicates.
EX: the last line 238db2..... only appears once
ac6b51055fdac5b92934699d5b07db78
ac6b51055fdac5b92934699d5b07db78
7f5417a85a63967d8bba72496faa997a
7f5417a85a63967d8bba72496faa997a
1e78ba685a4919b7cf60a5c60b22ebc2
1e78ba685a4919b7cf60a5c60b22ebc2
238db202693284f7e8838959ba3c80e8
I tried the following that just listed one of each of the doubles, not just identifying the one that only appeared once
foreach ($line in (Get-Content "C:\hashes.txt" | Select-Object -Unique)) {
Write-Host "Line '$line' appears $(($line | Where-Object {$_ -eq $line}).count) time(s)."
}
You could use a Hashtable and a StreamReader.
The StreamReader reads the file line-by-line and the Hashtable will store that line as Key and in its Value state $true (if this is a duplicate) or $false (if it is unique)
$reader = [System.IO.StreamReader]::new('D:\Test\hashes.txt')
$hash = #{}
while($null -ne ($line = $reader.ReadLine())) {
$hash[$line] = $hash.ContainsKey($line)
}
# clean-up the StreamReader
$reader.Dispose()
# get the unique line(s) by filtering for value $false
$result = $hash.Keys | Where-Object {-not $hash[$_]}
Given your example, $result will contain 238db202693284f7e8838959ba3c80e8
Given that you're dealing with a large file, Get-Content is best avoided.
A switch statement with the -File parameter allows efficient line-by-line processing, and given that duplicates appear to be grouped together already, they can be detected by keeping a running count of identical lines.
$count = 0 # keeps track of the count of identical lines occurring in sequence
switch -File 'C:\hashes.txt' {
default {
if ($prevLine -eq $_ -or $count -eq 0) { # duplicate or first line.
if ($count -eq 0) { $prevLine = $_ }
++$count
}
else { # current line differs from the previous one.
if ($count -eq 1) { $prevLine } # non-duplicate -> output
$prevLine = $_
$count = 1
}
}
}
if ($count -eq 1) { $prevLine } # output the last line, if a non-duplicate.
$values = Get-Content .\hashes.txt # Read the values from the hashes.txt file
$groups = $values | Group-Object | Where-Object { $_.Count -eq 1 } # Group the values by their distinct values and filter for groups with a single value
foreach ($group in $groups) {
foreach ($value in $group.Values) {
Write-Host "$value" # Output the value of each group
}
}
To handle very large files you could try this.
$chunkSize = 1000 # Set the chunk size to 1000 lines
$lineNumber = 0 # Initialize a line number counter
# Use a do-while loop to read the file in chunks
do {
# Read the next chunk of lines from the file
$values = Get-Content .\hashes.txt | Select-Object -Skip $lineNumber -First $chunkSize
# Group the values by their distinct values and filter for groups with a single value
$groups = $values | Group-Object | Where-Object { $_.Count -eq 1 }
foreach ($group in $groups) {
foreach ($value in $group.Values) {
Write-Host "$value" # Output the value of each group
}
}
# Increment the line number counter by the chunk size
$lineNumber += $chunkSize
} while ($values.Count -eq $chunkSize)
Or this
# Create an empty dictionary
$dict = New-Object System.Collections.Hashtable
# Read the file line by line
foreach ($line in Get-Content .\hashes.txt) {
# Check if the line is already in the dictionary
if ($dict.ContainsKey($line)) {
# Increment the value of the line in the dictionary
$dict.Item($line) += 1
} else {
# Add the line to the dictionary with a count of 1
$dict.Add($line, 1)
}
}
# Filter the dictionary for values with a count of 1
$singles = $dict.GetEnumerator() | Where-Object { $_.Value -eq 1 }
# Output the values of the single items
foreach ($single in $singles) {
Write-Host $single.Key
}

Powershell script to search and replace text in a file using two columns in a separate reference file

I want a script that can help me check for the name of keyset (column a) in Sample.cvs and then replace the current command(column b) with new command (column c) in the Source text file.
CSV file: Sample.csv
A. | B. | C.
Manock | 2B | 2ab
Sterling | 3F | 3sf
Source file text: Source.txt
keyset "Manock"
(
key("SELECT")
command ("display/app=%disapp% "2B")
);
So desired output:
keyset "Manock"
(
key("SELECT")
command ("display/app=%disapp% "2ab")
);
Powershell Script:
New-Item -Path "C:\Users\e076200\Desktop\ks_update\source.txt" -ItemType File -Force
$data = Get-Content C:\Users\e076200\Desktop\ks_update\source.ddl
Add-Content -Value $data -Path "C:\Users\e076200\Desktop\ks_update\source.txt"
$foundline = $false
$a = 0
$Etxt = foreach($line in Get-Content C:\Users\e076200\Desktop\ks_update\source.txt)
{
if ($line -match 'keyset "Manock"' )
{
$a = 0
$foundline = $true
}
$a= $a + 1
if($line -match "display/app" -and $a -eq 5 -and $foundline -eq $true)
{
$line = $line.replace('2b' , '2ab')
$line
}
else
{
$line
}
}
$Etxt | Set-Content C:\Users\e076200\Desktop\ks_update\source.txt -Force
$users = Import-CSV -Path:\Users\e076200\Desktop\ks_update\sample.csv
I've figured out how to find and replace one line in the file directly. I've also figured out how to import the csv. I need help on how to make the logic parameterized and use column A of CSV as the match piece and column c as the replacement piece.
Script Explanation.
New-Item -Path "C:\Users\e076200\Desktop\ks_update\source.txt" -ItemType File -Force
New-Item creates new text file # location defined by -Path using name specified at the end, source.
ItemType to define type of document, -Force is force command.
$data = Get-Content C:\Users\e076200\Desktop\ks_update\source.ddl
Retrieves ddl and stores in variable.
Add-Content -Value $data -Path "C:\Users\e076200\Desktop\ks_update\source.txt"
Transfers content from variable to new text file created.
$foundline = $false
conditional variable defined for when keyset identifier is found.
$a = 0
counter defined for if statement.
$Etxt = foreach($line in Get-Content C:\Users\e076200\Desktop\ks_update\source.txt)
$Etxt - for loop
$line - variable for each line in txt
{
if ($line -match 'keyset "Manock"' )
{
$a = 0
$foundline = $true
}
If keyset identifier is found, set counter to 0 and set conditional variable to true
$a= $a + 1
if($line -match "display/app" -and $a -eq 5 -and $foundline -eq $true)
{
$line = $line.replace('2b' , '2ab')
$line
Match found, PS runs logic, line with keyset identifier == 0 + 1....up until line = 5 where we find item to be replaced.
For redundancy, line reader set to check for line identifier, ("display/app") on expected line.
If Redundant check if met and counter is 5 then we replace word with the line.replace function.
Overwritten data is returned in $line
}
else
{
$line
}
Else retain line
}
$Etxt | Set-Content C:\Users\e076200\Desktop\ks_update\source.txt -Force
Updated text file
$users = Import-CSV -Path:\Users\e076200\Desktop\ks_update\sample.csv
Imports Reference csv file
Please make explanation as dumbed down as possible. Thank you.

Powershell/ Print by filename

My English may not be perfect but I do my best.
I'm trying to write a Powershell script where the filename has a number at the end and it should print exactly that often.
Is this somehow possible ?
With the script it prints it only 1 time.
For whatever reason..
param (
[string]$file = "C:\Scans\temp\*.pdf",
[int]$number_of_copies = 1
)
foreach ($onefile in (Get-ChildItem $file -File)) {
$onefile -match '\d$' | Out-Null
for ($i = 1; $i -le [int]$number_of_copies; $i++) {
cmd /C "lpr -S 10.39.33.204 -P optimidoc ""$($onefile.FullName)"""
}
}
There is no need for parameter $number_of_copies when the number of times it should be printed is taken from the file's BaseName anyway.
I would change your code to:
param (
[string]$path = 'C:\Scans\temp'
)
Get-ChildItem -Path $path -Filter '*.pdf' -File |
# filter only files that end with a number and capture that number in $matches[1]
Where-Object { $_.BaseName -match '(\d+)$' } |
# loop through the files and print
ForEach-Object {
for ($i = 1; $i -le [int]$matches[1]; $i++) {
cmd /C "lpr -S 10.39.33.204 -P optimidoc ""$($_.FullName)"""
}
}
Inside the ForEach-Object, on each iteration, the $_ automatic variable represents the current FileInfo object.
P.S. Your script prints each file only once because you set parameter $number_of_copies to 1 as default value, but the code never changes that to the number found in the file name.
BTW. Nothing wrong with your English

Why Isn't This Counting Correctly | PowerShell

Right now, I have a CSV file which contains 3,800+ records. This file contains a list of server names, followed by an abbreviation stating if the server is a Windows server, Linux server, etc. The file also contains comments or documentation, where each line starts with "#", stating it is a comment. What I have so far is as follows.
$file = Get-Content .\allsystems.csv
$arraysplit = #()
$arrayfinal = #()
[int]$windows = 0
foreach ($thing in $file){
if ($thing.StartsWith("#")) {
continue
}
else {
$arraysplit = $thing.Split(":")
$arrayfinal = #($arraysplit[0], $arraysplit[1])
}
}
foreach ($item in $arrayfinal){
if ($item[1] -contains 'NT'){
$windows++
}
else {
continue
}
}
$windows
The goal of this script is to count the total number of Windows servers. My issue is that the first "foreach" block works fine, but the second one results in "$Windows" being 0. I'm honestly not sure why this isn't working. Two example lines of data are as follows:
example:LNX
example2:NT
if the goal is to count the windows servers, why do you need the array?
can't you just say something like
foreach ($thing in $file)
{
if ($thing -notmatch "^#" -and $thing -match "NT") { $windows++ }
}
$arrayfinal = #($arraysplit[0], $arraysplit[1])
This replaces the array for every run.
Changing it to += gave another issue. It simply appended each individual element. I used this post's info to fix it, sort of forcing a 2d array: How to create array of arrays in powershell?.
$file = Get-Content .\allsystems.csv
$arraysplit = #()
$arrayfinal = #()
[int]$windows = 0
foreach ($thing in $file){
if ($thing.StartsWith("#")) {
continue
}
else {
$arraysplit = $thing.Split(":")
$arrayfinal += ,$arraysplit
}
}
foreach ($item in $arrayfinal){
if ($item[1] -contains 'NT'){
$windows++
}
else {
continue
}
}
$windows
1
I also changed the file around and added more instances of both NT and other random garbage. Seems it works fine.
I'd avoid making another ForEach loop for bumping count occurrences. Your $arrayfinal also rewrites everytime, so I used ArrayList.
$file = Get-Content "E:\Code\PS\myPS\2018\Jun\12\allSystems.csv"
$arrayFinal = New-Object System.Collections.ArrayList($null)
foreach ($thing in $file){
if ($thing.StartsWith("#")) {
continue
}
else {
$arraysplit = $thing -split ":"
if($arraysplit[1] -match "NT" -or $arraysplit[1] -match "Windows")
{
$arrayfinal.Add($arraysplit[1]) | Out-Null
}
}
}
Write-Host "Entries with 'NT' or 'Windows' $($arrayFinal.Count)"
I'm not sure if you want to keep 'Example', 'example2'... so I have skipped adding them to arrayfinal, assuming the goal is to count "NT" or "Windows" occurrances
The goal of this script is to count the total number of Windows servers.
I'd suggest the easy way: using cmdlets built for this.
$csv = Get-Content -Path .\file.csv |
Where-Object { -not $_.StartsWith('#') } |
ConvertFrom-Csv
#($csv.servertype).Where({ $_.Equals('NT') }).Count
# Compatibility mode:
# ($csv.servertype | Where-Object { $_.Equals('NT') }).Count
Replace servertype and 'NT' with whatever that header/value is called.

Powershell to count columns in a file

I need to test the integrity of file before importing to SQL.
Each row of the file should have the exact same amount of columns.
These are "|" delimited files.
I also need to ignore the first line as it is garbage.
If every row does not have the same number of columns, then I need to write an error message.
I have tried using something like the following with no luck:
$colCnt = "c:\datafeeds\filetoimport.txt"
$file = (Get-Content $colCnt -Delimiter "|")
$file = $file[1..($file.count - 1)]
Foreach($row in $file){
$row.Count
}
Counting rows is easy. Columns is not.
Any suggestions?
Yep, read the file skipping the first line. For each line split it on the pipe, and count the results. If it isn't the same as the previous throw an error and stops.
$colCnt = "c:\datafeeds\filetoimport.txt"
[int]$LastSplitCount = $Null
Get-Content $colCnt | ?{$_} | Select -Skip 1 | %{if($LastSplitCount -and !($_.split("|").Count -eq $LastSplitCount)){"Process stopped at line number $($_.psobject.Properties.value[5]) for column count mis-match.";break}elseif(!$LastSplitCount){$LastSplitCount = $_.split("|").Count}}
That should do it, and if it finds a bad column count it will stop and output something like:
Process stopped at line number 5 for column count mis-match.
Edit: Added a Where catch to skip blank lines ( ?{$_} )
Edit2: Ok, if you know what the column count should be then this is even easier.
Get-Content $colCnt | ?{$_} | Select -Skip 1 | %{if(!($_.split("|").Count -eq 210)){"Process stopped at line number $($_.psobject.Properties.value[5]), incorrect column count of: $($_.split("|").Count).";break}}
If you want it to return all lines that don't have 210 columns just remove the ;break and let it run.
A more generic approach, including a RegEx filter:
$path = "path\to\folder"
$regex = "regex"
$expValue = 450
$files= Get-ChildItem $path | Where-Object {$_.Name -match $regex}
Foreach( $f in $files) {
$filename = $f.Name
echo $filename
$a = Get-Content $f.FullName;
$i = 1;
$e = 0;
echo "Starting...";
foreach($line in $a)
{
if ($line.length -ne $expValue){
echo $filename
$a | Measure-Object -Line
echo "Long:"
echo $line.Length;
echo "Line NÂș: "
echo $i;
$e = $e + 1;
}
$i = $i+1;
}
echo "Finished";
if ($e -ne 0){
echo $e "errors found";
}else{
echo "No errors"
echo ""
}
}
echo "All files examined"
Another possibility:
$colCnt = "c:\datafeeds\filetoimport.txt"
$DataLine = (Get-Content $colCnt -TotalCount 2)[1]
$DelimCount = ([char[]]$DataLine -eq '|').count
$MatchString = '.*' + ('|.*' * $DelimCount )
$test = Select-String -Path $colCnt -Pattern $MatchString -NotMatch |
where { $_.linenumber -ne 1 }
That will find the number of delimiter characters in the second line, and build a regex pattern that can be used with Select-String.
The -NotMatch switch will make it return any lines that don't match that pattern as MatchInfo objects that will have the filename, line number and content of the problem lines.
Edit: Since the first line is "garbage" you probably don't care if it didn't match so I added a filter to the result to drop that out.