Remove text from a dynamic file based on data in a static file - powershell

I have a powershell script that generates data that is sent to the file Dynamic.txt. The script generates a list of servers that meet very specific criteria. The list is then processed. However I have about 20 servers that meet the criteria that I do not want in the list.
This is my static list. I can remove the servers from the list using the Foreach-Object {$_ -replace "xxx", ""} command. However this is messy and I want cleaner code. How can I remove data from Dynamic.txt based on data in Static.txt?

To remove entries from one text file based on entries in another text file,
$dynamic = Get-Content .\Dynamic.txt
$static = Get-Content .\Static.txt
$dynamic| where { $static -notcontains $_ }| Set-Content .\Dynamic.txt

You could use the Compare-Object cmdlet.
The Compare-Object cmdlet compares two sets of objects. One set of
objects is the Reference set, and the other set is the Difference set.
Here's some example code.
Contents of colors.txt:
red
green
blue
pink
Contents of notcolors.txt:
green
Command and output:
compare-object (Get-Content "notcolors.txt") (Get-Content "colors.txt") | FL
InputObject : red
SideIndicator : =>
InputObject : blue
SideIndicator : =>
InputObject : pink
SideIndicator : =>
Simply selecting InputObject from the results should give you the correct list of servers.
This is powershell, there are other ways too. You could use a filter somewhere in the script that might go something like this ( you might have to switch around the $_.Name and Get-Content portions to get the logic right.)
...| Where-Object {$_.Name -notmatch (Get-Content serverlist.txt)} | ...

Say content of diff.txt should be equal to difference of fileX and fileY then use the below code
$fileA = 'fileA.txt'
$fileB = 'fileB.txt'
$diff = 'diff.txt'
$fileAContent = Get-Content $fileA -Encoding UTF8
$fileBContent = Get-Content $fileB -Encoding UTF8
$fileAContent| where { $fileBContent -notcontains $_ }| Set-Content $diff

Related

Powershell CSV removing rows and then remove from whole file if A column matches

I've created the following small script to remove 2++ strings from a CSV.
Each row is a log of a given person and a answer they give.
The CSV has X columns.
The column named FIRST identifies the person.
What I need to do is when I delete a row matching the answer, I also need to delete the person from the whole CSV if it had one of the two strings.
What I've made so far, removes the row of people having the answers but the person is still left in the overall CSV with other answers. I want to remove the person fully if the questions have been answered.
Can somebody help me out with making the addition or changes to make this happen?
INPUT File
FIRST,LAST,ADDR,ADDR2,GENDER,HOME,WORK
1,N/A,N/A,N/A,N/A,BAF,N/A
10005,JAS,AA,N/A,,ZAV,N/A
10007,JADE,BB,N/A,OMA,N/A,N/A
10007,JADE,N/A,RAV,N/A,N/A,N/A
10011,KIAH,N/A,N/A,BALI,BB,N/A
SCRIPT
$CSVfile = "C:\Temp\Test\Test.csv"
$CSVfile_filtered = "C:\Temp\Test\Test.csv"
$regex001 = "AA"
$regex002 = "BB"
$filterArray = #($regex001,$regex002)
Get-Content $CSVfile | Select-String -pattern $filterArray -notmatch | Set-Content $CSVfile_filtered
The file should then remove 10005, 10011 and both lines of 10007. But my version only removes one of the 10007 since it only matches one of the two patterns.
Using more of PowerShell's built-in cmdlets can make this a little easier to manage.
# Assuming searching only properties ADDR and ADDR2
$filter = 'AA','BB'
# Grouping by First and Last values to easily remove duplicates
# -match uses regex so | is needed for an OR of multiple items
Import-Csv Test.csv | Group-Object First,Last |
Where {!($_.Group.ADDR,$_.Group.ADDR2 -match ($filter -join '|'))} |
Foreach-Object Group |
Export-Csv output.csv -NoType
You would think strictly using text manipulation would be simpler, but it adds other scenarios to consider:
You will need to track users that have duplicate entries and potentially back track to remove them (if not grouping). This could require reading the file contents twice.
Your header row could match the string you want to filter so you will need to add it to the output if filtering removes it.
Keeping the scenarios above in mind, you can still use a grouping concept:
$filter = 'AA','BB'
$file = Get-Content Test.csv
# $file[0] is the header row
# -split string uses regex and splits at the second comma
# -split results' [0] element is First,Last values
$file[0],($file |
Select-Object -Skip 1 |
Group-Object {($_ -split '(?<=^[^,]*,[^,]*),')[0]} |
where {!($_.Group -match ($filter -join '|'))} |
Foreach-Object Group) | Set-Content output.csv
If I got it right you could do something like this:
$SearchPattern = 'AA', 'BB'
$INPUTCSV = #'
FIRST,LAST,ADDR,ADDR2,GENDER,HOME,WORK
1,N/A,N/A,N/A,N/A,BAF,N/A
10005,JAS,AA,N/A,,ZAV,N/A
10007,JADE,BB,N/A,OMA,N/A,N/A
10007,JADE,N/A,RAV,N/A,N/A,N/A
10011,KIAH,N/A,N/A,BALI,BB,N/A
'# | ConvertFrom-Csv
$ActualSearchPattern =
$INPUTCSV |
Where-Object {
$_.LAST -in $SearchPattern -or
$_.ADDR -in $SearchPattern -or
$_.ADDR2 -in $SearchPattern -or
$_.GENDER -in $SearchPattern -or
$_.HOME -in $SearchPattern -or
$_.Work -in $SearchPattern
} |
Select-Object -ExpandProperty FIRST
$INPUTCSV |
Where-Object -Property FIRST -NotIn -Value $ActualSearchPattern |
Format-Table -AutoSize
There might be more sophisticated or more elegant ways but I cannot think about one at the moment. ;-)
There is a nice PowerShell module you can use to manipulate the content of a csv or xlsx file: ImportExcel
This give you a lot of options to manipulate the sheets, columns etc.

Powershell - Finding the output of get-contents and searching for all occurrences in another file using wild cards

I'm trying to get the output of two separate files although I'm stuck on the wild card or contains select-string search from file A (Names) in file B (name-rank).
The contents of file A is:
adam
george
william
assa
kate
mark
The contents of file B is:
12-march-2020,Mark-1
12-march-2020,Mark-2
12-march-2020,Mark-3
12-march-2020,william-4
12-march-2020,william-2
12-march-2020,william-7
12-march-2020,kate-54
12-march-2020,kate-12
12-march-2020,kate-44
And I need to match on every occurrence of the names after the '-' so my ordered output should look like this which is a combination of both files as the output:
mark
Mark-1
Mark-2
Mark-3
william
william-2
william-4
william-7
Kate
kate-12
kate-44
kate-54
So far I only have the following and I'd be grateful for any pointers or assistance please.
import-csv (c:\temp\names.csv) |
select-string -simplematch (import-csv c:\temp\names-rank.csv -header "Date", "RankedName" | select RankedName) |
set-content c:\temp\names-and-ranks.csv
I imagine the select-string isn't going to be enough and I need to write a loop instead.
The data you give in the example does not give you much to work with, and the desired output is not that intuitive, most of the time with Powershell you would like to combine the data in to a much richer output at the end.
But anyway, with what is given here and what you want, the code bellow will get what you need, I have left comments in the code for you
$pathDir='C:\Users\myUser\Downloads\trash'
$names="$pathDir\names.csv"
$namesRank="$pathDir\names-rank.csv"
$nameImport = Import-Csv -Path $names -Header names
$nameRankImport= Import-Csv -Path $namesRank -Header date,rankName
#create an empty array to collect the result
$list=#()
foreach($name in $nameImport){
#get all the match names
$match=$nameRankImport.RankName -like "$($name.names)*"
#add the name from the First list
$list+=($name.names)
#if there are any matches, add them too
if($match){
$list+=$match
}
}
#Because its a one column string, Export-CSV will now show us what we want
$list | Set-Content -Path "$pathDir\names-and-ranks.csv" -Force
For this I would use a combination of Group-Object and Where-Object to first group all "RankedName" items by the name before the dash, then filter on those names to be part of the names we got from the 'names.csv' file and output the properties you need.
# read the names from the file as string array
$names = Get-Content -Path 'c:\temp\names.csv' # just a list of names, so really not a CSV
# import the CSV file and loop through
Import-Csv -Path 'c:\temp\names-rank.csv' -Header "Date", "RankedName" |
Group-Object { ($_.RankedName -split '-')[0] } | # group on the name before the dash in the 'RankedName' property
Where-Object { $_.Name -in $names } | # use only the groups that have a name that can be found in the $names array
ForEach-Object {
$_.Name # output the group name (which is one of the $names)
$_.Group.RankedName -join [environment]::NewLine # output the group's 'RankedName' property joined with a newline
} |
Set-Content -Path 'c:\temp\names-and-ranks.csv'
Output:
Mark
Mark-1
Mark-2
Mark-3
william
william-4
william-2
william-7
kate
kate-54
kate-12
kate-44

Extracting a portion of a string then using it to match with other strings in Powershell

I previously asked for assistance parsing a text file and have been using this code for my script:
import-csv $File -header Tag,Date,Value|
Where {$_.Tag -notmatch '(_His_|_Manual$)'}|
Select-Object *,#{Name='Building';Expression={"{0} {1}" -f $($_.Tag -split '_')[1..2]}}|
Format-table -Groupby Building -Property Tag,Date,Value
I've realized since then that, while the code filters out any tags containing _His or _Manual, I need to also filter any tags associated with _Manual. For example, the following tags are present in my text file:
L01_B111_BuildingName1_MainElectric_111A01ME_ALC,13-Apr-17 08:45,64075
L01_B111_BuildingName1_MainElectric_111A01ME_Cleansed,13-Apr-17 08:45,64075
L01_B111_BuildingName1_MainElectric_111A01ME_Consumption,13-Apr-17 08:45,10.4
L01_B333_BuildingName3_MainWater_333E02MW_Manual,1-Dec-16 18:00:00,4.380384E+07
L01_B333_BuildingName3_MainWater_333E02MW_Cleansed,1-Dec-16 18:00:00,4.380384E+07
L01_B333_BuildingName3_MainWater_333E02MW_Consumption,1-Dec-16 18:00:00,25.36
The 333E02MW_Manual string would be excluded using my current code, but how could I also exclude 333E02MW_Cleansed and 333E02MW_Consumption? I feel I would need something that will allow me to extract the 8-digit code before each _Manual instance and then use it to find any other strings with a {MatchingCode}
xxx_xxxx_xxxxxxxxxxx_xxxxxxxxxx_MatchingCode_Cleansed
xxx_xxxx_xxxxxxxxxxx_xxxxxxxxxx_MatchingCode_Consumption
I know there are the -like -contains and -match operators and I've seen these posts on using substrings and regex, but how could I extract the MatchingCode to actually have something to match to? This post seems to come closest to my goal, but I'm not sure how to apply it to PowerShell.
You can find every tag that ends with _Manual and create a regex pattern that matches any of the parts before _Manual. Ex.
$Data = Import-Csv -Path $File -Header Tag,Date,Value
#Create regex that matches any prefixes that has a manual row (matches using the value before _Manual)
$ExcludeManualPattern = ($Data | Foreach-Object { if($_.Tag -match '^(.*?)_Manual$') { [regex]::Escape($Matches[1]) } }) -join '|'
$Data | Where-Object { $_.Tag -notmatch '_His_' -and $_.Tag -notmatch $ExcludeManualPattern } |
Select-Object -Property *,#{Name='Building';Expression={"{0} {1}" -f $($_.Tag -split '_')[1..2]}}|
Format-table -GroupBy Building -Property Tag,Date,Value

Using Powershell to compare two files and then output only the different string names

So I am a complete beginner at Powershell but need to write a script that will take a file, compare it against another file, and tell me what strings are different in the first compared to the second. I have had a go at this but I am struggling with the outputs as my script will currently only tell me on which line things are different, but it also seems to count lines that are empty too.
To give some context for what I am trying to achieve, I would like to have a static file of known good Windows processes ($Authorized) and I want my script to pull a list of current running processes, filter by the process name column so to just pull the process name strings, then match anything over 1 character, sort the file by unique values and then compare it against $Authorized, plus finally either outputting the different process strings found in $Processes (to the ISE Output Pane) or just to output the different process names to a file.
I have spent today attempting the following in Powershell ISE and also Googling around to try and find solutions. I heard 'fc' is a better choice instead of Compare-Object but I could not get that to work. I have thus far managed to get it to work but the final part where it compares the two files it seems to compare line by line, for which would always give me false positives as the line position of the process names in the file supplied would change, furthermore I only want to see the changed process names, and not the line numbers which it is reporting ("The process at line 34 is an outlier" is what currently gets outputted).
I hope this makes sense, and any help on this would be very much appreciated.
Get-Process | Format-Table -Wrap -Autosize -Property ProcessName | Outfile c:\users\me\Desktop\Processes.txt
$Processes = 'c:\Users\me\Desktop\Processes.txt'
$Output_file = 'c:\Users\me\Desktop\Extracted.txt'
$Sorted = 'c:\Users\me\Desktop\Sorted.txt'
$Authorized = 'c:\Users\me\Desktop\Authorized.txt'
$regex = '.{1,}'
select-string -Path $Processes -Pattern $regex |% { $_.Matches } |% { $_.Value } > $Output_file
Get-Content $Output_file | Sort-Object -Unique > $Sorted
$dif = Compare-Object -ReferenceObject $(Get-Content $Sorted) -DifferenceObject $(get-content $Authorized) -IncludeEqual
$lineNumber = 1
foreach ($difference in $dif)
{
if ($difference.SideIndicator -ne "==")
{
Write-Output "The Process at Line $linenumber is an Outlier"
}
$lineNumber ++
}
Remove-Item c:\Users\me\Desktop\Processes.txt
Remove-Item c:\Users\me\Desktop\Extracted.txt
Write-Output "The Results are Stored in $Sorted"
From the length and complexity of your script, I feel like I'm missing something, but your description seems clear
Running process names:
$ProcessNames = #(Get-Process | Select-Object -ExpandProperty Name)
.. which aren't blank: $ProcessNames = $ProcessNames | Where-Object {$_ -ne ''}
List of authorised names from a file:
$AuthorizedNames = Get-Content 'c:\Users\me\Desktop\Authorized.txt'
Compare:
$UnAuthorizedNames = $ProcessNames | Where-Object { $_ -notin $AuthorizedNames }
optional output to file:
$UnAuthorizedNames | Set-Content out.txt
or in the shell:
#(gps).Name -ne '' |? { $_ -notin (gc authorized.txt) } | sc out.txt
1 2 3 4 5 6 7 8
1. #() forces something to be an array, even if it only returns one thing
2. gps is a default alias of Get-Process
3. using .Property on an array takes that property value from every item in the array
4. using an operator on an array filters the array by whether the items pass the test
5. ? is an alias of Where-Object
6. -notin tests if one item is not in a collection
7. gc is an alias of Get-Content
8. sc is an alias of Set-Content
You should use Set-Content instead of Out-File and > because it handles character encoding nicely, and they don't. And because Get-Content/Set-Content sounds like a memorable matched pair, and Get-Content/Out-File doesn't.

Remove String in one text file from another text file

I have two text files. One has all servers on the domain (servers_all.txt) and the other has only the virtual servers (virtual_servers.txt). I want to get the difference between the two so I can find the physical servers. I have been tring compare-object to no avail
Compare-Object (gc servers_all.txt) (gc .\VIRTUAL_SERVERS.TXT) -IncludeEqual
Is there a way I can remove the servers listed in virtual_servers.txt from servers_all.txt.
Both txt files are formatted the same way with only the server names in a single column ie:
ServerA
ServerB
ServerC
Compare-Object is somewhat obtuse (to put it mildly). I avoid it.
$VirtualServers = Get-Content .\VIRTUAL_SERVERS.TXT;
$NonVirtualServers = Get-Content servers_all.txt | Where-Object { $_ -notin $VirtualServers };
The only thing to beware of here is file formatting errors like leading or trailing whitespace. If you want to handle that, you could do something like:
$VirtualServers = Get-Content .\VIRTUAL_SERVERS.TXT | ForEach-Object { $_.Trim() };
$NonVirtualServers = Get-Content servers_all.txt | ForEach-Object { $_.Trim() } | Where-Object { $_ -notin $VirtualServers };
$all = 'a','b','c','d'
$virtual = 'b','c'
Compare-Object $all $virtual -PassThru
->
a
d