Extracting a portion of a string then using it to match with other strings in Powershell

Extracting a portion of a string then using it to match with other strings in Powershell - powershell

I previously asked for assistance parsing a text file and have been using this code for my script:
import-csv $File -header Tag,Date,Value|
Where {$_.Tag -notmatch '(_His_|_Manual$)'}|
Select-Object *,#{Name='Building';Expression={"{0} {1}" -f $($_.Tag -split '_')[1..2]}}|
Format-table -Groupby Building -Property Tag,Date,Value
I've realized since then that, while the code filters out any tags containing _His or _Manual, I need to also filter any tags associated with _Manual. For example, the following tags are present in my text file:
L01_B111_BuildingName1_MainElectric_111A01ME_ALC,13-Apr-17 08:45,64075
L01_B111_BuildingName1_MainElectric_111A01ME_Cleansed,13-Apr-17 08:45,64075
L01_B111_BuildingName1_MainElectric_111A01ME_Consumption,13-Apr-17 08:45,10.4
L01_B333_BuildingName3_MainWater_333E02MW_Manual,1-Dec-16 18:00:00,4.380384E+07
L01_B333_BuildingName3_MainWater_333E02MW_Cleansed,1-Dec-16 18:00:00,4.380384E+07
L01_B333_BuildingName3_MainWater_333E02MW_Consumption,1-Dec-16 18:00:00,25.36
The 333E02MW_Manual string would be excluded using my current code, but how could I also exclude 333E02MW_Cleansed and 333E02MW_Consumption? I feel I would need something that will allow me to extract the 8-digit code before each _Manual instance and then use it to find any other strings with a {MatchingCode}
xxx_xxxx_xxxxxxxxxxx_xxxxxxxxxx_MatchingCode_Cleansed
xxx_xxxx_xxxxxxxxxxx_xxxxxxxxxx_MatchingCode_Consumption
I know there are the -like -contains and -match operators and I've seen these posts on using substrings and regex, but how could I extract the MatchingCode to actually have something to match to? This post seems to come closest to my goal, but I'm not sure how to apply it to PowerShell.

You can find every tag that ends with _Manual and create a regex pattern that matches any of the parts before _Manual. Ex.
$Data = Import-Csv -Path $File -Header Tag,Date,Value
#Create regex that matches any prefixes that has a manual row (matches using the value before _Manual)
$ExcludeManualPattern = ($Data | Foreach-Object { if($_.Tag -match '^(.*?)_Manual$') { [regex]::Escape($Matches[1]) } }) -join '|'
$Data | Where-Object { $_.Tag -notmatch '_His_' -and $_.Tag -notmatch $ExcludeManualPattern } |
Select-Object -Property *,#{Name='Building';Expression={"{0} {1}" -f $($_.Tag -split '_')[1..2]}}|
Format-table -GroupBy Building -Property Tag,Date,Value

Related

Powershell CSV removing rows and then remove from whole file if A column matches

I've created the following small script to remove 2++ strings from a CSV.
Each row is a log of a given person and a answer they give.
The CSV has X columns.
The column named FIRST identifies the person.
What I need to do is when I delete a row matching the answer, I also need to delete the person from the whole CSV if it had one of the two strings.
What I've made so far, removes the row of people having the answers but the person is still left in the overall CSV with other answers. I want to remove the person fully if the questions have been answered.
Can somebody help me out with making the addition or changes to make this happen?
INPUT File
FIRST,LAST,ADDR,ADDR2,GENDER,HOME,WORK
1,N/A,N/A,N/A,N/A,BAF,N/A
10005,JAS,AA,N/A,,ZAV,N/A
10007,JADE,BB,N/A,OMA,N/A,N/A
10007,JADE,N/A,RAV,N/A,N/A,N/A
10011,KIAH,N/A,N/A,BALI,BB,N/A
SCRIPT
$CSVfile = "C:\Temp\Test\Test.csv"
$CSVfile_filtered = "C:\Temp\Test\Test.csv"
$regex001 = "AA"
$regex002 = "BB"
$filterArray = #($regex001,$regex002)
Get-Content $CSVfile | Select-String -pattern $filterArray -notmatch | Set-Content $CSVfile_filtered
The file should then remove 10005, 10011 and both lines of 10007. But my version only removes one of the 10007 since it only matches one of the two patterns.

Using more of PowerShell's built-in cmdlets can make this a little easier to manage.
# Assuming searching only properties ADDR and ADDR2
$filter = 'AA','BB'
# Grouping by First and Last values to easily remove duplicates
# -match uses regex so | is needed for an OR of multiple items
Import-Csv Test.csv | Group-Object First,Last |
Where {!($_.Group.ADDR,$_.Group.ADDR2 -match ($filter -join '|'))} |
Foreach-Object Group |
Export-Csv output.csv -NoType
You would think strictly using text manipulation would be simpler, but it adds other scenarios to consider:
You will need to track users that have duplicate entries and potentially back track to remove them (if not grouping). This could require reading the file contents twice.
Your header row could match the string you want to filter so you will need to add it to the output if filtering removes it.
Keeping the scenarios above in mind, you can still use a grouping concept:
$filter = 'AA','BB'
$file = Get-Content Test.csv
# $file[0] is the header row
# -split string uses regex and splits at the second comma
# -split results' [0] element is First,Last values
$file[0],($file |
Select-Object -Skip 1 |
Group-Object {($_ -split '(?<=^[^,]*,[^,]*),')[0]} |
where {!($_.Group -match ($filter -join '|'))} |
Foreach-Object Group) | Set-Content output.csv

If I got it right you could do something like this:
$SearchPattern = 'AA', 'BB'
$INPUTCSV = #'
FIRST,LAST,ADDR,ADDR2,GENDER,HOME,WORK
1,N/A,N/A,N/A,N/A,BAF,N/A
10005,JAS,AA,N/A,,ZAV,N/A
10007,JADE,BB,N/A,OMA,N/A,N/A
10007,JADE,N/A,RAV,N/A,N/A,N/A
10011,KIAH,N/A,N/A,BALI,BB,N/A
'# | ConvertFrom-Csv
$ActualSearchPattern =
$INPUTCSV |
Where-Object {
$_.LAST -in $SearchPattern -or
$_.ADDR -in $SearchPattern -or
$_.ADDR2 -in $SearchPattern -or
$_.GENDER -in $SearchPattern -or
$_.HOME -in $SearchPattern -or
$_.Work -in $SearchPattern
} |
Select-Object -ExpandProperty FIRST
$INPUTCSV |
Where-Object -Property FIRST -NotIn -Value $ActualSearchPattern |
Format-Table -AutoSize
There might be more sophisticated or more elegant ways but I cannot think about one at the moment. ;-)

There is a nice PowerShell module you can use to manipulate the content of a csv or xlsx file: ImportExcel
This give you a lot of options to manipulate the sheets, columns etc.

Powershell - Finding the output of get-contents and searching for all occurrences in another file using wild cards

I'm trying to get the output of two separate files although I'm stuck on the wild card or contains select-string search from file A (Names) in file B (name-rank).
The contents of file A is:
adam
george
william
assa
kate
mark
The contents of file B is:
12-march-2020,Mark-1
12-march-2020,Mark-2
12-march-2020,Mark-3
12-march-2020,william-4
12-march-2020,william-2
12-march-2020,william-7
12-march-2020,kate-54
12-march-2020,kate-12
12-march-2020,kate-44
And I need to match on every occurrence of the names after the '-' so my ordered output should look like this which is a combination of both files as the output:
mark
Mark-1
Mark-2
Mark-3
william
william-2
william-4
william-7
Kate
kate-12
kate-44
kate-54
So far I only have the following and I'd be grateful for any pointers or assistance please.
import-csv (c:\temp\names.csv) |
select-string -simplematch (import-csv c:\temp\names-rank.csv -header "Date", "RankedName" | select RankedName) |
set-content c:\temp\names-and-ranks.csv
I imagine the select-string isn't going to be enough and I need to write a loop instead.

The data you give in the example does not give you much to work with, and the desired output is not that intuitive, most of the time with Powershell you would like to combine the data in to a much richer output at the end.
But anyway, with what is given here and what you want, the code bellow will get what you need, I have left comments in the code for you
$pathDir='C:\Users\myUser\Downloads\trash'
$names="$pathDir\names.csv"
$namesRank="$pathDir\names-rank.csv"
$nameImport = Import-Csv -Path $names -Header names
$nameRankImport= Import-Csv -Path $namesRank -Header date,rankName
#create an empty array to collect the result
$list=#()
foreach($name in $nameImport){
#get all the match names
$match=$nameRankImport.RankName -like "$($name.names)*"
#add the name from the First list
$list+=($name.names)
#if there are any matches, add them too
if($match){
$list+=$match
}
}
#Because its a one column string, Export-CSV will now show us what we want
$list | Set-Content -Path "$pathDir\names-and-ranks.csv" -Force

For this I would use a combination of Group-Object and Where-Object to first group all "RankedName" items by the name before the dash, then filter on those names to be part of the names we got from the 'names.csv' file and output the properties you need.
# read the names from the file as string array
$names = Get-Content -Path 'c:\temp\names.csv' # just a list of names, so really not a CSV
# import the CSV file and loop through
Import-Csv -Path 'c:\temp\names-rank.csv' -Header "Date", "RankedName" |
Group-Object { ($_.RankedName -split '-')[0] } | # group on the name before the dash in the 'RankedName' property
Where-Object { $_.Name -in $names } | # use only the groups that have a name that can be found in the $names array
ForEach-Object {
$_.Name # output the group name (which is one of the $names)
$_.Group.RankedName -join [environment]::NewLine # output the group's 'RankedName' property joined with a newline
} |
Set-Content -Path 'c:\temp\names-and-ranks.csv'
Output:
Mark
Mark-1
Mark-2
Mark-3
william
william-4
william-2
william-7
kate
kate-54
kate-12
kate-44

Powershell - How to create array of filenames based on filename?

I'm looking to create an array of files (pdf's specifically) based on filenames in Powershell. All files are in the same directory. I've spent a couple of days looking and can't find anything that has examples of this or something that is close but could be changed. Here is my example of file names:
AR - HELLO.pdf
AF - HELLO.pdf
RT - HELLO.pdf
MH - HELLO.pdf
AR - WORLD.pdf
AF - WORLD.pdf
RT - WORLD.pdf
HT - WORLD.pdf
....
I would like to combine all files ending in 'HELLO' into an array and 'WORLD' into another array and so on.
I'm stuck pretty early on in the process as I'm brand new to creating scripts, but here is my sad start:
Get-ChildItem *.pdf
Where BaseName -match '(.*) - (\w+)'
Updated Info...
I do not know the name after the " - " so using regex is working.
My ultimate goal is to combine PDF's based on the matching text after the " - " in the filename and the most basic code for this is:
$file1 = "1 - HELLO.pdf"
$file2 = "2 - HELLO.PDF"
$mergedfile = "HELLO.PDF"
Merge-PDF -InputFile $file1, $file2 -OututFile $mergedfile
I have also gotten the Merge-PDF to work using this code which merges all PDF's in the directory:
$Files = Get-ChildItem *.pdf
$mergedfiles = "merged.pdf"
Merge-PDF -InputFile $Files -OutputFile $mergedfiles
Using this code from #Mathias the $suffix portion of the -OutputFile works but the -InputFile portion is returning an error "Exception calling "Close" with "0" argument(s)"
$groups = Get-ChildItem *.pdf |Group-Object {$_.BaseName -replace
'^.*\b(\w+)$','$1'} -AsHashTable
foreach($suffix in $groups.Keys) {Merge-PDF -InputFile $(#($groups[$suffix]))
-OutputFile "$suffix.pdf"}
For the -InputFile I've tried a lot of different varieties and I keep getting the "0" arguments error. The values in the Hashtable seem to be correct so I'm not sure why this isn't working.
Thanks

This should do the trick:
$HELLO = Get-ChildItem *HELLO.pdf |Select -Expand Name
$WORLD = Get-ChildItem *WORLD.pdf |Select -Expand Name
If you want to group file names by the last word in the base name and you don't know them up front, regex is indeed an option:
$groups = Get-ChildItem *.pdf |Group-Object {$_.BaseName -replace '^.*\b(\w+)$','$1'} -AsHashTable
And then you can do:
$groups['HELLO'].Name
for all the file names ending with the word HELLO, or, to iterate over all of them:
foreach($suffixGroup in $groups.GetEnumerator()){
Write-Host "There are $($suffixGroup.Value.Count) files ending in $($suffixGroup.Key)"
}

Another option is to get all items with Get-ChildItem and use Where-Object to filter.
$fileNames = Get-ChildItem | Select-Object -ExpandProperty FullName
#then filter
$fileNames | Where-Object {$_.EndsWith("HELLO.PDF")}
#or use the aliases if you want to do less typing:
$fileNames = gci | select -exp FullName
$fileNames | ? {$_.EndsWith("HELLO.PDF")}
Just wanted to show more options -especially the Where-Object cmdlet which comes in useful when you're calling cmdlets that don't have parameters to filter.
Side note:
You may be asking what -ExpandProperty does.
If you just call gci | select -exp FullName, you will get back an array of PSCustomObjects (each of them with one property called FullName).
This can be confusing for people who don't really see that the objects are typed as it is not visible just by looking at the PowerShell script.

Powershell: delete duplicate entry in arraylist

In my Powershellscript I read some data from a csv-File in an Arraylist.
In the second step I eliminate every line without the specific char: (.
At the third step I want to eliminate every double entries.
Example for my list:
Klein, Jürgen (Klein01); salesmanagement national
Klein, Jürgen (Klein01); salesmanagement national
Meyer, Gerlinde (Meyer02); accounting
Testuser
Admin1
Müller, Kai (Muell04); support international
I use the following script:
$Arrayusername = New-Object System.Collections.ArrayList
$NewArraylistuser = New-Object System.Collections.ArrayList
$Arrayusername = Get-Content -Path "C:\Temp\User\Userlist.csv"
for ($i=0; $i -le $Arrayusername.length; $i++)
{
if ($Arrayusername[$i] -like "*(*")
{
$NewArraylistuser.Add($Arrayusername_ads[$i])
}
$Array_sorted = $NewArraylistuser | sort
$Array_sorted | Get-Unique
}
But the variable $Array_sorted still has double entries.
I don´t find the mistake.

Some Ideas how you could change your code:
Use the existing Command to import .csv files with the Delimiter ;.
Filter the output with Where-Object to only include Names with (.
Select only unique objects with Select-Object, or if you want to sort the Object, use the Sort-Object with the same paramets.
Something like this should work:
Import-csv -Delimiter ';' -Header "Name","Position" -Path "C:\Temp\User\Userlist.csv" | Where-Object {$_.Name -like "*(*"} | Sort-Object -Unique -Property Name,Position

Switch Names from one side to other

I have to set description to a list of sames provided in csv format.
I know I need samaccountnames so i am trying to pull up samaccount from named, unfortunately the names in csv are in reverse order with a header as name
example
Name
Snow, Jon
Starc,arya
lannister,jamie
In a nutshell, I tried
Import-Csv C:\list.csv |
foreach {
$_.Name = "{1}, {0}" -f ($_.Name -split ', ')
$_
No luck, any help is appreciated.
The names should come as -
Jon snow
Arya starc
Jamie lannister
so I can query AD for sam's

To have lastname and firstname you can do this:
$names = $string.split(",")
[array]::Reverse($names)
$names
So if I understood correctly, you want to skip the header (first line). Try changing the following:
Import-Csv C:\list.csv | Select-Object -Skip 1 | ConvertFrom-Csv -Header Name
(or increase '-Skip' amount according to how many lines you want to skip)

The question you have here to me reads a little vague, but I wanted to offer the below. If you specify the header of the CSV when you import it, you can output whatever Property you want, in whatever order:
$test = Import-Csv -Path "C:\temp\test.txt" -Header Last_Name,First_Name
$test | % {"$($_.First_Name) $($_.Last_Name)"}

It was
Import-Csv c:\list.csv |
foreach{
$last,$first = $_.name -split ","
new-object psobject -Property #{name = "$first,$last"}
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Extracting a portion of a string then using it to match with other strings in Powershell - powershell

Related

Powershell CSV removing rows and then remove from whole file if A column matches

Powershell - Finding the output of get-contents and searching for all occurrences in another file using wild cards

Powershell - How to create array of filenames based on filename?

Powershell: delete duplicate entry in arraylist

Switch Names from one side to other

Categories

Resources