When a word matches retrieve the varying string after it - powershell

I have a query which looks like this:
FROM TableA
INNER JOIN TableB
ON TableA.xx = TableB.xx
INNER JOIN TableC
ON TableA.yy = TableC.yy
I am trying to write a script which selects the tables which come after the word "JOIN".
The script that I wrote now is:
$data = Get-Content -Path query1.txt
$dataconv = "$data".ToLower() -replace '\s+', ' '
$join = 0
$overigetabellen = ($dataconv) | foreach {
if ($_ -match "join (.*)") {
$join++
$join = $matches[1].Split(" ")[0]
#Write-Host "Table(s) on which is joined:" $join"."
$join
}
}
$overigetabellen
This gives me only the first table, so TableB.
Can anyone help me how I get the second table also as output?

Process your data with Select-String:
$data | Select-String -AllMatches -Pattern '(?<=join\s+)\S+' |
Select-Object -Expand Matches |
Select-Object -Expand Groups |
Select-Object -Expand Value
(?<=...) is a so-called positive lookbehind assertion that is used for matching the pattern without being included in the returned string (meaning the returned matches are just the table names without the JOIN before them).

This is my naive attempt to find the desired table names.
Split the data input on whitespace into an array, find the indices of the word "JOIN", and then access the following indices after the word "JOIN."
$data = Get-Content -Path query1.txt
$indices = #()
$output = #()
$dataarray = $data -split '\s+'
$singleIndex = -1
Do{
$singleIndex = [array]::IndexOf($dataarray,"JOIN",$singleIndex + 1)
If($singleIndex -ge 0){$indices += $singleIndex}
}While($singleIndex -ge 0)
foreach ($index in $indices) {
$output += $dataarray[$index + 1]
}
Outputs:
TableB
TableC
You can adjust for capitalization (saw you set your input to all lowercase), etc as needed if you expect varying input files.

Related

How do you group unique values from imported csv in a foreach loop

I've got a txt file with the following content:
#test.txt
'ALDHT21;MIMO;1111;BOK;Tree'
'ALDHT21;MIMO;1211;BOK;Tree'
'PRGHT21;AIMO;1351;STE;Water'
'PRGHT21;AIMO;8888;FRA;Stone'
'ABCDT22;DIDO;8888;STE;Stone'
'PRA2HT21;ADDO;8888;STE;Stone'
';ADDO;1317;STE;Stone'
To make it easier to explain, let's give the above content headers:
''Group;Code;ID;Signature;Type'
With the help of Powershell, I'm trying to create a foreach loop of each unique "Signature" to return two variables with unique data from rows where the "Signature" exists in and then mashed together with some delimiters.
Based on the file content, here are the expected results:
First loop:
$Signature = "BOK"
$Groups = "Tree:ALDHT21"
$Codes = "Tree:MIMO"
Next loop:
$Signature = "FRA"
$Groups = "Stone:PRGHT21"
$Codes = "Stone:AIMO"
Last loop:
$Signature = "STE"
$Groups = "Stone:PRA2HT21,Stone:ABCDT22,Water:PRGHT21"
$Codes = "Stone:ADDO,Stone:DIDO,Water:AIMO"
Notice the last loop should skip the last entry in the file because it contains an empty Group.
My attempt didn't quite hit the mark and I'm struggling to find a good way to accomplish this:
$file = "C:\temp\test.txt"
$uniqueSigs = (gc $file) -replace "'$|^'" | ConvertFrom-Csv -Delimiter ';' -Header Group,Code,ID,Signature,Type | group Signature
foreach ($sigs in $uniqueSigs) {
$Groups = ""
foreach ($Group in $sigs.Group) {
$Groups += "$($Group.Type):$($Group.Group),"
}
$Groups = $Groups -replace ",$"
[PSCustomObject] #{
Signatur = $sigs.Name
Groups = $Groups
}
$Codes = ""
foreach ($Group in $sigs.Group) {
$Codes += "$($Group.Type):$($Group.Code),"
}
$Codes = $Codes -replace ",$"
[PSCustomObject] #{
Signatur = $sigs.Name
Codes = $Codes
}
$Signature = $sigs.Name
If ($Group.Group){
write-host "$Signature "-" $Groups "-" $Codes "
}
}
Result from my bad attempt:
BOK - Tree:ALDHT21,Tree:ALDHT21 - Tree:MIMO,Tree:MIMO
FRA - Stone:PRGHT21 - Stone:AIMO
Any help appreciated. :)
Your variables are somewhat confusingly named; the following streamlined solution uses fewer variables and perhaps produces the desired result:
$file = "test.txt"
(Get-Content $file) -replace "'$|^'" | ConvertFrom-Csv -Delimiter ';' -Header Group,Code,ID,Signature,Type |
Group-Object Signature |
ForEach-Object {
# Create and output an object with group information.
# Skip empty .Group properties among the group's member objects.
# Get the concatenation of all .Group and .Code column
# values each, skipping empty groups and eliminating duplicates.
$groups = (
$_.Group.ForEach({ if ($_.Group) { "$($_.Type):$($_.Group)" } }) |
Select-Object -Unique
) -join ","
$codes = (
$_.Group.ForEach({ "$($_.Type):$($_.Code)" }) |
Select-Object -Unique
) -join ","
# Create and output an object comprising the signature
# and the concatenated groups and codes.
[PSCustomObject] #{
Signature = $_.Name
Groups = $groups
Codes = $codes
}
# Note: This is just *for-display* output.
# Don't use Write-Host to output *data*.
Write-Host ($_.Name, $groups, $codes -join ' - ')
}
Output:
BOK - Tree:ALDHT21 - Tree:MIMO
FRA - Stone:PRGHT21 - Stone:AIMO
STE - Water:PRGHT21,Stone:ABCDT22,Stone:PRA2HT21 - Water:AIMO,Stone:DIDO,Stone:ADDO
Signature Groups Codes
--------- ------ -----
BOK Tree:ALDHT21 Tree:MIMO
FRA Stone:PRGHT21 Stone:AIMO
STE Water:PRGHT21,Stone:ABCDT22,Stone:PRA2HT21 Water:AIMO,Stone:DIDO,Stone:ADDO
Note that the for-display Write-Host surprisingly precedes the the default output formatting for the [pscustomobject] instances, which is due to the asynchronous behavior of the implicitly applied Format-Table formatting explained in this answer.

PowerShell: list CSV file rows where at least one value between the 3rd and last column is equal to "0" or "1"

In my PowerShell script, I'm working with a CSV file that looks like this (with a number of rows and columns that can vary, but there will always be at least the headers and the first 2 columns):
OS;IP;user0;user1;user3
Windows;10.0.0.1;;;
Linux;hostname2;0;;1
Linux;10.0.0.3;;0;0
Linux;hostname4;;;
Windows;hostname5;1;1;1
I basically list servers in the first column and users in the first row (CSV header). This represents a user "access granting" matrix to servers (1 for "give access", 0 for "remove access", and void for "don't change").
I'm looking for a way to extract only the rows that include a value equal to "1" or "0" between (and including) the 3rd and last column. (= to eventually get the list of servers where access rights should be changed)
So taking the above example, I only want the following lines returned:
Linux;hostname2;0;;1
Linux;10.0.0.3;;0;0
Windows;hostname5;1;1;1
Any hints to make this possible? Or the opposite (getting the ones without any 0 or 1)?
Even if it means using "Get-Content" instead of "Import-CSV". I don't care about the 1st (headers) row; I know how to exclude that.
Thank you!
--- Final solution, thanks to #Tomalak's answer:
$AccessMatrix = Import-CSV $CSVfile -delimiter ';'
$columns = $AccessMatrix | Get-Member -MemberType NoteProperty | Select-Object -Skip 2 -ExpandProperty Name
$AccessMatrix = $AccessMatrix | ForEach-Object {
$row = $_
foreach ($col in $columns) {
if ($row.$col.trim() -eq "1" -OR $row.$col.trim() -eq "0") {
$row # this pushes the $row onto the pipeline
break
}
}
}
The following uses Get-Member to select the names of all columns after the first two.
Then, using ForEach-Object, we can output only those rows that have a value in any of those columns.
$data = ConvertFrom-Csv "OS;IP;user0;user1;user3
Windows;10.0.0.1;;;
Linux;hostname2;0;;1
Linux;10.0.0.3;;0;0
Linux;hostname4;;;
Windows;hostname5;1;1;1" -Delimiter ";"
$columns = $data | Get-Member -MemberType NoteProperty | Select-Object -Skip 2 -ExpandProperty Name
$data | ForEach-Object {
$row = $_
foreach ($col in $columns) {
if ($row.$col -ne "") {
$row # this pushes the $row onto the pipeline
break
}
}
}
The break statement stops the execution of the inner foreach loop because there is no point in further checking as soon as the first column with any value is found.
This is equivalent to the above, if you prefer Where-Object:
$data | Where-Object {
$row = $_
foreach ($col in $columns) {
if ($row.$col -ne "") {
return $true
}
}
}

What's the best way in PowerShell to parse these strings?

I'm getting two string passed into my script:
"Project1,Project2,Project3,Project4"
"web,batch,web,components"
The strings come from a tool in our DevOps toolchain and I have no control over the input format. String 1 could be any number of projects. String 2 will be the same number of entries with the "type" of the project in string 1.
I need to emit one string for each distinct type in the second string that contains the projects from the first string:
"Project1,Project3"
"Project2"
"Project4"
I know I can do it with a bunch of nested foreach loops. Is there a way to do this with a hashtable and/or arrays?
You can turn the original input strings into arrays with the -split operator:
$ProjectNames = "Project1,Project2,Project3,Project4" -split ','
$ProjectTypes = "web,batch,web,components" -split ','
Then create an empty hash table to contain the type-to-projectname mappings
$ProjectsByType = #{}
Finally iterate over the two arrays to group the project names by type:
for($i = 0; $i -lt $ProjectNames.Count; $i++){
if(-not $ProjectsByType.ContainsKey($ProjectTypes[$i])){
# Create key and entry as array if it doesn't already exist
$ProjectsByType[$ProjectTypes[$i]] = #()
}
# Add the project to the appropriate project type key
$ProjectsByType[$ProjectTypes[$i]] += $ProjectNames[$i]
}
Now you can produce your desired strings grouped by project type:
$ProjectsByType.Keys |ForEach-Object {
$ProjectsByType[$_] -join ','
}
You could also create objects from the two arrays and use Group-Object to group them:
$Projects = for($i = 0; $i -lt $ProjectNames.Count; $i++){
New-Object psobject -Property #{
Name = $ProjectNames[$i]
Type = $ProjectTypes[$i]
}
}
$Projects |Group-Object -Property Type
This is more interesting if you want to do further processing of the projects, if you just need the strings the first approach is easier
There isn't really an elegant way of combining two arrays that way with built-in methods. A somewhat convoluted way would be the following:
$projects = $projectString -split ','
$types = $typeString -split ','
0..($projects.Count) | group { $types[$_] } | % { $projects[$_.Group] -join ',' }
However, this first generates indices into the arrays to group and format them later, which is inherently a bit iffy (and not very understandable). I tend to pre-process the data to actually reflect what I'm operating on:
$projects = $projectString -split ','
$types = $typeString -split ','
$projectsWithType = 0..($projects.Count) | % {
[pscustomobject]#{
Project = $projects[$_]
Type = $types[$_]
}
}
$projectsWithType | group Type | % { $_.Group -join ',' }
This makes the actual data munging task much clearer.
with only one search in first list
$projects = "Project1,Project2,Project3,Project4" -split ','
$types = "web,batch,web,components" -split ','
$linenumber = 0
$projects |%{New-Object psObject -Property #{Project=$_;TypeProject= $types[$linenumber]};$linenumber++} |
group TypeProject |
select Name, #{N="Projects";E={$_.Group.Project -join ","}}

powershell concat all columns in a row

i have 20+ columns in a csv file like
empid ename deptid mgrid hiredon col6 .... col20
10 a 10 5 10-may-2010
11 b 10 5 08-aug-2005
12 c 11 3 11-dec-2008
i would like to get the output as csv like
empid, all_other_details
10 , {ename:a;deptid:10;mgrid:5; like this for all 19 columns }
except employee id all other columns should be wrapped into a string containing key:value pairs. Is there a way to join all the columns without mentioning each column as $_. ?
I have come up with this, I hope comments are self explanatory.
It should work with 2 or more columns.
Delimiters can be changed (on my computer, CSV delimiter is ; not , for example, and I know it can be different with other Cultures).
#declare delimiters
$CSVdelimiter = ";"
$detailsDelimiter = ","
#load file in array
$data = Get-Content "Book1.csv"
#isolate headers
$headers = $data[0].Split($CSVdelimiter)
#declare row counter
$rowCount = 0
#declare results array with headers
$results = #($headers[0] + "$CSVdelimiter`details")
#for each row except first
$data | Select-Object -Skip 1 | % {
#split on $csvDelimiter
$rowArray = $_.Split($CSVdelimiter)
#declare details array
$details = #()
#for each column except first
for($i = 1; $i -lt $rowArray.Count; $i++) {
#add to details array (header:value)
$details += $headers[$i] + ":" + $rowArray[$i]
}
#join details array with $detailsDelimiter to build new row
#append to first column value
#add to results array
$results += "$($rowArray[0])$CSVdelimiter{$($details -join $detailsDelimiter)}"
#increment row counter
$rowCount++
}
#output results to new csv file
$results | Out-File "Book2.csv"
Output looks like this :
empid;details
10;{ename:a,deptid:10,mgrid:5,hiredon:10-may-2010}
11;{ename:b,deptid:10,mgrid:5,hiredon:08-aug-2005}
12;{ename:c,deptid:11,mgrid:3,hiredon:11-dec-2008}
Try this:
$csv = Get-Content .\input_file.csv
$keys = $csv[0] -split '\s+'
$c = $keys.count - 1
$keys = ($keys[1..$c] | % {$i = -1}{$i += 1; "$($_):{$i}"}) -join '; '
$csv[1..($csv.count -1)] | % {
$a = $_ -split '\s+'
New-Object psobject -Property #{
empid = $a[0]
all_other_details = "{$($keys -f $a[1..$c])}"
}
} | Export-Csv output_file.csv -NoTypeInformation

Is the Name property of output of Group-Object always string?

The following script can transform(pivot) the array by the third column (x, y). However, it needs to concatenate the first two columns for the group-object command. And then the Name of the output need to be split to get the original values.
It can be error prone if the data has the separator character. And it seems not performance optimized since extra string concatenation/split actions are needed. Is it a more direct way (like SQL group clause) in powershell?
$a =#('a','b','x',10),
#('a','b','y',20),
#('c','e','x',50),
#('c','e','y',30)
# $a | % { "[$_]"}
$a | %{
new-object PsObject -prop #{
label = "$($_[0]),$($_[1])" # Concatenate for grouping
value = #{ $_[2] = $_[3] }
}
} |
group label | % {
$l = #($_.Name -split ",") + # then split to restore
#($_.Group.value.x, $_.Group.value.y)
"[$l]"
}
Yes, the "Name" property of GroupInfo is always a string.
The easiest way to find the distinct values is to sample the first item in each group:
$a |Group-Object -Property {$_[0]},{$_[1]} |ForEach-Object {
$Group = $_.Group
# The first item in each group
$SampleItem = $Group | Select-Object -First 1
# Now we can inspect the key values, $SampleItem[0] and $SampleItem[1]
Write-Host ('This group has {0} and {1} as primary keys:' -f $SampleItem[0..1]) -ForegroundColor Green
$Group |ForEach-Object {
# echo each array in group
Write-Host ($_ -join ' ')
}
}