Why is my search on 2 fields in Lucene (zend) failing? - zend-framework

So, I have two fields I am indexing:
| NAME | TYPE |
-------------------------------------
|Texas Steak | Restaurant |
|Bone Digs Restaurant | Restaurant |
|Rain Sushi | Restaurant |
|Brazil Steakhouse | Restaurant |
|Alfredos | Restaurant |
|Mediterranean Corner | Restaurant |
|Korean Taqueria | Restaurant |
|Orlando Cantina | Restaurant |
|Boomer Bar and Grill | Restaurant |
|Top Spice | Restaurant |
|Tamarind | Restaurant |
I am indexing in the following fashion:
Zend_Search_Lucene_Analysis_Analyzer::setDefault(new Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum_CaseInsensitive());
//let lucene search for numbers as well as words
try {
$index = Zend_Search_Lucene::create('data/');
} catch(Zend_Search_Lucene_Exception $e){
echo "Unable {$e->getMessage()}";
}
$mdlPage = new Application_Model_DbTable_Business();
$currentPages = $mdlPage->fetchAll();
if($currentPages->count() >= 0) {
foreach ($currentPages as $p) {
$doc = new Zend_Search_Lucene_Document();
$doc->addField(Zend_Search_Lucene_Field::text('business_name', $p->business_name));
$doc->addField(Zend_Search_Lucene_Field::text('primary_category',$p->primary_category));
$index->addDocument($doc);
}
}
$index->commit();
$index->optimize();
$this->view->indexSize = $index->numDocs();
Now, I am querying the index in the following fashion:
$query = new Zend_Search_Lucene_Search_Query_MultiTerm();
$query->addTerm(new Zend_Search_Lucene_Index_Term($keyword,'business_name'),null);
$query->addTerm(new Zend_Search_Lucene_Index_Term($keyword,'primary_category'),null);
echo $query;
try {
$results = $index->find($query);
}
catch (Zend_Search_Lucene_Exception $e) {
echo "Unable {$e->getMessage()}";
}
Here is the odd thing. If I query for 'thai' or any word that is in field business_name I get results. But once I query for 'restaurant' my screen does not resolve but to a white screen. If I change my code to this:
$query = new Zend_Search_Lucene_Search_Query_MultiTerm();
//$query->addTerm(new Zend_Search_Lucene_Index_Term($keyword,'business_name'),null);
$query->addTerm(new Zend_Search_Lucene_Index_Term($keyword,'primary_category'),null);
Then I do get results. I don't get what Lucene does not like. Is it my data? Is it how I am creating my query? Or is it how I am creating my index?

Related

Using PowerShell Core ConvertFrom-Markdown to parse values in a markdown table

I'm interested in using the ConvertFrom-Markdown cmdlet to parse values in a markdown table. The cmdlet uses the markdig markdown processor, which has an Abstract Syntax Tree that should be able to be traversed for this purpose.
How can we search/enumerate the Tokens in the following powershell snippet to return the rows and columns?
(#'
# header1
## header2
| Column1 | Column2 |
| ------- | ------- |
| Row1Column1 | Row1Column2 |
| Row2Column1 | Ro2Column2 |
'# | ConvertFrom-Markdown).Tokens
The values that I see in the Tokens look promising, I can see Markdig.Extensions.Tables.TableCell in the Parent fields, but that's about as far as I can get.
Here's a way to do it.
Note I'm not sure if, example a Table can contain only TableRows, so the | where-object { ... } might not be necessary.
# set up some sample data
$md = #"
# header1
## header2
| Column1 | Column2 |
| ------- | ------- |
| Row1Column1 | Row1Column2 |
| Row2Column1 | Ro2Column2 |
"# | ConvertFrom-Markdown
# walk the syntax tree
$mdDoc = $md.Tokens;
$mdTables = #( $mdDoc | where-object { $_ -is [Markdig.Extensions.Tables.Table] } );
foreach( $mdTable in $mdTables )
{
write-host "table";
$mdRows = #( $mdTable | where-object { $_ -is [Markdig.Extensions.Tables.TableRow] } );
foreach( $mdRow in $mdRows )
{
write-host " row";
write-host " header = $($mdRow.IsHeader)";
$mdCells = #( $mdRow | where-object { $_ -is [Markdig.Extensions.Tables.TableCell] } );
foreach( $mdCell in $mdCells )
{
write-host " cell";
$mdInline = $mdCell.Inline;
write-host " inline - $($mdInline.Content)";
}
}
}
Which gives the following output:
table
row
header = True
cell
inline - Column1
cell
inline - Column2
row
header = False
cell
inline - Row1Column1
cell
inline - Row1Column2
row
header = False
cell
inline - Row2Column1
cell
inline - Ro2Column2
Hopefully that'll be enough to get you started...
If you like to import the markdown tables into PowerShell arrays, you can parse and build PsCustomObjects as well along the way...
$MarkDown = #"
# header1
## header2
| Column1 | Column2 |
| ------- | ------- |
| Row1Column1 | Row1Column2 |
| Row2Column1 | Ro2Column2 |
| Table2 Column1 |
| ------------- |
| T2 Row1 |
| T2 Row2 |
| T2 Row3 |
"# | ConvertFrom-Markdown
$mdDoc = $Markdown.Tokens
[array]$tables = $null
$mdTables = #($mdDoc | where {$_ -is [Markdig.Extensions.Tables.Table]})
foreach ($mdTable in $mdTables) {
[array]$table = $null
$mdRows = #($mdTable | where {$_ -is [Markdig.Extensions.Tables.TableRow]})
foreach ($mdRow in $mdRows) {
$mdCells = #($mdRow | where-object { $_ -is [Markdig.Extensions.Tables.TableCell]})
$mdCellsValues = #($mdCells.Inline.Content | foreach {$_.ToString()})
if ($mdRow.IsHeader) {# don't use headers as values
$CustomProperties = $mdCellsValues
} else {# iterate throw the customobject and populate it
$thisrow = New-Object PSCustomObject | select $CustomProperties
foreach ($i in 0..($CustomProperties.Count -1)) {
$thisrow.($CustomProperties[$i]) = $mdCellsValues[$i]
}
$table += $thisrow
}# endif
}#end tablerows
$tables += ,$table #add each table a sub arrays
}#end tables
$tables
The result is available in two sub arrays
C:\> $tables[0]
Column1 Column2
------- -------
Row1Column1 Row1Column2
Row2Column1 Ro2Column2
C:\> $tables[1]
Table2 Column1
-------------
T2 Row1
T2 Row2
T2 Row3

PowerShell: Iterating Multiple Variables not working as expected

I am trying to iterate over an array $dailyTasks to find 'Blank' i.e. '' values in the EmployeeName column and inject names from another array into those empty values.
Example of what the array looks like before the for loop starts:
| Task | EmployeeName | EmployeeName2 |
|-------|--------------|---------------|
| Task1 | | |
| Task2 | Person Y | |
| Task3 | | |
| Task4 | Person Z | Person X |
This is my for loop code that produces an undesired result. $randomisedUsers is an Object[]
$randomisedUsers | Group-Object { $_ -in ($randomisedUsers | Select-Object -Last 2) } | ForEach-Object {
if ($_.Name -eq 'True') {
for ($i = 0; $i -lt $dailyTasks.Count; $i++) {
if ($dailyTasks[$i].Task -eq 'Task4') {
$dailyTasks[$i].EmployeeName = $_.Group.EmployeeName[0]
$dailyTasks[$i].EmployeeName2 = $_.Group.EmployeeName[1]
}
}
} else {
for ($i = 0; $i -lt $dailyTasks.Count; $i++) {
if ($dailyTasks[$i].EmployeeName -eq '') {
if ($_.Count -gt '1') {
for ($x = 0; $x -lt $_.Group.EmployeeName.Count; $x++) {
$dailyTasks[$i].EmployeeName = $_.Group.EmployeeName[$x]
}
} else {
$dailyTasks[$i].EmployeeName = $_.Group.EmployeeName
}
}
}
}
}
Result:
| Task | EmployeeName | EmployeeName2 |
|-------|--------------|---------------|
| Task1 | Person A | |
| Task2 | Person Y | |
| Task3 | Person A | |
| Task4 | Person Z | Person X |
The problem here is that $_.Group.EmployeeName contains two objects but for whatever reason the result table doesnt populate Person B in the array:
$_.Group.EmployeeName
{Person A, Person B}
The desired result in this case is:
| Task | EmployeeName | EmployeeName2 |
|-------|--------------|---------------|
| Task1 | Person A | |
| Task2 | Person Y | |
| Task3 | Person B | |
| Task4 | Person Z | Person X |
Im not completely sure where im going wrong in my for loops and i've been stuck on this for a while...
TIA
I would personally use something like this:
$csv = #'
Task,EmployeeName,EmployeeName2
Task1,,
Task2,Person Y,
Task3,,
Task4,Person Z,Person X
'# | ConvertFrom-Csv
$fillEmployees = [System.Collections.ArrayList]#(
'Person A'
'Person B'
)
foreach($line in $csv)
{
if([string]::IsNullOrWhiteSpace($line.EmployeeName))
{
$line.EmployeeName = $fillEmployees[0]
$fillEmployees.RemoveAt(0)
}
}
The flow is quite simple, if the loop finds a value in EmployeeName that is null or has white spaces it will replace that value with the index 0 of $fillEmployees and then remove that index 0 from the list.
It's hard to tell what you're trying to accomplish with your code, but if you have an array of the type System.Array filled with random names which will be used to fill this empty values on EmployeeName you can convert that Array to an ArrayList which will allow you to use the .RemoveAt(..) method:
PS /> $fillEmployees = 0..10 | ForEach-Object {"Employee {0}" -f [char](Get-Random -Minimum 65 -Maximum 90)}
PS /> $fillEmployees
Employee J
Employee S
Employee D
Employee P
Employee O
Employee E
Employee M
Employee K
Employee R
Employee F
Employee A
PS /> $fillEmployees.GetType()
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True Object[] System.Array
Attempting to Remove an item from an Array would result in the following:
PS /> $fillEmployees.RemoveAt(0)
Exception calling "RemoveAt" with "1" argument(s): "Collection was of a fixed size."
At line:1 char:1
...
...
However if convert it to an ArrayList (not convert it but copy it):
PS /> $fillEmployees = [System.Collections.ArrayList]$fillEmployees
PS /> $fillEmployees.RemoveAt(0)

PowerShell: Expression only with Last item of an Array

I've been stuck on this for a little while however I've got an array of People and im trying to get the last person and creating a seperate column with that person only.
I've played around with #{NAME = 'NAME' Expression = {}} in Select-Object but I don't really know how to tackle it.
Current:
| Employee |
|---------------|
| John Doe |
| Jane West |
| Jordan Row |
| Paul Willson |
| Andrew Wright |
Desired Result:
| Employee | Employee2 |
|--------------|---------------|
| John Doe | |
| Jane West | |
| Jordan Row | |
| Paul Willson | Andrew Wright |
TIA!
So what I decided to do here is create 2 groups. One group contains all of the values except the last 2, and the other group contains these last 2 values
# create the sample array
$employees = #(
'John Doe'
'Jane West'
'Jordan Row'
'Paul Willson'
'Andrew Wright'
)
$employees |
# Separate objects into 2 groups: those contained in the last 2 values and those not contained in the last 2 values
Group-Object {$_ -in ($employees | Select-Object -Last 2)} |
ForEach-Object {
switch ($_) {
{$_.name -eq 'False'} { # 'False' Name of group where values are not one of the last 2
# Iterate through all the values and assign them to Employee property. Leave Employee2 property blank
$_.group | ForEach-Object {
[PSCustomObject]#{
Employee = $_
Employee2 = ''
}
}
}
{$_.name -eq 'True'} { # 'True' Name of group where values are those of the last 2
# Create an object that assigns the values to Employee and Employee2
[PSCustomObject]#{
Employee = $_.group[0]
Employee2 = $_.group[1]
}
}
}
}
Output
Employee Employee2
-------- ---------
John Doe
Jane West
Jordan Row
Paul Willson Andrew Wright
Edit
Here is another way you can do it
$employees[0..($employees.Count-3)] | ForEach-Object {
[PSCustomObject]#{
Employee = $_
Employee2 = ''
}
}
[PSCustomObject]#{
Employee = $employees[-2]
Employee2 = $employees[-1]
}

Get rows from one datatable where Id not in another using Powershell

I have two DataTables in Powershell with differing columns but one common Id column.
I want to get the rows from DataTable A where the Id of the row doesn't appear in DataTable B.
|DataTable A |
|---------------------------|
|Id|SomeName|SomeDescription|
|--|--------|---------------|
|1 |Blah |Whatevs |
|2 |Foo |Bar |
|3 |Woo |Yeah |
|DataTable B |
|------------------------------------|
| Id | SomeOtherName | SomeOtherDesc |
|----|---------------|---------------|
| 1 | Blah blah | Yadda yadda |
| 2 | Foo foo | Bah bah |
The result I'd like:
|DataTable Result |
|---------------------------|
|Id|SomeName|SomeDescription|
|--|--------|---------------|
|3 |Woo |Yeah |
How is this best done in Powershell?
Quickly whipped a csv table and not a data table, but if the ID columns are put in an array, this should work,:
$tableA = #'
Id,somename,somedescription
1,Blah,Whatevs
2,Foo,Bar
3,Woo,Yeah
'#
$tableB = #'
Id,somename,somedescription
1,Blah,Whatevs asd
2,Foo,Bar asd
'#
$importA = $tableA | Convertfrom-csv
$importB = $tableB | Convertfrom-csv
$importA | Where-Object { $importB.Id -notcontains $_.Id }
Since you work with DataTables and the result should also be a DataTable, this should do it:
# DataTable A
$dtA = New-Object System.Data.DataTable
$dtA.Columns.Add([System.Data.DataColumn]::new("Id",[int]))
$dtA.Columns.Add([System.Data.DataColumn]::new("SomeName"))
$dtA.Columns.Add([System.Data.DataColumn]::new("SomeDescription"))
$row = $dtA.NewRow()
$row["Id"] = 1
$row["SomeName"] = "Blah"
$row["SomeDescription"] = "Whatevs"
$dtA.rows.Add($row)
$row = $dtA.NewRow()
$row["Id"] = 2
$row["SomeName"] = "Foo"
$row["SomeDescription"] = "Bar"
$dtA.rows.Add($row)
$row = $dtA.NewRow()
$row["Id"] = 3
$row["SomeName"] = "Woo"
$row["SomeDescription"] = "Yeah"
$dtA.rows.Add($row)
# DataTable B
$dtB = New-Object System.Data.DataTable
$dtB.Columns.Add([System.Data.DataColumn]::new("Id",[int]))
$dtB.Columns.Add([System.Data.DataColumn]::new("SomeOtherName"))
$dtB.Columns.Add([System.Data.DataColumn]::new("SomeOtherDesc"))
$row = $dtB.NewRow()
$row["Id"] = 1
$row["SomeOtherName"] = "Blah blah"
$row["SomeOtherDesc"] = "Yadda yadda"
$dtB.rows.Add($row)
$row = $dtB.NewRow()
$row["Id"] = 2
$row["SomeOtherName"] = "Foo foo"
$row["SomeOtherDesc"] = "Ba ba"
$dtB.rows.Add($row)
# create a clone of datatable A (no data, just the structure)
$dtResult = $dtA.Clone()
# Get the Id values that are in DataTable A, but not in DataTable B
$diff = Compare-Object -ReferenceObject $dtA.Id -DifferenceObject $dtB.Id -PassThru | Where-Object { $_.SideIndicator -eq '<=' }
$dtA | Where-Object { $diff -contains $_.Id } | ForEach-Object {
# here, $_ is of type System.Data.DataRow
# you cannot add this DataRow directly because it belongs to another DataTable,
# to overcome that use the 'ItemArray' property to get an array of the values inside
$null = $dtResult.Rows.Add($_.ItemArray)
}
$dtResult
Result (type System.Data.DataTable):
Id SomeName SomeDescription
-- -------- ---------------
3 Woo Yeah

Trouble filtering a Nested Hashtable inside a Switch

I've built a WPF form with 2 Combo Boxes, both filled with queries from this Nested Hashtable.
cboAgencies fills when the script runs and the form loads. cboOffices based on the selection made in cboAgencies and lists the office locations specific to the selected Agency.
Or, it's supposed to.
Here's the relevant code:
$cboAgencies.Add_SelectionChanged({
$Script:selectedAgency = $cboAgencies.SelectedItem.ToString()
$Script:cboOffices.Items.Clear()
$Script:selectedAdmin = $null
Switch($Script:selectedAgency) {
subdivision1 {
$Script:selectedAdmin = 'division2'
$Script:arr_Offices = ($Script:hash_AgencyOffices[$Script:selectedAdmin][$Script:selectedAgency].Keys | Sort-Object)
$Script:arr_Offices | ForEach-Object {
$Script:cboOffices.Items.Add($_) | Out-Null
}
break
}
subdivision2 {
$Script:selectedAdmin = 'division2'
$Script:arr_Offices = ($Script:hash_AgencyOffices[$Script:selectedAdmin][$Script:selectedAgency].Keys | Sort-Object)
$Script:arr_Offices | ForEach-Object {
$Script:cboOffices.Items.Add($_) | Out-Null
}
break
}
division1 {
$Script:selectedAdmin = 'division1'
$Script:arr_Offices = ($Script:hash_AgencyOffices[$Script:selectedAdmin][$Script:selectedAgency].Keys | Sort-Object)
$Script:arr_Offices | ForEach-Object {
$Script:cboOffices.Items.Add($_) | Out-Null
}
break
}
}
})
Note: The code is anonymized. There are no spaces in the real division or subdivision names in the hashtable. Just in case that is a point of concern.
Running this code using . .\script.ps1 I query the various variables and find that the following variables are all correctly:
$cboAgencies.SelectedItem (and $Script:cboAgencies.SelectedItem)
$selectedAgency (and $Script:selectedAgency)
$selectedAdmin (and $Script:selectedAdmin)
But $arr_Offices is empty, meaning that the problem is happening in this line:
$Script:arr_Offices = ($Script:hash_AgencyOffices[$Script:selectedAdmin][$Script:selectedAgency].Keys | Sort-Object)
I've further confirmed this by placing this outside the Switch function:
$Script:arr_Offices = ($Script:hash_AgencyOffices['division2']['subdivision1'] | Sort-Object)
$Script:arr_Offices | ForEach-Object {
$Script:cboOffices.Items.Add($_) | Out-Null
}
And $arr_Offices and $cboOffices fill just fine.
What have I done wrong here?
From your linked question, $script:hash_AgencyOffices[$Script:selectedAdmin][$Script:se‌​lectedAgency] results in an array of strings, it will not have a .Keys property.
Your data structure is "Root hashtable which has keys, their values are hashtables, which have keys, their values are arrays which directly have content".
So you can use AgencyOffices['division2'].Keys and AgencyOffices['division2']['subdivision1'], but not AgencyOffices['division2']['subdivision1'].Keys.