Multifield in Elastic4s 5.x - scala

I'm currently using Elastic4s v5.0, which still has the multifield type used to index a field in more than one way.
elasticClient.execute(createIndex("foo") mappings (
mapping("bar").as(
multiField("baz").as(
textField("baz") analyzer myAnalyzer,
textField("original") index NotAnalyzed
)
)
)
However, I get the following error:
No handler for type [multi_field] declared on field []
The answer ElasticSearch 5: MapperParserException with multi_field and documentation here https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html says to use "fields" instead, but I cannot find how to do this in elastic4s.

In Elasticsearch any multifield has a primary field which is kind of like a parent field and then it has secondary fields. The primary field (primary and secondary is my terminology by the way), is accessed with a and the secondary fields are accessed as a.b, a.c and so on.
This might not be how you would first imagine a multi field to be, because you might just think that there's a, b, c as siblings like a kind of sequence. So its worth understanding this.
In elastic4s, you can just use .fields on any field you want, and then those fields will be combined with the parent to become a multi field. So your example re-written would be.
client.execute {
createIndex("foo").mappings(
mapping("bar").fields(
textField("baz").fields(
textField("inner1") analyzer PatternAnalyzer,
textField("inner2") index NotAnalyzed
)
)
)
}
Note that as is an alias for fields and I think fields is more readable so I used it here.

Related

Search jsonb fields in postgresql with Hasura

Is it possible to do a greater than search across a jsonb field using hasura?
it looks to be possible in PostgreSQL itself, How can I do less than, greater than in JSON Postgres fields?
in postgres I'm storing a table
asset
name: string
version: int
metadata: jsonb
the metadata looks like this.
{'length': 5}
I am able to find asset that matches exactly using the _contains.
{
asset(where:{metadata : {_contains : {length: 5}}}){
name
metadata
}
}
I would like to be able to find asset with a length over 10.
I tried:
{
asset(where:{metadata : {_gt : {length: 10}}}){
name
metadata
}
}
A. Possibility to do on graphql level directly
Hasura documentation: JSONB operators (_contains, _has_key, etc.) mentions only 4 operators:
The _contains, _contained_in, _has_key, _has_keys_any and _has_keys_all operators are used to filter based on JSONB columns.
So direct answer for your question: No. It's not possible to do on graphql level in hasura.
(At least it's not possible yet. Who knows: maybe in future releases more operators will be implemented.
)
B. Using derived views
But there is another way, the one explained in https://hasura.io/blog/postgres-json-and-jsonb-type-support-on-graphql-41f586e47536/#derived-data
This recomendation is repeated in: https://github.com/hasura/graphql-engine/issues/6331
We don't have operators like that for JSONB (might be solved by something like #5211) but you can use a view or computed field to flatten the text field from the JSONB column into a new column and then do a like on that.
Recipe is:
1. Create a view
CREATE VIEW assets -- note plural here. Name view accordingly to your style guide
AS
SELECT
name,
version,
metadata,
(metadata->>'length')::int as meta_len -- cast to other number type if needed
FROM asset
2. Register this view
3. Use it in graphql queries as usual table
E.g.
query{
assets(where: {meta_len: {_gt:10}}){
name
metadata
}
C. Using SETOF-functions
1. Create SETOF-function
CREATE FUNCTION get_assets(min_length int DEFAULT 0)
RETURNS SETOF asset
LANGUAGE SQL
STABLE
AS $$
SELECT * FROM asset
WHERE
(metadata->>'length')::int > min_length;
$$;
2. Register in hasura
3. Use in queries
query{
get_assets(args: {min_length: 10}){
name
metadata
}
I think that was the last possible option.
It will not gives you full "schemaless freedom" that maybe you're looking but IDK know about other ways.

How to SET jsonb_column = json_build_array( string_column ) in Sequelize UPDATE?

I'm converting a one-to-one relationship into a one-to-many relationship. The old relationship was just a foreign key on the parent record. The new relationship will be an array of foreign keys on the parent record.
(Using Postgres dialect, BTW.)
First I'll add a new JSONB column, which will hold an array of UUIDs.
Then I'll run a query to update all existing rows such that the value from the old column is now stored in the new column (as the first element in an array).
Finally, I'll remove the old column.
I'm looking for help with step 2: writing the update statement that will update all rows, setting the value of the new column based on the value of the old column. Basically, I'm trying to figure out how to express this SQL query using Sequelize:
UPDATE "myTable"
SET "newColumn" = json_build_array("oldColumn")
-- ^^ this really works, btw
Where:
newColumn is type JSONB, and should hold an array (of UUIDs)
oldColumn is type UUID
names are double-quoted because they're mixed case in the DB (shrug)
Expressed using Sequelize sugar, that might be something like:
const { models } = require('../sequelize')
await models.MyModel.update({ newColumn: [ 'oldColumn' ] })
...except that would result in saving an array that contains the string "oldColumn" rather than an array whose first element is the value in that row's oldColumn column.
My experience, and the Sequelize documentation, is focused on working with individual rows via the standard instance methods. I could do that here, but it'd be a lot better to have the database engine do the work internally instead of forcing it to transfer every row to Node and then back again.
Looking for whatever is the most Sequelize-idiomatic way of doing this, if there is one.
Any help is appreciated.

Elastic search - scala case class #timestamp

I want to save data into Elasticsearch using Spark.
I use this connector: https://www.elastic.co/guide/en/elasticsearch/hadoop/master/spark.html#spark-installation
I can save data using saveToEsWithMeta method on RDD with a case class. But when I want to set field named #timestamp I have a problem. I added an attribute name #timestamp into my case class but this attribute is saved with name '$attimestamp' in Elasticsearch instead of '#timestamp'.
I found a workaround using a Map instead of a case class, but do you know a solution using a case class?
Thanks a lot,
BenoƮt
Maybe try this from the documentation you linked to:
For cases where the id (or other metadata fields like ttl or
timestamp) of the document needs to be specified, one can do so by
setting the appropriate mapping namely es.mapping.id. Following the
previous example, to indicate to Elasticsearch to use the field id as
the document id, update the RDD configuration (it is also possible to
set the property on the SparkConf though due to its global effect it
is discouraged):
EsSpark.saveToEs(rdd, "spark/docs", Map("es.mapping.id" -> "id"))

About Composite Index

When defining a composite index, e.g.
create table temptable (id integer, id2 integer, name string, INDEX ci using plain(id2, id));
The id and id2 are indexed in elasticsearch using integer,
but the I see from ES's _mapping is like:
"ci" : {
"type" : "string",
"analyzer" : "standard"
},
Where both id and id2 are copied to ci with type "string".
Can you explain more about this (like the order preserved) and probably a bit more the whole composite index thing in crate data?
You found 2 bugs by doing this which we'll try to fix asap. ;)
First, using a plain index type should result in the 'keyword' analyzer not the 'standard' one.
Second, a composite over 2 non-string columns shouldn't result in a string typed column, but if supported, in the same type of the origin columns.
I've wrote "if supported" because for now, we'd forbid defining a composite index over non-string columns because we don't know what this would be for.
Our current match function implementation only supports string literals, so this function couldn't be used for querying the composite index.
Can you explain your use-case a bit?
Maybe creating an issue at github would make sense for this possible enhancement.
The order of the columns used for defining the composite index doesn't matter at all, in case of a string, values of both are analyzed and resulting terms will be inserted/merged at the target field.
Thanks for reporting!

Zend Framework relationships - defining column names in findManyToManyRowset()?

I'm working on an application developed using Zend Framework. I have defined relationships in models, and can use them happily, e.g:
$rowset = $row->findManyToManyRowset('People', 'Jobs');
However, i've hit a problem where the rowset is returned has column names that are the same in 'People' and 'Jobs', and therefore, merges the array keys, losing some of the data from the final rowset.
I understand I can pass a Zend_Db_Select object to findManyToManyRowset() as one of the parameters, but can't find any documentation explaining how to use it in this case, e.g.:
$select = $this->select()->from(array(
'p' => 'people',
'j' => 'jobs'
),
array(
'person_id' => 'p.id',
'job_id' => 'j.id',
'person_code' => 'p.code',
'job_code' => 'j.code'
)
);
If i try to use the above code, I get a message such as:
Error: No reference rule "" from table People to table Jobs
Can anyone enlighten me on how this should be done? I know I could change my column names in the database, but i'd prefer a code change as opposed to re-designing my DB structure and updating all the related code.
Note: without some form of column aliasing as above, the rowset returned looks like this (ie., it merges the columns with the same names):
[_data:protected] => Array
(
[id] => 1
[code] => SX342
)
Cheers,
Matt
I know this answer comes a little late but here are some things to point out.
1) findManyToManyRowset($matchTable, $intersectionTable, $callerRefRule, $matchRefRule, $select); -- if you are passing a Zend_Db_Table_Select you are going to want to pass null for the rules.
2) The Zend_Db_Table_Select passed into the findManyToManyRowset() should be created from $matchTable and it is safe to assume that in the where clauses i is the alias for the intersection table, and m is the alias for the match table.
3) In the case of collisions, m will win the key name in the associative array returned in php. The query executed looks like this:
SELECT
`i`.*, `m`.*
FROM
`interscetTable` AS `i`
INNER JOIN
`matchTable` AS `m`
ON
`i`.`fk_m` = `m`.`pk` WHERE (`i`.`fk_o` = ?)
4) No matter what, The return value of findManyToManyRowset() will be a Rowset created from the $matchTable so, if you need to capture any information from the intersecting table, while also capturing the data for the match table, you will probably need to have a custom Zend_Db_Select and avoid using the Zend_Db_Table stuff to map the data anyway.
So a working example, using "People" as the match table, "Workers" as the intersection table and lets say "Clients" as the originating table.. Assuming for this example that the tables link together something like:
People.id:... -> workers.person_id:client_id:job_id -> clients:id:...
$client = $clientTable->fetchRow(); /// grab a random client
// fetch all people that have worked for the client ordered by their last name.
$client->findManyToManyRowset("People", "Workers", null, null,
$peopleTable->select()->order('m.lastname'));
// fetch all people that have worked for the client ordered by their hire date:
// `workers`.`hiredate`
$client->findManyToManyRowset("People", "Workers", null, null,
$peopleTable->select()->order('i.hiredate'));
My first recommendation is that you shouldn't name columns such generic names like id and code. These names are meaningless, and as you have discovered they also result in collisions when you fetch results in an associative array.
You're also using the Select interface incorrectly. You should specify only one table per from() call or join() call.
Finally, I never try to do complex queries via the Zend_Db_Table relationships interface. It's intended only for simple cases. If you have a more complex query, just write the SQL query explicitly.
See also How to do a joined query in the ZF tables interface?