I have a Cypher Query question related to filtering mechanism implementation.
Let's assume we are performing a filter to PROFILE nodes.. for simplicity there are two types of filters F1 ,F2.
Each filter requires a separated MATCH clause in the query.
I need to retrieve PROFILES where (F1 or F2) = true sorted by date.
So at first I thought of something like the following:
START(node) match (Applying F1 filter)->profile WITH profile,node
match (Applying F2 filter)->profile2
But now I need to apply the return clause with order by ..
Is there a way to aggregate the profiles from previous match without override them? something like..
START(node) match (Applying F1 filter)->profile WITH profile,node
match (Applying F2 filter)->profile return profile order by profile.date
Help will be appreciated
Thanks
Related
I'm trying to filter a dataset by order status. This is my code:
df1=all_in_all_df.groupBy("productName") \
.agg(F.max('orderItemSubTotal')) \
.filter(col("orderStatus") == "CLOSED") \
.show()
But when I run the code, I get the following error:
AnalysisException: cannot resolve 'orderStatus' given input columns: [max(orderItemSubTotal), productName];
'Filter ('orderStatus = CLOSED)
Removing the .filter() helps displaying a result but I need to filter the data.
The aggregation restricts the number of resulting columns to the ones used for the grouping (in group by clause) and the result of the aggregation.
Thus, there is no orderStatus column anymore.
If you want to be able to filter on it, do it before the aggregation (but only filtered rows will be taken into account for the aggregation) or integrate them in the group by clause (again, the aggregation will be made by status, not globally, but in this second case you will have all statuses, with related aggregations, available).
Firstly, the document schema am querying on, is as follows:
{x:10, y:"temp", z:20}
There are multiple other documents in the collection with the same schema as above.
Now, I have a list where each element contains, pair of values belonging to keys x and y. This can be pictured as:
[{10,"temp"}, {20,"temp1"}, .....]
keys -> x y x y
Now, am aware, that if I process the array in a loop and take each pair, I can construct a query like:
query.addCriteria(Criteria.where("x").is(10).and("y").is("temp"))
This will return the document if it matches the AND criteria. I can query with all the pairs in the list in such a manner. But this approach will involve a high number of calls to the data base since for each pair in the list, there is a database call.
To avoid this, Is there any way I can query for all the documents that match this AND criteria for each element in the list, in a single call, using spring data MongoDb Api? Framed differently, I want to avoid looping through the array and making multiple calls, if possible.
You could use Criteria.orOperator to return each Document that match at least one Criteria of your list.
Build your list of Criteria looping over your list
List<Criteria> criteriaList = new ArrayList<>();
for (item : yourList) {
criteriaList.add(Criteria.where("x").is(item.x).and("y").is(item.y));
}
Build your query using orOperator:
Query.query(new Criteria.orOperator(criteriaList.toArray(new Criteria[criteriaList.size()])));
Looking to translate an HQL query into it's JPA Criteria API equivalent.
In HQL I have the following which seems to work.
WHERE (x, y) IN (
... sub query which selects two columns
)
In JPA Criteria API I don't see how I can match a tuple value, only a single value. For example, I know how to convert the following to JPA Criteria.
WHERE (x) IN (
... sub query which selects single column
)
Is this even possible using the criteria api?
It's not possible in JPQL to try to match two fields with the IN.
By the way, you can't do it in JPA Criteria neither.
I know I can sort the results of a query based on the text score that each result has been assigned using MongoDB text search. But, given two different queries A and B that retrieve different documents D1 and D2, if score(A, D1) > score(B, D2) does it mean that D1 is more related to query A than D2 is to query B?
In other words, are the scores relative to the query or also valid absolutely?
given two different queries A and B that retrieve different documents D1 and D2, if score(A, D1) > score(B, D2) does it mean that D1 is more related to query A than D2 is to query B?
Assuming both queries are against equivalent text search indexes, the same scoring algorithm is used so this seems a correct inference to make.
Factors that can influence the scoring for queries would include text index options like:
field weights
language
text index version (eg: MongoDB 3.2 has text search enhancements associated with version 3 text indexes).
Before marking this question as a duplicate - please read through. I don't think a sufficiently conclusive and general answer has been given yet, as most questions have focused on specific examples.
The MongoDB documentation says that you can specify an aggregate key for the _id value of a $group operation. There are a number of previously answered questions about using MongoDB's aggregate framework to group over multiple fields in this way, i.e:
{$group: {_id:{field_a:'$field_a', field_b:'$field_b'} } }
Q: In the most general sense, what does this action do?
If grouping documents by field A condenses any documents sharing the same value of field A into a single document, does grouping by fields A and B condense documents with matching values of both A and B into a single document?
Is the grouping operation sequential?
If so, does that imply any level of precedence between 'field_a' and 'field_b' depending on their ordering?
If grouping documents by field A condenses any documents sharing the same value of field A into a single document, does grouping by fields A and B condense documents with matching values of both A and B into a single document?
Let A = { a:A, b:B }, then that automatically follows from the assumption. You didn't make any assumption about the type of A, which is correct: the type doesn't matter. If the type of A is document, the usual comparison rules apply (equal content is considered equal).
Is the grouping operation sequential?
I'm not sure what that means. The aggregation pipeline runs accumulator functions on all items in each stage, so it certainly iterates the entire set, but I'd refrain from making assumptions about the exact order that happens in, i.e. from performing any non-associative operations.
If so, does that imply any level of precedence between 'field_a' and 'field_b' depending on their ordering?
No, documents are compared field-by-field and there are no strict guarantees on the ordering of fields ("attempts to...") in MongoDB. However, one can, in principle, create documents that contain multiple fields of the same name where the ordering might matter. But it's hard to do so, since most client interfaces don't allow different fields of equal name.