How to remove ranking of query results - pg-search

I have the following pg_search scope on my stories.rb model:
pg_search_scope :with_text,
:against => :title,
:using => { :tsearch => { :dictionary => "english" }},
:associated_against => { :posts => :contents }
I want the query to return the results ignoring any ranking (I care only about the date the story was last updated order DESC). I know that this is an easy question for most of the people who view it, but how do I turn off the rank ordering in pg_search?

I'm the author of pg_search.
You could do something like this, which uses ActiveRecord::QueryMethods#reorder
MyModel.with_text("foo").reorder("updated_at DESC")

Related

How to get document which meets one of two conditions [duplicate]

From the docs:
You can also chain multiple where() methods to create more specific queries (logical AND).
How can I perform an OR query?
Example:
Give me all documents where the field status is open OR upcoming
Give me all documents where the field status == open OR createdAt <= <somedatetime>
OR isn't supported as it's hard for the server to scale it (requires keeping state to dedup). The work around is to issue 2 queries, one for each condition, and dedup on the client.
Edit (Nov 2019):
Cloud Firestore now supports IN queries which are a limited type of OR query.
For the example above you could do:
// Get all documents in 'foo' where status is open or upcmoming
db.collection('foo').where('status','in',['open','upcoming']).get()
However it's still not possible to do a general OR condition involving multiple fields.
With the recent addition of IN queries, Firestore supports "up to 10 equality clauses on the same field with a logical OR"
A possible solution to (1) would be:
documents.where('status', 'in', ['open', 'upcoming']);
See Firebase Guides: Query Operators | in and array-contains-any
suggest to give value for status as well.
ex.
{ name: "a", statusValue = 10, status = 'open' }
{ name: "b", statusValue = 20, status = 'upcoming'}
{ name: "c", statusValue = 30, status = 'close'}
you can query by ref.where('statusValue', '<=', 20) then both 'a' and 'b' will found.
this can save your query cost and performance.
btw, it is not fix all case.
I would have no "status" field, but status related fields, updating them to true or false based on request, like
{ name: "a", status_open: true, status_upcoming: false, status_closed: false}
However, check Firebase Cloud Functions. You could have a function listening status changes, updating status related properties like
{ name: "a", status: "open", status_open: true, status_upcoming: false, status_closed: false}
one or the other, your query could be just
...where('status_open','==',true)...
Hope it helps.
This doesn't solve all cases, but for "enum" fields, you can emulate an "OR" query by making a separate boolean field for each enum-value, then adding a where("enum_<value>", "==", false) for every value that isn't part of the "OR" clause you want.
For example, consider your first desired query:
Give me all documents where the field status is open OR upcoming
You can accomplish this by splitting the status: string field into multiple boolean fields, one for each enum-value:
status_open: bool
status_upcoming: bool
status_suspended: bool
status_closed: bool
To perform your "where status is open or upcoming" query, you then do this:
where("status_suspended", "==", false).where("status_closed", "==", false)
How does this work? Well, because it's an enum, you know one of the values must have true assigned. So if you can determine that all of the other values don't match for a given entry, then by deduction it must match one of the values you originally were looking for.
See also
in/not-in/array-contains-in: https://firebase.google.com/docs/firestore/query-data/queries#in_and_array-contains-any
!=: https://firebase.googleblog.com/2020/09/cloud-firestore-not-equal-queries.html
I don't like everyone saying it's not possible.
it is if you create another "hacky" field in the model to build a composite...
for instance, create an array for each document that has all logical or elements
then query for .where("field", arrayContains: [...]
you can bind two Observables using the rxjs merge operator.
Here you have an example.
import { Observable } from 'rxjs/Observable';
import 'rxjs/add/observable/merge';
...
getCombinatedStatus(): Observable<any> {
return Observable.merge(this.db.collection('foo', ref => ref.where('status','==','open')).valueChanges(),
this.db.collection('foo', ref => ref.where('status','==','upcoming')).valueChanges());
}
Then you can subscribe to the new Observable updates using the above method:
getCombinatedStatus.subscribe(results => console.log(results);
I hope this can help you, greetings from Chile!!
We have the same problem just now, luckily the only possible values for ours are A,B,C,D (4) so we have to query for things like A||B, A||C, A||B||C, D, etc
As of like a few months ago firebase supports a new query array-contains so what we do is make an array and we pre-process the OR values to the array
if (a) {
array addObject:#"a"
}
if (b) {
array addObject:#"b"
}
if (a||b) {
array addObject:#"a||b"
}
etc
And we do this for all 4! values or however many combos there are.
THEN we can simply check the query [document arrayContains:#"a||c"] or whatever type of condition we need.
So if something only qualified for conditional A of our 4 conditionals (A,B,C,D) then its array would contain the following literal strings: #["A", "A||B", "A||C", "A||D", "A||B||C", "A||B||D", "A||C||D", "A||B||C||D"]
Then for any of those OR combinations we can just search array-contains on whatever we may want (e.g. "A||C")
Note: This is only a reasonable approach if you have a few number of possible values to compare OR with.
More info on Array-contains here, since it's newish to firebase docs
If you have a limited number of fields, definitely create new fields with true and false like in the example above. However, if you don't know what the fields are until runtime, you have to just combine queries.
Here is a tags OR example...
// the ids of students in class
const students = [studentID1, studentID2,...];
// get all docs where student.studentID1 = true
const results = this.afs.collection('classes',
ref => ref.where(`students.${students[0]}`, '==', true)
).valueChanges({ idField: 'id' }).pipe(
switchMap((r: any) => {
// get all docs where student.studentID2...studentIDX = true
const docs = students.slice(1).map(
(student: any) => this.afs.collection('classes',
ref => ref.where(`students.${student}`, '==', true)
).valueChanges({ idField: 'id' })
);
return combineLatest(docs).pipe(
// combine results by reducing array
map((a: any[]) => {
const g: [] = a.reduce(
(acc: any[], cur: any) => acc.concat(cur)
).concat(r);
// filter out duplicates by 'id' field
return g.filter(
(b: any, n: number, a: any[]) => a.findIndex(
(v: any) => v.id === b.id) === n
);
}),
);
})
);
Unfortunately there is no other way to combine more than 10 items (use array-contains-any if < 10 items).
There is also no other way to avoid duplicate reads, as you don't know the ID fields that will be matched by the search. Luckily, Firebase has good caching.
For those of you that like promises...
const p = await results.pipe(take(1)).toPromise();
For more info on this, see this article I wrote.
J
OR isn't supported
But if you need that you can do It in your code
Ex : if i want query products where (Size Equal Xl OR XXL : AND Gender is Male)
productsCollectionRef
//1* first get query where can firestore handle it
.whereEqualTo("gender", "Male")
.addSnapshotListener((queryDocumentSnapshots, e) -> {
if (queryDocumentSnapshots == null)
return;
List<Product> productList = new ArrayList<>();
for (DocumentSnapshot snapshot : queryDocumentSnapshots.getDocuments()) {
Product product = snapshot.toObject(Product.class);
//2* then check your query OR Condition because firestore just support AND Condition
if (product.getSize().equals("XL") || product.getSize().equals("XXL"))
productList.add(product);
}
liveData.setValue(productList);
});
For Flutter dart language use this:
db.collection("projects").where("status", whereIn: ["public", "unlisted", "secret"]);
actually I found #Dan McGrath answer working here is a rewriting of his answer:
private void query() {
FirebaseFirestore db = FirebaseFirestore.getInstance();
db.collection("STATUS")
.whereIn("status", Arrays.asList("open", "upcoming")) // you can add up to 10 different values like : Arrays.asList("open", "upcoming", "Pending", "In Progress", ...)
.addSnapshotListener(new EventListener<QuerySnapshot>() {
#Override
public void onEvent(#Nullable QuerySnapshot queryDocumentSnapshots, #Nullable FirebaseFirestoreException e) {
for (DocumentSnapshot documentSnapshot : queryDocumentSnapshots) {
// I assume you have a model class called MyStatus
MyStatus status= documentSnapshot.toObject(MyStatus.class);
if (status!= null) {
//do somthing...!
}
}
}
});
}

Different conditions for different fields in thinking-sphinx

I have:
ThinkingSphinx::Index.define :new_post, with: :real_time do
indexes title, sortable: true
indexes text
has state, type: :string
has forum_hidden, type: :boolean
has created_at, type: :timestamp
has publish_at, type: :timestamp
has reminde_at, type: :timestamp
has deleted_at, type: :timestamp
has content_category_ids, type: :integer, multi: true
end
And let's i need to get all the records where #title=query with any value publish_at or #text=query with publish_at = 1.month.ago..Time.current
That is, I need to combine these two requests:
NewPost.search(conditions: { title: query })
NewPost.search(conditions: { text: query }, with: { publish_at: 1.month.ago..Time.current })
The result is needed with excerpts
UPDATE
published_at interval for the #title and #text fields is always different and depends on the user's rights. For example, there can be such a situation:
NewPost.search(conditions: { title: query }, with: { publish_at: 1.year.ago..Time.current })
NewPost.search(conditions: { text: query }, with: { publish_at: 6.month.ago..Time.current })
and all results that do not fall under these conditions should not be displayed at all
The main thing when wanting to 'combine' multiple criteria is deciding on a 'calculation' how to compute a weight that gives that effect. Sphinx computes a weight and orders by that. Rather than unions of multiple distinct queries
For example
https://freelancing-gods.com/thinking-sphinx/searching.html#sorting
ThinkingSphinx.search(
:select => '*, WEIGHT() + IF(publish_at>NOW()-2592000,1000,0) AS custom_weight',
:order => 'custom_weight DESC'
)
would use a the 'full-text' search weight, but add 1000 if in last month.
Uses the sphinx NOW() function
http://sphinxsearch.com/docs/current.html#expr-func-now
You might also want field weights
https://freelancing-gods.com/thinking-sphinx/searching.html#fieldweights
to boost title matches over 'text' matches
:field_weights => { :title=>10 }
Bringing all togehter (if got the ruby syntax right... )
NewPost.search( query ,
:select => '*, weight() + IF(publish_at>NOW()-2592000,1000,0) as custom_weight',
:order => 'custom_weight DESC',
:field_weights => { :title=>10 }
)
... in theory title matches will be first. And recent ones towards start too. Not EXACTLY what you asked for, but close.

Order Posts by Most Votes (Overall, Last Month, etc.) with Laravel MongoDB

I am trying to understand more advanced functions of mongodb and laravel but having trouble with this. Currently I have my schema setup with a users, posts, and posts_votes collections. The posts_votes has a user_id, post_id and timestamp field.
In a relational DB, I would just left join the posts_votes collection, count, and order by that count. Exclude dates when need be and all that.
MongoDB I am having difficulty b/c there's no left join equivalent. So I'd like to learn how to accomplish my goal in a more document-y way.
On my Post model in Laravel, I reference this way. So looking at an individual post, I can get the vote count, see if current user voted for a specific post, etc.
public function votes()
{
return $this->hasMany(PostVote::class, 'post_id');
}
And my current working query looks like this:
$posts = Post::forCategoryType($type)
->with('votes', 'author', 'businessType')
->where('approved', true)
->paginate(25);
The forCategoryType method is just extended scope I added. Here it is on the Post model/document class.
public function scopeForCategoryType($builder, $catType)
{
if ($catType->exists) {
return $builder->where('cat_id', $catType->id);
}
return $builder;
}
So when I look at posts like this one, it's close to what I want to accomplish, but I am not applying it properly. For instance, I changed my main query to look like this:
$posts = Post::forBusinessType($type)
->with('votes', 'author', 'businessType')
->where('approved', true)
->sortByVotes()
->paginate(25);
And created this new method on the Post model:
public function scopeSortByVotes($builder, $dir = 'desc')
{
return $builder->raw(function($collection) {
return $collection->aggregate([
['$group' => [
'_id' => ['post_id' => 'votes.$post_id', 'user_id' => 'votes.$user_id']
],
'vote_count' => ['$sum' => 1]
],
['$sort' => ['vote_count' => -1]]
]);
});
}
This returns the error exception: A pipeline stage specification object must contain exactly one field.
Not sure how to fix that (still looking), so then I tried:
return $collection->aggregate([
['$unwind' => '$votes'],
['$group' => [
'_id' => ['post_id' => ['$votes.post_id', 'user_id' => '$votes.user_id']],
'count' => ['$sum' => 1]
]
]
]);
returns an empty ArrayIterator, so then I tried:
public function scopeSortByVotes($builder, $dir = 'desc')
{
return $builder->raw(function($collection) {
return $collection->aggregate([
'$lookup' => [
'from' => 'community_posts_votes',
'localField' => 'post_id',
'foreignField' => '_id',
'as' => 'vote_count'
]
]);
});
}
But on this setup, I just get the list of posts unsorted. On version 3.2.8.
The default loads everything by most recent. But ultimately I want to be able to pull these posts based on how many votes they got lifetime, but also query based on which posts got the most votes in the last week, month, etc.
That example I shared has the grand total linked in the Post model and an array of all the user ids that voted on it. With the way I have things setup using a separate collection holding the user_id, post_id and timestamps of when the vote happened, can I still accomplish the same goal?
Note: using this laravel mongodb library.

How do I get a Wyam pipeline of documents based of a comma-delimited meta value from a previous pipeline?

I have a Wyam pipeline called "Posts" filled with documents. Some of these documents have a Tags meta value, which is a comma-delimited list of tags. For example, let's say it has three documents, with Tags meta of:
gumby,pokey
gumby,oscar
oscar,kermit
I want a new pipeline filled with one document for each unique tag found in all documents in the "Posts" pipeline. These documents should have the tag in a meta value called TagName.
So, the above values should result in a new pipeline consisting of four documents, with the TagName meta values of:
gumby
pokey
oscar
kermit
Here is my solution. This technically works, but I feel like it's inefficient, and I'm pretty sure there has to be a better way.
Documents(c => c.Documents["Posts"]
.Select(d => d.String("Tags", string.Empty))
.SelectMany(s => s.Split(",".ToCharArray()))
.Select(s => s.Trim().ToLower())
.Distinct()
.Select(s => c.GetNewDocument(
string.Empty,
new List<KeyValuePair<string, object>>()
{
new KeyValuePair<string, object>("TagName", s)
}
))
)
So, I'm calling Documents and passing in a ContextConfig which:
Gets the documents from "Posts" (I have a collection of documents)
Selects the Tags meta value (now I have a collection of strings)
Splits this on the comma (a bigger collection of strings)
then trims and lower cases (still a collection of strings)
De-dupes it (a smaller collection of strings)
Then creates a new document for each value in the list, with am empty body and a TagName value for the string (I should end up with a collection of new documents)
Again, this works. But is there a better way?
That's actually not bad at all - part of the challenge here is getting the comma-separated list of tags into something that can be processed by a LINQ expression or similar. That part is probably unavoidable and accounts for 3 of the lines in your expression.
That said, Wyam does provide a little help here with the ToLookup() extension (see the bottom of this page: http://wyam.io/getting-started/concepts).
Here's how that might look (this code is from a self-contained LINQPad script and would need to be adjusted for use in a Wyam config file):
public void Main()
{
Engine engine = new Engine();
engine.Pipelines.Add("Posts",
new PostsDocuments(),
new Meta("TagArray", (doc, ctx) => doc.String("Tags")
.ToLowerInvariant().Split(',').Select(x => x.Trim()).ToArray())
);
engine.Pipelines.Add("Tags",
new Documents(ctx => ctx.Documents["Posts"]
.ToLookup<string>("TagArray")
.Select(x => ctx.GetNewDocument(new MetadataItems { { "TagName", x.Key } }))),
new Execute((doc, ctx) =>
{
Console.WriteLine(doc["TagName"]);
return null;
})
);
engine.Execute();
}
public class PostsDocuments : IModule
{
public IEnumerable<IDocument> Execute(IReadOnlyList<IDocument> inputs, IExecutionContext context)
{
yield return context.GetNewDocument(new MetadataItems { { "Tags", "gumby,pokey" } });
yield return context.GetNewDocument(new MetadataItems { { "Tags", "gumby,oscar" } });
yield return context.GetNewDocument(new MetadataItems { { "Tags", "oscar,kermit" } });
}
}
Output:
gumby
pokey
oscar
kermit
A lot of that is just housekeeping to set up the fake environment for testing. The important part that you're looking for is this:
engine.Pipelines.Add("Tags",
new Documents(ctx => ctx.Documents["Posts"]
.ToLookup<string>("TagArray")
.Select(x => ctx.GetNewDocument(new MetadataItems { { "TagName", x.Key } }))),
// ...
);
Note that we still have to do the work of getting the comma delimited tags list into an array - it's just happening earlier up in the "Posts" pipeline.

MapReduce gives odd result when count == 1

Stack: MongoDB 2.6.5, Mongoid 3.1.6, Ruby 2.1.1
I'm using Mongoid to do some MongoDB Map/Reduce stuff.
Here's the setup:
class User
include Mongoid::Document
has_many :given_bonuses, class_name: 'Bonus', inverse_of: :giver
has_many :received_bonuses, class_name: 'Bonus', inverse_of: :receiver
end
class Bonus
include Mongoid::Document
belongs_to :giver, class_name: 'User', inverse_of: :given_bonuses
belongs_to :receiver, class_name: 'User', inverse_of: :received_bonuses
end
I'm using the following Map/Reduce code:
map = %Q{
function() {
emit(this.receiver_id, this.giver_id)
}
}
reduce = %Q{
function(key, values) {
var result = {};
values.forEach(function(value) {
if(! result[value]) {
result[value] = 0;
}
result[value] += 1;
});
return result;
}
}
Bonus.all.map_reduce(map, reduce).out(inline: 1)
Let's say I have two or more bonus documents:
> Bonus.count
=> 2
> Bonus.all.to_a
=> [#<Bonus _id: 547612a21dbe8b7859000071, giver_id: "547612441dbe8bf35b000005", receiver_id: "547612531dbe8b4a7200006a">, #<Bonus _id: 547612a21dbe8b78590000f9, giver_id: "547612441dbe8bf35b000005", receiver_id: "547612531dbe8b4a7200006a">]
Then the Map/Reduce result is:
=> [{"_id"=>"547612531dbe8b4a7200006a", "value"=>{"ObjectId(\"547612441dbe8bf35b000005\")"=>2.0}}]
Notice that the value key points to a hash of the form {"ObjectID" => Number}. This is as it should be.
Now let's say I have only one bonus document:
> Bonus.count
=> 1
> Bonus.first
=> #<Bonus _id: 547612a21dbe8b7859000071, giver_id: "547612441dbe8bf35b000005", receiver_id: "547612531dbe8b4a7200006a">
Then the Map/Reduce result is:
=> [{"_id"=>"547612531dbe8b4a7200006a", "value"=>"547612441dbe8bf35b000005"}]
Notice that the schema of the result has changed. It should be:
=> [{"_id"=>"547612531dbe8b4a7200006a", "value"=>{"ObjectId(\"547612441dbe8bf35b000005\")"=>1.0}}]
What's going on here?
From the docs:
MongoDB will not call the reduce function for a key that has only a
single value. The values argument is an array whose elements are the
value objects that are “mapped” to the key.
In the first case the key this.receiver_id, has two documents in its group and hence the reduce function is invoked for that key.
In the second case when your emitted key has only one record in its group, the reduce function won't be called at all.
So the value(this.giver_id) that you emit for the key, is being displayed as emitted without being reduced, there is no need for it to be reduced further.