Find all where parameter is within an array - Waterline - sails.js

In pseudo code, it'd be as so
Find all businesses where the outcodes array contains NG1
I'm having a hard time finding something that works, and waterline throws it's Invalid usage at everything I try.
Business.find({
or:{outcodes: {contains: 'NG1 4RQ' }}
})
For reference, my business model contains outcodes as an array:
outcodes: { type: 'array' },
Is anyone able to advise how I can achieve this. I'm stumped. Currently using SailsJS with Waterline ORM

The or is not working because it needs to be an array. With only 1 criteria, you don't need to use or, but here's an example using or and searching an array for a partial string.
Business.find({
or: [ { outcodes: { contains: 'NG1' }}]
}).exec(function(err, businesses){...});

Related

Mongodb Stitch realtime watch

What I intend to achieve is some sort of "live query" functionality.
So far I've tried using the "watch" method. According to the documentation:
You can open a stream of changes that match a filter by calling
collection.watch(delegate:) with a $match expression as the argument.
Whenever the watched collection changes and the ChangeEvent matches
the provided $match expression, the stream’s event handler fires with
the ChangeEvent object as its only argument
Passing the doc ids as an array works perfectly, but passing a query doesn't work:
this.stitch.db.collection<Queue>('queues')
.watch({
hospitalId: this.activehospitalid
}));
I've also tried this:
this.stitch.db.collection<Queue>('queues')
.watch({
$match: {
hospitalId: this.activehospitalid
}
},
));
Which throws an error on the console "StitchServiceError: mongodb watch: filter is invalid (unknown top level operator: $match)". The intention is watch all documents where the field "hospitalId" matches the provided value, or to effectively pass a query filter to the watch() method.
After a long search I found that it's possible to filter, but the query needs to be formatted differently
this.stitch.db.collection<Queue>('queues')
.watch({
$or: [
{
"fullDocument.hospitalId": this.activehospitalid
}
]
},
));
For anyone else who might need this, please note the important fullDocument part of the query. I havent found much documentation relating to this, but I hope it helps

MongoDB: Bulk changing all field types in python

I have a ton of documents (around 10 million) and I need to change their field type. The usual forEach function (just looping through every value) seems to take forever and is clearly not viable in the timeframe I have (it basically took all night for one out of four updates)
I've heard that bulkwrites may be able to do it but I'm getting mixed messages. I saw a confusing answer on this site, for example, says that there's no written function to do it (you would have to do some workaround), others say that it can be done with updates in Python, using pymongo.
I was wondering if there was a quicker way to mass changes of field type (string->double, string -> int) using python? I can also work from the console but I find even less solutions there.
Thanks
You can try using aggregation query in the mongo shell
Something like
db.your_collection.aggregate([
{
$addFields: {
field1: {
$convert: {
input: "$field1",
to: "string"
}
}
}
},
{ $out: "your_collection" }
])
More info here https://docs.mongodb.com/manual/reference/operator/aggregation/convert/

Fuzzy search required for searching nested objects using feathers-vuex?

I am using feathers-vuex in a project and am not very familiar with the rest of the feathers package. I am using this is because with the scaffolding cli, it was very easy to get started and it just works. Has been a really good experience so far. However, this also means that I do not entirely get what's going on under the hood. I am trying to use the find function to retrieve all records where a nested array contains a certain string from a mongodb. The questions are as follows:
So far, the only option that I can think of is a fuzzy search. Is that the way to do it? Or are there other possibilities?
Is my assumption that fuzzy search won't work because of the absence of hooks correct? Or have I misread the docs?
Any other general way of accomplishing this?
Does this mean that fuzzy search will not work using feathers-vuex or are there ways to accomplish this?
Now that you got set up, I do recommend to go through the basics guide and have a look at any of the other guides.
Depending on what you picked you can also look at the MongoDB and Mongoose database adapter API documentation. Additionally to the common query syntax both those adapters also support additional MongoDB queries which you can find in the MongoDB documentation.
If you look at the MongoDB documentation on how to query for arrays you can see that querying for a value in an array can be done similar to this:
async function run() {
await app.service('myservice').create({
name: 'first',
test: [ 'one', 'two' ]
});
await app.service('myservice').create({
name: 'second',
test: [ 'two', 'three' ]
});
let results = await app.service('myservice').find({
query: { test: 'two' }
});
console.log(results) // will log `first` and `second`
result = await app.service('myservice').find({
query: { test: 'one' }
});
console.log(results) // will log only `first`
}
run();

call custom python function on every document in a collection Mongo DB

I want to call a custom python function on some existing attribute of every document in the entire collection and store the result as a new key-value pair in that (same) document. May I know if there's any way to do that (since each call is independent of others) ?
I noticed cursor.forEach but can't it be done just using python efficiently ?
A simple example would be to split the string in text and store the no. of words as a new attribute.
def split_count(text):
# some complex preprocessing...
return len(text.split())
# Need something like this...
db.collection.update_many({}, {'$set': {"split": split_count('$text') }}, upsert=True)
But it seems like setting a new attribute in a document based on the value of another attribute in the same document is not possible this way yet. This post is old but the issues seem to be still open.
I found a way to call any custom python function on a collection using parallel_scan in PyMongo.
def process_text(cursor):
for row in cursor.batch_size(200):
# Any complex preprocessing here...
split_text = row['text'].split()
db.collection.update_one({'_id': row['_id']},
{'$set': {'split_text': split_text,
'num_words': len(split_text) }},
upsert=True)
def preprocess(num_threads=4):
# Get up to max 'num_threads' cursors.
cursors = db.collection.parallel_scan(num_threads)
threads = [threading.Thread(target=process_text, args=(cursor,)) for cursor in cursors]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
This is not really faster than cursor.forEach (but not that slow either), but it helps me execute any arbitrarily complex python code and save the results from within Python itself.
Also if I have an array of ints in one of the attributes, doing cursor.forEach converts them to floats which I don't want. So I preferred this way.
But I would be glad to know if there're any better ways than this :)
It is quite unlikely that it will ever be efficient to do this kind of thing in python. This is because the document would have to make a round trip and go through the python function on the client machine.
In your example code, you are passing the result of a function to a mongodb update query, which won't work. You can't run any python code inside mongodb queries on the db server.
As the answer to you linked question suggests, this type of action has to be performed in the mongo shell. e.g:
db.collection.find().snapshot().forEach(
function (elem) {
splitLength = elem.text.split(" ").length
db.collection.update(
{
_id: elem._id
},
{
$set: {
split: splitLength
}
}
);
}
);

Mongo find by regex: return only matching string

My application has the following stack:
Sinatra on Ruby -> MongoMapper -> MongoDB
The application puts several entries in the database. In order to crosslink to other pages, I've added some sort of syntax. e.g.:
Coffee is a black, caffeinated liquid made from beans. {Tea} is made from leaves. Both drinks are sometimes enjoyed with {milk}
In this example {Tea} will link to another DB entry about tea.
I'm trying to query my mongoDB about all 'linked terms'. Usually in ruby I would do something like this: /{([a-zA-Z0-9])+}/ where the () will return a matched string. In mongo however I get the whole record.
How can I get mongo to return me only the matched parts of the record I'm looking for. So for the example above it would return:
["Tea", "milk"]
I'm trying to avoid pulling the entire record into Ruby and processing them there
I don't know if I understand.
db.yourColl.aggregate([
{
$match:{"yourKey":{$regex:'[a-zA-Z0-9]', "$options" : "i"}}
},
{
$group:{
_id:null,
tot:{$push:"$yourKey"}
}
}])
If you don't want to have duplicate in totuse $addToSet
The way I solved this problem is using the string aggregation commands to extract the StartingIndexCP, ending indexCP and substrCP commands to extract the string I wanted. Since you could have multiple of these {} you need to have a projection to identify these CP indices in one shot and have another projection to extract the words you need. Hope this helps.