How to get ID from file added to GridFS with the Rust MongoDB driver? - mongodb

The mongodb 0.1.4 bindings for Rust provide a GridFS implementation.
As from the code and the examples there is a put, but it doesn't return an object ID.
My workaround is to put the file into GridFS and then open it again to retrieve the ID:
fn file_to_mongo(gridfs: &Store, fpath: &PathBuf) -> bson::oid::ObjectId {
gridfs.put(fpath.to_str().unwrap().to_owned());
let mut file = gridfs.open(fpath.to_str().unwrap().to_owned()).unwrap();
let id = file.doc.id.clone();
file.close().unwrap();
id
}
Is there a better way?

I don't have MongoDB running and I don't really know anything about it, but this at least has the right signature and compiles.
extern crate bson;
extern crate mongodb;
use mongodb::gridfs::{Store,ThreadedStore};
use mongodb::error::Result as MongoResult;
use std::{fs, io};
fn my_put(store: &Store, name: String) -> MongoResult<bson::oid::ObjectId> {
let mut f = try!(fs::File::open(&name));
let mut file = try!(store.create(name));
try!(io::copy(&mut f, &mut file));
try!(file.close());
Ok(file.doc.id.clone())
}
Recall that most Rust libraries are open-source and you can even browse the source directly from the documentation. This function is basically just a hacked version of the existing put.

Related

mongodb-rust-driver perform poorly on find and get large amount of data compare to go-driver

I have a database consist of 85.4k of document with average size of 4kb
I write a simple code in go to find and get over 70k document from the database using mongodb-go-driver
package main
import (
"context"
"log"
"time"
"go.mongodb.org/mongo-driver/mongo"
"go.mongodb.org/mongo-driver/mongo/options"
)
func main() {
localC, _ := mongo.Connect(context.TODO(), options.Client().ApplyURI("mongodb://127.0.0.1:27017/?gssapiServiceName=mongodb"))
localDb := localC.Database("sampleDB")
collect := localDb.Collection("sampleCollect")
localCursor, _ := collect.Find(context.TODO(), JSON{
"deleted": false,
})
log.Println("start")
start := time.Now()
var result []map[string] interface{} = make([]map[string] interface{}, 0)
localCursor.All(context.TODO(), &result)
log.Println(len(result))
log.Println("done")
log.Println(time.Now().Sub(start))
}
Which done in around 20 seconds
2021/03/21 01:36:43 start
2021/03/21 01:36:56 70922
2021/03/21 01:36:56 done
2021/03/21 01:36:56 20.0242869s
After that, I try to implement the similar thing in rust using mongodb-rust-driver
use mongodb::{
bson::{doc, Document},
error::Error,
options::FindOptions,
Client,
};
use std::time::Instant;
use tokio::{self, stream::StreamExt};
#[tokio::main]
async fn main() {
let client = Client::with_uri_str("mongodb://localhost:27017/")
.await
.unwrap();
let db = client.database("sampleDB");
let coll = db.collection("sampleCollect");
let find_options = FindOptions::builder().build();
let cursor = coll
.find(doc! {"deleted": false}, find_options)
.await
.unwrap();
let start = Instant::now();
println!("start");
let results: Vec<Result<Document, Error>> = cursor.collect().await;
let es = start.elapsed();
println!("{}", results.iter().len());
println!("{:?}", es);
}
But it took almost 1 minutes to complete the same task on release build
$ cargo run --release
Finished release [optimized] target(s) in 0.43s
Running `target\release\rust-mongo.exe`
start
70922
51.1356069s
May I know the performance on rust in this case is consider normal or I made some mistake on my rust code and it could be improve?
EDIT
As comment suggested, here is the Example document
The discrepancy here was due to some known bottlenecks in the Rust driver that have since been addressed in the latest beta release (2.0.0-beta.3); so, upgrading your mongodb dependency to use that version should solve the issue.
Re-running your examples with 10k copies of the provided sample document, I now see the Rust one taking ~3.75s and the Go one ~5.75s on my machine.

Updating data in MongoDB with Rust

I'm trying to update a field in a collection of a MongoDB database using Rust. I was using this code:
extern crate mongodb;
use mongodb::{Client, ThreadedClient};
use mongodb::db::ThreadedDatabase;
fn main() {
let client = Client::connect("ipaddress", 27017);
let coll = client.db("DEV").collection("DEV");
let film_a = doc!{"DEVID"=>"1"};
let filter = film_a.clone();
let update = doc!{"temp"=>"5"};
coll.update_one(filter, update, None).expect("failed");
}
This gives me an error saying update only works with the $ operator, which after some searching seems to mean I should use $set. I've been trying different versions of this but only get mismatched type errors and such.
coll.update_one({"DEVID": "1"},{$set:{"temp" => "5"}},None).expect("failed");
Where am I going wrong?
The DB looks like this.
db.DEVICES.find()
{ "_id" : ObjectId("59a7bb747a1a650f1814ef85"), "DEVID" : 1, "temp" : 0,
"room_temp" : 0 }
{ "_id" : ObjectId("59a7bb827a1a650f1814ef86"), "DEVID" : 2, "temp" : 0,
"room_temp" : 0 }
If someone is looking for the answer for a newer version of the driver, here it is based on #PureW's answer in an async version:
use mongodb::{Client, ThreadedClient, bson::doc};
use mongodb::db::ThreadedDatabase;
async fn main() {
let client = Client::connect("localhost", 27017).unwrap();
let coll = client.db("tmp").collection("tmp");
let filter = doc!{"DEVID":"1"};
let update = doc!{"$set": {"temp":"5"}};
coll.update_one(filter, update, None).await.unwrap();
}
You're pretty much there. The following compiles and runs for me when I try your example (hint: You haven't enclosed "$set" in quotes):
#[macro_use(bson, doc)]
extern crate bson;
extern crate mongodb;
use mongodb::{Client, ThreadedClient};
use mongodb::db::ThreadedDatabase;
fn main() {
let client = Client::connect("localhost", 27017).unwrap();
let coll = client.db("tmp").collection("tmp");
let filter = doc!{"DEVID"=>"1"};
let update = doc!{"$set" => {"temp"=>"5"}};
coll.update_one(filter, update, None).unwrap();
}
Another piece of advice: Using unwrap rather than expect might give you more precise errors.
As for using the mongodb-library, I've stayed away from that as the authors explicitly say it's not production ready and even the update_one example in their documentation is broken.
Instead I've used the wrapper over the battle-tested C-library with good results.

Procedural macro parsing weirdness in Rust

I'm trying to parse a macro similar to this one:
annoying!({
hello({
// some stuff
});
})
Trying to do this with a procedural macro definition similar to the following, but I'm getting a behaviour I didn't expect and I'm not sure I'm doing something I'm not supposed to or I found a bug. In the following example, I'm trying to find the line where each block is,
for the first block (the one just inside annoying!) it reports the correct line, but for the inner block, when I try to print them it's always 1, no matter where the code is etc.
#![crate_type="dylib"]
#![feature(macro_rules, plugin_registrar)]
extern crate syntax;
extern crate rustc;
use macro_result::MacroResult;
use rustc::plugin::Registry;
use syntax::ext::base::{ExtCtxt, MacResult};
use syntax::ext::quote::rt::ToTokens;
use syntax::codemap::Span;
use syntax::ast;
use syntax::parse::tts_to_parser;
mod macro_result;
#[plugin_registrar]
pub fn plugin_registrar(registry: &mut Registry) {
registry.register_macro("annoying", macro_annoying);
}
pub fn macro_annoying(cx: &mut ExtCtxt, _: Span, tts: &[ast::TokenTree]) -> Box<MacResult> {
let mut parser = cx.new_parser_from_tts(tts);
let lo = cx.codemap().lookup_char_pos(parser.span.lo);
let hi = cx.codemap().lookup_char_pos(parser.span.hi);
println!("FIRST LO {}", lo.line); // real line for annoying! all cool
println!("FIRST HI {}", hi.line); // real line for annoying! all cool
let block_tokens = parser.parse_block().to_tokens(cx);
let mut block_parser = tts_to_parser(cx.parse_sess(), block_tokens, cx.cfg());
block_parser.bump(); // skip {
block_parser.parse_ident(); // hello
block_parser.bump(); // skip (
// block lines
let lo = cx.codemap().lookup_char_pos(block_parser.span.lo);
let hi = cx.codemap().lookup_char_pos(block_parser.span.hi);
println!("INNER LO {}", lo.line); // line 1? wtf?
println!("INNER HI {}", hi.line); // line 1? wtf?
MacroResult::new(vec![])
}
I think the problem might be the fact that I'm creating a second parser to parse the inner block, and that might be making the Span types inside it go crazy, but I'm not sure that's the problem or how to keep going from here. The reason I'm creating this second parser is so I can recursively parse what's inside each of the blocks, I might be doing something I'm not supposed to, in which case a better suggestion would be very welcome.
I believe this is #15962 (and #16472), to_tokens has a generally horrible implementation. Specifically, anything non-trivial uses ToSource, which just turns the code to a string, and then retokenises that (yes, it's not great at all!).
Until those issues are fixed, you should just handle the original tts directly as much as possible. You could approximate the right span using the .span of the parsed block (i.e. return value of parse_block), which will at least focus the user's attention on the right area.

Does Rust have a dlopen equivalent

Does Rust have a way to make a program pluggable. In C the plugins I create are .so files that I load with dlopen. Does Rust provide a native way of doing the same thing?
The Rust FAQ officially endorses libloading. Beyond that, there are three different options I know of:
Use the shared_library crate
Use the dylib crate.
Use std::dynamic_lib, which is deprecated since Rust 1.5. (These docs are no longer available in version 1.32; it's likely the feature has been dropped altogether by now.)
I haven't tried any of these, so I cannot really say which is best or what the pros/cons are for the different variants. I'd strongly advise against using std::dynamic_lib at least, given that it's deprecated and will likely be made private at some point in the future.
Exactly,
And below is the complete use case example:
use std::unstable::dynamic_lib::DynamicLibrary;
use std::os;
fn load_cuda_library()
{
let path = Path::new("/usr/lib/libcuda.so");
// Make sure the path contains a / or the linker will search for it.
let path = os::make_absolute(&path);
let lib = match DynamicLibrary::open(Some(&path)) {
Ok(lib) => lib,
Err(error) => fail!("Could not load the library: {}", error)
};
// load cuinit symbol
let cuInit: extern fn(u32) -> u32 = unsafe {
match lib.symbol("cuInit") {
Err(error) => fail!("Could not load function cuInit: {}", error),
Ok(cuInit) => cuInit
}
};
let argument = 0;
let expected_result = 0;
let result = cuInit(argument);
if result != expected_result {
fail!("cuInit({:?}) != {:?} but equaled {:?}",
argument, expected_result, result)
}
}
fn main()
{
load_cuda_library();
}
Yes. There's a module std::unstable::dynamic_lib that enables dynamic loading of libraries. It's undocumented, though, as it's a highly experimental API (everything in std::unstable is undocumented). As #dbaupp suggests, the source is the best documentation (current version is af9368452).

What would be the opposite to hasFields?

I'm using logical deletes by adding a field deletedAt. If I want to get only the deleted documents it would be something like r.table('clients').hasFields('deletedAt'). My method has a withDeletes parameter which determines if deleted documents are excluded or not.
Finally, people at the #rethinkdb IRC channel suggested me to use the filter method and that did the trick:
query = adapter.table(table).filter(filters)
if withDeleted
query = adapter.filter (doc) ->
return doc.hasFields 'deletedAt'
else
query = adapter.filter (doc) ->
return doc.hasFields('deletedAt').not()
query.run connection, (err, results) ->
...
My question is why do I have to use filter and not something like:
query = adapter.table(table).filter(filters)
query = if withDeleted then query.hasFields 'deletedAt' else query.hasFields('deletedAt').not()
...
or something like that.
Thanks in advance.
The hasFields function can be called on both objects and sequences, but not cannot.
This query:
query.hasFields('deletedAt')
Behaves the same as this one (on sequences of objects):
query.filter((doc) -> return doc.hasFields('deletedAt'))
However, this query:
query.hasFields('deletedAt').not()
Behaves like this:
query.filter((doc) -> return doc.hasFields('deletedAt')).not()
But that doesn't make sense. you want the not to be inside the filter, not after it. Like this:
query.filter((doc) -> return doc.hasFields('deletedAt').not())
One nice that about RethinkDB is that because of the way queries are built up in host language it's very easy to define new fluent syntax by just defining functions in your language. For example if you wanted to have a lacksFields function you could define it in Python (sorry I don't really know coffeescript) like so:
def lacks_fields(stream, *args):
res = stream
for arg in args:
res = res.filter(lambda x: ~x.has_fields(arg))
return res
Then you can use a nice fluent syntax like:
lacks_fields(stream, "foo", "bar", "buzz")