I'm working on Rails4, Mongoid4 and Gridfs. I;m not able to connect gridfs filesystem
class GridfsController < ApplicationController
def serve
gridfs_path = env["PATH_INFO"].gsub("/uploads/", "")
begin
gridfs_file = Mongo::GridFileSystem.new(Mongo::DB.new('database_name', Mongo::Connection.new('localhost'))).open(gridfs_path, 'r')
self.response_body = gridfs_file.read
self.content_type = gridfs_file.content_type
rescue Exception => e
self.status = :file_not_found
self.content_type = 'text/plain'
self.response_body = ''
raise e
end
end
end
Getting this error
NameError (uninitialized constant GridfsController::Mongo):
app/controllers/gridfs_controller.rb:7:in `serve'
Mongoid doesn't use the "official" Ruby driver to talk to MongoDB and that's where Mongo::GridFileSystem comes from. Mongoid uses Moped to talk to MongoDB and Moped doesn't know anything about GridFS.
AFAIK the usual GridFS solution is to use mongoid-grid_fs to talk to GridFS:
self.response_body = Mongoid::GridFs[gridfs_path].data
or if you have the id instead of the path:
self.response_body = Mongoid::GridFs.get(gridfs_id).data
There is an implementation of the gridfs specs for the Moped driver here: moped-gridfs
It's better than loading two drivers (moped and mongo-ruby-driver)
Related
I'm trying to connect to IBM Cloud Object Storage from IBM Data Science Experience:
access_key = 'XXX'
secret_key = 'XXX'
bucket = 'mybucket'
host = 'lon.ibmselect.objstor.com'
service = 'mycos'
sqlCxt = SQLContext(sc)
hconf = sc._jsc.hadoopConfiguration()
hconf.set('fs.cos.myCos.access.key', access_key)
hconf.set('fs.cos.myCos.endpoint', 'http://' + host)
hconf.set('fs.cose.myCos.secret.key', secret_key)
hconf.set('fs.cos.service.v2.signer.type', 'false')
obj = 'mydata.tsv.gz'
rdd = sc.textFile('cos://{0}.{1}/{2}'.format(bucket, service, obj))
print(rdd.count())
This returns:
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: java.io.IOException: No FileSystem for scheme: cos
I'm guessing I need to use the 'cos' scheme based on the stocator docs. However, the error suggests stocator isn't available or is an old version?
Any ideas?
Update 1:
I have also tried the following:
sqlCxt = SQLContext(sc)
hconf = sc._jsc.hadoopConfiguration()
hconf.set('fs.cos.impl', 'com.ibm.stocator.fs.ObjectStoreFileSystem')
hconf.set('fs.stocator.scheme.list', 'cos')
hconf.set('fs.stocator.cos.impl', 'com.ibm.stocator.fs.cos.COSAPIClient')
hconf.set('fs.stocator.cos.scheme', 'cos')
hconf.set('fs.cos.mycos.access.key', access_key)
hconf.set('fs.cos.mycos.endpoint', 'http://' + host)
hconf.set('fs.cos.mycos.secret.key', secret_key)
hconf.set('fs.cos.service.v2.signer.type', 'false')
service = 'mycos'
obj = 'mydata.tsv.gz'
rdd = sc.textFile('cos://{0}.{1}/{2}'.format(bucket, service, obj))
print(rdd.count())
However, this time the response was:
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: java.io.IOException: No object store for: cos
at com.ibm.stocator.fs.ObjectStoreVisitor.getStoreClient(ObjectStoreVisitor.java:121)
...
Caused by: java.lang.ClassNotFoundException: com.ibm.stocator.fs.cos.COSAPIClient
The latest version of Stocator (v1.0.9) that supports fs.cos scheme is not yet deployed on Spark aaService (It will be soon). Please use the stocator scheme "fs.s3d" to connect to your COS.
Example:
endpoint = 'endpointXXX'
access_key = 'XXX'
secret_key = 'XXX'
prefix = "fs.s3d.service"
hconf = sc._jsc.hadoopConfiguration()
hconf.set(prefix + ".endpoint", endpoint)
hconf.set(prefix + ".access.key", access_key)
hconf.set(prefix + ".secret.key", secret_key)
bucket = 'mybucket'
obj = 'mydata.tsv.gz'
rdd = sc.textFile('s3d://{0}.service/{1}'.format(bucket, obj))
rdd.count()
Alternatively, you can use ibmos2spark. The lib is already installed on our service. Example:
import ibmos2spark
credentials = {
'endpoint': 'endpointXXXX',
'access_key': 'XXXX',
'secret_key': 'XXXX'
}
configuration_name = 'os_configs' # any string you want
cos = ibmos2spark.CloudObjectStorage(sc, credentials, configuration_name)
bucket = 'mybucket'
obj = 'mydata.tsv.gz'
rdd = sc.textFile(cos.url(obj, bucket))
rdd.count()
Stocator is on the classpath for Spark 2.0 and 2.1 kernels, but the cos scheme is not configured. You can access the config by executing the following in a Python notebook:
!cat $SPARK_CONF_DIR/core-site.xml
Look for the property fs.stocator.scheme.list. What I currently see is:
<property>
<name>fs.stocator.scheme.list</name>
<value>swift2d,swift,s3d</value>
</property>
I recommend that you raise a feature request against DSX to support the cos scheme.
It looks like cos driver is not properly initialized. Try this configuration:
hconf.set('fs.cos.impl', 'com.ibm.stocator.fs.ObjectStoreFileSystem')
hconf.set('fs.stocator.scheme.list', 'cos')
hconf.set('fs.stocator.cos.impl', 'com.ibm.stocator.fs.cos.COSAPIClient')
hconf.set('fs.stocator.cos.scheme', 'cos')
hconf.set('fs.cos.mycos.access.key', access_key)
hconf.set('fs.cos.mycos.endpoint', 'http://' + host)
hconf.set('fs.cos.mycos.secret.key', secret_key)
hconf.set('fs.cos.service.v2.signer.type', 'false')
UPDATE 1:
You also need to ensure stocator classes are on the classpath. You can use packages system by exceuting pyspark in the following way:
./bin/pyspark --packages com.ibm.stocator:stocator:1.0.24
This works with swift2d and cos scheme.
UPDATE 2:
Just follow Stocator documentation (https://github.com/CODAIT/stocator). It contains all details how to install it, what branch to use, etc.
I found the same issue, and to solve it I just changed environment:
Within IBM Watson Studio, if you start a a Jupyter notebook in an environment without a pre-configured spark cluster, than you get that error. Installing PySpark is not enough.
Instead, if you start a notebook with the Spark cluster available, you will be just fine.
You have to set .config("spark.hadoop.fs.stocator.scheme.list", "cos") along with some others fs.cos... configurations.
Here's an end-to-end snippet code example that works (tested with pyspark==2.3.2 and Python 3.7.3):
from pyspark.sql import SparkSession
stocator_jar = '/path/to/stocator-1.1.2-SNAPSHOT-IBM-SDK.jar'
cos_instance_name = '<myCosIntanceName>'
bucket_name = '<bucketName>'
s3_region = '<region>'
cos_iam_api_key = '*******'
iam_servicce_id = 'crn:v1:bluemix:public:iam-identity::<****************>'
spark_builder = (
SparkSession
.builder
.appName('test_app'))
spark_builder.config('spark.driver.extraClassPath', stocator_jar)
spark_builder.config('spark.executor.extraClassPath', stocator_jar)
spark_builder.config(f"fs.cos.{cos_instance_name}.iam.api.key", cos_iam_api_key)
spark_builder.config(f"fs.cos.{cos_instance_name}.endpoint", f"s3.{s3_region}.cloud-object-storage.appdomain.cloud")
spark_builder.config(f"fs.cos.{cos_instance_name}.iam.service.id", iam_servicce_id)
spark_builder.config("spark.hadoop.fs.stocator.scheme.list", "cos")
spark_builder.config("spark.hadoop.fs.cos.impl", "com.ibm.stocator.fs.ObjectStoreFileSystem")
spark_builder.config("fs.stocator.cos.impl", "com.ibm.stocator.fs.cos.COSAPIClient")
spark_builder.config("fs.stocator.cos.scheme", "cos")
spark_sess = spark_builder.getOrCreate()
dataset = spark_sess.range(1, 10)
dataset = dataset.withColumnRenamed('id', 'user_idx')
dataset.repartition(1).write.csv(
f'cos://{bucket_name}.{cos_instance_name}/test.csv',
mode='overwrite',
header=True)
spark_sess.stop()
print('done!')
Does Mongoid has any method like ActiveRecord::Base.connected??
I want to check if the connection that's accessible.
We wanted to implement a health check for our running Mongoid client that tells us whether the established connection is still alive. This is what we came up with:
Mongoid.default_client.database_names.present?
Basically it takes your current client and tries to query the databases on its connected server. If this server is down, you will run into a timeout, which you can catch.
My solution:
def check_mongoid_connection
mongoid_config = File.read("#{Rails.root}/config/mongoid.yml")
config = YAML.load(mongoid_config)[Rails.env].symbolize_keys
host, db_name, user_name, password = config[:host], config[:database], config[:username], config[:password]
port = config[:port] || Mongo::Connection::DEFAULT_PORT
db_connection = Mongo::Connection.new(host, port).db(db_name)
db_connection.authenticate(user_name, password) unless (user_name.nil? || password.nil?)
db_connection.collection_names
return { status: :ok }
rescue Exception => e
return { status: :error, data: { message: e.to_s } }
end
snrlx's answer is great.
I use following in my puma config file, FYI:
before_fork do
begin
# load configuration
Mongoid.load!(File.expand_path('../../mongoid.yml', __dir__), :development)
fail('Default client db check failed, is db connective?') unless Mongoid.default_client.database_names.present?
rescue => exception
# raise runtime error
fail("connect to database failed: #{exception.message}")
end
end
One thing to remind is the default server_selection_timeout is 30 seconds, which is too long for db status check at least in development, you can modify this in your mongoid.yml.
I want to use MongoDb as cacche store for the infinispan to persist the data evicted according to policy
i am posting the snippet of the code that is causing exception along with the exception
ConfigurationBuilder config = new ConfigurationBuilder();
MongoDBCacheStore strgBuilder = new MongoDBCacheStore();
ConfigurationBuilder b = new ConfigurationBuilder();
b.persistence()
.addStore(MongoDBCacheStoreConfigurationBuilder.class)
.host( "localhost" )
.port( 27017 )
.timeout( 1500 )
.acknowledgment( 0 )
.username( "" )
.password( "" )
.database( "infinispan_cachestore" )
.collection( "entries" );
/* DefaultCacheManager manager=new DefaultCacheManager(b.build());
Cache ch=manager.getCache();
ch.put("username","sogani"); */
final Configuration configcache = b.build();
MongoDBCacheStoreConfiguration store = (MongoDBCacheStoreConfiguration) configcache.persistence().stores().get(0);
exception that I am getting is
java.lang.NoSuchMethodException: org.infinispan.loaders.mongodb.configuration.MongoDBCacheStoreConfigurationBuilder.
Any pointer will be of a great help
Thnx.
MongoDB was not updated after new persistence API was adopted in Infinispan. Try Infinispan 5.2.7.Final, maybe 5.3.0.Final or look into adaptor52x stuff. Or, even better, try to reimplement it using the new CacheWriter interface and issue a PR - the existing code should provide you some guidelines.
I am using Neo4j(embedded) Enterprise edition 1.9.4 along with Scala-Neo4j wrapper in my project. I tried to backup the Neo4j data using Java like below
def backup_data()
{
val backupPath: File = new File("D:/neo4j-enterprise-1.9.4/data/backup/")
val backup = OnlineBackup.from( "127.0.0.1" )
if(backupPath.list().length > 0)
{
backup.incremental( backupPath.getPath() )
}
else
{
backup.full( backupPath.getPath() );
}
}
It is working fine for the full backup. But the incremental backup part is throwing the Null pointer exception.
Where did I go wrong?
EDIT
Building the GraphDatabase instance through Scala-Neo4j wrapper
class MyNeo4jClass extends SomethingClass with Neo4jWrapper with EmbeddedGraphDatabaseServiceProvider {
def neo4jStoreDir = "/tmp/temp-neo-test"
. . .
}
Stacktrace
Exception in thread "main" java.lang.NullPointerException
at org.neo4j.consistency.checking.OwnerChain$3.checkReference(OwnerChain.java:111)
at org.neo4j.consistency.checking.OwnerChain$3.checkReference(OwnerChain.java:106)
at org.neo4j.consistency.report.ConsistencyReporter$DiffReportHandler.checkReference(ConsistencyReporter.java:330)
at org.neo4j.consistency.report.ConsistencyReporter.dispatchReference(ConsistencyReporter.java:109)
at org.neo4j.consistency.report.PendingReferenceCheck.checkReference(PendingReferenceCheck.java:50)
at org.neo4j.consistency.store.DirectRecordReference.dispatch(DirectRecordReference.java:39)
at org.neo4j.consistency.report.ConsistencyReporter$ReportInvocationHandler.forReference(ConsistencyReporter.java:236)
at org.neo4j.consistency.report.ConsistencyReporter$ReportInvocationHandler.dispatchForReference(ConsistencyReporter.java:228)
at org.neo4j.consistency.report.ConsistencyReporter$ReportInvocationHandler.invoke(ConsistencyReporter.java:192)
at $Proxy17.forReference(Unknown Source)
at org.neo4j.consistency.checking.OwnerChain.check(OwnerChain.java:143)
at org.neo4j.consistency.checking.PropertyRecordCheck.checkChange(PropertyRecordCheck.java:57)
at org.neo4j.consistency.checking.PropertyRecordCheck.checkChange(PropertyRecordCheck.java:35)
at org.neo4j.consistency.report.ConsistencyReporter.dispatchChange(ConsistencyReporter.java:101)
at org.neo4j.consistency.report.ConsistencyReporter.forPropertyChange(ConsistencyReporter.java:382)
at org.neo4j.consistency.checking.incremental.StoreProcessor.checkProperty(StoreProcessor.java:61)
at org.neo4j.consistency.checking.AbstractStoreProcessor.processProperty(AbstractStoreProcessor.java:95)
at org.neo4j.consistency.store.DiffRecordStore$DispatchProcessor.processProperty(DiffRecordStore.java:207)
at org.neo4j.kernel.impl.nioneo.store.PropertyStore.accept(PropertyStore.java:83)
at org.neo4j.kernel.impl.nioneo.store.PropertyStore.accept(PropertyStore.java:43)
at org.neo4j.consistency.store.DiffRecordStore.accept(DiffRecordStore.java:159)
at org.neo4j.kernel.impl.nioneo.store.RecordStore$Processor.applyById(RecordStore.java:180)
at org.neo4j.consistency.store.DiffStore.apply(DiffStore.java:73)
at org.neo4j.kernel.impl.nioneo.store.StoreAccess.applyToAll(StoreAccess.java:174)
at org.neo4j.consistency.checking.incremental.IncrementalDiffCheck.execute(IncrementalDiffCheck.java:43)
at org.neo4j.consistency.checking.incremental.DiffCheck.check(DiffCheck.java:39)
at org.neo4j.consistency.checking.incremental.intercept.CheckingTransactionInterceptor.complete(CheckingTransactionInterceptor.java:160)
at org.neo4j.kernel.impl.transaction.xaframework.InterceptingXaLogicalLog$1.intercept(InterceptingXaLogicalLog.java:79)
at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog$LogDeserializer.readAndWriteAndApplyEntry(XaLogicalLog.java:1120)
at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.applyTransaction(XaLogicalLog.java:1292)
at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.applyCommittedTransaction(XaResourceManager.java:766)
at org.neo4j.kernel.impl.transaction.xaframework.XaDataSource.applyCommittedTransaction(XaDataSource.java:246)
at org.neo4j.com.ServerUtil.applyReceivedTransactions(ServerUtil.java:423)
at org.neo4j.backup.BackupService.unpackResponse(BackupService.java:453)
at org.neo4j.backup.BackupService.incrementalWithContext(BackupService.java:388)
at org.neo4j.backup.BackupService.doIncrementalBackup(BackupService.java:286)
at org.neo4j.backup.BackupService.doIncrementalBackup(BackupService.java:273)
at org.neo4j.backup.OnlineBackup.incremental(OnlineBackup.java:147)
at Saddahaq.User_node$.backup_data(User_node.scala:1637)
at Saddahaq.User_node$.main(User_node.scala:2461)
at Saddahaq.User_node.main(User_node.scala)
After the backup is taken, the backed target is checked for consistency. The incremental version of the consistency checker currently suffers from a bug leading to the observed NPE.
Workaround: either always take full backups with backup.full or prevent consistency checking on incremental backups by using
backup.incremental(backupPath.getPath(), false);
I have a small flask application which I am deploying to Heroku.
My local configuration looks like this:
from flask import Flask
from flask.ext.mongoengine import MongoEngine
app = Flask(__name__)
app.debug = True
app.config["MONGODB_SETTINGS"] = {'DB': "my_app"}
app.config["SECRET_KEY"] = "secretpassword"
db = MongoEngine(app)
So, I know that I need to configure the app to use the Mongo URI method of connection, and I have my connection info:
mongodb://<user>:<password>#alex.mongohq.com:10043/app12345678
I am just a little stuck as to the syntax for modifying my app to connect through the URI.
So I got it working (finally):
from flask import Flask
from mongoengine import connect
app = Flask(__name__)
app.config["MONGODB_DB"] = 'app12345678'
connect(
'app12345678',
username='heroku',
password='a614e68b445d0d9d1c375740781073b4',
host='mongodb://<user>:<password>#alex.mongohq.com:10043/app12345678',
port=10043
)
Though I anticipate that various other configurations will work.
When you look at the flask-mongoengine code, you can see what configuration variables are available
So this should work:
app.config["MONGODB_HOST"] = 'alex.mongohq.com/app12345678'
app.config["MONGODB_PORT"] = 10043
app.config["MONGODB_DATABASE"] = 'dbname'
app.config["MONGODB_USERNAME"] = 'user'
app.config["MONGODB_PASSWORD"] = 'password'
db = MongoEngine(app)
I'm not sure, if app123 is the app or the database name. You might have to fiddle arround a little to get the connection. I had the same problem with Mongokit + MongoLab on Heroku :)
Also you could use the URI like this.
app.config["MONGODB_SETTINGS"] = {'DB': "my_app", "host":'mongodb://<user>:<password>#alex.mongohq.com:10043/app12345678'}
I have actually no idea, at what point "MONGODB_SETTINGS" is read, but it seemed to work, when I tried it in the shell.
I figured out how to use the flask.ext.mongoengine.MongoEngine wrapper class to do this rather than mongoengine.connect():
from flask import Flask
from flask.ext.mongoengine import MongoEngine
app = Flask(__name__)
HOST = '<hostname>' # ex: 'oceanic.mongohq.com'
db_settings = {
'MONGODB_DB': '<database>',
'MONGODB_USERNAME': '<username>',
'MONGODB_PASSWORD': '<password>',
'MONGODB_PORT': <port>,
}
app.config = dict(list(app.config.items()) + list(db_settings.items()))
app.config["MONGODB_HOST"] = ('mongodb://%(MONGODB_USERNAME)s:%(MONGODB_PASSWORD)s#'+
HOST +':%(MONGODB_PORT)s/%(MONGODB_DB)s') % db_settings
db = MongoEngine(app)
if __name__ == '__main__':
app.run()
If you're using mongohq, app.config["MONGODB_HOST"] should match the Mongo URI under Databases->Admin->Overview.
You can then follow MongoDB's tumblelog tutorial using this setup to write your first app called tumblelog.
Using python's nifty object introspection (python oh how I love you so), you can see how the MongoEngine wrapper class accomplishes this:
from flask.ext.mongoengine import MongoEngine
import inspect
print(inspect.getsource(MongoEngine))
...
conn_settings = {
'db': app.config.get('MONGODB_DB', None),
'username': app.config.get('MONGODB_USERNAME', None),
'password': app.config.get('MONGODB_PASSWORD', None),
'host': app.config.get('MONGODB_HOST', None),
'port': int(app.config.get('MONGODB_PORT', 0)) or None
}
...
self.connection = mongoengine.connect(**conn_settings)
...
self.app = app