Titan - How to Use 'Lucene' Search Backend - titan

I am attempting to use the lucene search backend with Titan. I am setting the index.search.backend property to lucene as so.
TitanFactory.Builder config = TitanFactory.build();
config.set("storage.backend", "hbase");
config.set("storage.hostname", "node1");
config.set("storage.hbase.table", "titan");
config.set("index.search.backend", "lucene");
config.set("index.search.directory", "/tmp/foo");
TitanGraph graph = config.open();
GraphOfTheGodsFactory.load(graph);
graph.getVertices().forEach(v -> System.out.println(v.toString()));
Of course, this does not work because this setting is of the GLOBAL_OFFLINE variety. The logs make me aware of this. Titan ignores my 'lucene' setting and then attempts to use Elasticsearch as the search backend.
WARN com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration
- Local setting index.search.backend=lucene (Type: GLOBAL_OFFLINE)
is overridden by globally managed value (elasticsearch). Use
the ManagementSystem interface instead of the local configuration to control
this setting.
After some reading, I understand that I need to use the Management System to set the index.search.backend. I need some code that looks something like the following.
graph.getManagementSystem().set("index.search.backend", "lucene");
graph.getManagementSystem().set("index.search.directory", "/tmp/foo");
graph.getManagementSystem().commit();
I am confused on how to integrate this in my original example code above. Since this is a GLOBAL_OFFLINE setting, I cannot set this on an open graph. At the same time, I do not know how to get a graph unless I open one first. How do I set the search backend correctly?

There is no inmemory search backend. The supported search backends are Lucene, Solr, and Elasticsearch.
Lucene is a good option for a small scale, single machine search backend. You need to set 2 properties to do this, index.search.backend and index.search.directory:
index.search.backend=lucene
index.search.directory=/path/to/titansearchindexdir
As you've noted, the search backend is a GLOBAL_OFFLINE, so you should configure this before initially creating your graph. Since you've already created a titan table in your HBase, either disable and drop the titan table, or set your graph configuration to point at a new storage.hbase.table.

Related

I can't find CloudWatch metric in Grafana UI query editor/builder

I'm trying to create a Grafana dashboard that will reflect my AWS RDS cluster metrics.
For the simplicity I've chose CloudWatch as a datasource, It works well for showing the 'direct' metrics from the RDS cluster.
Problem is that we've switched to use RDS Proxy due the high number of connections we are required to support.
Now, I'm adjusting my dashboard to reflect few metrics that are lacking, most important is number of actual connections, which in AWS CloudWatch console presented by this query:
SELECT AVG(DatabaseConnections)
FROM SCHEMA("AWS/RDS", ProxyName,Target,TargetGroup)
WHERE Target = 'db:my-db-1'
AND ProxyName = 'my-db-rds-proxy'
AND TargetGroup = 'default'
Problem is that I can't find it anywhere in the CloudWatch Grafana query editor:
The only metric with "connections" is the standard DatabaseConnections which represents the 'direct' connections to the RDS cluster and not the connections to the RDS Proxy.
Any ideas?
That UI editor is generated from hardcoded list of metrics, which may not contain all metrics and dimensions (especially if they have been added recently), so in that case UI doesn't generate them in the selectbox.
But that is not a problem, because that selectbox is not a standard selectbox. It is an input, where you can write your own metric and dimension name. Just click there, write what you need and Hit enter to add (the same is applicable for:
Pro tip: don't use UI query builder (that's for beginners), but switch to Code and write your queries directly (anyway UI builder builds that query under the hood):
It would be nice if you create a Grafana PR - add these metrics and dimensions which are missing in the UI builder to metrics.go.
So for who ever will ever get here you should use ClientConnections and use the ProxyName as the dimension (which I didn't set initially
I was using old Grafana version (7.3.5) which didn't have it built in.

Spring data mongodb federation attempts -- how can I get interface methods to use a custom configured mongotemplate?

In my application, I need to be able to connect to any number of mongodb hosts, and any number of databases in any of those hosts to support at least this basic level of query federation. This is specified by configuration, so, for any given installation of our app, I cannot know ahead of time how many collections I will need to access. I based my attempt on configuration that I saw in this Baeldung article with some modifications to suit my requirements. My configuration looks something like this yaml:
datasources:
- name: source1
uri: mongodb://user1:pass1#127.0.0.1:27017
fq_collection: db1.coll1
- name: source2
uri: mongodb://user1:pass1#192.168.0.100:27017
fq_collection: db2.coll2
And, depending on the installation, there could be any number of datasources entries. So, in my #Configuration class, I can iterate through these entries that are injected via configuration properties. But I want to create a MongoTemplate that I can set up for each of these, since I cannot rely on the default MongoTemplate. The solution that I have attempted is to create a repository interface, and then to create a custom impl that will accept the configured MongoTemplate. When I use this code to create each Repository instance with its template:
public MongoRepository<String, Item> mongoCustomRepositories(MongoTemplate template) {
MyCustomMongoRepository customImpl = new MyCustomMongoRepositoryImpl(template);
MongoRepositoryFactory repositoryFactory = new MongoRepositoryFactory(template);
return repositoryFactory.getRepository(MyMongoRepository.class, customImpl);
}
And I call it from a #Bean method that returns the list of all of these repositories created from the config entries, I can inject the repositories into service classes.
UPDATE/EDIT: Ok, I set mongodb profiling to 2 in order to log the queries. It turns out that, in fact, the queries are being sent to mongodb, but the problem is that the collection names are not being set for the model. I can't believe that I forgot about this, but I did, so it was using the lower camel case model class name, which will make sure that there are no documents to be retrieved. I have default collection names, but the specific collection names are set in the configuration, like the example YAML shows. I have a couple of ideas, but if anyone has a suggestion about how to set these dynamically, then that would help a lot.
EDIT 2: I did a bunch of work and I have it almost working. You can see my work in my repo. However, in doing this, I uncovered a bug in spring-data-mongodb, and I filed an issue.

Orion contextBroker allows set read preference to Mongodb replicaset?

I'm reading the documentation of orion Context Broker and in the command line arguments I dont see any argument to set the read preference to my replicaset of mongoDB. In my application I need to set that the read preference have the option nearest to avoid bottle necks in high query traffic periods. Does anyone know if is possible?
Current Orion version (3.3.1) doesn't allow to set read preference. There is an open issue in the Orion repository about implement the -mongoUri CLI parameter to allow setting the MongoDB connection URI (so you could add for instance &readPreference=secondary to it).
Alternatively, you could hack the Orion source code to build an specific version for you with the readPreference value you want. Have a look to composeMongoUri() function. It seems it is a matter of just addding uri += optionPrefix + "readPreference=<whatever you want>"; at the end.
It is not a smart soluction (it is not flexible and you would need to rebuild Orion if you want to change the setting) but it could be a valid workaround while -mongoUri gets implemented.

Row level security using prisma and postgres

I am using prisma and yoga graphql servers with a postgres DB.
I want to implement authorization for my graphql queries. I saw solutions like graphql-shield that solve column level security nicely - meaning I can define a permission and according to it block or allow a specific table or column of data (on in graphql terms, block a whole entity or a specific field).
The part I am stuck on is row level security - filtering rows by the data they contain - say I want to allow a logged in user to view only the data that is related to him, so depending on the value in a user_id column I would allow or block access to that row (the logged in user is one example, but there are other usecases in this genre).
This type of security requires running a query to check which rows the current user has access to and I can't find a way (that is not horrible) to implement this with prisma.
If I was working without prisma, I would implement this in the level of each resolver but since I am forwarding my queries to prisma I do not control the internal resolvers on a nested query.
But I do want to work with prisma, so one idea we had was handling this in the DB level using postgres policy. This could work as follows:
Every query we run will be surrounded with “begin transaction” and “commit transaction”
Before the query I want to run “set local context.user_id to 5"
Then I want to run the query (and the policy will filter results according to the current_setting(‘context.user_id’))
For this to work I would need prisma to allow me to either add pre/post queries to each query that runs or let me set a context for the db.
But these options are not available in prisma.
Any ideas?
You can use prisma-client instead of prisma-binding.
With prisma-binding, you define the top level resolver, then delegates to prisma for all the nesting.
On the other hand, prisma-client only returns scalar values of a type, and you need to define the resolvers for the relations. Which means you have complete control on what you return, even for nested queries. (See the documentation for an example)
I would suggest you use prisma-client to apply your security filters on the fields.
With the approach you're looking to take, I'd definitely recommend a look at Graphile. It approaches row-level security essentially the same way that you're thinking of. Unfortunately, it seems like Prisma doesn't help you move away from writing traditional REST-style controller methods in this regard.

Is meteor using the Mongo Oplog?

How can I check if meteor is using the oplog of my mongo?
I have a cluster of mongo and set two envs for my meteor.
MONGO_URL=mongodb://mongo/app?replicaSet=rs0
MONGO_OPLOG_URL=mongodb://mongo/local?authSource=app
How can I check if the opt log is actually in use. Meteor can fallback to query polling which is very inefficient but I would like to see if it's working properly with the oplog.
Any ideas?
Quoting the relevant bits from Meteor's OplogObserveDriver docs:
How to tell if your queries are using OplogObserveDriver
For now, we only have a crude way to tell how many observeChanges calls are using OplogObserveDriver, and not which calls they are.
This uses the facts package, an internal Meteor package that exposes real-time metrics for the current Meteor server. In your app, run meteor add facts, and add the {{> serverFacts}} template to your app. If you are using the autopublish package, Meteor will automatically publish all metrics to all users. If you are not using autopublish, you will have to tell Meteor which users can see your metrics by calling Facts.setUserIdFilter in server code; for example:
Facts.setUserIdFilter(function (userId) {
var user = Meteor.users.findOne(userId);
return user && user.admin;
});
(When running your app locally, Facts.setUserIdFilter(function () { return true; }); may be good enough!)
Now look at your app. The facts template will render a variety of metrics; the ones we're looking for are observe-drivers-oplog and observe-drivers-polling in the mongo-livedata section. If observe-drivers-polling is zero or not rendered at all, then all of your observeChanges calls are using OplogObserveDriver!
To set up oplog tailing, you need to set up a user on my_database, and an oplog_user on local. Then, specify the following URIs to connect to your replica set named test-shard (e.g. if there are 3 hosts named test-shard-[0-2]):
MONGO_URL="mongodb://user:PASS#test-shard-0.mongodb.net:27017,test-shard-1.mongodb.net:27017,test-shard-2.mongodb.net:27017/my_database?ssl=true&replicaSet=test-shard&authSource=admin"
MONGO_OPLOG_URL="mongodb://oplog_user:PASS#test-shard-0.mongodb.net:27017,test-shard-1.mongodb.net:27017,test-shard-2.mongodb.net:27017/local?ssl=true&replicaSet=test-shard&authSource=admin"
On MongoDB Atlas they require ssl=true, and also all users authenticate through the admin database. On another deployment you might just authenticate through my_database, in which case you'd remove the authsource=admin for MONGO_URL and write authsource=my_database for MONGO_OPLOG_URL. See this post for another example.
With MongoDB 3.6 and the Mongo node driver 3.0+, you may be able to use a succinct notation for DNS seedlist connections, e.g. on MongoDB Atlas, to specify the environment variables:
MONGO_URL="mongodb+srv://user:PASS#foo.mongodb.net/my_database"
MONGO_OPLOG_URL="mongodb+srv://oplog_user:PASS#foo.mongodb.net/local"
The link above explains how this notation fills in the ssl, replicaSet, and authSource arguments. This is a lot nicer than the long strings above, and also means you can scale your replica set up and down without needing to reconfigure anything.
As hwillson mentioned, use the facts-ui and facts-base packages (formerly facts) to see if there are any oplogObserveDrivers running in your app. If they are all pollingObserveDriver, than oplog is not set up correctly.
If you are using Kadira APM to monitor your app's performance, you can see if oplogs are working by navigating to the "Live Queries" section and having a look at the "Oplog notifications" chart.
You can see in my screenshot that oplogs are working, as values appear in the chart (bottom right). If oplogs weren't working then this chart would be empty.
This may be very late, but this is the only way that worked for me :
someCollection._driver.mongo._oplogHandle
if this is set to null then the oplog is not enabled, otherwise you can use this handle to check for more details.