How do I confirm I am reading the data from Mongo secondary server from Java - mongodb

For performance optimisation we are trying to read data from Mongo secondary server for selected scenarios. I am using the inline query using "withReadPreference(ReadPreference.secondaryPreferred())" to read the data, PFB the code snippet.
What I want to confirm the data we are getting is coming from secondary server after executing the inline query highlighted, is there any method available to check the same from Java or Springboot
public User read(final String userId) {
final ObjectId objectId = new ObjectId(userId);
final User user = collection.withReadPreference(ReadPreference.secondaryPreferred()).findOne(objectId).as(User.class);
return user;
}

Pretty much the same way in Java. Note we use secondary() not secondaryPrefered(); this guarantees reads from secondary ONLY:
import com.mongodb.ReadPreference;
{
// This is your "regular" primaryPrefered collection:
MongoCollection<BsonDocument> tcoll = db.getCollection("myCollection", BsonDocument.class);
// ... various operations on tcoll, then create a new
// handle that FORCES reads from secondary and will timeout and
// fail if no secondary can be found:
MongoCollection<BsonDocument> xcoll = tcoll.withReadPreference(ReadPreference.secondary());
BsonDocument f7 = xcoll.find(queryExpr).first();
}

Related

Is there any way to compress the data while using mongo persistence with NEventStore?

I'm working with C#, Dotnet core, and NeventStore( version- 9.0.1), trying to evaluate various persistence options that it supports out of the box.
More specifically, when trying to use the mongo persistence, the payload is getting stored without any compression being applied.
Note: Payload compression is happening perfectly when using the SQL persistence of NEventStore whereas not with the mongo persistence.
I'm using the below code to create the event store and initialize:
private IStoreEvents CreateEventStore(string connectionString)
{
var store = Wireup.Init()
.UsingMongoPersistence(connectionString,
new NEventStore.Serialization.DocumentObjectSerializer())
.InitializeStorageEngine()
.UsingBsonSerialization()
.Compress()
.HookIntoPipelineUsing()
.Build();
return store;
}
And, I'm using the below code for storing the events:
public async Task AddMessageTostore(Command command)
{
using (var stream = _eventStore.CreateStream(command.Id))
{
stream.Add(new EventMessage { Body = command });
stream.CommitChanges(Guid.NewGuid());
}
}
The workaround did: Implementing the PreCommit(CommitAttempt attempt) and Select methods in IPipelineHook and by using gzip compression logic the compression of events was achieved in MongoDB.
Attaching data store image of both SQL and mongo persistence:
So, the questions are:
Is there some other option or setting I'm missing so that the events get compressed while saving(fluent way of calling compress method) ?
Is the workaround mentioned above sensible to do or is it a performance overhead?
I also faced the same issue while using the NEventStore.Persistence.MongoDB.
Even if I used the fluent way of compress method, the payload compression is not happening perfectly in the mongo persistence like SQL persistence.
Finally, I have achieved the compression/decompression by customizing the logic inside the PreCommit(CommitAttempt attempt) and Select(ICommit committed) methods.
Code used for compression:
using (var stream = new MemoryStream())
{
using (var compressedStream = new GZipStream(stream,
CompressionMode.Compress))
{
var serializer = new JsonSerializer {
TypeNameHandling = TypeNameHandling.None,
ReferenceLoopHandling = ReferenceLoopHandling.Ignore
};
var writer = new JsonTextWriter(new StreamWriter(compressedStream));
serializer.Serialize(writer, this);
writer.Flush();
}
return stream.ToArray();
}
Code used for decompression:
using (var stream = new MemoryStream(bytes))
{
var decompressedStream = new GZipStream(stream, CompressionMode.Decompress);
var serializer = new JsonSerializer {
TypeNameHandling = TypeNameHandling.None,
ReferenceLoopHandling = ReferenceLoopHandling.Ignore
};
var reader = new JsonTextReader(new StreamReader(decompressedStream));
var body = serializer.Deserialize(reader, type);
return body as Command;
}
I'm not sure if this a right approach or will this have any impact on the performance of EventStore operations like Insert and Select..

Mongoose how to listen for collection changes

I need to build a mongo updater process to dowload mongodb data to local IoT devices (configuration data, etc.)
My goal is to watch for some mongo collections in a fixed interval (1 minute, for example). If I have changed a collection (deletion, insertion or update) I will download the full collection to my device. The collections will have no more than a few hundred simple records, so it´s gonna not be a lot of data to download.
Is there any mechanism to find out a collection has changed since last pool ? What mongo features should be used in that case ?
To listen for changes to your MongoDB collection, set up a Mongoose Model.watch.
const PersonModel = require('./models/person')
const personEventEmitter = PersonModel.watch()
personEventEmitter.on('change', change => console.log(JSON.stringify(change)))
const person = new PersonModel({name: 'Thabo'})
person.save()
// Triggers console log on change stream
// {_id: '...', operationType: 'insert', ...}
Note: This functionality is only available on a MongoDB Replicaset
See Mongoose Model Docs for more:
If you want to listen for changes to your DB, use Connection.watch.
See Mongoose Connection Docs for more
These functions listen for Change Events from MongoDB Change Streams as of v3.6
I think best solution would be using post update middleware.
You can read more about that here
http://mongoosejs.com/docs/middleware.html
I have the same demand on an embedded that works quite autonomously, and it is always necessary to auto adjust your operating parameters without having to reboot your system.
For this I created a configuration manager class, and in its constructor I coded a "parameter monitor", which checks the database only the parameters that are flagged for it, of course if a new configuration needs to be monitored, I inform the config -manager in another part of the code to reload such an update.
As you can see the process is very simple, and of course can be improved to avoid overloading the config-manager with many updates and also prevent them from overlapping with a very small interval.
Since there are many settings to be read, I open a cursor for a query as soon as the database is connected and opened. As data streaming sends me new data, I create a proxy for it so that it can be manipulated according to the type and internal details of Config-manager. I then check if the property needs to be monitored, if so, I call an inner-function called watch that I created to handle this, and it queries the subproject of the same name to see what default time it takes to check in the database by updates, and thus registers a timeout for that task, and each check recreates the timeout with the updated time or interrupts the update if watch no longer exists.
this.connection.once('open', () => {
let cursor = Config.find({}).cursor();
cursor.on('data', (doc) => {
this.config[doc.parametro] = criarProxy(doc.parametro, doc.valor);
if (doc.watch) {
console.log(sprintf("Preparando para Monitorar %s", doc.parametro));
function watch(configManager, doc) {
console.log("Monitorando parametro: %s", doc.parametro);
if (doc.watch) setTimeout(() => {
Config.findOne({
parametro: doc.parametro
}).then((doc) => {
console.dir(doc);
if (doc) {
if (doc.valor != configManager.config[doc.parametro]) {
console.log("Parametro monitorado: %(parametro)s, foi alterado!", doc);
configManager.config[doc.parametro] = criarProxy(doc.parametro, doc.valor);
} else
console.log("Parametro monitorado %{parametro}s, não foi alterado", doc);
watch(configManager, doc);
} else
console.log("Verifique o parametro: %s")
})
},
doc.watch)
}
watch(this, doc);
}
});
cursor.on('close', () => {
if (process.env.DEBUG_DETAIL > 2) console.log("ConfigManager closed cursor data");
resolv();
});
cursor.on('end', () => {
if (process.env.DEBUG_DETAIL > 2) console.log("ConfigManager end data");
});
As you can see the code can improve a lot, if you want to give suggestions for improvements according to your environment or generics please use the gist: https://gist.github.com/carlosdelfino/929d7918e3d3a6172fdd47a59d25b150

Parallel.Foreach and BulkCopy

I have a C# library which connects to 59 servers of the same database structure and imports data to my local db to the same table. At this moment I am retrieving data server by server in foreach loop:
foreach (var systemDto in systems)
{
var sourceConnectionString = _systemService.GetConnectionStringAsync(systemDto.Ip).Result;
var dbConnectionFactory = new DbConnectionFactory(sourceConnectionString,
"System.Data.SqlClient");
var dbContext = new DbContext(dbConnectionFactory);
var storageRepository = new StorageRepository(dbContext);
var usedStorage = storageRepository.GetUsedStorageForCurrentMonth();
var dtUsedStorage = new DataTable();
dtUsedStorage.Load(usedStorage);
var dcIp = new DataColumn("IP", typeof(string)) {DefaultValue = systemDto.Ip};
var dcBatchDateTime = new DataColumn("BatchDateTime", typeof(string))
{
DefaultValue = batchDateTime
};
dtUsedStorage.Columns.Add(dcIp);
dtUsedStorage.Columns.Add(dcBatchDateTime);
using (var blkCopy = new SqlBulkCopy(destinationConnectionString))
{
blkCopy.DestinationTableName = "dbo.tbl";
blkCopy.WriteToServer(dtUsedStorage);
}
}
Because there are many systems to retrieve data, I wonder if it is possible to use Pararel.Foreach loop? Will BulkCopy lock the table during WriteToServer and next WriteToServer will wait until previous will complete?
-- EDIT 1
I've changed Foreach to Parallel.Foreach but I face one problem. Inside this loop I have async method: _systemService.GetConnectionStringAsync(systemDto.Ip)
and this line returns error:
System.NotSupportedException: A second operation started on this
context before a previous asynchronous operation completed. Use
'await' to ensure that any asynchronous operations have completed
before calling another method on this context. Any instance members
are not guaranteed to be thread safe.
Any ideas how can I handle this?
In general, it will get blocked and will wait until the previous operation complete.
There are some factors that may affect if SqlBulkCopy can be run in parallel or not.
I remember when adding the Parallel feature to my .NET Bulk Operations, I had hard time to make it work correctly in parallel but that worked well when the table has no index (which is likely never the case)
Even when worked, the performance gain was not a lot faster.
Perhaps you will find more information here: MSDN - Importing Data in Parallel with Table Level Locking

Get connection used by DatabaseFactory.GetDatabase().ExecuteReader()

We have two different query strategies that we'd ideally like to operate in conjunction on our site without opening redundant connections. One strategy uses the enterprise library to pull Database objects and Execute_____(DbCommand)s on the Database, without directly selecting any sort of connection. Effectively like this:
Database db = DatabaseFactory.CreateDatabase();
DbCommand q = db.GetStoredProcCommand("SomeProc");
using (IDataReader r = db.ExecuteReader(q))
{
List<RecordType> rv = new List<RecordType>();
while (r.Read())
{
rv.Add(RecordType.CreateFromReader(r));
}
return rv;
}
The other, newer strategy, uses a library that asks for an IDbConnection, which it Close()es immediately after execution. So, we do something like this:
DbConnection c = DatabaseFactory.CreateDatabase().CreateConnection();
using (QueryBuilder qb = new QueryBuilder(c))
{
return qb.Find<RecordType>(ConditionCollection);
}
But, the connection returned by CreateConnection() isn't the same one used by the Database.ExecuteReader(), which is apparently left open between queries. So, when we call a data access method using the new strategy after one using the old strategy inside a TransactionScope, it causes unnecessary promotion -- promotion that I'm not sure we have the ability to configure for (we don't have administrative access to the SQL Server).
Before we go down the path of modifying the query-builder-library to work with the Enterprise Library's Database objects ... Is there a way to retrieve, if existent, the open connection last used by one of the Database.Execute_______() methods?
Yes, you can get the connection associated with a transaction. Enterprise Library internally manages a collection of transactions and the associated database connections so if you are in a transaction you can retrieve the connection associated with a database using the static TransactionScopeConnections.GetConnection method:
using (var scope = new TransactionScope())
{
IEnumerable<RecordType> records = GetRecordTypes();
Database db = DatabaseFactory.CreateDatabase();
DbConnection connection = TransactionScopeConnections.GetConnection(db).Connection;
}
public static IEnumerable<RecordType> GetRecordTypes()
{
Database db = DatabaseFactory.CreateDatabase();
DbCommand q = db.GetStoredProcCommand("GetLogEntries");
using (IDataReader r = db.ExecuteReader(q))
{
List<RecordType> rv = new List<RecordType>();
while (r.Read())
{
rv.Add(RecordType.CreateFromReader(r));
}
return rv;
}
}

Error inserting document into mongodb from scala

Trying to insert into a mongodb database from scala. the below codes dont create a db or collection. tried using the default test db too. how do i perform CRUD operations?
object Store {
def main(args: Array[String]) = {
def addMongo(): Unit = {
var mongo = new Mongo()
var db = mongo.getDB("mybd")
var coll = db.getCollection("somecollection")
var obj = new BasicDBObject()
obj.put("name", "Mongo")
obj.put("type", "db")
coll.insert(obj)
coll.save(obj)
println("Saved") //to print to console
}
}
On a first glance things look OK in your code although you have that stray def addMongo(): Unit = {
code at the top. I'll defer to a suggestion on looking for errors here.... Two items of note:
1) save() and insert() are complementary operations - you only need one. insert() will always attempt to create a new document ... save() will create one if the _id field isn't set, and update the represented _id if it does.
2) Mongo clients do not wait for an answer to a write operation by default. It is very possible & likely that an error is occurring within MongoDB causing your write to fail. the getLastError() command will return the result of the last write operation on the current connection. Because MongoDB's Java driver uses connection pools you have to tell it to lock you onto a single connection for the duration of an operation you want to run 'safely' (e.g. check result). This is the easiest way from the Java driver (in Scala, sample code wise, though):
mongo.requestStart() // lock the connection in
coll.insert(obj) // attempt the insert
getLastError.throwOnError() // This tells the getLastError command to throw an exception in case of an error
mongo.requestDone() // release the connection lock
Take a look at this excellent writeup on MongoDB's Write Durability, which focuses specifically on the Java Driver.
You may also want to take a look at the Scala driver I maintain (Casbah) which wraps the Java driver and provides more scala functionality.
We provide among other things an execute-around-method version of the safe write concept in safely() which makes things a lot easier for testing for writes' success.
You just missed the addMongo call in main. The fix is trivial:
object Store {
def main(args: Array[String]) = {
def addMongo(): Unit = {
var mongo = new Mongo()
var db = mongo.getDB("mybd")
var coll = db.getCollection("somecollection")
var obj = new BasicDBObject()
obj.put("name", "Mongo")
obj.put("type", "db")
coll.insert(obj)
coll.save(obj)
println("Saved") //to print to console
}
addMongo // call it!
}