Is there any way to compress the data while using mongo persistence with NEventStore? - mongodb

I'm working with C#, Dotnet core, and NeventStore( version- 9.0.1), trying to evaluate various persistence options that it supports out of the box.
More specifically, when trying to use the mongo persistence, the payload is getting stored without any compression being applied.
Note: Payload compression is happening perfectly when using the SQL persistence of NEventStore whereas not with the mongo persistence.
I'm using the below code to create the event store and initialize:
private IStoreEvents CreateEventStore(string connectionString)
{
var store = Wireup.Init()
.UsingMongoPersistence(connectionString,
new NEventStore.Serialization.DocumentObjectSerializer())
.InitializeStorageEngine()
.UsingBsonSerialization()
.Compress()
.HookIntoPipelineUsing()
.Build();
return store;
}
And, I'm using the below code for storing the events:
public async Task AddMessageTostore(Command command)
{
using (var stream = _eventStore.CreateStream(command.Id))
{
stream.Add(new EventMessage { Body = command });
stream.CommitChanges(Guid.NewGuid());
}
}
The workaround did: Implementing the PreCommit(CommitAttempt attempt) and Select methods in IPipelineHook and by using gzip compression logic the compression of events was achieved in MongoDB.
Attaching data store image of both SQL and mongo persistence:
So, the questions are:
Is there some other option or setting I'm missing so that the events get compressed while saving(fluent way of calling compress method) ?
Is the workaround mentioned above sensible to do or is it a performance overhead?

I also faced the same issue while using the NEventStore.Persistence.MongoDB.
Even if I used the fluent way of compress method, the payload compression is not happening perfectly in the mongo persistence like SQL persistence.
Finally, I have achieved the compression/decompression by customizing the logic inside the PreCommit(CommitAttempt attempt) and Select(ICommit committed) methods.
Code used for compression:
using (var stream = new MemoryStream())
{
using (var compressedStream = new GZipStream(stream,
CompressionMode.Compress))
{
var serializer = new JsonSerializer {
TypeNameHandling = TypeNameHandling.None,
ReferenceLoopHandling = ReferenceLoopHandling.Ignore
};
var writer = new JsonTextWriter(new StreamWriter(compressedStream));
serializer.Serialize(writer, this);
writer.Flush();
}
return stream.ToArray();
}
Code used for decompression:
using (var stream = new MemoryStream(bytes))
{
var decompressedStream = new GZipStream(stream, CompressionMode.Decompress);
var serializer = new JsonSerializer {
TypeNameHandling = TypeNameHandling.None,
ReferenceLoopHandling = ReferenceLoopHandling.Ignore
};
var reader = new JsonTextReader(new StreamReader(decompressedStream));
var body = serializer.Deserialize(reader, type);
return body as Command;
}
I'm not sure if this a right approach or will this have any impact on the performance of EventStore operations like Insert and Select..

Related

How do I confirm I am reading the data from Mongo secondary server from Java

For performance optimisation we are trying to read data from Mongo secondary server for selected scenarios. I am using the inline query using "withReadPreference(ReadPreference.secondaryPreferred())" to read the data, PFB the code snippet.
What I want to confirm the data we are getting is coming from secondary server after executing the inline query highlighted, is there any method available to check the same from Java or Springboot
public User read(final String userId) {
final ObjectId objectId = new ObjectId(userId);
final User user = collection.withReadPreference(ReadPreference.secondaryPreferred()).findOne(objectId).as(User.class);
return user;
}
Pretty much the same way in Java. Note we use secondary() not secondaryPrefered(); this guarantees reads from secondary ONLY:
import com.mongodb.ReadPreference;
{
// This is your "regular" primaryPrefered collection:
MongoCollection<BsonDocument> tcoll = db.getCollection("myCollection", BsonDocument.class);
// ... various operations on tcoll, then create a new
// handle that FORCES reads from secondary and will timeout and
// fail if no secondary can be found:
MongoCollection<BsonDocument> xcoll = tcoll.withReadPreference(ReadPreference.secondary());
BsonDocument f7 = xcoll.find(queryExpr).first();
}

Jackson streaming api & WebFlux response body in Spring Cloud Gateway

I'm pretty new to Spring Cloud Gateway and Spring WebFlux.
I have a requirement where I need to gather all the IDs of the data elements in the response for logging purposes.
I have done so in a very memory-intensive way, by reading the DataBuffer into a byte array, parsing it and then wrapping it into a DataBuffer and passing it along.
However, this isn't very viable when the responses are big and I am making use of the Jackson Streaming API so it seems silly.
Anyone have any tips on how to achieve this? All the examples I have seen seem to do it in a similar way, by buffering the entire response in memory.
Current version (Groovy):
class DataIdResponseHandler extends ServerHttpResponseDecorator {
final DataBufferFactory dataBufferFactory
DataIdResponseHandler(ServerHttpResponse delegate) {
super(delegate)
dataBufferFactory = delegate.bufferFactory()
}
#Override
Mono<Void> writeWith(Publisher<? extends DataBuffer> body) {
Flux<? extends DataBuffer> fluxBody = (Flux<? extends DataBuffer>) body
return super.writeWith(fluxBody.map { dataBuffer ->
byte[] content = new byte[dataBuffer.readableByteCount()]
dataBuffer.read(content)
List<String> dataIds = DataIdParser.parseFromByteArray(content)
DataIdCollector.add(dataIds)
return dataBufferFactory.wrap(content)
})
}
}
Reactive version of the above would method would be:
return super.writeWith(fluxBody.doOnNext { DataBuffer dataBuffer ->
List<String> dataIds = DataIdParser.parseFromByteArray(dataBuffer.toString(Charset.defaultCharset()).bytes)
Logger logger = LoggerFactory.getLogger(DataIdResponseFilter)
logger.info(dataIds.toString())
})
Oddly enough, if there's an interaction with the DataBuffer where you access the asInputStream() and you read from that it'll empty the DataBuffer and the actual response will be empty. If you use the toString() method, which obviously also reads from the DataBuffer, the buffer and hence body will still be complete.

SignalR Core with Redis Pub\Sub and console application

I am having Asp.Net Core 2.1 with SignalR Core 1.0.1.
I have created chat application that is described here:
https://learn.microsoft.com/en-us/aspnet/core/tutorials/signalr?view=aspnetcore-2.1&tabs=visual-studio
Also have configured SignalR to use Redis using
services.AddSignalR().AddRedis(Configuration["ConnectionStrings:Redis"]);
Having running Redis server up with redis-cli monitor I can see the following commands coming:
1530086417.413730 [0 127.0.0.1:57436] "SUBSCRIBE" "SignalRCore.Hubs.ChatHub:connection:VAIbFqtNyPVaod18jmm_Aw"
1530086428.181854 [0 127.0.0.1:57435] "PUBLISH" "SignalRCore.Hubs.ChatHub:all" "\x92\x90\x81\xa4json\xc4W{\"type\":1,\"target\":\"ReceiveMessage\",\"arguments\":[{\"user\":\"user\",\"message\":\"message\"}]}\x1e"
Everything works fine till the time when I would like to push some message from another console application.
In that application I am using ServiceStack.Redis and the code is the following:
var redisManager = new RedisManagerPool(configuration["ConnectionStrings:Redis"]);
using (var client = redisManager.GetClient())
{
client.PublishMessage("SignalRCore.Hubs.ChatHub:all", "{\"type\":1,\"target\":\"ReceiveMessage\",\"arguments\":[{\"user\":\"FromConsole\",\"message\":\"Message\"}]");
}
The messages are not handled by browser. I assume the case is in this additional information that is used for SignalR:
"\x92\x90\x81\xa4json\xc4W{...}\x1e"
Related monitor record:
1530087843.512083 [0 127.0.0.1:49480] "PUBLISH" "SignalRCore.Hubs.ChatHub:all" "{\"type\":1,\"target\":\"ReceiveMessage\",\"arguments\":[{\"user\":\"FromConsole\",\"message\":\"Message\"}]"
Any ideas how can I specify this additional data for publish?
Probably I should use something more suitable for my case instead of ServiceStack.Redis
using Microsoft.AspNetCore.SignalR.Protocol;
using Microsoft.AspNetCore.SignalR.Redis.Internal;
using StackExchange.Redis;
using System.Collections.Generic;
static void Main(string[] args)
{
using (var redis = ConnectionMultiplexer.Connect("127.0.0.1:6379"))
{
var sub = redis.GetSubscriber();
var protocol = new JsonHubProtocol();
var redisProtocol = new RedisProtocol(new List<JsonHubProtocol>() { protocol});
var bytes = redisProtocol.WriteInvocation("ReceiveMessage", new[] { "60344", "60344" });
sub.Publish("SignalRChat.Hubs.ChatHub:all", bytes);
}
}
How to find it?
in signalr source code search ".Publish", you can find the https://github.com/aspnet/SignalR/blob/c852bdcc332ffb998ec6a5b226e35d5e74d24009/src/Microsoft.AspNetCore.SignalR.StackExchangeRedis/RedisHubLifetimeManager.cs
it uses the RedisProtocol and messagepack to .WriteBytes. header footer name count...

JOliver EventStore Snapshotting

Say I have this code:
private void CreateSnapshots(IEnumerable<StreamHead> streams)
{
foreach (StreamHead head in streams)
{
IAggregate aggregate = ???;
IMemento memento = aggregate.GetSnapshot();
var snapshot = new Snapshot(head.StreamId, head.SnapshotRevision + 1, memento);
eventStore.AddSnapshot(snapshot);
observer.Notify(new SnapshotTaken(head.StreamId, head.HeadRevision));
}
}
how do I know what aggregate to load for the current stream? I'm also using CommonDomain. Is there something in there?
Thanks
The snapshotting aspect of the EventStore needs a bit of love. I have tried to make the IStoreEvents interface geared toward working with an individual aggregate. I have also tried to ensure that snapshotting does not interfere or get in the way of normal use.
Since the release of v2.0, I have now turned my attention toward v2.1 and I will be able to make a few small API changes related to this. In the meantime, your best option is probably to bypass IStoreEvents altogether when doing snapshotting.
Another alternative is to have the snapshotting code run in-process with your regular code. When an aggregate is loaded the needs a snapshot, you could easily push a reference to that aggregate asynchronously to your snapshotting code. In this way, you don't actually have to do a load because you already have the aggregate.
I found a solution for me (this is most definitely a hack). It is still out-of-band snapshotting. Here's a sample of it in action.
private void CreateSnapshots(IEnumerable<StreamHead> streams)
{
foreach (StreamHead head in streams)
{
//NOTE: This uses a patched version of EventStore that loads commit headers in OptimisticEventStream.PopulateStream()
// <code>
// this.identifiers.Add(commit.CommitId);
// this.headers = this.headers.Union(commit.Headers).ToDictionary(k => k.Key, k => k.Value);
// </code>
var stream = eventStore.OpenStream(head.StreamId, int.MinValue, int.MaxValue);
//NOTE: Nasty hack but it works.
var aggregateType = stream.UncommittedHeaders.Where(p=>p.Key=="AggregateType").First();
var type = aggregateTypeResolver(aggregateType.Value.ToString());
MethodInfo methodInfo = typeof(IRepository).GetMethod("GetById");
MethodInfo method = methodInfo.MakeGenericMethod(type);
object o = method.Invoke(repository, new object[]{head.StreamId, head.HeadRevision});
var aggregate = (IAggregate) o;
IMemento memento = aggregate.GetSnapshot();
var snapshot = new Snapshot(head.StreamId, head.HeadRevision, memento);
eventStore.AddSnapshot(snapshot);
observer.Notify(new SnapshotTaken(head.StreamId, head.HeadRevision));
}
}

ADO.NET - Bad Practice?

I was reading an article in MSDN several months ago and have recently started using the following snippet to execute ADO.NET code, but I get the feeling it could be bad. Am I over reacting or is it perfectly acceptable?
private void Execute(Action<SqlConnection> action)
{
SqlConnection conn = null;
try {
conn = new SqlConnection(ConnectionString);
conn.Open();
action.Invoke(conn);
} finally {
if (conn != null && conn.State == ConnectionState.Open) {
try {
conn.Close();
} catch {
}
}
}
}
public bool GetSomethingById() {
SomeThing aSomething = null
bool valid = false;
Execute(conn =>
{
using (SqlCommand cmd = conn.CreateCommand()) {
cmd.CommandText = ....
...
SqlDataReader reader = cmd.ExecuteReader();
...
aSomething = new SomeThing(Convert.ToString(reader["aDbField"]));
}
});
return aSomething;
}
What is the point of doing that when you can do this?
public SomeThing GetSomethingById(int id)
{
using (var con = new SqlConnection(ConnectionString))
{
con.Open();
using (var cmd = con.CreateCommand())
{
// prepare command
using (var rdr = cmd.ExecuteReader())
{
// read fields
return new SomeThing(data);
}
}
}
}
You can promote code reuse by doing something like this.
public static void ExecuteToReader(string connectionString, string commandText, IEnumerable<KeyValuePair<string, object>> parameters, Action<IDataReader> action)
{
using (var con = new SqlConnection(connectionString))
{
con.Open();
using (var cmd = con.CreateCommand())
{
cmd.CommandText = commandText;
foreach (var pair in parameters)
{
var parameter = cmd.CreateParameter();
parameter.ParameterName = pair.Key;
parameter.Value = pair.Value;
cmd.Parameters.Add(parameter);
}
using (var rdr = cmd.ExecuteReader())
{
action(rdr);
}
}
}
}
You could use it like this:
//At the top create an alias
using DbParams = Dictionary<string, object>;
ExecuteToReader(
connectionString,
commandText,
new DbParams() { { "key1", 1 }, { "key2", 2 } }),
reader =>
{
// ...
// No need to dispose
}
)
IMHO it is indeed a bad practice, since you're creating and opening a new database-connection for every statement that you execute.
Why is it bad:
performance wise (although connection pooling helps decrease the performance hit): you should open your connection, execute the statements that have to be executed, and close the connection when you don't know when the next statement will be executed.
but certainly context-wise. I mean: how will you handle transactions ? Where are your transaction boundaries ? Your application-layer knows when a transaction has to be started and committed, but you're unable to span multiple statements into the same sql-transaction with this way of working.
This is a very reasonable approach to use.
By wrapping your connection logic into a method which takes an Action<SqlConnection>, you're helping prevent duplicated code and the potential for introduced error. Since we can now use lambdas, this becomes an easy, safe way to handle this situation.
That's acceptable. I've created a SqlUtilities class two years ago that had a similar method. You can take it one step further if you like.
EDIT: Couldn't find the code, but I typed a small example (probably with many syntax errors ;))
SQLUtilities
public delegate T CreateMethod<T> (SqlDataReader reader);
public static T CreateEntity<T>(string query, CreateMethod<T> createMethod, params SqlParameter[] parameters) {
// Open the Sql connection
// Create a Sql command with the query/sp and parameters
SqlDataReader reader = cmd.ExecuteReader();
return createMethod(reader);
// Probably some finally statements or using-closures etc. etc.
}
Calling code
private SomeThing Create(SqlDataReader reader) {
SomeThing something = new SomeThing();
something.ID = Convert.ToIn32(reader["ID"]);
...
return something;
}
public SomeThing GetSomeThingByID (int id) {
return SqlUtilities.CreateEntity<SomeThing> ("something_getbyid", Create, ....);
}
Of course you could use a lambda expression instead of the Create method, and you could easily make a CreateCollection method and reuse the existing Create method.
However if this is a new project. Check out LINQ to entities. Is far easier and flexible than ADO.Net.
Well, In my opinion check what you do before going through it.Something that is working doesn't mean it is best and good programming practice.Check out and find a concrete example and benefit of using it.But if you are considering using for big projects it would be nice using frameworks like NHibernate.Because there are a lot projects even frameworks developed based on it,like http://www.cuyahoga-project.org/.