JOliver EventStore Snapshotting - cqrs

Say I have this code:
private void CreateSnapshots(IEnumerable<StreamHead> streams)
{
foreach (StreamHead head in streams)
{
IAggregate aggregate = ???;
IMemento memento = aggregate.GetSnapshot();
var snapshot = new Snapshot(head.StreamId, head.SnapshotRevision + 1, memento);
eventStore.AddSnapshot(snapshot);
observer.Notify(new SnapshotTaken(head.StreamId, head.HeadRevision));
}
}
how do I know what aggregate to load for the current stream? I'm also using CommonDomain. Is there something in there?
Thanks

The snapshotting aspect of the EventStore needs a bit of love. I have tried to make the IStoreEvents interface geared toward working with an individual aggregate. I have also tried to ensure that snapshotting does not interfere or get in the way of normal use.
Since the release of v2.0, I have now turned my attention toward v2.1 and I will be able to make a few small API changes related to this. In the meantime, your best option is probably to bypass IStoreEvents altogether when doing snapshotting.
Another alternative is to have the snapshotting code run in-process with your regular code. When an aggregate is loaded the needs a snapshot, you could easily push a reference to that aggregate asynchronously to your snapshotting code. In this way, you don't actually have to do a load because you already have the aggregate.

I found a solution for me (this is most definitely a hack). It is still out-of-band snapshotting. Here's a sample of it in action.
private void CreateSnapshots(IEnumerable<StreamHead> streams)
{
foreach (StreamHead head in streams)
{
//NOTE: This uses a patched version of EventStore that loads commit headers in OptimisticEventStream.PopulateStream()
// <code>
// this.identifiers.Add(commit.CommitId);
// this.headers = this.headers.Union(commit.Headers).ToDictionary(k => k.Key, k => k.Value);
// </code>
var stream = eventStore.OpenStream(head.StreamId, int.MinValue, int.MaxValue);
//NOTE: Nasty hack but it works.
var aggregateType = stream.UncommittedHeaders.Where(p=>p.Key=="AggregateType").First();
var type = aggregateTypeResolver(aggregateType.Value.ToString());
MethodInfo methodInfo = typeof(IRepository).GetMethod("GetById");
MethodInfo method = methodInfo.MakeGenericMethod(type);
object o = method.Invoke(repository, new object[]{head.StreamId, head.HeadRevision});
var aggregate = (IAggregate) o;
IMemento memento = aggregate.GetSnapshot();
var snapshot = new Snapshot(head.StreamId, head.HeadRevision, memento);
eventStore.AddSnapshot(snapshot);
observer.Notify(new SnapshotTaken(head.StreamId, head.HeadRevision));
}
}

Related

Is there any way to compress the data while using mongo persistence with NEventStore?

I'm working with C#, Dotnet core, and NeventStore( version- 9.0.1), trying to evaluate various persistence options that it supports out of the box.
More specifically, when trying to use the mongo persistence, the payload is getting stored without any compression being applied.
Note: Payload compression is happening perfectly when using the SQL persistence of NEventStore whereas not with the mongo persistence.
I'm using the below code to create the event store and initialize:
private IStoreEvents CreateEventStore(string connectionString)
{
var store = Wireup.Init()
.UsingMongoPersistence(connectionString,
new NEventStore.Serialization.DocumentObjectSerializer())
.InitializeStorageEngine()
.UsingBsonSerialization()
.Compress()
.HookIntoPipelineUsing()
.Build();
return store;
}
And, I'm using the below code for storing the events:
public async Task AddMessageTostore(Command command)
{
using (var stream = _eventStore.CreateStream(command.Id))
{
stream.Add(new EventMessage { Body = command });
stream.CommitChanges(Guid.NewGuid());
}
}
The workaround did: Implementing the PreCommit(CommitAttempt attempt) and Select methods in IPipelineHook and by using gzip compression logic the compression of events was achieved in MongoDB.
Attaching data store image of both SQL and mongo persistence:
So, the questions are:
Is there some other option or setting I'm missing so that the events get compressed while saving(fluent way of calling compress method) ?
Is the workaround mentioned above sensible to do or is it a performance overhead?
I also faced the same issue while using the NEventStore.Persistence.MongoDB.
Even if I used the fluent way of compress method, the payload compression is not happening perfectly in the mongo persistence like SQL persistence.
Finally, I have achieved the compression/decompression by customizing the logic inside the PreCommit(CommitAttempt attempt) and Select(ICommit committed) methods.
Code used for compression:
using (var stream = new MemoryStream())
{
using (var compressedStream = new GZipStream(stream,
CompressionMode.Compress))
{
var serializer = new JsonSerializer {
TypeNameHandling = TypeNameHandling.None,
ReferenceLoopHandling = ReferenceLoopHandling.Ignore
};
var writer = new JsonTextWriter(new StreamWriter(compressedStream));
serializer.Serialize(writer, this);
writer.Flush();
}
return stream.ToArray();
}
Code used for decompression:
using (var stream = new MemoryStream(bytes))
{
var decompressedStream = new GZipStream(stream, CompressionMode.Decompress);
var serializer = new JsonSerializer {
TypeNameHandling = TypeNameHandling.None,
ReferenceLoopHandling = ReferenceLoopHandling.Ignore
};
var reader = new JsonTextReader(new StreamReader(decompressedStream));
var body = serializer.Deserialize(reader, type);
return body as Command;
}
I'm not sure if this a right approach or will this have any impact on the performance of EventStore operations like Insert and Select..

What impact does changing a IReliableQueue to a IReliableConcurrentQueue have in an existing deployment?

I am working in a Service Fabric application that uses IReliableQueue. For the uses cases of this system, the IReliableConcurrentQueue makes sense to use and some local testing (i.e. basically by just changing the code to use IReliableConcurrentQueue instead of IReliableQueue - queue name does not change) shows great performance improvements. However, I am worried about the impact of changing this in a production system (i.e. upgrading). I can't find any docs or online questions (unless I just missed them) about these considerations. For example, in this system, the existing IReliableQueue will almost always have items. So what happens to that data when I upgrade the SF application? Will it be available to dequeue in the IReliableConcurrentQueue? Or would data be lost? I know I can "just try it" but wanted to see if someone out there had done the same or could offer pointers to existing resources. Thanks!
Sorry for a late answer (that you probably don't need anymore but still).
When we calling GetOrAddAsync method on IReliableStateManager we aren't retrieving the interface to store values - we actually creating an instance of reliable collection. This basically means that type of the interface we specify is very important.
Taking this into account if we do this:
Service v. 1.0
// Somewhere in RunAsync for example
await this.StateManager.GetOrAddAsync<IReliableQueue<long>>("MyCollection")
Then doing this in the next version:
Service v. 1.1
// Somewhere in RunAsync for example
await this.StateManager.GetOrAddAsync<IReliableConcurrentQueue<long>>("MyCollection")
will throw an exception:
Returned reliable object of type Microsoft.ServiceFabric.Data.Collections.DistributedQueue`1[System.Int64] cannot be casted to requested type Microsoft.ServiceFabric.Data.Collections.IReliableConcurrentQueue`1[System.Int64]
and then:
System.ExecutionEngineException: 'Exception of type 'System.ExecutionEngineException' was thrown.'
The above exception looks like a bug so I have filled one.
UPDATE 2019.06.28
It turned out that appearance of System.ExecutionEngineException isn't a bug but rather an undocumented behavior of Environment.FailFast method in combination with Visual Studio debugger.
Please see my comment to the above issue.
This is what would happen.
There are plenty ways to overcome this.
Here is the most obvious one:
Example
var migrate = false; // This flag indicates whether the migration was already done.
var migrateValues = new List<long>();
var applicationFlags = await this.StateManager
.GetOrAddAsync<IReliableDictionary<string, bool>>("application-flags");
using (var transaction = this.StateManager.CreateTransaction())
{
var flag = await applicationFlags
.TryGetValueAsync(transaction, "queue-to-concurrent-queue-migration");
if (!flag.HasValue || !flag.Value)
{
var queue = await this.StateManager
.GetOrAddAsync<IReliableQueue<long>>("value-collection");
for (;;)
{
var c = await queue.TryDequeueAsync(transaction);
if (!c.HasValue)
{
break;
}
migrateValues.Add(c.Value);
}
migrate = true;
}
}
if (migrate)
{
await this.StateManager.RemoveAsync("value-collection");
using (var transaction = this.StateManager.CreateTransaction())
{
var concurrentQueue = await this.StateManager
.GetOrAddAsync<IReliableConcurrentQueue<long>>("value-collection");
foreach (var i in migrateValues)
{
await concurrentQueue.EnqueueAsync(transaction, i);
}
await applicationFlags.AddOrUpdateAsync(
transaction,
"queue-to-concurrent-queue-migration",
true,
(s, b) => true);
}
await transaction.CommitAsync();
}
Please note that this code is just an illustrative example and should be properly tested before applying it to real life application.

Parallel.Foreach and BulkCopy

I have a C# library which connects to 59 servers of the same database structure and imports data to my local db to the same table. At this moment I am retrieving data server by server in foreach loop:
foreach (var systemDto in systems)
{
var sourceConnectionString = _systemService.GetConnectionStringAsync(systemDto.Ip).Result;
var dbConnectionFactory = new DbConnectionFactory(sourceConnectionString,
"System.Data.SqlClient");
var dbContext = new DbContext(dbConnectionFactory);
var storageRepository = new StorageRepository(dbContext);
var usedStorage = storageRepository.GetUsedStorageForCurrentMonth();
var dtUsedStorage = new DataTable();
dtUsedStorage.Load(usedStorage);
var dcIp = new DataColumn("IP", typeof(string)) {DefaultValue = systemDto.Ip};
var dcBatchDateTime = new DataColumn("BatchDateTime", typeof(string))
{
DefaultValue = batchDateTime
};
dtUsedStorage.Columns.Add(dcIp);
dtUsedStorage.Columns.Add(dcBatchDateTime);
using (var blkCopy = new SqlBulkCopy(destinationConnectionString))
{
blkCopy.DestinationTableName = "dbo.tbl";
blkCopy.WriteToServer(dtUsedStorage);
}
}
Because there are many systems to retrieve data, I wonder if it is possible to use Pararel.Foreach loop? Will BulkCopy lock the table during WriteToServer and next WriteToServer will wait until previous will complete?
-- EDIT 1
I've changed Foreach to Parallel.Foreach but I face one problem. Inside this loop I have async method: _systemService.GetConnectionStringAsync(systemDto.Ip)
and this line returns error:
System.NotSupportedException: A second operation started on this
context before a previous asynchronous operation completed. Use
'await' to ensure that any asynchronous operations have completed
before calling another method on this context. Any instance members
are not guaranteed to be thread safe.
Any ideas how can I handle this?
In general, it will get blocked and will wait until the previous operation complete.
There are some factors that may affect if SqlBulkCopy can be run in parallel or not.
I remember when adding the Parallel feature to my .NET Bulk Operations, I had hard time to make it work correctly in parallel but that worked well when the table has no index (which is likely never the case)
Even when worked, the performance gain was not a lot faster.
Perhaps you will find more information here: MSDN - Importing Data in Parallel with Table Level Locking

RXJS : Idiomatic way to create an observable stream from a paged interface

I have paged interface. Given a starting point a request will produce a list of results and a continuation indicator.
I've created an observable that is built by constructing and flat mapping an observable that reads the page. The result of this observable contains both the data for the page and a value to continue with. I pluck the data and flat map it to the subscriber. Producing a stream of values.
To handle the paging I've created a subject for the next page values. It's seeded with an initial value then each time I receive a response with a valid next page I push to the pages subject and trigger another read until such time as there is no more to read.
Is there a more idiomatic way of doing this?
function records(start = 'LATEST', limit = 1000) {
let pages = new rx.Subject();
this.connect(start)
.subscribe(page => pages.onNext(page));
let records = pages
.flatMap(page => {
return this.read(page, limit)
.doOnNext(result => {
let next = result.next;
if (next === undefined) {
pages.onCompleted();
} else {
pages.onNext(next);
}
});
})
.pluck('data')
.flatMap(data => data);
return records;
}
That's a reasonable way to do it. It has a couple of potential flaws in it (that may or may not impact you depending upon your use case):
You provide no way to observe any errors that occur in this.connect(start)
Your observable is effectively hot. If the caller does not immediately subscribe to the observable (perhaps they store it and subscribe later), then they'll miss the completion of this.connect(start) and the observable will appear to never produce anything.
You provide no way to unsubscribe from the initial connect call if the caller changes its mind and unsubscribes early. Not a real big deal, but usually when one constructs an observable, one should try to chain the disposables together so it call cleans up properly if the caller unsubscribes.
Here's a modified version:
It passes errors from this.connect to the observer.
It uses Observable.create to create a cold observable that only starts is business when the caller actually subscribes so there is no chance of missing the initial page value and stalling the stream.
It combines the this.connect subscription disposable with the overall subscription disposable
Code:
function records(start = 'LATEST', limit = 1000) {
return Rx.Observable.create(observer => {
let pages = new Rx.Subject();
let connectSub = new Rx.SingleAssignmentDisposable();
let resultsSub = new Rx.SingleAssignmentDisposable();
let sub = new Rx.CompositeDisposable(connectSub, resultsSub);
// Make sure we subscribe to pages before we issue this.connect()
// just in case this.connect() finishes synchronously (possible if it caches values or something?)
let results = pages
.flatMap(page => this.read(page, limit))
.doOnNext(r => this.next !== undefined ? pages.onNext(this.next) : pages.onCompleted())
.flatMap(r => r.data);
resultsSub.setDisposable(results.subscribe(observer));
// now query the first page
connectSub.setDisposable(this.connect(start)
.subscribe(p => pages.onNext(p), e => observer.onError(e)));
return sub;
});
}
Note: I've not used the ES6 syntax before, so hopefully I didn't mess anything up here.

Combining local result with possible (timeout/error) async web result

I have two methods that both return an IObservable
IObservable<Something[]> QueryLocal();
and
IObservable<Something[]> QueryWeb();
QueryLocal is always successful. QueryWeb is susceptible to both a timeout and possible web errors.
I wish to implement a QueryLocalAndWeb() that calls both and combines their results.
So far I have:
IObservable<Something[]> QueryLocalAndWeb()
{
var a = QueryLocal();
var b = QueryWeb();
var plan = a.And(b).Then((x, y) => x.Concat(y).ToArray());
return Observable.When(plan).Timeout(TimeSpan.FromSeconds(10), a);
}
However, I'm not sure that it handles the case where QueryWeb yields an error.
In the future I might have a QueryWeb2() that also needs to be taken into account.
So, how do I combine the results from a number of IObservables ignoring the ones that throw errors (or time out)?
I guess OnErrorResumeNext should be able to handle this scenario:
From MSDN:
Continues an observable sequence that is terminated normally or by an
exception with the next observable sequence.
IObservable<Something[]> QueryLocalAndWeb()
{
var a = QueryLocal();
var b = QueryWeb().Timeout(TimeSpan.FromSeconds(10));
return Observable.OnErrorResumeNext(b, a);
}
You can do concat of array by using Aggregation on the returned observable.
I am assuming that both local and web are cold observable i.e they start producing values only when someone subscribes to them.
How about:
var plan = a.And(b).Then((x, y) => x.Concat(y.Catch(Observable.Empty<Something[]>()).ToArray());