Are AppSync's subscriptions very limited to one specific use case?

Are AppSync's subscriptions very limited to one specific use case? - aws-appsync

I spend the last day trying AWS AppSync I'm a bit disappointed with what subscriptions can do.
It seems to me that the current state of AppSync subscription is for the use case where you have a list of items and you want it to be sync over all clients.
It's pretty limited compared to what apollo-subscription can do.
So if I understood the doc correctly:
we can't filter out target to whom you want to send the data to
I have use cases where mutations like a vote on a Post can lead to push data of a different Type to the owner of the Post only.
it has to be linked to a specific mutation and it has to be of the same type
I have use cases where a mutation or even a query can lead to send a push to a specific target that is listening to the event.
It is not linked to a resolver
Can you please correct me If I'm wrong?

As you already figured out, the result must be the same as from the mutation and you can't link your mutation to a resolver.
But concerning your first assumption:
It is possible to filter the results of a mutation.
For example if you have the following mutation:
type Mutation {
addPost(input: PostAddInput!): Post!
}
input PostAddInput {
text: String!
author: ID!
}
You can publish the result of the mutation to the specific user with this subscription:
type Subscription {
addedPost(author_id: ID!): Post!
#aws_subscribe(mutations: ["addPost"])
}
Now you will only receive the results if the author_id of the mutation matches the subscribed author_id.
I also created an AppSync RDS repository on GitHub if you want to try it out by yourself.

Related

Swift Firebase - Combining estimated ServerTimestamp with Codable Custom Objects

I have a messaging app, where I have a Chats collection in my Firebase Firestore database. I use a custom object which is Codable to read and write changes to firebase.
struct ChatFirebaseDO: Codable {
#DocumentID var id: String?
... {100 other fields} ...
var lastMessageDate: Date
}
When a user sends a new message, I update this lastMessageDate with the FieldValue.serverTimestamp()
I also have a listener which is listening for changes, and it immediately returns any update to me (wnether that is a new Chat of an update to an existing one). However if it is my own user that has created this new chat, it will be returned to me with a null timestamp.
From the docs I see this is intentional behaviour. It suggests that I change replace the nulls with estimated timestamps values (perfect!) however, I can't work out how to combine this with my custom objects.
To get the estimated timestamps, I need to do this:
diff.document.data(with: .estimate)
which returns a dictionary of fields.
But for my Codable custom objects to work, I have to use:
let messageDO = try diff.document.data(as: ChatFirebaseDO.self)
which uses a document (not a dictionary of data).
Is there a way I can (1) replace the nulls with estimated timestamps but (2) still have a document object I can use for my custom object transformation.
Perhaps its a global setting I can make to use estimates, or locally within a single listener request. Or perhaps it is a way to use custom objects from a data dictionary and not just from the FIRDocument.
Thank you in advance!

If you're not encoding these chats for disk storage then why are they even Codable is a question to ask yourself. This particular method is made for that purpose so I'd argue you're using the wrong tool for the job—a tool that also doesn't work because of the timestamp conflict, which I imagine will be addressed in a future update to Firestore.
That said, the timestamps (which are tokens) only return nil when they haven't reached the server which means only from latency-compensated snapshots generated by the client (or only when the signed-in user posts). Therefore, you can provide your own estimate when the value is nil (which would be the current date and time), which would not only be accurate but would be overwritten by the subsequent snapshot anyway when it has a real value. It's not a pleasant workaround but it accomplishes exactly what the token does with its own estimate.
If you don't want to ditch Codable then you can ditch Firestore's timestamp, which I've personally done. I'm not a fan of the token system and I've replaced it with a basic Unix Timestamp (an integer) that makes things much simpler. I don't have to worry about nil times, latency-compensated returns, or configuring snapshot data just to handle the value of a single field. If I had to guess, I would imagine Firestore will eventually allow a global setting of timestamp behavior in addition to expanding the API to allow the Codable method to also account for timestamp behavior. The TLDR of it is that what you want doesn't yet exist natively in the Firestore SDK, unfortunately, and I'd consider making it a feature request on the Firestore-iOS git repo.

How to persist aggregate/read model from "EventStore" in a database?

Trying to implement Event Sourcing and CQRS for the first time, but got stuck when it came to persisting the aggregates.
This is where I'm at now
I've setup "EventStore" an a stream, "foos"
Connected to it from node-eventstore-client
I subscribe to events with catchup
This is all working fine.
With the help of the eventAppeared event handler function I can build the aggregate, whenever events occur. This is great, but what do I do with it?
Let's say I build and aggregate that is a list of Foos
[
{
id: 'some aggregate uuidv5 made from barId and bazId',
barId: 'qwe',
bazId: 'rty',
isActive: true,
history: [
{
id: 'some event uuid',
data: {
isActive: true,
},
timestamp: 123456788,
eventType: 'IsActiveUpdated'
}
{
id: 'some event uuid',
data: {
barId: 'qwe',
bazId: 'rty',
},
timestamp: 123456789,
eventType: 'FooCreated'
}
]
}
]
To follow CQRS I will build the above aggregate within a Read Model, right? But how do I store this aggregate in a database?
I guess just a nosql database should be fine for this, but I definitely need a db since I will put a gRPC APi in front of this and other read models / aggreates.
But what do I actually go from when I have built the aggregate, to when to persist it in the db?
I once tried following this tutorial https://blog.insiderattack.net/implementing-event-sourcing-and-cqrs-pattern-with-mongodb-66991e7b72be which was super simple, since you'd use mongodb both as the event store and just create a view for the aggregate and update that one when new events are incoming. It had it's flaws and limitations (the aggregation pipeline) which is why I now turned to "EventStore" for the event store part.
But how to persist the aggregate, which is currently just built and stored in code/memory from events in "EventStore"...?
I feel this may be a silly question but do I have to loop over each item in the array and insert each item in the db table/collection or do you somehow have a way to dump the whole array/aggregate there at once?
What happens after? Do you create a materialized view per aggregate and query against that?
I'm open to picking the best db for this, whether that is postgres/other rdbms, mongodb, cassandra, redis, table storage etc.
Last question. For now I'm just using a single stream "foos", but at this level I expect new events to happen quite frequently (every couple of seconds or so) but as I understand it you'd still persist it and update it using materialized views right?
So given that barId and bazId in combination can be used for grouping events, instead of a single stream I'd think more specialized streams such as foos-barId-bazId would be the way to go, to try and reduce the frequency of incoming new events to a point where recreating materialized views will make sense.
Is there a general rule of thumb saying not to recreate/update/refresh materialized views if the update frequency gets below a certain limit? Then the only other a lternative would be querying from a normal table/collection?
Edit:
In the end I'm trying to make a gRPC api that has just 2 rpcs - one for getting a single foo by id and one for getting all foos (with optional field for filtering by status - but that is not so important). The simplified proto would look something like this:
rpc GetFoo(FooRequest) returns (Foo)
rpc GetFoos(FoosRequest) returns (FooResponse)
message FooRequest {
string id = 1; // uuid
}
// If the optional status field is not specified, return all foos
message FoosRequest {
// If this field is specified only return the Foos that has isActive true or false
FooStatus status = 1;
enum FooStatus {
UNKNOWN = 0;
ACTIVE = 1;
INACTIVE = 2;
}
}
message FoosResponse {
repeated Foo foos;
}
message Foo {
string id = 1; // uuid
string bar_id = 2 // uuid
string baz_id = 3 // uuid
boolean is_active = 4;
repeated Event history = 5;
google.protobuf.Timestamp last_updated = 6;
}
message Event {
string id = 1; // uuid
google.protobuf.Any data = 2;
google.protobuf.Timestamp timestamp = 3;
string eventType = 4;
}
The incoming events would look something like this:
{
id: 'some event uuid',
barId: 'qwe',
bazId: 'rty',
timestamp: 123456789,
eventType: 'FooCreated'
}
{
id: 'some event uuid',
isActive: true,
timestamp: 123456788,
eventType: 'IsActiveUpdated'
}
As you can see there is no uuid to make it possible to GetFoo(uuid) in the gRPC API, which is why I'll generate a uuidv5 with the barId and bazId, which will combined, be a valid uuid. I'm making that in the projection / aggregate you see above.
Also the GetFoos rpc will either return all foos (if status field is left undefined), or alternatively it'll return the foo's that has isActive that matches the status field (if specified).
Yet I can't figure out how to continue from the catchup subscription handler.
I have the events stored in "EventStore" (https://eventstore.com/), using a subscription with catchup, I have built an aggregate/projection with an array of Foo's in the form that I want them, but to be able to get a single Foo by id from a gRPC API of mine, I guess I'll need to store this entire aggregate/projection in a database of some sort, so I can connect and fetch the data from the gRPC API? And every time a new event comes in I'll need to add that event to the database also or how is this working?
I think I've read every resource I can possibly find on the internet, but still I'm missing some key pieces of information to figure this out.
The gRPC is not so important. It could be REST I guess, but my big question is how to make the aggregated/projected data available to the API service (possible more API's will need it as well)? I guess I will need to store the aggregated/projected data with the generated uuid and history fields in a database to be able to fetch it by uuid from the API service, but what database and how is this storing process done, from the catchup event handler where I build the aggregate?

I know exactly how you feel! This is basically what happened to me when I first tried to do CQRS and ES.
I think you have a couple of gaps in your knowledge which I'm sure you will rapidly plug. You hydrate an aggregate from the event stream as you are doing. That IS your aggregate persisted. The read model is something different. Let me explain...
Your read model is the thing you use to run queries against and to provide data for display to a UI for example. Your aggregates are not (directly) involved in that. In fact they should be encapsulated. Meaning that you can't 'see' their state from the outside. i.e. no getter and setters with the exception of the aggregate ID which would have a getter.
This article gives you a helpful overview of how it all fits together: CQRS + Event Sourcing – Step by Step
The idea is that when an aggregate changes state it can only do so via an event it generates. You store that event in the event store. That event is also published so that read models can be updated.
Also looking at your aggregate it looks more like a typical read model object or DTO. An aggregate is interested in functionality, not properties. So you would expect to see void public functions for issuing commands to the aggregate. But not public properties like isActive or history.
I hope that makes sense.
EDIT:
Here are some more practical suggestions.
"To follow CQRS I will build the above aggregate within a Read Model, right? "
You do not build aggregates in the read model. They are separate things on separate sides of the CQRS side of the equation. Aggregates are on the command side. Queries are done against read models which are different from aggregates.
Aggregates have public void functions and no getter or setters (with the exception of the aggregate id). They are encapsulated. They generate events when their state changes as a result of a command being issued. These events are stored in an event store and are used to recover the state of an aggregate. In other words, that is how an aggregate is stored.
The events go on to be published so the event handlers and other processes can react to them and update the read model and or trigger new cascading commands.
"Last question. For now I'm just using a single stream "foos", but at this level I expect new events to happen quite frequently (every couple of seconds or so) but as I understand it you'd still persist it and update it using materialized views right?"
Every couple of seconds is very likely to be fine. I'm more concerned at the persist and update using materialised views. I don't know what you mean by that but it doesn't sound like you have the right idea. Views should be very simple read models. No need to complex relations like you find in an RDMS. And is therefore highly optimised fast for reading.

There can be a lot of confusion on all the terminologies and jargon used in DDD and CQRS and ES. I think in this case, the confusion lies in what you think an aggregate is. You mention that you would like to persist your aggregate as a read model. As #Codescribler mentioned, at the sink end of your event stream, there isn't a concept of an aggregate. Concretely, in ES, commands are applied onto aggregates in your domain by loading previous events pertaining to that aggregate, rehydrating the aggregate by folding each previous event onto the aggregate and then applying the command, which generates more events to be persisted in the event store.
Down stream, a subscribing process receives all the events in order and builds a read model based on the events and data contained within. The confusion here is that this read model, at this end, is not an aggregate per se. It might very well look exactly like your aggregate at the domain end or it could be only creating a read model that doesn't use all the events and or the event data.
For example, you may choose to use every bit of information and build a read model that looks exactly like the aggregate hydrated up to the newest event(likely your source of confusion). You may instead have another process that builds a read model that only tallies a specific type of event. You might even subscribe to multiple streams and "join" them into a big read model.
As for how to store it, this is really up to you. It seems to me like you are taking the events and rebuilding your aggregate plus a history of events in a memory structure. This, of course, doesn't scale, which is why you want to store it at rest in a database. I wouldn't use the memory structure, since you would need to do a lot of state diffing when you flush to the database. You should be modify the database directly in response to each individual event. Ideally, you also transactionally store the stream count with said modification so you don't process the same event again in the case of a failure.
Hope this helps a bit.

How to Subscribe to All GraphQL Mutations in AWS-Amplify Vue Components?

I am trying to subscribe to changes on delete, create and update mutations.
In my GraphQL schema, I created a subscription field that listens to all those mutations with type Subscription { onAll: Task #aws_subscribe(mutations: ["createTask","updateTask","deleteTask"]) }
Now when tried using amplify-vue components, in case of getting back a response :onSubscriptionMsg=SomeFunction(response) I am receiving old list of tasks from response.data.listOfTasks.
So how should I know which mutation was provoked and thus update the data.listOfTasks?
Thanks a heap in advance for answering this question :)

A suggestion would be to break apart the subscription into multiple subscriptions (i.e. CreateTaskSubscription, UpdateTaskSubscription, etc.) and that way you would be able to implement your Vue logic separately based on which mutation was invoked - as there is a 1-1 mapping now between subscription and mutation as opposed to the 1-to-many that your onAll subscription has currently.
Some reference docs:
Splitting up subscriptions: https://docs.aws.amazon.com/appsync/latest/devguide/real-time-data.html
Vue handling for each subscription type (go to the API connect section): https://aws-amplify.github.io/docs/js/vue

Update embedded data on referenced data update

I am building a Meteor application and am currently creating the publications and coming up against what seems like a common design quandary around related vs embedded documents. My data model (simplified) has Bookings, each of which have a related Client and a related Service. In order to optimise the speed of retrieving a collection I am embedding the key fields of a Client and Service in the Booking, and also linking to the ID - my Booking model has the following structure:
export interface Booking extends CollectionObject {
client_name: string;
service_name: string;
client_id: string;
service_id: string;
bookingDate: Date;
duration: number;
price: number;
}
In this model, client_id and service_id are references to the linked documents and client_name / service_name are embedded as they are used when displaying a list of bookings.
This all seems fine to me however the missing part of the puzzle is keeping this embedded data up to date. If a user in a separate part of the system updates a service (which would be a reactive collection) then I need this to trigger an update of the service_name to any bookings with the corresponding service ID. Is there an event I should subscribe to for this or am I able to? Client side, I have a form which allows the user to add / edit a Service which simply uses the insert or update method on the MongoObservable collection - the OOP part of me feels like this needs to be overridden in the server code to also then update the related data or am I completely going about this the wrong way?
Is this all irrelevant and shoudl I actually just use https://atmospherejs.com/reywood/publish-composite and return collections of related documents (it just feels like it would harm performance in a production environment when returning several hundred bookings at once)

i use a lot of the "foreign key" concept as you're describing, and do de-normalize data across collection as you're doing with the service name. i do this explicitly to avoid extra lookups / publishes.
i use 2 strategies to keep things up to date. the first is done when the source data is saved, say in a Meteor method call. i'll update the de-normalized data on the spot, touching the other collection(s). i would do all this in a "high read, low write" scenario.
the other strategy is to use collection hooks to fire when the source collection is updated. i use this package: matb33:collection-hooks
conceptually, it's similar to the first, but the hook into knowing when to do it is different.
an example we're using in the current app i'm working on: we have a news feed with comments. news items and comments are in separate collections, and each record the comment collection has the id of the associated news item.
we keep a running comment count associated with the news item itself. whenever a comment is added or removed, we increment/decrement the count and update the news item right away.

Querying Azure Mobile App TableController

I'm using Azure Mobile Apps and TableControllers in my project. Development has been going quite smoothly, until now. One of my tables relies on quite a bit of business logic in order to return the appropriate entities back to the client. To perform this business logic I need to get some parameters from the client (specifically, a date range).
I know I could use an APIController to return the data, but won't that break the entity syncing that's provided by the SyncTables in Xamarin?
My current logic in my GetAll is:
public IQueryable<WorkItemDTO> GetAllWorkItem()
{
//Return all the work items that the user owns or has been assigned as a resource.
var query = MappedDomainManager.QueryEntity().Where(x => x.OwnerId == UserProfileId || x.Resources.Where(r => r.AssignedResourceId == UserProfileId).Count() > 0);
return query.Project().To<WorkItemDTO>();
}
What I would like is to be able to somehow pass through a start and end date that I can then use to build up my list of WorkItemDTO objects. The main problem is that a WorkItem entity can actually spawn off multiple WorkItemDTO objects as a WorkItem can be set to be recurring. So for example say a WorkItem is recurring once a week, and the user wants to see a calendar for 1 month, that single WorkItem will spawn 4 separate concrete WorkItemDTO objects.
Then when a user modifies one of those WorkItemDTO objects on the client side, I want it to be sent back as a patch that creates its own WorkItem entity.
Does anyone know how I can get a TableController to receive parameters? Or how to get an APIController to work so that client syncing isn't affected?
Any help would be appreciated.
Thanks
Jacob

On the server, you can add a query parameter to the table controller get method easily, by adding a parameter with the right name and type.
For instance, you could add a dateFilter query parameter as follows:
public IQueryable<WorkItemDTO> GetAllWorkItem(string dateFilter)
This would be called by passing a dateFilter=value query parameter. You can use any data type that ASP.NET Web API supports in serialization. (Note that if you don't have a GetAll that takes no query parameters, you will get an Http 405 Method Not allowed if you do a Get without this query parameter.)
On the client, as noted by #JacobJoz, you just use the method IMobileServiceTableQuery.WithParameters to construct the query that is passed to PullAsync. If you have multiple incremental sync queries against the same table and they use different values for the parameters, you should make sure to include those in the queryId to pull.
That is, if you have one query with parameters foo=bar and another that is foo=baz for the same sync table, make sure you use two different query IDs, one that includes "bar" and one that includes "baz". Otherwise, the 2 incremental syncs can interfere with one another, as the queryId is used as a key to save the last updated timestamp for that sync table. See How offline synchronization works.
The part that is unfortunately hard is passing the query parameter as part of the offline sync pull. Offline sync only works with table controllers, FYI.
There is an overloaded extension method for PullAsync that takes a dictionary of parameters, but unfortunately it requires a string query rather than IMobileServiceTableQuery:
PullAsync(this IMobileServiceSyncTable table, string queryId, string query, IDictionary<string, string> parameters, CancellationToken cancellationToken)
(I've filed a bug to fix this: Add a generic PullAsync overload that accepts query parameters).
The problem is that there's no easy way to convert from IMobileServiceTableQuery to an OData query string, since you'd need to access internal SDK methods. (I filed another issue: Add extension method ToODataString for IMobileServiceTableQuery.)

I've looked through the source code for MobileServiceTableQuery on github. It looks like it exposes a method called WithParameters. I have chained that method call onto CreateQuery in order to generate the query to the server, and it seems to do what I want.
Here is the client code:
var parameters = new Dictionary<string, string>();
parameters.Add("v1", "hello");
var query = WorkItemTable.CreateQuery().WithParameters(parameters);
await WorkItemTable.PullAsync("RetrieveWorkItems", query);
On the server I have a GetAll implementation that looks like this:
public IQueryable<WorkItem> GetAllWorkItem(string v1)
{
//return IQueryable after processing business logic based on parameter
}
The parameterized version of the method gets called successfully. I'm just not entirely sure what the impacts are from an incremental pull perspective.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse