Azure Mobile Services for Xamarin Forms - Conflict Resolution - azure-mobile-services

I'm supporting a production Xamarin Forms app with offline sync feature implemented using Azure Mobile Services.
We have a lot of production issues related to users losing data or general instability that goes away if the reinstall the app. After having a look through, I think the issues are around how the conflict resolution is handled in the app.
For every entity that tries to sync we handle MobileServicePushFailedException and then traverse through the errors returned and take action.
catch (MobileServicePushFailedException ex)
{
foreach (var error in ex.PushResult.Errors) // These are MobileServiceTableOpearationErrors
{
var status = error.Status; // HttpStatus code returned
// Take Action based on this status
// If its 409 or 412, we go in to conflict resolving and tries to decide whether the client or server version wins
}
}
The conflict resolving seems too custom to me and I'm checking to see whether there are general guidelines.
For example, we seem to be getting empty values for 'CreatedAt' & 'UpdatedAt' timestamps for local and server versions of the entities returned, which is weird.
var serverItem = error.Result;
var clientItem = error.Item;
// sometimes serverItem.UpdatedAt or clientItem.UpdatedAt is NULL. Since we use these 2 fields to determine who wins, we are stumped here
If anyone can point me to some guideline or sample code on how these conflicts should be generally handled using information from the MobileServiceTableOperationError, that will be highly appreciated

I came across the following code snippet from the following doc.
// Simple error/conflict handling.
if (syncErrors != null)
{
foreach (var error in syncErrors)
{
if (error.OperationKind == MobileServiceTableOperationKind.Update && error.Result != null)
{
//Update failed, reverting to server's copy.
await error.CancelAndUpdateItemAsync(error.Result);
}
else
{
// Discard local change.
await error.CancelAndDiscardItemAsync();
}
Debug.WriteLine(#"Error executing sync operation. Item: {0} ({1}). Operation discarded.",
error.TableName, error.Item["id"]);
}
}
Surfacing conflicts to the UI I found in this doc
private async Task ResolveConflict(TodoItem localItem, TodoItem serverItem)
{
//Ask user to choose the resolution between versions
MessageDialog msgDialog = new MessageDialog(
String.Format("Server Text: \"{0}\" \nLocal Text: \"{1}\"\n",
serverItem.Text, localItem.Text),
"CONFLICT DETECTED - Select a resolution:");
UICommand localBtn = new UICommand("Commit Local Text");
UICommand ServerBtn = new UICommand("Leave Server Text");
msgDialog.Commands.Add(localBtn);
msgDialog.Commands.Add(ServerBtn);
localBtn.Invoked = async (IUICommand command) =>
{
// To resolve the conflict, update the version of the item being committed. Otherwise, you will keep
// catching a MobileServicePreConditionFailedException.
localItem.Version = serverItem.Version;
// Updating recursively here just in case another change happened while the user was making a decision
UpdateToDoItem(localItem);
};
ServerBtn.Invoked = async (IUICommand command) =>
{
RefreshTodoItems();
};
await msgDialog.ShowAsync();
}
I hope this helps provide some direction. Although the Azure Mobile docs have been deprecated, the SDK hasn't changed and should still be relevant. If this doesn't help, let me know what you're using for a backend store.

Related

Flutter Future timeouts not always working correctly

Hey I need some help here for How to use timeouts in flutter correctly. First of all to explain what the main goal is:
I want to recive data from my Firebase RealTime Database but need to secure this request api call with an time out of 15 sec. So after 15 sec my timeout should throw an exception that will return to the Users frontend the alert for reasons of time out.
So I used the simple way to call timeouts on future functions:
This functions should only check if on some firebase node an ID is existing or not:
Inside this class where I have declared this functions I also have an instance which called : timeoutControl this is a class which contains a duration and some reasons for the exceptions.
Future<bool> isUserCheckedIn(String oid, String maybeCheckedInUserIdentifier, String onGateId) async {
try {
databaseReference = _firebaseDatabase.ref("Boarding").child(oid).child(onGateId);
final snapshot = await databaseReference.get().timeout(Duration(seconds: timeoutControl.durationForTimeOutInSec), onTimeout: () => timeoutControl.onEppTimeoutForTask());
if(snapshot.hasChild(maybeCheckedInUserIdentifier)) {
return true;
}
else {
return false;
}
}
catch (exception) {
return false;
}
}
The TimeOutClass where the instance timeoutControl comes from:
class CustomTimeouts {
int durationForTimeOutInSec = 15; // The seconds for how long to try until we throw an timeout exception
CustomTimeouts();
// TODO: Implement the exception reasons here later ...
onEppTimeoutForUpload() {
throw Exception("Some reason ...");
}
onEppTimeoutForTask() {
throw Exception("Some reason ...");
}
onEppTimeoutForDownload() {
throw Exception("Some reason ...");
}
}
So as you can see for example I tried to use this implementation above. This works fine ... sometimes I need to fight with un explain able things -_-. Let me try to introduce what in somecases are the problem:
Inside the frontend class make this call:
bool isUserCheckedIn = await service.isUserCheckedIn(placeIdentifier, userId, gateId);
Map<String, dynamic> data = {"gateIdActive" : isUserCheckedIn};
/*
The response here is an Custom transaction handler which contains an error or an returned param
etc. so this isn't relevant for you ...
*/
_gateService.updateGate(placeIdentifier, gateId, data).then((response) {
if(response.hasError()) {
setState(() {
EppDialog.showErrorToast(response.getErrorMessage()); // Shows an error message
isSendButtonDiabled = false; /*Reset buttons state*/
});
}
else {
// Create an gate process here ...
createGateEntrys(); // <-- If the closures update was successful we also handle some
// other data inside the RTDB for other reasons here ...
}
});
IMPORTANT to know for you guys is that I am gonna use the returned "boolean" value from this function call to update some other data which will be pushed and uploaded into another RTDB other node location for other reasons. And if this was also successful the application is going on to update some entrys also inside the RTDB -->createGateEntrys()<-- This function is called as the last one and is also marked as an async function and called with its closures context and no await statement.
The Data inside my Firebase RTDB:
"GateCheckIns" / "4mrithabdaofgnL39238nH" (The place identifier) / "NFdxcfadaies45a" (The Gate Identifier)/ "nHz2mhagadzadzgadHjoeua334" : 1 (as top of the key some users id who is checked in)
So on real devices this works always without any problems... But the case of an real device or simulator could not be the reason why I'am faceing with this problem now. Sometimes inside the Simulator this Function returns always false no matter if the currentUsers Identifier is inside the this child nodes or not. Therefore I realized the timeout is always called immediately so right after 1-2 sec because the exception was always one of these I was calling from my CustomTimeouts class and the function which throws the exception inside the .timeout(duration, onTimeout: () => ...) call. I couldn't figure it out because as I said on real devices I was not faceing with this problem.
Hope I was able to explain the problem it's a little bit complicated I know but for me is important that someone could explain me for what should I pay attention to if I am useing timeouts in this style etc.
( This is my first question here on StackOverFlow :) )

How can I handle high traffic increase on my google cloud feature?

My situation is the following:
I developed a game, where people can save their progress via google cloud. I'm releasing big content-updates, resulting in many people returning to my game at the same time, trying to get their savegame.
This overload causing my customers to get stucked in the savegame-loading process - not able to start playing with their progress.
Here's an updated 4 days screenshot of the cloud-api-dashboard
(And here's the old "12 hours" screenshot of the cloud-api-dashboard)
more informations about the Project:
The game keeps using the "save in cloud"-function in the background on some stages of the game to provide players with the functionality to play on two diffrent devices.
I'm using Unity 2019.3.9f1 and the Asset "Easy Mobile Pro 2.17.3" for the Game-Service-Feature.
The "Google Play Games Plugin" has the version "0.10.12" and can be found on github
more informations about the Cloud-Situation:
The OAuth "user type" is "External" (and can't be changed)
The OAuth user cap display shows "0/100" for the user-cap
And The OAuth rate limits is displaying this for the token-grant-rate (highest "Average Token Grant Rate" is 3,33 of 10.000 as limit)
All used quotas are within the limit. The project reaches
1/10 of "queries per day" (1.000.000.000 max) and
1/2 of "queries per 100 sec" (20.000 max).
more informations about the Error-Trace in the cloud-API:
On my search for a better Error-Log I tried to find “Cloud Logging”-tools in the “Google Cloud Platform”-console. But every section i tried won’t display anything:
“Logging” (Operations tool) is empty
“Cloud Logging API” says: “no data available for the selected time frame.”
“Cloud Debugger API” says: “no data available for the selected time frame.”
I can't find a more detailed variant of the errors as this (the "Metrics"-Section in the "Google Drive API"):
Is there anything I miss to get a better insight?
more informations about the Core-Code
As I mentioned, I’m using “EasyMobilePro”, so I have one “SaveGame”-Var and 8 calls for google and apple as well. I already contacted their support: They assured me that those calls are unchangeable & kind of rock solid (so it can’t be caused from their code) and I should try to contact google if the problem is not on my side.
The 5 calls from EasyMobile for cloudsave are:
bool “GameServices.IsInitialized()”
void “GameServices.OpenWithAutomaticConflictResolution”
void “GameServices.WriteSavedGameData”
void “GameServices.ReadSavedGameData”
void “GameServices.DeleteSavedGame”
The 3 calls from EasyMobile for cloud-login are:
void “GameServices.Init()”
delegate “GameServices.UserLoginSucceeded”
delegate “GameServices.UserLoginFailed”
The Process, that causes the Issue:
I call “GameService.Init()”, the user logs in (no problem)
On that “LoginSuccess”-Callback I call my Function “HandleFirstCloudOpening”:
//This Method is Called, after the player Pressed "Save/ Load" on the StartScreen
//The button is disabled imidiately (and will be re-enabled if an error/fail happens)
public void TryCallUserLogin() {
if (!IsLoginInit) {
EasyMobile.GameServices.UserLoginFailed += HandleLoginFail;
EasyMobile.GameServices.UserLoginSucceeded += HandleFirstCloudOpening;
IsLoginInit = true;
}
if (!IsGameServiceInitialized) {
EasyMobile.GameServices.Init();
} else { //This "else" is only be called, if the "Init" was successfull, but the player don't have a connected savegame
HandleFirstCloudOpening();
}
}
private void HandleLoginFail() {
//(...) Show ErrorPopup, let the player try to login again
}
private void HandleFirstCloudOpening() {
if (currentSaveState != CloudSaveState.NONE) {
CloudStateConflictDebug(CloudSaveState.OPENING);
return;
}
currentSaveState = CloudSaveState.OPENING;
EasyMobile.GameServices.SavedGames.OpenWithAutomaticConflictResolution(cloudSaveNameReference, UseFirstTimeOpenedSavedGame);
}
private void UseFirstTimeOpenedSavedGame(EasyMobile.SavedGame _savedGame, string _error) {
currentSaveState = CloudSaveState.NONE;
if (string.IsNullOrEmpty(_error)) {
cloudSaveGame = _savedGame;
ReadDataFromCloud(cloudSaveGame);
} else {
ErrorPopupWithCloseButton("cloud_open", "failed with error: " + _error);
}
}
private void ReadDataFromCloud(EasyMobile.SavedGame _savedGame) {
if (_savedGame.IsOpen) {
currentSaveState = CloudSaveState.LOADING;
EasyMobile.GameServices.SavedGames.ReadSavedGameData(_savedGame, UseSucessfullLoadedCloudSaveGame);
} else { //backup function if the fresh-opened savegame is "closed" for some reason (can happen later while "saving" ingame)
HandleFirstCloudOpening();
}
}
private void UseSucessfullLoadedCloudSaveGame(EasyMobile.SavedGame _game, byte[] _cloudData, string error) {
if (!string.IsNullOrEmpty(error)) {
ErrorPopupWithCloseButton("cloud_read", "Reading saved game data failed: " + error);
return;
}
if (_cloudData.Length > 0) {
//A function, that converts the saved bytes to my useable Savegame-Data
//having a "try&catch": if it fails, it useses the callback with the param "null"
SaveGameToByteConverter.LoadFromBytes<CoreSaveData>(_cloudData, UseSucessfullConvertedSavegameData);
} else {
//this will "fail", causing the use of the callback with the param "null"
SaveGameToByteConverter.LoadFromBytes<CoreSaveData>(null, UseSucessfullConvertedSavegameData);
}
}
private void UseSucessfullConvertedSavegameData(CoreSaveData _convertedSaveGame) {
//Has a Loaded & normal SaveGame in his cloud
if (_convertedSaveGame != null) {
//Loaded Save matches verify-conditions
if (CheckLoadedSaveIsVerified(_convertedSaveGame)) {
OverrideGameSaveDatawithLoaded(_convertedSaveGame);
ReloadCurrentScene();
return;
} else { //This happens if the cloud-save doesn't pass my verification-process
ErrorPopupWithCloseButton("cloud_loadedSave", "Couldn't find a compatible Savegame!");
return;
}
} else { //User uses Cloud-save for the frist Time or has an unusable savegame and gets a "new" (lost his old data)
TrySaveGameToCloud((bool _saved) => {
SaveAllGameFilesLocally();
});
}
}
I shrunk the code by removing most of my “if error happens, do XY”, since there are many and they would extend the reprex. If necessary I can provide a more detailed (but more complicated) code.
current conclusion
I can't find any issue on my side, that wouldn't have been fixed with a "restart of the game" or woudln't been covered by an error-popup for the user. It's like they are queued because of the amount of users and need to wait way too long for a response. Some users told us they had to wait & tried "x hours" (it's variable from 2h to 36h) and then they passed to play with their progress (so it worked). But some players mentioned they couldn't play again on the next day (same problem). Like their "access-token" only holds for a day?
Edit-History:
(1) updated the first dash-board-picture to match the ongoing situation
(1) added "more informations about the cloud-situation"
(1) can't find a more detailed error-log
(2) removed most pictures as displayables (kept the links)
(2) added "more informations about the Error-Trace in the cloud-API"
(2) added "more informations about the Core-Code" and a Reprex
(2) added "current conclusion"

Vertx CompositeFuture

I am working on a solution where I am using vertx 3.8.4 and vertx-mysql-client 3.9.0 for asynchronous database calls.
Here is the scenario that I have been trying to resolve, in a proper reactive manner.
I have some mastertable records which are in inactive state.
I run a query and get the list of records from the database.
This I did like this :
Future<List<Master>> locationMasters = getInactiveMasterTableRecords ();
locationMasters.onSuccess (locationMasterList -> {
if (locationMasterList.size () > 0) {
uploadTargetingDataForAllInactiveLocations(vertx, amazonS3Utility,
locationMasterList);
}
});
Now in uploadTargetingDataForAllInactiveLocations method, i have a list of items.
What I have to do is, I need to iterate over this list, for each item, I need to download a file from aws, parse the file and insert those data to db.
I understand the way to do it using CompositeFuture.
Can someone from vertx dev community help me with this or with some documentation available ?
I did not find good contents on this by googling.
I'm answering this as I was searching for something similar and I ended up spending some time before finding an answer and hopefully this might be useful to someone else in future.
I believe you want to use CompositeFuture in vertx only if you want to synchronize multiple actions. That means that you either want an action to execute in the case that either all your other actions on which your composite future is built upon succeed or at least one of the action on which your composite future is built upon succeed.
In the first case I would use CompositeFuture.all(List<Future> futures) and in the second case I would use CompositeFuture.any(List<Future> futures).
As per your question, below is a sample code where a list of item, for each item we run an asynchronous operation (namely downloadAnProcessFile()) which returns a Future and we want to execute an action doAction() in the case that all the async actions succeeded:
List<Future> futures = new ArrayList<>();
locationMasterList.forEach(elem -> {
Promise<Void> promise = Promise.promise();
futures.add(promise.future());
Future<Boolean> processStatus = downloadAndProcessFile(); // doesn't need to be boolean
processStatus.onComplete(asyncProcessStatus -> {
if (asyncProcessStatus.succeeded()){
// eventually do stuff with the result
promise.complete();
} else {
promise.fail("Error while processing file whatever");
}
});
});
CompositeFuture.all(futures).onComplete(compositeAsync -> {
if (compositeAsync.succeeded()){
doAction(); // <-- here do what you want to do when all future complete
} else {
// at least 1 future failed
}
});
This solution is probably not perfect and I suppose can be improved but this is what I found works for me. Hopefully will work for someone else.

What impact does changing a IReliableQueue to a IReliableConcurrentQueue have in an existing deployment?

I am working in a Service Fabric application that uses IReliableQueue. For the uses cases of this system, the IReliableConcurrentQueue makes sense to use and some local testing (i.e. basically by just changing the code to use IReliableConcurrentQueue instead of IReliableQueue - queue name does not change) shows great performance improvements. However, I am worried about the impact of changing this in a production system (i.e. upgrading). I can't find any docs or online questions (unless I just missed them) about these considerations. For example, in this system, the existing IReliableQueue will almost always have items. So what happens to that data when I upgrade the SF application? Will it be available to dequeue in the IReliableConcurrentQueue? Or would data be lost? I know I can "just try it" but wanted to see if someone out there had done the same or could offer pointers to existing resources. Thanks!
Sorry for a late answer (that you probably don't need anymore but still).
When we calling GetOrAddAsync method on IReliableStateManager we aren't retrieving the interface to store values - we actually creating an instance of reliable collection. This basically means that type of the interface we specify is very important.
Taking this into account if we do this:
Service v. 1.0
// Somewhere in RunAsync for example
await this.StateManager.GetOrAddAsync<IReliableQueue<long>>("MyCollection")
Then doing this in the next version:
Service v. 1.1
// Somewhere in RunAsync for example
await this.StateManager.GetOrAddAsync<IReliableConcurrentQueue<long>>("MyCollection")
will throw an exception:
Returned reliable object of type Microsoft.ServiceFabric.Data.Collections.DistributedQueue`1[System.Int64] cannot be casted to requested type Microsoft.ServiceFabric.Data.Collections.IReliableConcurrentQueue`1[System.Int64]
and then:
System.ExecutionEngineException: 'Exception of type 'System.ExecutionEngineException' was thrown.'
The above exception looks like a bug so I have filled one.
UPDATE 2019.06.28
It turned out that appearance of System.ExecutionEngineException isn't a bug but rather an undocumented behavior of Environment.FailFast method in combination with Visual Studio debugger.
Please see my comment to the above issue.
This is what would happen.
There are plenty ways to overcome this.
Here is the most obvious one:
Example
var migrate = false; // This flag indicates whether the migration was already done.
var migrateValues = new List<long>();
var applicationFlags = await this.StateManager
.GetOrAddAsync<IReliableDictionary<string, bool>>("application-flags");
using (var transaction = this.StateManager.CreateTransaction())
{
var flag = await applicationFlags
.TryGetValueAsync(transaction, "queue-to-concurrent-queue-migration");
if (!flag.HasValue || !flag.Value)
{
var queue = await this.StateManager
.GetOrAddAsync<IReliableQueue<long>>("value-collection");
for (;;)
{
var c = await queue.TryDequeueAsync(transaction);
if (!c.HasValue)
{
break;
}
migrateValues.Add(c.Value);
}
migrate = true;
}
}
if (migrate)
{
await this.StateManager.RemoveAsync("value-collection");
using (var transaction = this.StateManager.CreateTransaction())
{
var concurrentQueue = await this.StateManager
.GetOrAddAsync<IReliableConcurrentQueue<long>>("value-collection");
foreach (var i in migrateValues)
{
await concurrentQueue.EnqueueAsync(transaction, i);
}
await applicationFlags.AddOrUpdateAsync(
transaction,
"queue-to-concurrent-queue-migration",
true,
(s, b) => true);
}
await transaction.CommitAsync();
}
Please note that this code is just an illustrative example and should be properly tested before applying it to real life application.

SignalR Core - Error: Websocket closed with status code: 1006

I use SignalR in an Angular app. When I destroy component in Angular I also want to stop connection to the hub. I use the command:
this.hubConnection.stop();
But I get an error in Chrome console:
Websocket closed with status code: 1006
In Edge: ERROR Error: Uncaught (in promise): Error: Invocation canceled due to connection being closed. Error: Invocation canceled due to connection being closed.
It actually works and connection has been stopped, but I would like to know why I get the error.
This is how I start the hub:
this.hubConnection = new HubConnectionBuilder()
.withUrl("/matchHub")
.build();
this.hubConnection.on("MatchUpdate", (match: Match) => {
// some magic
})
this.hubConnection
.start()
.then(() => {
this.hubConnection.invoke("SendUpdates");
});
EDIT
I finally find the issue. Its caused by change streams from Mongo. If I remove the code from SendUpdates() method then OnDisconnected is triggered.
public class MatchHub : Hub
{
private readonly IMatchManager matchManager;
public MatchHub(IMatchManager matchManager)
{
this.matchManager = matchManager;
}
public async Task SendUpdates() {
using (var changeStream = matchManager.GetChangeStream()) {
while (changeStream.MoveNext()) {
var changeStreamDocument = changeStream.Current.FullDocument;
if (changeStreamDocument == null) {
changeStreamDocument = BsonSerializer.Deserialize<Match>(changeStream.Current.DocumentKey);
}
await Clients.Caller.SendAsync("MatchUpdate", changeStreamDocument);
}
}
}
public override async Task OnDisconnectedAsync(Exception exception)
{
await base.OnDisconnectedAsync(exception);
}
}
Method GetChangeStream from the manager.
ChangeStreamOptions options = new ChangeStreamOptions() { FullDocument = ChangeStreamFullDocumentOption.UpdateLookup };
var watch = mongoDb.Matches.Watch(options).ToEnumerable().GetEnumerator();
return watch;
But I don't know how to fix it.
This can be for many reasons but i think it is most likely this one:
I think this is because of how the server is handling the connected / disconnected events. I can't say for sure but the connection closing needs to handled correctly on the server also with code. Try overriding the built in On Connected /Disconnected methods on the server and see. My assumption only is that you're closing it but the server isn't closing properly and therefore not relaying the proper closed response.
found as a comment at : getting the reason why websockets closed with close code 1006
Where you don't need to change the connection/disconection because evrything works fine. But as an answer this one is the most likely.
It throws error because the callback doesn't get clear properly.
And it is caused by the return data from websocket.
normally it should return like
However, for some reason it might return something like
the very last response breaking into 2 pieces
And that causes the issue.
I don't think there is a way to bypass this without changing the source code.
I reported this on github repo as well at here
It turns out that I can just utilize invocation response to notify client to stop the hub. So it doesn't trigger racing issue.