How to build a Distributed Network with mutual consensus - distributed-computing

What are the technologies needed to build a system which is distributed and maintains a proof of work between parties and transactions.
I've known about Blockchain and worked with IBM Fabric, but they have their own restrictions.
What are the other possible methods and technologies to build such a strong and robust system?
Thanks in advance :)

Well if you want to stay block chain specific, tendermint stepped around the issues of bft by using gossip, for each round, the thing is expecting proof of work hinders scalability, strong consensus is also not very scalable.
Now if we're talking about concepts that can help you build your own solution, rather than framework x, look at event sourcing for sharable immutable transactions, if you need consistency look into chord, it's the easiest way to implement eventual consistency or gossip + crdts, or hell dynamo and cassandra use all 3.
For maintaining security look into the station to station protocol + mutual tls auth.

Related

Is there a specific standard to follow to therefore call something a"Microservices Architecture"?

In the past I worked with what I believe were "microservices". We had a service discovery and the applications communicated with each other through REST sync calls, which I believe is called request / response.
Now we are working on an application that is simply broken down into multiple small applications which work together through publish / subscribe using Kafka, there's no direct communication between them nor a service discovery, per say.
Is it safe to say that these are "Microservices" too or what should I call them?
As one commenter pointed out- An answer will always be opinionated (and stackoverflow is generally not the best place for opinion based questions), but I am not afraid to give mine in an attempt to help.
My view goes along the lines of what Martin Fowler and others have taught us in last decade, as well as the insight from the pioneering work of Werner Vogels at Amazon. Their definitions are that microservices are not a well defined single standard, but an architectural style that can encompass a range of different approaches in distributed computing. What they have in common are the goals of providing scalability in many different dimensions, but mostly system complexity, organizational productivity, throughput and resilience - All while being affordable. To achieve that the designs should be centered around:
Service boundaries cut according to business goals (a type of Domain Driven Design).
Sizing / Scope of services to be limited (the famous two-pizza team size idea).
Clear separation in terms of organizational responsibilities and maintenance. This includes independent teams and release-/ maintenance cycles for each service.
Embrace of automation on all layers (how cloud virtualization, devops, CICD, infrastructure as code, etc. became popular tools).
Design focus on failure-resilient operation, e.g. avoid cascading failures, defined failure states, fallbacks, etc.
A consequence of those goals is also the embrace of distributed system architectures and design for horizontal scalability (statelessness, separation of compute and storage, etc.).
So if you feel your designs are along those lines then in my opinion you would have applied the microservices architectural style successfully.
If you want to learn more about this school of thought I recommend this read.

Any established practices for implementing coordinated operations across multiple APIs?

Writing a service that has to talk to multiple APIs and also do some transactions to DB. Obviously any of the whole sequence of async calls might fail and then I guess I would have to rollback the steps that have completed so far. In the case of DB the obvious solution to the task would be to use transactions. So if any of the steps fail - transaction simply gets rolled back. But what about the same thing happening in the hybrid distributed environment.
What are best practices here? Are there any at all?
Considering that you have that hybrid distributed environment and Two-Phase Commit transactions are
not valid options you could apply the Saga Pattern.
The basic idea is to make your services/modules execute local transactions and coordinate the distributed transaction among them providing compensation operations to undo operations when necessary.
It is a complicated pattern and adds a lot of complexity to your system. However it provides a well designed solutions for that problem.
References:
https://microservices.io/patterns/data/saga.html
https://blog.couchbase.com/saga-pattern-implement-business-transactions-using-microservices-part/
Regardless of if you use only rest API for communication or other protocols or methods like Web Sockets as well you can apply the Saga Pattern.
The implementation would require an abstraction for the different type of calls but it would still be considered a Saga pattern.
The orchestration of the Saga is still on application level so you can control it regardless of if it is REST, WebSockets, WSDL or something else.
In a addition to Leonardo's provided sources please have a look at this answer.

Connection between programs over the network

I want to dive into the whole diversity of tools which provide connection between programs over the network.
To clarify the question, I divide it on subquestions:
Why some groups of programs (or specific tools/frameworks/approaches with programming languages where this frameworks can be used) were popular in each period of time? (I expect description of problems which were solved, description of tools, why those tools are considered as best solution to those problems at that time, why some tools lost popularity)
What is the entire history of software communication over the network? (tools/approaches popularity precisely to decades)
What are the modern solutions to this problem?
I can distinguish only two significant approaches.
RPC, RMI and their implementations (I saw this, but it is about concrete problem and specific tools to solve this problem, I want to see the place of this problem in the whole picture of interconnection programs over the network. I heard about implementations: ONC RPC, XML-RPC, CORBA, DCOM, gRPC, but which are active now? which are reasonable to use? which are preferable and why? I want answers not to be opinion based, so I accept answers like "technology A better than technology B for problem X because ..." only if there is reliable research/statistics or facts). I heard that RPC and RMI were popular 10 years ago. Are they still?
Web services: REST, SOAP.
Am I miss something? Maybe there are some technologies which solve problem completely new way? Maybe there are technologies which can be treated as replacement to RPC(RMI) and Web Services? Can we replace RPC(RMI) by REST for any task? Can we replace RPC(RMI) by REST only for modern tasks? Should I separate technologies not as RPC and Web Services, but in some other manner?
As a partial answer, I can give you my feedback on the use of RabbitMQ.
As explain here, it provides a lot of different ways to use it :
RPC by implementing a "callback" queue
One to one, one to many routing strategy to propagate your events through your whole infrastructure and target the right destination.
It comes with the ability to persist messages to avoid loosing data when a crash appears but also with some plugins to increase possibilities (e.g x-delayed plugin)
This technologie written in Erlang is powerful and is a must try in term of communication between programs.
To your question „Am I missing something“: yes.
Very popular communication patterns are the so-called Event-Driven or Message-Driven protocols. This type of protocols are often used in distributed systems such web applications, microservices and IoT-Environments. The communication is complete asynchronously and allows building scalable and loosely coupled systems.
There are many different frameworks and methods for Event-Driven systems like WebSockets, WebHooks, Pub-Sub and Messaging-Librarys like AcitveMQ, OpenMQ, RabbitMQ, ZeroMQ and MQTT.
Hope this info helps for your research.

How to implement XEP-0289 FMUC plugin on a XMPP server?

I need to implement a distributed XMPP MuC application on the lines of XEP-0289 minus some of the features, in essence I want to have a bare bones implementation of the plugin, my concern is to address fault-tolerance and as of now I do not want to worry about the performance considerations as specified in 289.
I have looked into SleekXmpp as a tool to develop server side plugins, but don't know how comfortable it would be to use it for such an implementation, other options I have looked at are OpenFire , Tigase. I am comfortable with Python/Java and other key features to consider would be good documentation, ease of use etc keeping that in mind I would like to know what would be the preferred path to take for this development.
Any guidance will be appreciated.
you should be able to write a MUC component that includes FMUC (or similar). The general way to do this would be to use a library that supports XEP-0114 components (e.g. SleekXMPP (Python), Swiften (C++)) and implement MUC+FMUC through that. You haven't said what your concerns with SleekXMPP are, but it's a fairly well-respected library in the XMPP community, so seems a fair choice (I'd pick Swiften, but I'm biased as one of the authors).
Your second option (patching the server directly) isn't generally the XMPPish way of adding customisations (as it's vendor-specific), but should also work if you can find someone sufficiently familiar with the server code, or if you're willing to become so.
To achieve fault tolerance (assuming you mean resilience to server failures) you'd need to run your XMPP server clustered, and also cluster your FMUC implementation. With that done, the usual XMPP fail-over using SRV records in DNS should ensure other servers retry connections to another host.
On a side note, the next version of FMUC (XEP-0289) will have some of the features of the current revision stripped out, and a number of improvements made based on deployment experience, so if your work is not time-critical, it might be of benefit to you to read that when it's released. I also note that there exists at least one implementation of FMUC already (Isode's M-Link, on which I work), and there is interest from other vendors, so using the standard protocol might benefit you in terms of not re-inventing the wheel.

Choosing noSQL - availability priorited

We have thought a bit about running a noSQL database for our next project. However, we're not sure about which platform that will give us the best possible availability and has the best built-in replication features/functions to provide this - with the least headache.
Right now, Cassandra appears as the best candidate, but we would like to hear more about this from someone that have more experience in this area, then we do.
Thanks a lot!
High availablity will most likely be achieved with a Dynamo clone.
Cassandra is a good option although it has been bashed recently by several early adapters.
Project Voldemort is also Dynamo-based and therefore easily optimized for high-availability, it's what LinkedIn are using.
Another interesting noSQL option might be membase, I myself didn't use it but their notion of virtual buckets for rebalancing as opposed to just consistent hashing makes a lot of sense and would appear to provide more robust high-availability.