What would be a good/correct term to designate a group of microservices? [closed] - service

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I am opening this topic looking for an advice to solve/help in the following problem:
I am currently working with other people on a project with a microservices architecture, using cloud computing.
There are 6 different microservices, and there are some couples of microservices that are not compatible, therefore not able of being instantiated within the same machine.
Each microservice has a version number.
In order to launch one or more new instances of any microservice, we have to define which microservices will run on this new machine, via a static configuration.
This static configuration, that we call so far as a "deploy" contains the microservices that are being deployed, and the version of each microservice. (ex: (XY,[(X,v1),(Y,v2)]) - X and Y are microservices, and the XY deploy instantiates version 1 of X and version 2 of Y)
Those "deploys" also have their own version number. Altering the version number of a microservice within a deploy requires altering the version of any "deploy" containing the microservice. (ex: (XY,v1,[(X,v1),(Y,v2)]) and (XY,v2,[(X,v1),(Y,v3)]))
The question is: what would be a correct, or at least, a good term to refer to this entity that I have previously called a "deploy"?
Many developers are writing programs around our architecture and using different names for such entity, which causes syntaxic and semantic incompatibility inside our team.
Of those different names, all have pros and cons:
deploy: makes sense because you are deploying all the microservices in the list. However, the term deploy already designate another part of our process, and there could be some over utilization of the same term. (Deploying the XY deploy will deploy microservices X and Y in a machine)
cluster: good name for a group of things, but you can deploy multiple machines from a configuration, and the term cluster already applies to this group of machines.
service: a service would be a group of microservices. Makes sense, but many pieces of codes refer to a microservice as 'service', and that could lead to a confusion. (def get_version(service) - Is he talking about a service or a microservice?)
Does any of you could give us any opinion or enlightenment on the question?
Thanks!

You might take a hint from the 12-factor App, and call them releases (http://12factor.net/build-release-run)
You then deploy a versioned release.

It sounds like you want a suitable collective noun. I suggest you Google "collective nouns", to find numerous lists. Read some of the lists and pick a noun that you think is appropriate.
Alternatively, the term cooperative (or co-op for short) might be suitable if one of the defining characteristics of an instantiation collection of microservices is that they complement, or cooperate with, each other.

I have used the term "complex" (as in the "mortgage risk" complex vs the "compliance" complex). It seemed unambiguous.
People also used the term within a project for deployed sets of microservices (e.g the production complex vs the test complex).

Related

Microservice Adapter. One for many or many / countries to one / country. Architectural/deployment decision

Say, I have System1 that connect to System 2 through the adapter-microservice between them.
System1 -> rest-calls --> Adapter (converts request-response + some extra logic, like validation) -> System2
System1 is more like a monolith, exists for many countries (but it may change).
Question is: From the perspective of MicroService architecture and deployment, should the Adapter be one per country. Say Adapter-Uk, Adappter-AU, etc. Or it should be just the Adapter that could handle many countries at the same time?
I mean:
To have a single system/adapter-service :
Advantage: is having one code-base in or place, the adaptive code-logic between countries in 90 % are the same. Easy to introduce new changes.
Disadvantage: once we deploy the system, and there is a bug, it could affect many countries at the same time. Not safe.
To have a separate system:
Disadvantage: once some generic change has been introduced to one system, then it should be "copy-pasted" for all other countries/services. Repetitive, not smart.. work, from developer point of view.
Advantage:
Safer to change/deploy.
Q: what is a preferable way from the point of view of microservice architecture?
I would suggest the following:
the adapter covers all countries in order to maintain single codebase and improve code reusability
unit and/or integration tests to cope with the bugs
spawn multiple identical instances of the adapter with load balancer in front
Since Prabhat Mishra asked in the comment.
After two years.. (took some time to understand what I have asked.)
Back then, for me was quite critical to have a resilient system, i.e. if I change a code in one adapter I did not want all my countries to go down (it is SAP enterprise system, millions clients). I wanted only once country to go down (still million clients, but fewer millions :)).
So, for this case: i would create many adapters one per country, BUT, I would use some code-generated common solution to create them, like common jar - so I would would not repeat my infrastructure or communication layers. Like scaffolding thing.
country-adapter new "country-1"
add some country specific code (not changing any generated one (less code to repeat, less to support))
Otherwise, if you feel safe (like code reviewing your change, making sure you do not change other countries code), then 1 adapter is Ok.
Another solution, is to start with 1 adapter, and split to more if it is critical (feeling not safe about the potential damage / cost if fails).
In general, seems, all boils down to the same problem, which is: WHEN to split "monolith" to pieces. The answer is always: when it causes problems to be as big as it is. But, if you know your system well, you know in advance that WHEN is now (or not).

When should I use message libraries like ZeroMQ/RabbitMQ/ActiveMQ [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I'm making a small project that contains some Clients and a Server (that manages the clients through a reactor)
I'm wondering if I need to use some sort of message provider library to get them to interact? or just plain sockets will be enough?
So in what scenarios should I use these libraries?
a)
When some more abstract ( might be even a composite ) formal-behaviour communication pattern is to be implemented on top of the just raw point-to-point transport means and some easy scaling & composition is needed or expected later.{ XREQ/XREP | PUSH/PULL | PAIR/PAIR | PUB/SUB }
b)
When a mix of multiple transport-classes is beneficial for your performance goals, using { tcp:// | ipc:// | inproc:// | epgm:// }
c)
When one does not want to have hands dirty from scaling IO-processing performance and has an option to let those issues operated and scaled by dedicated IO-threads beyond of one's own work ( as it is being left towards a set of well-oiled and performance optimised, fine-tuned set of threads inside the central Context() and the programming team just enjoys the comfort of operating published messaging methods, without a need to re-spend time on low-level dirty hacks and details and can concentrate on the domain-specific knowledge, needed for the application under development ).
All the time!
Couple of things to bear in mind:
1). RabbitMQ needs a central server, ZeroMQ does not.
2). RabbitMQ is a proactor, ZeroMQ is a reactor. The latter is IMHO far easier to code for.
Also consider nanomsg. It's similar to ZeroMQ, but has a cleaner design that makes adding additional patterns easier ( and so they have ).
Things like RabbitMQ, ZeroMQ and nanomsg are merely byte shifters. To make them useful it's a good idea to serialise messages. If your system is a single language ( e.g. C# ), then there's nothing wrong in using whatever serialisation facilities are built into that language. However you may also wish to consider something like Google Protocol Buffers, or XSD/XML, or ASN.1, Avro. These are all language independent serialisers, allowing you to develop a distributed heterogeneous system. JSON is attractive in some senses ( everyone's using it ), though from what I've seen of tools to turn JSON schemas into code they're far from mature.

What distance metric for project source code makes sense? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I want to compare the changes of the source code of two project stages, e.g. the web application source code before it was scalable, and the scalable one.
For me it is interesting to show how many lines needed to be changed, removed or added to get from one to the other stage. I'm searching for a good distance metric that rewards less code and little code changes - the one I imagine would output a relative value:
0% = "Both projects are the same"
50% = "Half of the source code has been changed"
100% = "Both projects have nothing in common"
Intentionally I came up with a few solutions:
diff: Maybe concat all files to a single source code file and run a diff against them. Problem here is that less code is better, but with this solution is counted as a plain change therefore punishing code removal.
Levenshtein Distance: Calculates the changes needed to transform source code a to source code b. The result is a number of changes in characters. Problem here again is, that code removal is not rewarded but punished.
Unified Code Count: Sets up rules how to consistently count lines of code, but is no descriptive distance metric between projects.
So I'm searching for a metric that is descriptive, rewards code removal and only counts in code changes or additions. It doesn't have to be source code specific, both projects use the same language. My personal feeling goes into the diff direction but I did not come up with a satisfactory descriptive metric.
What would you propose?
If you want, you can make this into a really difficult research problem:
http://gate.ac.uk/sale/dd/related-work/tao-related/2007+Kagdi+Survey+for+mining+software+repositories.pdf
This approach: http://www.cs.kent.edu/~jmaletic/papers/icsm04.pdf looks like it's under active development here: http://www.srcml.org/ .
There's other, more general, code metrics tools listed here http://www.aniche.com.br/wp-content/uploads/2013/04/scam2013.pdf (though it looks like the tool advertised by the paper is down now). Apparently Sonar has the ability to look at metrics over time: https://en.wikipedia.org/wiki/SonarQube

UML Deployment Diagram for IaaS and PaaS Cloud Systems

I would like to model the following situation using a UML deployment diagram.
A small command and control machine instance is spawned on an Infrastructure as a Service cloud platform such as Amazon EC2. This instance is in turn responsible for spawning additional instances and providing them with a control script NumberCruncher.py either via something like S3 or directly as a start up script parameter if the program is small enough to fit into that field. My attempt to model the situation using UML deployment diagrams under the working assumption that a Machine Instance is a Node is unsatisfying for the following reasons.
The diagram seems to suggest that there will be exactly three number cruncher nodes. Is it possible to illustrate a multiplicity of Nodes in a deployment diagram like one would illustrate a multiplicity of object instances using a multi-object. If this is not possible for Nodes then this seems to be a Long Standing Issue
Is there anyway to show the equivalent of deployment regions / data-centres in the deployment diagram?
Lastly:
What about Platform as a Service? The whole Machine Instance is a Node idea completely breaks down at that point. What on earth do you do in that case? Treat the entire PaaS provider as a single node and forget about the details?
Regarding your first question:
Is there anyway to show the equivalent of deployment regions /
data-centres in the deployment diagram?
I generally use Notes for that.
And your second question:
What about Platform as a Service? The whole Machine Instance is a Node
idea completely breaks down at that point. What on earth do you do in
that case? Treat the entire PaaS provider as a single node and forget
about the details?
I would say, yes for your last question. And I suppose you could take more details from the definition of the deployment model and its elements. Specially at the end of this paragraph:
They [Nodes] can be nested and can be connected into systems of arbitrary
complexity using communication paths. Typically, Nodes represent
either hardware devices or software execution environments.
and
ExecutionEnvironments represent standard software systems that
application components may require at execution time.
source: http://www.omg.org/spec/UML/2.5/Beta1/

Looking for examples where knowledge of discrete mathematics is helpful [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Inspired after watching Michael Feather's SCNA talk "Self-Education and the Craftsman", I am interested to hear about practical examples in software development where discrete mathematics have proved helpful.
Discrete math has touched every aspect of software development, as software development is based on computer science at its core.
http://en.wikipedia.org/wiki/Discrete_math
Read that link. You will see that there are numerous practical applications, although this wikipedia entry speaks mainly in theoretical terms.
Techniques I learned in my discrete math course from university helped me quite a bit with the Professor Layton games.
That counts as helpful... right?
There are a lot of real-life examples where map coloring algorithms are helpful, besides just for coloring maps. The question on my final exam had to do with traffic light programming on a six-way intersection.
As San Jacinto indicates, the fundamentals of programming are very much bound up in discrete mathematics. Moreover, 'discrete mathematics' is a very broad term. These things perhaps make it harder to pick out particular examples. I can come up with a handful, but there are many, many others.
Compiler implementation is a good source of examples: obviously there's automata / formal language theory in there; register allocation can be expressed in terms of graph colouring; the classic data flow analyses used in optimizing compilers can be expressed in terms of functions on lattice-like algebraic structures.
A simple example the use of directed graphs is in a build system that takes the dependencies involved in individual tasks by performing a topological sort. I suspect that if you tried to solve this problem without having the concept of a directed graph then you'd probably end up trying to track the dependencies all the way through the build with fiddly book-keeping code (and then finding that your handling of cyclic dependencies was less than elegant).
Clearly most programmers don't write their own optimizing compilers or build systems, so I'll pick an example from my own experience. There is a company that provides road data for satnav systems. They wanted automatic integrity checks on their data, one of which was that the network should all be connected up, i.e. it should be possible to get to anywhere from any starting point. Checking the data by trying to find routes between all pairs of positions would be impractical. However, it is possible to derive a directed graph from the road network data (in such a way as it encodes stuff like turning restrictions, etc) such that the problem is reduced to finding the strongly connected components of the graph - a standard graph-theoretic concept which is solved by an efficient algorithm.
I've been taking a course on software testing, and 3 of the lectures were dedicated to reviewing discrete mathematics, in relation to testing. Thinking about test plans in those terms seems to really help make testing more effective.
Understanding of set theory in particular is especially important for database development.
I'm sure there are numerous other applications, but those are two that come to mind here.
Just example of one of many many...
In build systems it's popular to use topological sorting of jobs to do.
By build system I mean any system where we have to manage jobs with dependency relation.
It can be compiling program, generating document, building building, organizing conference - so there is application in task management tools, collaboration tools etc.
I believe testing itself properly procedes from modus tollens, a concept of propositional logic (and hence discrete math), modus tollens being:
P=>Q. !Q, therefore !P.
If you plug in "If the feature is working properly, the test will pass" for P=>Q, and then take !Q as given ("the test did not pass"), then, if all these statements are factually correct, you have a valid, sound basis for returning the feature for a fix. By contrast, many, maybe most testers operate by the principle:
"If the program is working properly, the test will pass. The test passed, therefore the program is working properly."
This can be written as: P=>Q. Q, therefore P.
But this is the fallacy of "affirming the consequent" and does not show what the tester believes it shows. That is, they mistakenly believe that the feature has been "validated" and can be shipped. When Q is given, P may in fact either be true or it may be untrue for P=>Q, and this can be shown with a truth table.
Modus tollens is core to Karl Popper's notion of science as falsification, and testing should proceed in much the same way. We're attempting to falsify the claim that the feature always works under every explicit and implicit circumstance, rather than attempting to verify that it works in the narrow sense that it can work in some proscribed way.