Message format/specification for distributed REST services? - rest

I have a growing number of REST services that talk to each other with JSON. Right now, the communication is direct, but it's possible that a broker might process and distribute later on.
This is the only one I've found so far:
https://github.com/cjus/umf/blob/master/umf.md
Are there others that would be better suited? Thanks.

You can use JSON Schema.
There is more explanatory info available also.

Related

How can I get two platforms to exchange data by leveraging their REST APIs

I use two independent eCommerce platforms that don't officially support communication with one another, however, both have exposed REST API for managing products, orders, etc.
My question is: how can I leverage their APIs to allow the two platforms to exchange data?
Neither platform offers a workspace to code API calls and execute on retrieved records, so I don't understand how this scenario fits in the client server model.
Would this require a third node to be set up to act as the client with each of the eCommerce platforms acting as servers?
If so, then I surmise that this client would need to, for example, perform a GET request from Platform A, store the retrieved records in a database, then perform a POST to create the records in Platform B.
If my above understanding is correct, can anyone please advise on a platform/service with which this functionality can be built and managed? Is Postman what I'm looking for?
I'm still new to web API development, so please excuse my ignorance.
Any help would be much appreciated.
Thanks

Data synchronization between primary and redundant servers

I want to synchronize data among a set of REST API servers(Spring Boot based API cluster) periodically. Any instance in the cluster should be able to broadcast new information to all others.
I don't want to use a DB here. I am trying to find a lightweight library that can be used inside the API for this purpose. Is it possible to use Atomoix/Hazelcast/ZooKeeper for this purpose? If so, it will be really helpful if someone can post a sample code - if possible.
My thanks in advance.
In Hazelcast you can do it through WAN replication.
It is an enterprise feature you have to buy a license.
Hazelcast can be used for this use-case. Each of the REST instances will create an embedded Hazelcast member within its JVM. Hazelcast members then discover each other and form the cluster. Your REST apps will use the IMap or ReplicatedMap service - a distributed key-value store (IMap can store more data, ReplicatedMap is faster). Once you write a data to the IMap all other instances see it right away.
See the code sample here: https://docs.hazelcast.com/hazelcast/latest/getting-started/get-started-java.html#complete-code-samples
This feature and the Spring integration are open-source.

Possible to search all topics data in Kafka?

I need a solution preferably something inbuilt (rather than creating my own application) which would help management search through multiple/all topics in Kafka. We are using Confluent Platform. Basically user should be able to search a keyword in a UI and it should search current log of multiple/all Kafka topics and return the data. All the topics in our environment use json to communicate.
So this search would enable us to track flow for example, multiple microservices send data from one system to another system and this flow can be tracked via a correlation id which is present in all the jsons. So if someone searchers this correlation id he should be able to see the messages involved in the flow. This search would have more use cases later on.
We need a solution which would have minimal coding involved. We would prefer to use a UI like Kibana.
On basic reading I suspect below solutions but not really sure as I am new to Confluent (used open-source Apache Kafka earlier):
Sol 1: use ksqldb. (need more help on how to use it)
Sol 2: Stream all topics data using Kafka Connect to Elastic Search by using inbuilt plugin and use Kibana on top of Elastic.
Kindly help to find the best case alternative.
You could use Elastic, sure.
You could also use Splunk, though.
There is also the pdk tool offered by Pilosa that creates a distributed index over Kafka events. (no affiliation)
Another option would be distributed tracing using interceptors between clients, not "on all topics", which sounds like what you actually need

OrientDB in Azure

We would like to use OrientDB Graph in an Azure environment. Does anybody has experience using it? We also would like to know if high availability from OrientDB is required under Azure cloud? Azure already offers high availability for Azure storage, Azure Drive and SQL. I understand that they have replications and load balancing built in.
This is super important because we prefer not to get into the business of replications and infrastructure management.
Thanks
So you can spin up 2 or more machines and install OrientDB on them, then configure them together as a distributed cluster. However I haven't been able to find any way that is simpler, easier to do. I am interested in this topic too.
Azure does have features such as geo-replication, which is protects your data against a major data-center incident but doesn't provide any performance benefit and will not make it highly available.
Although pretty reliable, occasionally Microsoft will reboot servers for updates, so to protect against downtime you can use affinity groups so that, of your 2 or more servers, one will always be online. This however does need to be used in conjunction with database replication and ideally load balancing.
It's also worth noting that OrientDB recommends clusters have an odd number of servers as this can prevent conflicts when synchronising data after a communication issue between the servers.
I am using it in amazon and I had to create a java project to monitor http requests inserts and queries. The queries are very fast but takes longer inserting data .
I recommend this type of graph database mode to decrease the time of the queries. Also if you have empty fields OrientDB manages very well compared to other databases .
If you need help with the java project can response to this post and I´ll help u.
I hope it helps. Good luck.

SOA: Joining data across multiple services

Imagine we have 2 services: Product and Order. Based on my understanding of SOA, I know that each service can have its own data store (a separate database, or a group of tables in the same database). But no Service is allowed to touch the data store of another Service directly.
Now, imagine we have stored the product and order data independently inside Product and Order Services. In the Order Service, we can identify products by their ID.
My question is: With this architecture, how can I display the list of orders and product details on the "same" page?
My understanding is that I should get the list of OrderItems from OrderService. Each OrderItem has a ProductID. Now, if I make a separate call to ProductService to retrieve the details about each Product, that would be very inefficient.
How would you approach this problem?
Cheers,
Mosh
I did some research and found 2 different solutions for this.
1- Services can cache data of other Services locally. But this requires a pub/sub mechanism, so any changes in the source of the data should be published so the subscribing Services can update their local cache. This is costly to implement, but is the fastest solution because the Service has the required data locally. It also increases the availability of a Service by preventing it from being dependent to the data of other Services. In other words, if the other Service is not available, it can still do its job by its cache data.
2- Alternatively, a Service can query a "list" of objects from another Service by supplying a list of identifiers. This prevents a separate call to be made to the target service to get the details of a given object. This is easier to implement but performance-wise, is not as fast as solution 1. Also, in case the target Service is not available, the source Service cannot do its job.
Hope this helps others who have come across this issue.
Mosh
DB integration (which is really what you are talking about when two services share a table in a DB) is wrong at so many levels!
It completely breaks some of the major principles of software engineering
loose coupling,
encapsulation
separation of concerns
A service should be (to earn that name) completely independent, namely:
it must not rely on others to ensure the consistency and coherence of its data
it must not rely on others to guaranty the security of its data
it must not depend on external implementations (only interfaces)
Two services that share data at the DB level are unable to guarantee any of the former.
The fact that you "control" both services is completely irrelevant. Today you control... tomorrow you might want to outsource or replace one of the services. That should be as simple as ensuring the proper interfaces are in place.
Imagine both services that share a table with some field (varchar) in it. Now one service needs to change that field to numeric... bang the other service stops functioning - loose coupling goes down the drain.
Most of the time the trick lies in properly defining the service scope and clearly stipulating what a service do and what it doesn't do. You should also avoid turning everything into a service. Set your service granularity to high and services will start popping everywhere and integration headaches will escalate.
That being said, there are some situations where data integration between services poses some challenges. The main premiss do, should always be - data can belong only to one service. Data is intrinsically tied to business logic that affects data consistency and coherence and as such there should never be more than one service controlling any given data.
Another approach would be to have some sort of data source that lives outside of the SOA services. This data source could be considered your cache of the data, your operational data source or even a data warehouse. Extraction packages can export the data from the services (and/or some sort of real time mechanism). You can query this data source how you want.
The advantage of this approach is that the SOA black box is maintained and you can swap out a service knowing how you have coupled it.
Disadvantage is the added complexity and maintenance overhead.
SOA is just a buzz-phrase for deploying components behind web services. How many data stores you have is entirely up to you. In some cases it makes sense to have partitioned data behind individual components, and in other cases all the data lives behind one service, and in other cases many components that expose service interfaces connect to the same database via the database's connection protocol. Approach the problem by approaching the problem, not be imposing artificial constraints.
I don't think there is any principle in SOA that services should have separate data store. In general it is actually impractical. Yes you can have product and order service and the client can do the join using web service call as you said and this may be acceptable in some scenario. But that doesn't mean that you cannot have a specific service for a client if you already know the client's behaviour and performance requirements.
What I mean is that you should have a search service that returns orders and products with the join done in database. This is practical and would solve your business problem.
It is unfortunate to see this whole discussion being deteriorating in a "can I use a shared database or not in SOA" statement quest, which is totally irrelevant and does not help answering the original question at all.
More then often in a real world situation the data is already stored in different systems to start with. Customer data for example is coming from CRM, product data from SAP, contract data from yet another different source.
It is not a quest for getting this data technically together, rather than an understanding there is only one source of the data. To differently put it, there is only one owner of the data within your enterprise, who is solely responsible for maintaining it and ensuring the correct data quality.
Storing data locally for performance reasons means replicating data, which is more than often a dangerous venture, unless you have a solid caching strategy in place. I think Mosh has given some sensible answers when faced with an existing application landscape.