I am developing monolithic applications since many years and want to try microservices and containers now.
For learning microservices and containers I am planning a small calendar application (as a web application), where you can create dates and invite others to them. I identified three services that I have to implement:
Authentication service -> handles login, gives back a JWT
UserData services -> handles registration, shows user data, handles edits of user data (profile picture, name, short description)
Calendar service -> for creating, editing and deleting dates, inviting others to dates, viewing dates and so on.
This is the part I feel confident about, but if I did already something wrong, please correct me.
Where I am not sure what to do is how to implement the database and frontend part.
Database
I never used Neo4j or any graphbased database before, but they seem to work well for this use case and I want to learn something new. Maybe my database choice may not be relevant here, but my question is:
Should every microservice have its own database or should they access the same database?
Some data sets are connected, like every date has a creator and multiple participants.
And should the database run in a container or should it be hosted on the server directly? (I want to self-host the database)
For storing profile pictures I decided to use a shared container volume, so multiple instances of a service can access the same files. But I never used containers before, so if you have a better idea, I am open to hear it.
Frontend
Should I build a single (monolithic) frontend application, or is it useful to build some kind of micro frontend, which contains only a navigation and loads other parts like the calendar view or user data view from the above defined services? And if yes, should I pack the UserData frontend and the UserData service into one container?
MVP
I want the whole application to be useable by others, so they can install it on their own servers/cloud. Is it possible to pack the whole application (all services and frontend) into one package and make it installable in kubernetes with a single step, but in a way, that each service still has its own container? In my head this sounds necessary, because the calendar service won’t work without the userdata service, because every date needs a user who creates it.
Additional optional services
Imagine I want to add some additional but optional features like real time chat. What is the best way to check if these features are installed? Should I add a management service which checks which services exist and tells the frontend which navigation links it should show, or should the frontend application ping all possible services to check if they are installed?
I know, these are many questions, but I think they are tied together, because choices on one part can influence others.
I am looking forward for your answers.
I'm further along than you but far from "microservice expert". But I'll try my best:
Should every microservice have its own database or should they access the same database?
The rule of thumb is that a microservice should have its own database. This decouples them, making it so that your contract between services is just the API. Furthermore it keeps the logic simpler within a service in that there's less types for data that your service is responsible for handling.
And should the database run in a container or should it be hosted on the server directly?
I'm not a fan of running a database in a container. In general, a database:
can consume a lot of resources (depending on the query)
sustains long-lived connections
is something you vertically scale, rather than horizontally
These qualities reflect a poor case for containerization, imho.
For storing profile pictures I decided to use a shared container volume
Nothing wrong with this, but using cloud storage like Amazon S3 is a popular move here, just so you don't have to manage the state of the volume. i.e. if the volume goes away, do you still want your pictures around?
Should I build a single (monolithic) frontend application?
I would say yes: this is would be the simplest approach. The other big motivation for microservices is to allow separate development teams to work independently. If that's not the case here, I think you should avoid yourself the headache for micro-frontends.
And if yes, should I pack the UserData frontend and the UserData service into one container?
I'm not sure I perfectly understand here. I think it's fine to have the frontend and backend service as part of the same codebase, which can get "deployed" together. How you want to serve the frontend is completely up to you and how it's implemented. Some like to serve it through a CDN to minimize latency, but I've never seen the need.
Is it possible to pack the whole application (all services and frontend) into one package and make it installable in kubernetes with a single step?
This is use-case for Helm charts: A package manager for kubernetes.
Imagine I want to add some additional but optional features like real time chat. What is the best way to check if these features are installed? Should I add a management service which checks which services exist and tells the frontend which navigation links it should show, or should the frontend application ping all possible services to check if they are installed?
There's a thin line between what you're describing and monitoring. If it were me, I'd pick the service that radiates this information and just set some booleans in a config file or something. That's because this state is isn't going to change until reinstall, so there's no need to check it on an interval.
--
Those were my thoughts but, as with all things architecture, there's no universal truth. Some times exceptions are warranted depending on your actual circumstances.
Related
I need some help with deciding on the architecture of my project (a web app for unlocking discounts). I am first planning on creating the website (React for the front-end & Django for the back-end, PostgreSQL database). In the future, I may create a mobile app too for Android & iOS (unsure what front-end framework yet).
So I have decided I want the front-end and back-end to be completely separated so the back-end is a REST api. This will allow me to not have to create multiple back-ends for mobile apps.
But, after researching, I have found that this could be quite expensive in terms of server costs. This is a new business and I am the only developer so funding isn't high. So I was thinking that I could deploy the front-end & back-end on the same server but as separate apps that talk via nginx?
I have 4 questions about this:
If I do this, would it still be possible to reuse the back-end as a REST api for the mobile apps or is that a no because it's linked to the web front-end?
If it is possible, would I be able to host the mobile front-end in the same server (so have everything hosted on 1 server)?
Is this a stupid idea - would I just be better off deploying everything into separate servers in the long-run (to reduce load)?
Should I just worry about this in the future? And for now just deploy the separated web front-end & back-end to the same server.
I have never really deployed anything into a real life production environment so I'm sorry if my questions seem silly. I haven't started development yet but I want to think about scalability & future extensibility before I start. Thank you.
Nowadays I'd go with a serverless approach. Instead of having servers to maintain you can focus on your app functionalities.
There are a lot of options. You can check, for example, AWS Amplify (https://aws.amazon.com/amplify/) or Netlify (https://www.netlify.com/) for a more "full-stack" approach.
In AWS, you also can keep separated projects, having your backend in lambdas and your frontend served through S3 + CloudFront. You also don't have servers to care about.
There are only examples of how you can solve your problem without servers, but answering your questions:
You can reuse your APIs regardless of the way your app is deployed. It will be more related to how you designed them;
Yes, you can host everything in a single server if you want, but I really don't recommend that;
If you don't want to pay for 24/7 servers, you can go for a serverless approach;
As I told you before, you can do what you want without worrying about servers.
Your main point of focus is to keep the cost lower and to implement a good solution also. My suggestion would be to look for AWS Lightsail. Lightsail offers fixed price VM which you can configure yourself, and it starts from $3.5 / month at the time of writing this answer.
My answers to your questions
If I do this, would it still be possible to reuse the back-end as a REST api for the mobile apps or is that a no because it's linked to the web front-end?
Yes, it's possible. Keep the frontend and backend in different repo, and you can deploy it as docker instances on the same server. You will have 1 frontend docker container and 1 backend docker container, and they can communicate with each other.
If it is possible, would I be able to host the mobile front-end in the same server (so have everything hosted on 1 server)?
For mobile, you will develop a mobile application which you can publish to playstore or deploy to smartphone. Your app can then call the backend service and get the JSON in response. So you have to design your backend in such a way that it can serve data to both requests.
Is this a stupid idea - would I just be better off deploying everything into separate servers in the long-run (to reduce load)?
For long term and design perspective, you need to consider factors like scalability, maintainability, security etc.., so its always better to have multiple server to avoid single point of failure.
Should I just worry about this in the future? And for now just deploy the separated web front-end & back-end to the same server.
My advice to you will be to think carefully now, so you don't get nightmares in the future. Invest your time now and design a stable solution which could help you in long-term. As you mentioned that its a small business, but your solution should be able to easy handle growth.
My suggestion
As suggested by the Paulo, S3 + CloudFront looks good for frontend. You can get 1 year free CDN using Lightsail.
For Backend, you should at least have 2 (I will suggest minimum 3) servers and deploy backend docker containers. You can use docker compose to automate the deployment. If you want to orchestrate then Docker Swarm Mode is best. With this you will avoid single point of failure. You can get very affordable servers from Amazon Lightsail
For database, you need to make it scalable. To ensure scalability and High Avalability we should have replicated DB. Minimum 3 DB instances will be good starting point. MongoDB is a good choice. With simple configuration you can enable DB replication. 1 Master 2 slaves instances.
1 Load-balancer in front of your servers to distribute the load. To save the cost you can configure the Load-balancer yourself but this will add learning curve and you will have to spent time and understanding the details. The better solution is to use a managed load balancer. Lightsail offers Load Balancer for $18 / month at the time writing this answer.
The above mentioned solution is cost-effective and will give you long-term benefit and also you can estimate the cost based on your solution.
Obviously, this can still be improved but I tried to cover the necessary aspects of the question asked.
Background:
I'm building an API service app. The app is just like any other, you send an HTTP request and receive a response. This seems simple up until I start thinking about user registration, payments, authentication, logging and so on.
Application:
tl;dr simple app diagram
Endpoints listening for HTTP requests and doing all the request related work. This is the core of the service, what the service user would use this app for. Directly not accessible to the end user (unless somehow it knows the url). Python flask server, deployed on google cloud RUN.
API gateway acting like a proxy and a single access point forwarding the requests to the endpoints. This is the service access point for the end users. This part will also be responsible for authentication, limitations, logging and tracking the use of the API endpoints. Python flask server, deployed on google cloud RUN.
Website including documentation, demo and show off of API calls through API gateway, registration, payment (thinking of Stripe) etc. VueJS app on NodeJS server on google cloud compute VM.
Database storing credentials of registered users, payment information and auth keys. Not implemented yet.
Problems:
Is this architecture proper? What could be done differently or improved? How could I further simplify all the interactions between separate parts of the app? Am I not missing any essential parts?
Haven't yet implemented the database part and I'm not sure what should I
use? There are plenty of options on google cloud. Also I could go with something simple and just install a DB with http/JSON interface on google cloud compute VM. How do I chose the DB? Given such an app, what would be the best choice?
Please recommend literature/blogs/other sources of info on similar app
architecture for new developers not familiar with it?
This is pretty open ended, but here are some general comments:
Think about how your UI will work. Are you setting up a static app served directly from cloud storage or do you need something rendered on the server? Personally I prefer separating UI from API when I can but you need to be aware of things like search engine optimization. Even if you need to render some content dynamically your site can still be static. Take a look at static site generators like Gatsby. I haven't had to implement a server rendered UI in years and that makes me happy.
API gateway might be fine, but you don't really need it for anything. It might be simpler to start without it and concentrate on what actually matters. If your APIs are being called by an external client you can't trust the calls anyways and any API key you might be using will be exposed. I'd say don't worry about it for a single app. That being said, if you definitely want to use a GW then use one, just be aware that it is mostly a glorified proxy and not some core part of your architecture.
Make sure your API implementations don't store any local state so you can rely on Cloud Run scaling your services up and down. Definitely don't ever store state directly inside your containers. If you need state on the server it needs to be in some external data store.
Use JWTs or an external IDM (that will generate JWTs) for authentication. Keep session data on the client side as much as possible and pass the JWT in every API call to authenticate the caller. If you are implementing login on your own the only APIs you need to expose without tokens are for auth and password recovery, which you can separate into their own service.
Database selection depends on how well you understand your processes, how transactional your services are and your existing skillset. Overall I would use what you are comfortable with, you can probably succeed with a lot of things. Certain NoSQL flavors can seem simple on the surface but if you don't have a clear understanding on the types of queries you need to run they can get tedious to work with. Generally you should stick to relational databases for OLAP style implementations and consider NoSQL for OLTP. Personally I like MongoDB and it is very popular, probably because it sort of sits in the middle of the pack which makes it fit a lot of applications. Using MongoDB also makes you cloud agnostic since it is available on every platform. Using platform specific database flavors can lock you down to a specific vendor.
Whatever you do, don't start installing things on VMs. You can be almost 100% sure you are doing it wrong if this comes up. Remember, the services you consume don't all have to be managed by Google or even run on GCP. You can get MongoDB capacity directly from MongoDB who manage it on your behalf on all of the Big3 cloud vendors.
At least think about the long term, even if you don't necessarily need to have it impact your architecture right now. If you are expecting your app to be up for years try to make it more platform agnostic than less. This might mean sticking away from some really platform specific serverless features that will force you to jump a couple of extra hoops. If you are using Cloud Run you are using containers which already makes your app pretty portable, don't lock it to one platform by using a lot of platform specific features. That being said, don't stay away from them either. You should always go for the low hanging fruit, so don't try to avoid using things like secrets manager etc. If your app has a short lifespan and you need really fast time to market then don't worry about it.
Just my 2c, what you are doing is very generic and can be done in a lot of different ways.
We are currently using direct DB connection to query mongodb from our scripts and retrieve the required data.
Is it advisable / best practice to make the data retrieval from DB as a microservice.
It does until it doesn't :)
A service needs to get its data from somewhere and a database is a good start. If you have high loads you may find that you need to add a cache in the middle see this post from Instagram engineering https://instagram-engineering.com/thundering-herds-promises-82191c8af57d
edit (after comment)
generally speaking, a service should own its database and other services shouldn't access another database service directly only via its API. The idea is to keep services autonomous and enable them to evolve independently.
Depending on the size of microservice, that's now always practical since it can make the overhead of having the service be more of the utility it provide (I call this nanoservices). Also, if you have a lot of services you don't want to allow each one to talk to any other (even not via the DB) since you just get a huge mess. The way I see it there should be clear logical boundaries (services or microservices) and then within each such logical services you may find that it makes sense to have more than one "parts" (which I call aspects) e.g. they have different scaling needs or different suitable technologies etc. When you set things this way aspects can access the same database and services shouldn't (and you can still tame the chaos :) )
One last thing to think about - who said API is only a REST API, you can add views on top of the data that belongs to another service and as long as you treat that like an API (security, versioning etc.) you can have other services access that as well
I'm in the process of designing a cloud deployed website for a new solution my company is looking to provide. I have been attempting to answer a few questions and haven't had any luck, so when in rome.
First, I don't want the website to be stuck to any one particular framework. I know there is no way to completely future proof a website, but I would rather not put all of our eggs in one basket.
Secondly, I want to have a separation between the front and back end entirely. I have a list of reasons why I'm looking to do this, don't necessarily want to get into the conversation of what they are. Server Side rendering for the most part is out of the question.
So where does that leave me?
My initial thoughts on the design are to have a REST API that can be accessed for any API calls (this may be turned to GraphQL in the future).
The design decisions that I'm mostly wresting with are for the front end. The website will be a dashboard type system, where tenants can log in and see screens for them.
I was thinking that I would have a sort of shell, that hooks on to the index.html. This would have it's own routing, that would render micro-applications that are completely separate from the shell logic.
So for example, if I load index.html, path being "/"
It has some routes that it's responsible for, lets say
"/todos"
"/account"
If I accessed the /todos route, my shell application would then render that micro app. This application would be completely separate from the shell, except some data that might be loaded via the window. Once this application is rendered via the shell application.
So my todos route, for example, could be a redux application that's independent. It could have it's own routing, etc.
Is this is a common architecture? Are there any examples of this? Is there a better way of going about this?
Thanks for any insight!
Sounds like your well and truly over engineering this beast.
You may take on such an architecture for a HUGE build with many dev teams all working separately. Small agile team, the above would create so much overhead in boilerplate and brain ache in context switching between each "app"
Micro-service architecture is seriously great. Just don't break it up too small, read your use case well and break your services up accordingly.
For example: we are a team of 3. We have a pretty large-ish app devised into:
Php API
Backend management interface (redux)
Frontend website (html, react, php)
Search service (elastic search)
Cache (redis)
Data store (mysql)
All on running in multiple docker containers across multiple hosts. Pull down the backend.. Fine the frontend website is still up and running!
Imagine we have 2 services: Product and Order. Based on my understanding of SOA, I know that each service can have its own data store (a separate database, or a group of tables in the same database). But no Service is allowed to touch the data store of another Service directly.
Now, imagine we have stored the product and order data independently inside Product and Order Services. In the Order Service, we can identify products by their ID.
My question is: With this architecture, how can I display the list of orders and product details on the "same" page?
My understanding is that I should get the list of OrderItems from OrderService. Each OrderItem has a ProductID. Now, if I make a separate call to ProductService to retrieve the details about each Product, that would be very inefficient.
How would you approach this problem?
Cheers,
Mosh
I did some research and found 2 different solutions for this.
1- Services can cache data of other Services locally. But this requires a pub/sub mechanism, so any changes in the source of the data should be published so the subscribing Services can update their local cache. This is costly to implement, but is the fastest solution because the Service has the required data locally. It also increases the availability of a Service by preventing it from being dependent to the data of other Services. In other words, if the other Service is not available, it can still do its job by its cache data.
2- Alternatively, a Service can query a "list" of objects from another Service by supplying a list of identifiers. This prevents a separate call to be made to the target service to get the details of a given object. This is easier to implement but performance-wise, is not as fast as solution 1. Also, in case the target Service is not available, the source Service cannot do its job.
Hope this helps others who have come across this issue.
Mosh
DB integration (which is really what you are talking about when two services share a table in a DB) is wrong at so many levels!
It completely breaks some of the major principles of software engineering
loose coupling,
encapsulation
separation of concerns
A service should be (to earn that name) completely independent, namely:
it must not rely on others to ensure the consistency and coherence of its data
it must not rely on others to guaranty the security of its data
it must not depend on external implementations (only interfaces)
Two services that share data at the DB level are unable to guarantee any of the former.
The fact that you "control" both services is completely irrelevant. Today you control... tomorrow you might want to outsource or replace one of the services. That should be as simple as ensuring the proper interfaces are in place.
Imagine both services that share a table with some field (varchar) in it. Now one service needs to change that field to numeric... bang the other service stops functioning - loose coupling goes down the drain.
Most of the time the trick lies in properly defining the service scope and clearly stipulating what a service do and what it doesn't do. You should also avoid turning everything into a service. Set your service granularity to high and services will start popping everywhere and integration headaches will escalate.
That being said, there are some situations where data integration between services poses some challenges. The main premiss do, should always be - data can belong only to one service. Data is intrinsically tied to business logic that affects data consistency and coherence and as such there should never be more than one service controlling any given data.
Another approach would be to have some sort of data source that lives outside of the SOA services. This data source could be considered your cache of the data, your operational data source or even a data warehouse. Extraction packages can export the data from the services (and/or some sort of real time mechanism). You can query this data source how you want.
The advantage of this approach is that the SOA black box is maintained and you can swap out a service knowing how you have coupled it.
Disadvantage is the added complexity and maintenance overhead.
SOA is just a buzz-phrase for deploying components behind web services. How many data stores you have is entirely up to you. In some cases it makes sense to have partitioned data behind individual components, and in other cases all the data lives behind one service, and in other cases many components that expose service interfaces connect to the same database via the database's connection protocol. Approach the problem by approaching the problem, not be imposing artificial constraints.
I don't think there is any principle in SOA that services should have separate data store. In general it is actually impractical. Yes you can have product and order service and the client can do the join using web service call as you said and this may be acceptable in some scenario. But that doesn't mean that you cannot have a specific service for a client if you already know the client's behaviour and performance requirements.
What I mean is that you should have a search service that returns orders and products with the join done in database. This is practical and would solve your business problem.
It is unfortunate to see this whole discussion being deteriorating in a "can I use a shared database or not in SOA" statement quest, which is totally irrelevant and does not help answering the original question at all.
More then often in a real world situation the data is already stored in different systems to start with. Customer data for example is coming from CRM, product data from SAP, contract data from yet another different source.
It is not a quest for getting this data technically together, rather than an understanding there is only one source of the data. To differently put it, there is only one owner of the data within your enterprise, who is solely responsible for maintaining it and ensuring the correct data quality.
Storing data locally for performance reasons means replicating data, which is more than often a dangerous venture, unless you have a solid caching strategy in place. I think Mosh has given some sensible answers when faced with an existing application landscape.