As many organizations are transitioning to deploy SaaS (Software as a service) applications on the cloud they are facing an issue of how to approach and which services and tools to use. Especially when talking about microservice development, Amazon has a service that is a very good fit. EKS (Elastic Kubernetes Service) provides numerous features that SaaS providers offer to organizations with various models of deploying applications but one of them is the most popular and will be briefly explained in the next sections.
The most common but not very efficient and cost-effective way of designing architecture is a single-tenant (siloed model) where every organization has its environment and infrastructure. If you want to handle 5 different organizations it means you need 5 standalone environments with fully independent infrastructure. The consequences are more costs and difficulties with updates and maintenance across all these isolated environments.
The solution for this issue is designing multi-tenant architecture in which a single infrastructure (environment) can serve multiple tenants while providing a highly available, resilient, and scalable system.
Sharing underlying hardware implies that we need to logically isolate different services. By choosing EKS to make the move to a SaaS delivery model, organizations gain cost efficiency, security, scalability, and also smoother deployments.
There are various ways to build multi-tenant SaaS solutions using AWS EKS. It is important to mention that there is no best option when designing this solution, instead, there are several choices that could impact operational complexity, cost efficiency, implementation effort, and maintenance.
Some of the models are listed below:
As the name suggests, cluster-per-tenant means isolating one tenant from one EKS cluster. We can say that the isolation process is simple but not very cost-effective. For every organization, we would provide a new EKS cluster fully isolated from others.
When using the namespace-per-tenant model each tenant is deployed into the same cluster but logically separated using namespaces and other native Kubernetes constructs. This model is referred to as a “silo” since resources are not shared by tenants but provide the best combination of isolation and cost efficiency. An advantage when using the model is ensuring that each tenant has its resources, such as Config Maps, Secrets, and Pods, and allows you to easier manage security policies for each namespace. On the other hand, this requires more configuring since when using the cluster-per-tenant model some of these configurations are not needed. Also, there are Kubernetes resources such as Storage classes or Custom Resources that cannot be namespaced, but overall this model is best when it comes up to the ratio between isolation and costs.
When using the nodes-per-tenant model, EC2 infrastructure is fully isolated (Memory, Network Interface Bandwidth). However, there is a waste of resources such as the spared capacity of under-utilized EC2 Nodes. Also if you have a lot of tenants, there would be a large number of node groups which can lead to operational overhead.
The last option is to handle isolation on the application level. This means that all tenants are deployed within the same cluster and namespace. Of course, this is the most cost-effective model but provides the least isolation
Image below shows what the namespace-per-cluster isolation model looks like. Each organization in a multi-tenant SaaS application will have a separate namespace to logically divide resources into a single cluster. This enables reducing the cost of computing resources and enforcing data privacy without the need to create a separate cluster for each organization. Image also shows that we have a shared EKS cluster with computing resources and each tenant belongs to a specific namespace.
Now let's focus for a bit on common services that are typically part of a SaaS environment:
*From AWS Blog
We can see three AWS services at the bottom of the diagram. Amazon Cognito is used to store identities managed by the User management part of SaaS architecture. Tenant management uses Amazon DynamoDB to store data regarding tenant states and attributes. In this example, Amazon CodePipeline is included as a tool used for provisioning each new tenant that is onboarded to the system. The tenant registration process can vary from an infrastructure point of view as well as tools used to support it.
Now that we understand the baseline infrastructure of the SaaS environment, it is time to put tenants into this system. As previously mentioned we will have a namespace for each new tenant to create a resilient and robust isolation model.
This diagram illustrates what multi-tenant EKS infrastructure could look like. When the first user is trying to reach tenant1.example.com that request will be routed to namespace 1. The same thing will happen when requesting tenant2.example.com when traffic will be routed to namespace 2. Let’s step back and think about tenant namespaces. Do they host the same microservices or not? Actually, you can do both. You can either have a multi-tenant SaaS system where every time a new tenant is onboarded, a new namespace and other Kubernetes constructs are being created with the same containers, which is the case most of the time or on the other hand, you can host different applications for different organizations while having a set of shared resources.
For providing new tenants we’ve used the registration service mentioned in the baseline diagram. New tenant registration can be implemented in various ways using different tools. Whether we use AWS CodePipeline, CircleCI or any other CD (Continuous Delivery) tool doesn't matter. What is important is that we have a process of creating a namespace, deploying services, and configuring routing. As you might guess, the question is how to route traffic to the appropriate tenant namespace or let’s just say Pods, whenever a new tenant is registered. As soon as a new tenant namespace is created a separate ingress resource is also configured for each Pod within that specific namespace.
Using an external load balancer such as NLB (Network Load Balancer) you can route external traffic to specific EKS services within the cluster using Ingress Controller. Every incoming request will hit NLB first and then Ingress perform service-specific routing. Ingress is used to simply add additional layer of routing and control behind NLB. There are different options of routing external traffic to EKS services and this is just one of them. You can learn more here.
Since in the namespace-per-tenant model for building multi-tenant SaaS applications we have one underlying infrastructure, it is important to use ResourceQuota to prevent single tenants from disproportionately using memory and CPU. High usage of these resources by one tenant can deprive others. Using ResourceQuota, resource usage limits for each namespace can be set.
Having persistent storage is not a problem. Amazon provides services like EBS (Elastic Block Store) or EFS (Elastic File System) to create PV (PersistentVolume) for seamlessly assigning and managing storage for the tenants.
This blog post introduces the process of building multi-tenant SaaS solutions using Amazon EKS and the namespace-per-tenant model of isolation. As we could see it provides a lot of flexibility and gives a lot of power when designing multi-tenant SaaS solutions. Since this is an overview blog, implementation parts are out of scope. There are a lot of topics where we could go into more detail such as cross-tenant access, onboarding new tenants, tenant routing within EKS and ingress controller, and many more. But hey, stay tuned for the next one ;)