Load Balancing Umbraco using Kubernetes

Something that’s been on my mind for quite some time now is how Umbraco fits in the world of Kubernetes and Containerization. See there’s this notion that Kubernetes and stateful applications such as a CMS do not go hand in hand, but what if they could?

Disclaimer:
This is not a tutorial of any kind just me thinking out loud. My apologies for the clickbait title.

How we do today:
We have several sites setup at work where Umbraco is used as a headless CMS on an Azure Web App, sharing content through an API with numerous containerized stateless applications running in Kubernetes clusters. We’ve even implemented our own ContentService as a middleware that stores the content in a MongoDB, this is so that Umbraco can perform updates or even be non responding and our frontend applications will still be running without having to wait for API response from Umbraco. While this setup works perfectly fine, I’m interested to see if this is the only way of using Umbraco with Kubernetes.

Wouldn’t it be possible to setup Umbraco in a Kubernetes cluster and configure Umbraco as we would do in a standard load balanced setup?

When Umbraco is load balanced the frontend servers are as close to stateless as we could possibly come so I wonder if this should be considered a good or bad practice as far as Kubernetes applications go?

While our current setup is both robust and scalable it’s not exactly a cheap solution so maybe this could be an alternative setup for clients with a lower budget but with high availability requirements?

Inspiration:
One of my all time favourite articles on Skrift Magazine is Callum Whyte's article on this subject where he touches a lot on this topic around Umbraco, Containers and Kubernetes and the big question “Should we care?”. This article is a few years old and Kubernetes is mentioned as a “Hipster fad” but now a few years later I think we can all agreed that Kubernetes is here to stay and will probably be a big part of how most of us deploy applications in the near future.

The challenge:
I want to try and setup a Load Balanced Umbraco site running in a Kubernetes cluster where I can scale out my frontend servers to how many replicas that I wish, spread over multiple nodes, apply auto-scaling, self healing and rolling updates with zero downtime.

The plan:
So I’ve set aside some time in March at work to come up with a proof of concept for this kind of setup and I thought I would share my findings and progress in a series of posts on this blog. Maybe if it turns out useful I might package it as a presentation for a Meetup of some sort, or maybe in the end I’ll come to the conclusion that this is not a good setup, but at least I’ll have fun investing it and hopefully I’ll learn a lot about Kubernetes along the way.

This is not a microservice setup:
I’ve mentioned in previous blog posts that most of our applications at work move towards a micro service architecture, where a bunch of containerized micro applications, maintained by Kubernetes, come together to form a complete website.

Not all applications needs to be microservices but all applications need high availability in 2021.

While this setup mentioned in this blog post do involve both containers and Kubernetes I would still argue that this is not a microservice architecture. This is not a bunch of micro applications working together but rather one big application deployed to several locations but I want to see if I can replace the traditional loadbalancer with the greatness that is the Kubernetes Service. Not all applications needs to be microservices but all applications need high availability in 2021.

Windows containers:
I will probably do this in Umbraco 8, which requires Windows containers since it’s .NET Framework. My initial thought was to do it in version 9 (.NET Core) so I could use the lot smaller Linux containers, but since its still a beta and not very much documentation I probably wont go down that road. Maybe I will rebuild it later this year when Umbraco .NET Core is released.

Load Balancing:
The plan is to setup Umbraco as if it was a load balanced environment, using the Azure Kubernetes Service, where the master would be deployed to a single node in the cluster. This is because Umbraco’s back-office does not support Load Balancing. My frontends however would be replicated to several nodes to take full advantage of the greatness of Kubernetes such as zero downtime, self healing and auto-scaling.

Database:
The database cannot be stored in the container itself for obvious reasons, and running the database in its own pod is something we tried and didn't like so for this exercise I figure I will setup a Azure Database that only my CMS node can write to and all other frontends are readonly.

Media:
The media files cannot live in the file system on each replica, so Azure Blob Storage is probably the way to go.

Cache:
Just as we do in a Load balanced setup I need to setup cache bursts using the cache instructions in the Umbraco database. This way when an editor save and publish content each frontend replica will flush their content/output cache and then fetch the latest content from the Azure database.

Exclude umbraco:
Someone mentioned on Twitter that to reduce the size of each container I could exclude the /umbraco folder on my frontend, which sounds interesting and something I will investigate.

I also need to setup redirects so that if anyone would request /umbraco on the frontend that traffic would be redirected to the CMS node.

Examine:
I’m not sure yet how Examine would need to be configured, if it needs any special configurations to work in this containerized setup or if I can just set it up the same way I would do in a load balanced setup in Azure. Someone mentioned ExamineX, which will be interesting to check out.

Sessions:
As in any Load Balanced setup I will need to handle session with a session state provider and I will go with Redis since it’s what I know.

Logging:
By default Umbraco writes to log files on the file system (App_Data/Logs) and these will be complicated to read on each pod, so the logs needs to be offloaded to an external logging tool of some sort, probably Papertrail or Logentries.

Machine Keys:
I need to set a common machine key for all my replicas as in any load balanced setup. (This will be handled for me during the installation process of Umbraco.)

Forms:
It’s hard to build a website today where forms of any sort is not required. I’ve actually never done a load balanced site that uses the Umbraco Forms package but in this case I will do it just to experiment with it in a Load Balanced Kubernetes setup and see if I need to tweak anything or if it works out of the box.

Summary:
As you can see this is still very much in a planning stage so if you’ve done this kind of thing before and failed or succeed please let me know. So far when I’ve raised the question on Twitter I’ve only heard “it should work” or “it shouldn’t be a problem” but no one’s actually said “I’ve done this on a production site and can confirm that it works (or not)” which is why I’m curious.

Am I going down a dead end or do you see any obvious parts that I’ve missed? Let me know and I’ll take it in to my investigation.

Wish me luck my friends!

Cheers! ❤️