Building a PaaS from Scratch #2: Deployment Strategies and Setting Up the Development Environment
This is Part 2 of me trying to build a simple clone of a PaaS inspired by Google App Engine.
Researching Deployment Strategies
I started looking into how Google deploys apps to App Engine Standard. Unfortunately I couldn't find much information. Their documentation says
The App Engine standard environment is based on container instances running on Google's infrastructure. Containers are preconfigured with one of several available runtimes.... Applications run in a secure, sandboxed environment,...
This told me that applications run in containers inside a sandboxed environment. I have experience with Docker and containers but I was less familiar with the specifics of sandboxing. I knew that containerization is different and less complex than virtualization, and that sandboxing means running applications in a restricted environment. However, I was unsure how to achieve this for my clone.
I kept on looking. I checked out fly.io which I became aware of back when I was learning about Elixir. I knew they had technical articles on their blog and I learned about two technologies from them specifically on running containers without Dockerfiles and sandboxing: Firecracker VM and gVisor. Firecracker VM was made by Amazon to run instances on AWS Lambda and gVisor was made by Google to run instances on GCE. I was getting close.
Later, I looked into how Cloud Run works. I ignored it at first since I knew this did container based deployments (CaaS) with Dockerfiles which is different from the App Engine Standard (PaaS) approach I wanted. Reading about it now however, I found out that you can deploy apps to Cloud Run without a Dockerfile. Cloud Run uses something called buildpacks to build a container based on your application. What are buildpacks was the next question?
Buildpacks are a CNCF standard made by Heroku that deploys containers to the cloud without a Dockerfile. I found out it is even used by App Engine Flex when you use the env: flex
option in app.yaml
.
I was fascinated. I started to look more into this since I hadn't really found out how apps were deployed on App Engine Standard. I asked ChatGPT about what tools App Engine Standard and AWS Elastic Beanstalk (another PaaS) use after including the fly.io links from above. It found me a GitHub Issues page that said Google uses gVisor to deploy Python apps to App Engine Standard. Later, it found me a page from Google's blog that confirmed this.
This was it. This was all I needed. I could have stopped here.
But I didn't.
I remembered learning about V8 Isolates from Cloudflare and Ryan Dahl when I was learning about NodeJS and the V8 engine. The idea of quickly deploying JS/ TS apps without spinning up a container and running npm install
fascinated me back then. I thought I could add this to my clone.
Roadmap
I started to wonder if I could incrementally build a project that would go from simple to complex, more abstract to more bare-metal strategies.
I ended upon this roadmap:
1) Deploy static sites
2) Deploy Python apps in containers with Buildpack
3) Deploy Ruby apps in containers using Dockerfiles
4) Deploy Go apps in a sandboxed environment using gVisor
5) Deploy TS/ JS apps using V8 Isolates
6) Deploy Elixir apps in microVMs using Firecracker VM
7) Deploy TS/ JS, Rust, Ruby hosted functions using Firecracker VM
8) Deploy WASM apps in the V8 Isolates
9) Deploying PHP apps with LXC
10) Deploy Java apps with Kata containers
11) Deploy Rust apps with KVM+QEMU
12) Deploy C/ C++ apps with Xen virtualization
13) Deploy Linux OS containers with LXD
14) Introduce Kubernetes to handle orchestration wherever it can, moving this functionality off the control plane
I decided to deploy Python apps in containers with buildpacks. This is not identical to App Engine Standard but it makes sense for two reason: 1) it is in line with my goal of deploying Python apps without a Dockerfile, and 2) gVisor is more complex than buildpacks and Dockerfiles so it is later in the roadmap. Interestingly, this is the strategy DigitalOcean uses for its PaaS, App Platform.
The first step is to deploy static sites since I want to understand how IP addresses and routing work in a multi-tenant environment with isolated apps/ sites. This is essential for the project and so I'm starting small.
I included support for hosted functions (step 7) because I've been intrigued with AWS Lambda for some time and wanted to learn how it works by building something like it. Now that I'm already adding Firecracker VM to this clone, it made sense to implement this functionality.
I also became aware of low-level virtualization techniques like LXC, Kata Containers, KVM+QEMU, Xen, and LXD through my research. I decided to incorporate these in my roadmap as well since these would be a great learning opportunity and my development environment allows me to experiment with them. I also decided to add QEMU since I am curious about adding basic GPU virtualization. I know this will teach me a lot about how GPU cloud computing works.
Development Environment
Initially I thought I could run the Elixir control plane with Docker and Docker Compose on my Mac. However, nested containers and virtualization could get either difficult or impossible.
I have an old Raspberry Pi which I thought I could use as a server instead. I looked into how I can deploy apps on it and stumbled upon an article that showed me how to run a microcloud cluster on it. This was similar to what I wanted to do with my Pi. The article introduced me to Ubuntu Appliances which introduced to Ubuntu Multipass. Multipass is a tool that runs on Linux, Mac, and Windows. According to its documentation you can use it to run your own local mini-cloud which means it can spawn containers and VM instances inside it. I chose to use it to run an Ubuntu server which will run the control plane.
Conclusion
So this is the plan.
I know this project will teach me a lot about containerization, virtualization, cloud computing, orchestration, networking, and operating systems.