Side Project Log 7/12/2023

This side project log covers work done from 4/29/2023 - 7/12/2023

This side project log is more than a bit late due to working a busy summer internship at FUTO!

NWS Container Deployment Service

I have finally added SSL to NWS CDS. This was challenging, as it required handling ACME challenges and certificate distribution across a set of geo-distributed Kubernetes clusters. I detail the complexities of this in a previous blog I wrote. In order to implement auto-created/auto-renewing SSL, I implemented the below solution:

Diagram of NWS SSL Architecture

First, a user creates a request to add SSL to their NWS CDS service through the web UI which calls the NWS API (not pictured)

Then, the NWS API calls SSLiaison (in-house written software) which adds the domain to Caddy's list of domains. Caddy will then attempt to create an SSL certificate from an ACME server (not pictured).

The ACME server will query NWS for the challenge response by requesting a file at /.well-known/acme-challenge on the domain to be verified (this is the green arrows).

HAProxy will re-route these requests to NWS Hill Country, which is where the NWS Management Engine (NWSME) lives (this is the orange arrows). (NWSME controls what's deployed on each k8s cluster on NWS)

HAProxy in NWS Hill Country will then route this request to Caddy, which will solve the http-01 challenge, and then get the certificate from the ACME server. Once it does this, it will write the certificate to a directory that is bind-mounted to both Caddy and SSLiaison.

SSLiaison will detect this new file, parse it into a k8s manifest file, and then add it to our GitOps repo which is hosted in GitHub.

From here, the certificate will be added to all the k8s clusters via Rancher Fleet.


For next steps, I'd like to revise this solution such that it doesn't have a single point of failure. Currently, if NWS Hill Country is down (which it is about 0.025% of the time), then SSL certificates won't be able to be created or renewed.

To do this, I will have SSLiaison implement the ACME client specification so that it can create and respond do ACME HTTP challenges. SSLiaison will run on NWS CDS (so that it's running on all of our k8s clusters and is HA) instead of running as a standalone docker container. I'll have SSLiaison use some distributed database (probably CockroachDB) to store the HTTP challenges so that it doesn't matter which k8s cluster the challenge request from the ACME server is routed to.

Next Steps for NWS

Olney

Rust, ActixWeb, PostgreSQL

Olney is a new project I am starting with my friend Sridhar Nandigam. It aims to make tracking your job applications easier. Most of my friends either use spreadsheets or Trello to track their job applications, I think that we can make something that's a bit better for the job. Some features I'd like to have are: resume version attached to your application, job posting notifications from job boards such as pittcsc, and watching your email for emails from recruiters. Currently, I have part of the backend setup with basic CRUD operations. Now that I'm done with the latest batch of NWS work, this is next on my list to work on.


These projects had minimal/no work done on them: RingGold, SQUIRREL