The Server Stack - Projects

The Stack Apr 28, 2021

It's been a while since I made a post about something that wasn't bad, so here's something that I find good in life: my server stack. It's expanded since the posts I made on Refgd, and hardware has been chopped and changed in the year since I posted server builds there, so I'm gonna start from scratch rather than revise the existing layouts.

Overall Layout

So, from that mess of a diagram, not much makes sense. It's worse in real life, because each box in that image actually represents 5-10 different machines that happen to be in the same location or network. Let's work through this demonic map!

Homeworld Cluster

The homeworld cluster is the set of servers which live with me, at home (hence homeworld). I have physical access to these machines whenever I'm at home, and the entire setup can be taken offline by yanking out an SFP+ cable. As a result, this is the set of servers which does most of the command and control work for the network.

Homeworld Cluster consists of the following servers:

  • Earth - the super, ultimate, "if you get into this you can control everything" server. It's isolated from the outside internet by 2 layers of networking though, so good luck.
  • Moon - Essentially just a DVR box, it's running a variant of FreeBSD and has no purpose other than monitoring the rest of homeworld for breaches, spikes in activity, etc.
  • Mars - The 4U NAS with 80TB of used space on it. Basically, every file I've ever come across is on there, including a 0.4 terapixel (48 million iPhone 12 cameras worth) RAW photo of Boston (no you can't have a copy, unless you have a 2TB disk drive you want to sacrifice).
  • Jupiter: the Raspberry Pi management server. It's an old optiplex which does nothing but manage my 40 Raspberry Pis (hence Jupiter, because all the moons) for web projects, controlling random stuff, some renders, just assorted crap to be honest.
  • Mercury: the speed machine, this thing has 4 ex-datacenter NVidia Tesla V100s in it. It's exclusively used for data processing and rendering, and has nothing on the internal storage except the OS and applications, because there's not enough PCI-E slots for more drives (as for SATA, the board only has M.2 NVMe, which is being used for the boot drive, and U.2, which I don't want to pay for).
  • Neptune: The outbound server, any traffic from the rest of Homeworld gets routed through Neptune before going to any other cluster (except Orion cluster). Outbound traffic (i.e. to the public internet) also goes through Neptune. It's got faster networking than I ever need, but hey, it was on sale.

Orion Cluster

Orion is the offsite backup cluster. Thrice a week, the entire delta of the Mars server is cloned to a server in Orion, so backups are never more than 3 days behind the latest versions.

  • Orion consists of 2 servers: Orion 1 and Orion 2, which are actually in different locations, keeping in line with the 3-2-1 backup system. Orion 1 pulls from master twice a week, and Orion 2 pulls from master once a week.
  • Orion 1 is located somewhere in Switzerland, while Orion 2 is located in Greenland. Why? Both are secure, and both are cheap.

Andromeda Cluster - Outbound 1

If you've used any of the numerous web services I've made over the years (including this blog, and Schoolnotes Portal shameless plug), your computer has probably talked to Andromeda and the Magellanic Clouds (get it, because they're "in the cloud"? I'm funny). All public web services are either hosted on or routed through Andromeda cluster, and some private ones are too.

Andromeda cluster consists of:

  • Andromeda 1, 2, 3 - All duplicates of one another, all hosted by Linode in Sydney. These are the servers that static sites are pushed from, such as Schoolnotes Portal and CSEC Cache.
  • Triangulum - Hosted by Linode in Sydney, but on a different subnet so it gets a different name: this is where my blog (the one you're reading right now) and bounce portal are hosted.
  • Triangulum 2 - A backup server for when Triangulum gets overloaded
  • Yunohost - a personal server which proxies all of my authentication stuff, just so I can have SSL when filling passwords for my own services.

Magellanic Clouds are hosted by:

  • Cloudflare
  • Fastly
  • BunnyCDN
  • Amazon AWS
  • Google Cloud Platform

All of these servers are externally managed. They take care of caching, delivery, DDoS protection, etc.

Apollo Cluster

Apollo is a mirror cluster for Mars.Homeworld.block. Essentially all the servers here do is make duplicates of specific file types, such as images, audio, and video, and then pass them onto Calliope, Terpsichore, and Clio, depending on what the file is being used for and where it's needed.

Apollo Cluster consists of:

  • Hyacinthus - Caches video files, which are then passed on by Clio
  • Narcissus - Caches images, which are then passed on by Calliope
  • Psyche - Caches audio files, which are then passsed on by Terpsichore
  • Zephyr - Deletes old unused files from Hyacinthus, because it's cheaper than upgrading Hyacinthus
I am aware that the muses are being attributed to the wrong things. One day, I'll get around to fixing it.

Calliope also peers files with the Magellanic clouds, because videos take up a lot of space and bandwidth, and maintenance is HARD.

Perseus Cluster

Perseus cluster is where all of my external management happens. TLS certificates are provisioned through Perseus, DNS records and email are both handled through Perseus, and all sorts of other weird stuff, like the mesh which holds the whole thing together, is primarily managed through Perseus (all mission-critical stuff does have fallbacks in other clusters).

Perseus cluster consists of the following:

  • ShiningLock - Provisions, manages, and deploys TLS certificates via Let's Encrypt (ShiningLock is the internal codename for Let's Encrypt)
  • DrowningLock - Proxies the Docker Swarm commands from Homeworld, so that weak containers don't lead straight back home.
  • OpenLock - Manages authentication via LDAP and a dozen other mechanisms, currently being migrated to a distributed version so there is no Single Point of Failure present.
  • Key - fallback DNS server (if Cloudflare is ever down, which it isn't), email server, and secondary healthcheck/OTX for homeworld.
  • Guard - Primary OTX, Threat Management, and Security frontend for the whole mesh. Peers with VirusTotal.
  • BYETPEER - Peer server with Byet, 10GB/s file sharing server.
  • MAGNUSPEER - 1GB/s peer to Magnus' servers in France, in exchange for use of his outbound bandwidth to accelerate Andromeda sites in the EU.

Naming

Internally, all servers are named under an unspecified domain, which I own, but is under WHOISGuard and still has the parked DNS record. For now, we'll pretend it's y.com.

Servers are named as follows:

serverName.clusterName.y.com (e.g. mars.homeworld.y.com)
serverNumber.clusterName.y.com (e.g. 1.andromeda.y.com)

Services are named as follows:

serviceName.serverName.clusterName.y.com (e.g. jellyfin.mars.homeworld.y.com)

While this can be cumbersome, the DNS servers know that these names don't actually exist, and are only accessible once you're already into one of the clusters and even then, only Homeworld can see all other clusters (most can only see Andromeda).

And that's the hardest bits done! I might make a post later on about software, hardware, operating systems, networking, etc. because that's a whole other rat's nest and it's nearly 3am, so I'm just gonna hit post and hope I got everything right.

Tags

Pranav Sharma

I’m a year 12 student at St Marks Catholic College. I specialise in science and mathematics, as well as full-stack software/hardware development. I am currently employed as a Network Administrator.

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.