Hardening and operating a public edge

level:Advanced

This is the final part of Expose your homelab to the internet. By now the edge works: a Hetzner VPS owns your domain and your certificates, a WireGuard¹ tunnel reaches back into your homelab, and Caddy routes every request into your cluster. The build is done. This part is about the part nobody shows you — keeping a public host alive and safe for years, and knowing exactly what happens when something breaks.

🔗 Learn more — ¹ What is a VPN (and how does WireGuard do it)?

The honest framing: you have put a machine on the public internet. That is a real, scanned, attacked surface, not a thought experiment. The whole point of the architecture was to make that surface small and recoverable — one host, almost no state, rebuildable in twenty minutes. This part hardens that one host, then walks the day-two operations: monitoring it, backing up the handful of things that matter, and the failure modes you will actually hit.

Recap the security model — what is actually exposed

Before adding any locks, it is worth being precise about what they are protecting, because the architecture is most of your security. The only thing reachable from the internet is the VPS. Its firewall accepts exactly four ports — 22/tcp (SSH), 80/tcp and 443/tcp (Caddy), and 51820/udp (WireGuard) — and nothing else. Your home network has no port forwards on the router, and your home IP address appears in no DNS² record. The only path from the VPS into your home is the WireGuard tunnel, which is encrypted and peer-locked: the VPS only talks to the one homelab peer whose public key it knows, and the homelab only accepts the VPS.

🔗 Learn more — ² What is DNS?

flowchart LR
    NET["Open internet<br/><i>scanners, bots, visitors</i>"] -->|"only 22, 80, 443, 51820"| VPS["Hetzner VPS 203.0.113.10<br/><i>Caddy · TLS · fail2ban</i>"]
    VPS -->|"WireGuard 51820/udp<br/>peer-locked, encrypted"| HOME["Homelab 10.8.0.2<br/><i>k3s / Docker · behind NAT</i>"]
    HOME -.->|"no inbound from internet"| X["Home router<br/><i>no port forwards · IP not in DNS</i>"]
    style HOME fill:#1b4332,stroke:#2d6a4f,color:#fff
    style VPS fill:#1d3557,stroke:#457b9d,color:#fff

The trust boundary: the VPS is the only host the internet can reach. Compromising it does not hand over the home network — only the tunnel to it, and that tunnel can be cut and re-keyed.

Why this is a strong posture: an attacker on the internet cannot scan, fingerprint, or directly reach anything at home — they cannot even see it. They can only attack the VPS. And the VPS, by design, holds almost nothing worth stealing: no databases, no user data, just a reverse proxy and a tunnel. The valuable state lives at home, behind NAT³, reachable only through an encrypted peer-locked link. Defence in depth here means hardening the one exposed host hard, and keeping the blast radius of losing it small.

🔗 Learn more — ³ What is NAT (and why CGNAT blocks you)?

ℹ️ Note — Hardening is layers, not a switch. Each section below removes one class of risk. None of them alone makes you "secure", and you do not need all of them on day one — but the SSH and WireGuard sections are non-negotiable for a public host.

Lock down SSH further

You already hardened SSH in Provision a Hetzner VPS: it is key-only and root login is disabled. That stops password guessing entirely, which is the single most important thing. What it does not stop is the constant noise — within minutes of a fresh VPS appearing, bots will hammer 22/tcp with thousands of login attempts. They will all fail, but the log spam is real and a misconfiguration could one day expose you. The standard answer is to ban the repeat offenders automatically with fail2ban: it watches the auth log and drops a firewall rule on any IP that fails too many times.

Install it first, then configure it. The package ships a default config in /etc/fail2ban/jail.conf, but you never edit that file — it gets overwritten on upgrade. Instead you create jail.local, which overrides it.

Install fail2ban (on the VPS):

sudo apt update
sudo apt install -y fail2ban

Now create a minimal override that enables the SSH jail and sets sane ban behaviour. The [DEFAULT] block applies to every jail; the [sshd] block turns on the SSH watcher specifically. bantime is how long a banned IP stays blocked, findtime is the window in which failures are counted, and maxretry is how many failures inside that window trigger a ban.

/etc/fail2ban/jail.local:

[DEFAULT]

# Ban for an hour after 5 failures within 10 minutes.
bantime  = 1h
findtime = 10m
maxretry = 5

# Never ban yourself out — add the WireGuard subnet and your own IPs.
ignoreip = 127.0.0.1/8 ::1 10.8.0.0/24

[sshd]
enabled = true
port    = ssh

Restart the service and confirm the jail is active. The status command should show the sshd jail and, after a while on a public host, a non-zero count of banned IPs.

Enable and check fail2ban:

sudo systemctl enable --now fail2ban
sudo fail2ban-client status sshd

💡 Tip — Put the WireGuard subnet 10.8.0.0/24 in ignoreip. If your homelab ever connects back through the tunnel and fumbles a key, you do not want fail2ban banning your own home.

A more modern alternative worth knowing about is CrowdSec. It does the same local log-and-ban job, but also participates in a shared reputation network — IPs that attacked other CrowdSec users get blocked on yours pre-emptively. It is heavier and adds an external dependency, so for a single edge host fail2ban is perfectly adequate; reach for CrowdSec when you want crowd-sourced blocklists.

WireGuard hardening

The tunnel is already in good shape from WireGuard tunnel: VPS to homelab, but it is worth confirming three things, because the tunnel is the one path into your home. First, the VPS firewall should accept WireGuard on 51820/udp and nothing more on its account — which it already does, since the firewall only opens 22, 80, 443, and 51820. Second, the private keys must be readable by root only. WireGuard config files contain the private key in plaintext, so file permissions are the only thing protecting them.

Lock down the WireGuard config (on the VPS and the homelab):

sudo chmod 600 /etc/wireguard/wg0.conf
sudo ls -l /etc/wireguard/wg0.conf   # expect -rw------- root root

Third, the tunnel is already peer-locked by design: each side lists exactly one [Peer] with the other's public key, so the VPS will not accept handshakes from anyone else even if they reach the port. You can tighten this further by setting AllowedIPs on the VPS peer to just 10.8.0.2/32 (only the homelab address), so the tunnel cannot be used to route arbitrary traffic.

⚠️ Warning — If a machine that holds a WireGuard private key is ever lost, stolen, or compromised, treat the key as burned. Generate a new keypair on the affected side, update the [Peer] public key on the other side, and bring the tunnel back up. Rotating keys is cheap; assuming a leaked key is still private is not.

Caddy security: headers, rate limiting, and auth

Caddy already gives you HTTPS for free. Two cheap additions make the front door noticeably tougher. The first is security headers — HTTP response headers that tell browsers to enforce HTTPS, refuse content-type sniffing, and limit how your pages can be framed. These cost nothing and apply to every response. You add them in a header block; the * import or a site-wide snippet applies them everywhere.

Read this before applying it: Strict-Transport-Security tells browsers to only ever reach this host over HTTPS (the max-age is two years in seconds), X-Content-Type-Options stops MIME sniffing, X-Frame-Options blocks clickjacking via framing, and Referrer-Policy trims what you leak in the Referer header. The -Server line removes the header that advertises what is serving the site.

/etc/caddy/Caddyfile (security headers, applied site-wide):

*.example.dev {
	# Keep the wildcard-cert TLS block from Part 6 — do not drop it.
	tls {
		dns cloudflare {env.CF_API_TOKEN}
	}

	header {
		# Force HTTPS for two years, including subdomains.
		Strict-Transport-Security "max-age=63072000; includeSubDomains"
		# Stop browsers guessing content types.
		X-Content-Type-Options "nosniff"
		# Disallow being framed by other sites.
		X-Frame-Options "DENY"
		# Send only the origin on cross-origin requests.
		Referrer-Policy "strict-origin-when-cross-origin"
		# Do not advertise the server software.
		-Server
	}

	reverse_proxy 10.8.0.2:80
}

Reload Caddy after editing — caddy validate first so a typo does not take the edge down.

Validate and reload Caddy:

sudo caddy validate --config /etc/caddy/Caddyfile
sudo systemctl reload caddy

The second addition is rate limiting, to blunt brute-force and scraping. Be honest about the state of this: rate limiting is not in stock Caddy. It lives in the caddy-ratelimit plugin, which means you either build a custom Caddy binary with xcaddy (xcaddy build --with github.com/mholt/caddy-ratelimit) or run the official Docker image with the module added. Once the plugin is present, a rate_limit directive caps requests per client. If you would rather not maintain a custom binary, you can get coarse protection from fail2ban watching the Caddy access log, or simply rely on the fact that nothing valuable is reachable without authenticating to the service behind it.

For admin-only services, the cleanest layer is to put authentication in front of the service at the edge, so an attacker never even reaches the app. Caddy has built-in basic_auth for a quick username/password gate, and supports forward_auth to delegate to a real identity provider (Authelia, Authentik, and similar). A dashboard or a database UI that has no business being public is a good candidate for an edge-level auth gate.

💡 Tip — Put basic_auth in front of anything administrative — a Grafana you forgot to secure, a Longhorn UI, a router page. It is one block in the Caddyfile and it means a leaked internal service is not instantly an open door.

TLS operations: mostly leave it alone

Caddy obtains and renews certificates automatically. In normal operation you do nothing — it renews each certificate well before expiry and reloads itself. The operational task is not renewing certs, it is noticing when renewal fails. Renewal failures are almost always one of two things: the ACME HTTP-01 challenge cannot reach port 80 (firewall or DNS drift), or, for a wildcard, the DNS provider API token has expired or lost permissions. Caddy logs both clearly.

Check Caddy's logs for renewal activity or errors:

sudo journalctl -u caddy --since "7 days ago" | grep -i -E "certificate|renew|acme|error"

ℹ️ Note — A wildcard certificate for *.example.dev is issued via the DNS-01 challenge, which needs Caddy to create a TXT record through your DNS provider's API. That means the API token Caddy holds must stay valid. If you rotate or revoke that token, wildcard renewal silently fails until you update Caddy's config — so when you rotate DNS tokens, update Caddy in the same change.

Monitoring the edge

You cannot operate what you cannot see, and the edge is the one host where you genuinely want eyes. There are four things worth watching: is the VPS up at all, who is trying to log in (the SSH/auth logs), what fail2ban is banning, and what Caddy is serving (the access logs). The first three you can read directly; the access logs you enable in the Caddyfile with a log directive writing JSON, which is both human-readable and machine-parseable.

Quick manual edge check (on the VPS):

uptime                                   # load and how long it has been up
sudo journalctl -u ssh --since today     # who has tried to log in
sudo fail2ban-client status sshd         # current bans
sudo journalctl -u caddy --since "1 hour ago"

Doing this by hand is fine for a hobby edge, but the better move is to fold the VPS into the monitoring you already built. If you followed Monitoring your homelab, you have a Prometheus and Grafana stack — point it at the VPS too. Run node_exporter on the VPS and have Prometheus scrape it over the WireGuard tunnel (target 10.8.0.1:9100), so the metrics endpoint never touches the public internet. Now the VPS appears on the same dashboards as everything else: CPU, memory, disk, and uptime, all in one place.

Finally, the one check that catches the failure that matters most — the edge being down from the outside — is an external uptime probe. A free service like UptimeRobot or Healthchecks hitting https://app.example.dev every minute from outside your network tells you the thing your users actually experience: can they reach it. Internal monitoring cannot tell you that, because internal monitoring is inside the tunnel.

Backups and disaster recovery

The reason this architecture is comfortable to run is that the VPS holds almost no state. If it burns down, you lose a reverse proxy and a tunnel endpoint — not data. The data lives at home and keeps running. So the disaster-recovery story is not "restore a big backup", it is "re-provision a tiny one". The few files that actually matter, and that you must back up off the VPS:

/etc/caddy/Caddyfile — your routing and TLS⁴ config. Losing it means re-deriving every site block.
/etc/wireguard/wg0.conf — the tunnel config, including the VPS private key and the homelab peer's public key.
Any API tokens — the DNS provider token Caddy uses for wildcard certificates, kept somewhere safe (a password manager or secrets store), never only on the VPS.

🔗 Learn more — ⁴ What is TLS (and how does Let's Encrypt fit)?

Back these up the same way you back up anything else valuable — and crucially, not on the VPS itself, because the whole point is surviving the VPS dying. The homelab storage and backups practices apply: keep a copy at home and a copy off-site.

Rebuilding the edge from scratch is then a short, repeatable runbook:

Re-provision a fresh Hetzner Debian 12 VPS and re-run the hardening from Part 3 (key-only SSH, firewall opening 22, 80, 443, 51820).
Point your DNS A record for the domain at the new public IP. (It will not be 203.0.113.10 any more — that is fine, only DNS knew it.)
Restore wg0.conf, install WireGuard, and bring the tunnel up: sudo wg-quick up wg0. Confirm wg show lists the homelab peer with a recent handshake.
Install Caddy, restore the Caddyfile, restore the DNS API token, and start Caddy. It re-issues certificates on first run.

💡 Tip — Keep the runbook and the three backed-up files together. A disaster-recovery procedure you have to reconstruct under pressure is not a procedure. The whole rebuild should be twenty minutes of typing, because the homelab — the part with the data — never went anywhere.

Failure modes, honestly

Things will break. The architecture's redeeming feature is that most failures are contained and recoverable. The ones you will actually meet:

The VPS goes down (Hetzner maintenance, a crash, you fat-finger the firewall). External access stops — https://app.example.dev is unreachable. But your homelab is completely unaffected: it is still running, still serving on the LAN, and still totally invisible and safe from the internet. You reach your services locally as if nothing happened, and you rebuild or reboot the edge at your leisure.

The tunnel drops. This is the most common transient failure, and it is largely self-healing. PersistentKeepalive on the homelab peer keeps the NAT mapping open and re-establishes the handshake automatically after a network blip, and systemctl enable wg-quick@wg0 on both sides brings the tunnel back after a reboot. For belt-and-braces, add a systemd watchdog or a small timer that re-runs wg-quick up wg0 if wg show reports no recent handshake.

Certificate renewal fails. Browsers start warning about an expired or invalid certificate. This is visible and not silent — your external uptime check and your own browser will both flag it. Fix the cause: confirm 80/tcp is open and DNS still points at the VPS for HTTP-01, or refresh the DNS API token for a wildcard, then sudo systemctl reload caddy.

⚠️ Warning — With this design the VPS is a genuine single point of failure for external access. That is an acceptable trade — it keeps home private — but it means the VPS deserves production discipline: keep it patched (unattended-upgrades), keep it monitored, and do not treat it as a toy. If it falls over unnoticed for a week, your services are gone from the outside world for a week.

⛔ Danger — Never disable the firewall "just to debug something", never open extra ports without a precise reason, and never paste the WireGuard private key or DNS token into a chat, an issue, or a screenshot. The exposed host is exactly where a careless five-minute shortcut becomes a public incident.

Full-circle close

Step back and look at what you have built across this series. You started with services trapped behind a home router and you finished with production-shaped self-hosting: services running in your homelab and your k3s cluster, answering on your own domain example.dev, served over real HTTPS that renews itself, reachable from anywhere in the world — and your home network still completely private, with no open ports and no IP in DNS. One small, hardened, recoverable VPS is the only thing the internet ever sees. That is the same shape a real company would use, built from parts you understand end to end.

It also closes the loop on the tracks that fed into it. The Build a homelab on Debian series got you a server and services; Kubernetes from scratch with k3s turned that into a real cluster; Monitoring your homelab gave you eyes on all of it — including, now, the edge; and Hardening a fresh Debian server is the discipline you just applied to the most exposed host you own. This series was the front door for all of them.

That is the whole journey: a private homelab, made publicly reachable, safely, and operated like it matters. Head back to the Expose your homelab to the internet hub for the full reading order, or revisit the source tracks — homelab on Debian and Kubernetes with k3s — to add the next service. The pattern is yours now: one DNS record and a few lines of Caddy, and anything you build is live.