Christian Lehnert — Linux, Hacking & Faith

Your Homelab Should Be a Git Repo - Bootstrapping Config Management With Ansible

Christian Lehnert2016-03-02~7 min read

Here is the honest state of most homelabs, mine included until recently. You have five or six Debian boxes. Each one was set up by hand, over SSH, late at night, by running a sequence of apt-get install and vim /etc/something commands that felt obvious at the time. Six months later one box has fail2ban and the others do not, two of them still permit root login, and you could not tell me which without logging in to check. That is not infrastructure. It is folklore: a set of half-remembered rituals that work until the machine they live on dies.

Configuration management fixes this, and as of this year the choice for a homelab is not really in doubt. Red Hat bought Ansible last October, version 2.0 shipped in January with a cleaner parser and proper block-level error handling, and the thing that made it worth picking in the first place is still its best feature: there is no agent.

Why agentless is the whole argument

Puppet, Chef, and Salt are all capable tools, and they all want something running on every managed node. An agent that phones home to a master, a certificate to sign, a daemon to keep alive and patched. In a datacenter with a dedicated ops team that overhead amortises. In a homelab it is pure tax. You install the agent, then you need a second mechanism to manage the agent, and now you have a chicken-and-egg problem on every box before you have configured a single line of anything useful.

Ansible takes the opposite position. The only things it needs on a target are an SSH daemon and a Python interpreter, both of which a Debian box already has. The control node, your laptop or a small management VM, opens an SSH connection, copies over a small Python module, runs it, collects the result as JSON, and deletes it. There is no persistent footprint. A machine you have never touched before is manageable the moment you can SSH into it. That is the entire pitch, and for self-hosting it is decisive.

The other half of the appeal is the language. Playbooks are YAML, not a Ruby DSL you have to learn before you can describe "this package should be installed". And the modules are idempotent: you declare the desired state, and Ansible makes the box match it. Run the playbook against an already-correct host and it changes nothing and reports ok. Run it against a drifted host and it fixes only what drifted. You stop writing scripts that do things and start writing descriptions of how things should be. That distinction is the whole point of config management, and it is the thing a pile of shell scripts never gives you.

The bootstrap

Install Ansible on the control node, not the targets. On Jessie the cleanest path is pip, because the distro package lags well behind:

apt-get install python-pip
pip install ansible
ansible --version    # 2.0.x

Now an inventory. INI format, one group for the homelab:

# inventory/hosts
[homelab]
git01  ansible_host=10.0.10.12
web01  ansible_host=10.0.10.13
vault01 ansible_host=10.0.10.11
 
[homelab:vars]
ansible_user=cl
ansible_become=true

Prove the connection works with an ad-hoc command. No playbook, no setup, just a ping module that checks SSH and Python are both answering:

ansible -i inventory/hosts homelab -m ping

One gotcha that will bite you on a minimal Jessie install: the ping module is not ICMP, it runs Python on the far side, and a netinst box may not have Python at all. Ansible needs python-minimal present before it can do anything module-based. The fix is the raw module, which runs a literal command over SSH with no Python required, so you bootstrap the interpreter first:

ansible -i inventory/hosts homelab -m raw -a "apt-get update && apt-get install -y python-minimal" -b

After that, ping answers and everything else is open to you.

The first real playbook

This is the playbook every self-hoster writes first, because it encodes the baseline you keep forgetting to apply consistently: common tools, an admin user with a key, a hardened SSH config, automatic security updates, and fail2ban.

---
- hosts: homelab
  become: true
  vars:
    admin_user: cl
  tasks:
    - name: Install baseline packages
      apt:
        name: "{{ item }}"
        state: present
        update_cache: yes
      with_items:
        - htop
        - tmux
        - fail2ban
        - unattended-upgrades
 
    - name: Create the admin user
      user:
        name: "{{ admin_user }}"
        groups: sudo
        shell: /bin/bash
        state: present
 
    - name: Install my SSH public key
      authorized_key:
        user: "{{ admin_user }}"
        key: "{{ lookup('file', 'files/' + admin_user + '.pub') }}"
        state: present
 
    - name: Deploy a hardened sshd_config
      template:
        src: templates/sshd_config.j2
        dest: /etc/ssh/sshd_config
        owner: root
        group: root
        mode: "0644"
        validate: "/usr/sbin/sshd -t -f %s"
      notify: reload sshd
 
  handlers:
    - name: reload sshd
      service:
        name: ssh
        state: reloaded

A few things in there are worth pointing at, because they are the difference between a playbook and a shell script with extra steps.

The package install loops with with_items. You will see newer examples hand the apt module a plain list under name:, but that does not work in the 2.0 line, so loop it.

The template task ships an SSH config from a Jinja2 template and, crucially, has a validate line. That runs sshd -t against the candidate file before it is moved into place. Lock yourself out of a remote box by deploying a broken sshd_config exactly once and you will never again skip validation.

The notify and the handler are the idempotent way to do "restart only if something changed". The handler runs at the end of the play, and only if the template task actually modified the file. A correct box reloads nothing.

Run it:

ansible-playbook -i inventory/hosts site.yml

The first run is mostly yellow changed lines. The second run, against the now-correct boxes, is almost entirely green ok. That green is the payoff. It means the state of every machine is now described in a file you can read, diff, and commit, instead of living in your head.

From one file to something maintainable

A single playbook is fine for a weekend. The moment you have more than one concern, move to roles, which is just Ansible's name for a standard directory layout that keeps tasks, handlers, templates, and files together:

ansible-galaxy init roles/common
site.yml
inventory/hosts
group_vars/homelab.yml
roles/
  common/
    tasks/main.yml
    handlers/main.yml
    templates/sshd_config.j2
    files/cl.pub

And the rule that matters more than any of the above: secrets do not go in the repo in plaintext. Ansible ships ansible-vault, which encrypts a file in place so you can commit it safely and decrypt it at runtime:

ansible-vault create group_vars/homelab/vault.yml
ansible-playbook -i inventory/hosts site.yml --ask-vault-pass

WiFi passwords, API tokens, the root password hash you template into /etc/shadow: all of it lives encrypted, in git, next to the playbook that uses it.

The actual win

The point of all this is not that Ansible is clever. It is that your infrastructure stops being a set of pets you nursed by hand and becomes a repository you can read. A new box is apt-get install python-minimal, add a line to the inventory, run the playbook. A box that dies is not a disaster, it is a checkout and a fifteen-minute rebuild. Six months from now, when you cannot remember whether web01 has fail2ban, you do not log in to check. You read the role, because the role is the truth and the box is just the current rendering of it.

That inversion, repo as source of truth and the machine as a disposable rendering of it, is the whole game. Everything else I build on top of this homelab from here will assume it. Start with one playbook that hardens SSH on every box, and grow it from there. The version you write tonight will be embarrassing in a year. Write it anyway. Folklore does not survive a failed disk. A git repo does.

Tagged:
#ansible #homelab #linux #debian
← Back to posts