Bastion Host with nsscache and Consul

When I started my new job, last year, my first task was to redesign the infrastructure and move from EC2 Classic to EC2 VPC. I spent the first few weeks setting up a new VPC with different subnets for each concern, a bastion server to access the servers located within the network and a consul cluster to keep track of the running instances.

In the past, I used several ways to manage the accounts and keys to access my servers:

  • a central LDAP server, but no jump host;
  • a jump host which was using SSH Agent forwarding to access the other servers, but only a handful of accounts/keys managed manually;
  • a jump host, and all the accounts created on each server by Ansible, but using the SSH public keys exposed by GitHub to authenticate the user.

I didn’t want to have to create the accounts on all the servers, and I wanted to avoid LDAP, because it would introduce a single point of failure and its management is a bit of a pain.

Distributed locking with Consul

A few months ago, I had to set up a daily job to perform the backup of some database running in a cluster. The job had to run on one of the nodes of the cluster. But configuring a cron job on only one server would mean that if the node is unavailable, the backup won’t be performed.

I decided to configure the cron job on all the servers, and to set up some kind of distributed locking to ensure that only one of the nodes will actually perform the backup every day.

Because I already had Consul running in my infrastructure, I chose to use the Consul Semaphore system. The semaphore system being a bit complex for what I wanted to achieve, I used the Leader election algorithm. The main difference being that only one session can acquire a lock, while the sempahore algorithm allows to have several sessions holding a lock.