A Helpful Bash Shell Prompt With Chef on AWS

We use Opscode Chef for provisioning new servers, or instances, as they're called on AWS. Sometimes we need to ssh into these instances to look at local log files, troubleshoot a problem, try out a configuration change on a throwaway machine, etc.

Our standard .bashrc file puts the local hostname into the shell prompt, like so:

william@ip-10-242-150-199:$

This is surprisingly dangerous.

We're accustomed to thinking of instances by their node names in Chef. (In Chef parlance, an individual server is a node. So for our purposes, an AWS instance maps one-to-one to a Chef node.) For example, to ssh into the machine above, I ran the ssh task from our custom ops Rakefile:

$ rake chef:ssh[aws-staging-app-precise-c1m-01]

The node name here, aws-staging-app-precise-c1m-01, is semantic, containing information about the instance such as the environment, the role of the instance, which version of Ubuntu it's running, and more. It's easy to be sure I got the right machine.

Once I'm logged in, on the other hand, I see something like ip-10-242-150-199: a reference to the IP address of the instance on Amazon's internal network. Not only is it not very useful, it's actually dangerous. Why? It makes it difficult to keep track of which instance you're working on. What the heck is ip-10-242-150-199 again? Which application is it running? Is it production or staging?

If my task is deploying a new app version or restarting a service, I need to ensure I do that on the right machine — and the right machine, in my mind, is defined by its Chef node name. So I need maintain a map of IP address to Chef node in my head or a scratch file.

In the best case, tracking this map as we create and terminate instances is an annoyance, a waste of time and effort. In the case of troubleshooting a problem, having to refer to it will slow me down. Worst of all, if I look at the wrong line in the map, or if it becomes wrong or out of date, I might run a command on the wrong machine. Imagine testing an nginx configuration change on a production instance instead of an instance that isn't serving real traffic — not a good idea!

The solution is clear: get the Chef node name into the prompt. This eliminates the indirection of the map and puts the important information right there where I can't miss it.

Knife will happily output a whole bunch of info about any given node, including the node name. But it runs on our dev machines, not on the instances. Where is the info on the node itself? (It could be available somewhere on the node object in Chef's template context, but if so, I couldn't find such a property in the docs.)

A little googling didn't turn up any quick answers, but after poking around a bit, I found it, in a line from /etc/chef/client.rb:

node_name "aws-staging-app-precise-c1m-01"

So I added this to our standard .bashrc:

# It's nice to know what Chef node we're on.
nodename () {
  if [ -r "$CHEF_CLIENT_FILE" ]; then
    cat "$CHEF_CLIENT_FILE" | egrep node_name | cut -f 2 -d ' ' | sed "s/\"//g"
  fi
}
CHEF_CLIENT_FILE=/etc/chef/client.rb
CHEF_NODENAME=$(nodename)

Then I inserted CHEF_NODENAME (defaulting to the original value of \h, meaning hostname, if CHEF_NODENAME is not set) into our PS1 definition to set the prompt.

PS1='\u@${CHEF_NODENAME:-\h}:\w\$ '

Now our prompts look like:

william@aws-staging-app-precise-c1m-01:~$

This frees up more attention to focus on the task at hand, without worrying about dangerous prompts or tracking which shell goes where.