If you’ve ever used an auto scaling group (ASG) on AWS, you’ve probably had an EC2 instance fail and get removed from the ASG. While great for redundancy (the ASG launches a new instance to start handling requests), it makes debugging the failure difficult since the ASG terminates the bad instance, erasing any evidence of what went wrong. Below, I present a script that will upload relevant files to S3 after an instance is triggered to shutdown but before it terminates.
To achieve this, we make use of Linux’s runlevel scripts. The instructions below are for Ubuntu, but it should be straight forward to migrate to a different distro.
First, we make a script
/etc/rc0.d/K01upload-logs. This script will run when the system is shutting down. You should change
LOG_FILE to match your needs.
#!/bin/bash source /etc/environment # get strict after sourcing environment since we don't trust it... set -euo pipefail IFS=$'\n\t' # what logs should I upload and where to? LOG_FILE="/var/log/tomcat7/catalina.out" BUCKET="my-logs-bucket" # below we include the instance id in the path. That way it's easily findable. HOST=$(/usr/bin/curl http://169.254.169.254/latest/meta-data/instance-id) PATH="services/logs/$HOST/" # upload the logs /bin/echo "Uploading logs to s3://$BUCKET/$PATH" | /usr/bin/wall /usr/local/bin/aws s3 cp $LOG_FILE s3://$BUCKET/$PATH wait
After installing the script, you need to set the permissions:
chown root:root /etc/rc0.d/K01upload-logs chmod +x /etc/rc0.d/K01upload-logs
And that’s it! The script will upload the logs to your S3 bucket when the ASG terminates an instance. We’ve found this extremely helpful for our deep learning infrastructure that can often contain errors from C++ code (and thus isn’t handled by the jvm or sent to our logging services).