Amazon AWS Tips and Gotchas - Part 7 - AWS EMR, Spot Instances & PGs

Continuing in this series of blog posts taking a bit of a “warts and all” view of a few Amazon AWS features, below are a handful more tips and gotchas when designing and implementing solutions on Amazon AWS, including EMR, Spot Instances and Placement Groups.

For the first post in this series with a bit of background on where it all originated from, see here:
Amazon #AWS Tips and Gotchas – Part 1

For more posts in this series, see here:
Index of AWS Tips and Gotchas

AWS Tips and Gotchas – Part 7

As detailed in the EMR FAQ, EMR does not support multi-master config, only one master node per EMR cluster (plus of course, multiple slaves). If that master node goes offline, you lose your cluster and all data which is being processed at the time. The AWS recommended workaround for this is to checkpoint your EMR cluster regularly, which allows resuming of the cluster from the last checkpoint in the event of a failure.
Spot instances and sticky sessions do not play well together!!! If you use spot instances as a method for providing cheap burst resources, make sure your application is not dependent on sticky sessions.
If it is, you risk losing user sessions when the spot instances are terminated with only 2 minutes notice.
There are a couple of mitigation methods for this, the best of which is simply to not use sticky sessions, and store your session data in another system such as ElastiCache or DynamoDB (or both!).
Alternatively, you could setup a script within the EC2 guest OS to monitor the Spot Instance Termination Notifications (http://169.254.169.254/latest/meta-data/spot/termination-time) and devise a method to cleanly migrate off any remaining sessions from your instance and remove it from the load balancer.
NOTE: It is best to avoid terminating your spot instances yourself, as AWS will not charge you for the hour in which they terminate your instance, so you can save some budget over shutting your own instances down.
Placement groups were designed specifically for high bandwidth applications, which require low latency, 10Gbps connectivity between instances.
If you do not start all instances in a placement group at the same time, you cannot guarantee that they will end up optimally close to each other later. Indeed, as stated in the placement groups KB “If you try to add more instances to the placement group later, or if you try to launch more than one instance type in the placement group, you increase your chances of getting an insufficient capacity error”.
If you do want to add more instances to your placement group later, the best thing to do is stop and restart all of your instances concurrently.

Find more posts in this series here:
Index of AWS Tips and Gotchas

Amazon AWS Tips and Gotchas – Part 8 – AWS EC2 Reserved Instances

Amazon AWS Tips and Gotchas – Part 7 – AWS EMR, Spot Instances & PGs

AWS Tips and Gotchas – Part 7

Follow Me

About Me

Recent Posts

Twitter Feed

Archive

RSS Feeds

Subscribe via Email

Meta

Amazon AWS Tips and Gotchas – Part 7 – AWS EMR, Spot Instances & PGs

AWS Tips and Gotchas – Part 7

Share this:

Follow Me

About Me

Recent Posts

Twitter Feed

Tags

Archive

RSS Feeds

Subscribe via Email

Meta