AMAZON EC2 SPOT INSTANCE INTEGRATION Cloud Computers – Eric Caron, Adam Irr, Cory Santiago, Brian Wallace Coach - Ryan Schneider Sponsor – Amazon Web Services – Stephen Elliott, Dhanvi Kapila Senior Project – 2013
AWS EC2 and Spot
Project Description The purpose of this project is to bring support for Amazon Elastic Compute Cloud (EC2) Spot instances to the popular continuous integration server, Jenkins. The team is extending the functionality of the existing EC2 plugin for Jenkins, which supports On-Demand instances, to also support the use of Spot instances.
Jenkins
An EC2 instance is a virtual computer that users rent on a per hour basis for computations “in the cloud”. Spot instances are a form of EC2 instance in which users can pay a significantly reduced cost for the instance. Users specify the amount they are willing to pay. If the bid exceeds the going rate for the instance then they are able to use the instance. If the going rate exceeds the user's maximum bid, the instance is terminated without warning.
How it Works
Sponsor Goals
Jenkins is a tool that provides developers with a means to run automated processes on a schedule or after certain events. The job might fetch the latest code from a repository, build the project, and run the unit tests to ensure that the newly submitted code did not create any issues. Jenkins can be configured to send out notifications in the instance that something does break.
Deployment View
Enable distributed computing frameworks to work with Spot instances. Alleviate some of the difficulties associated with Spot instances Demonstrate to potential users the cost saving benefits of using Spot, and how those benefits outweigh any risk involved. Ensure customer use and continued development of the project by merging our solution with the framework's mainline.
Cost and Time Difference
Once the plugin is installed users must go through a series of steps to ensure proper configuration of their environment:
For each instance size tested, the time it took Spot instances to complete the task was about 1 minutes longer than that of On-Demand instances due to the delay in the fulfillment of the Spot requests.
A. Configure Amazon Machine Image (AMI) with necessary scripts for registering with Jenkins. B. Configure an Amazon Cloud on Jenkins. C. Configure a node that uses the previous AMI, mark it as a Spot instance, and provide a max bid price. After the configuration of the Jenkins environment is complete you may provision a Spot node from the user interface (1). If the max bid price you specified is higher than the current market price, your Spot instance will be fulfilled (2). Once the Spot instance is fulfilled, it will use the scripts provided previously and register itself with the master Jenkins server (3). After the registration process is complete, the Jenkins server is able to distribute work to the Spot instance as needed (4).
For every instance size tested, the cost of using Spot instances ranged from 3.5x to 6.5x cheaper than the equivalent OnDemand instance.
If the Spot instance goes down due to an increase in market price, any job running on the node will be lost. To help counter this situation, the team built a separate plugin that allows jobs to be configured to requeue if the node they are running on goes down.
Technologies
Process The agile Scrum methodology was chosen for completing the project because of its high adaptability to changing requirements. Scrum provided a lot of visibility to our sponsor through demos and deliverables which ensured that the project was meeting their goals. To facilitate our use of the Scrum methodology the team used the project management tool Redmine. This tool assisted in the organization of Scrum artifacts and metrics for the project.
Process Metrics
Scrum Schedule 12/17/2012 - 01/06/2013
0. AWS and Framework Research 01/08/2013 - 01/23/2013
1. Provision Spot Instances from Jenkins 01/24/2013 - 02/06/2013
2. Callback Mechanism 02/07/2013 - 02/27/2013
3. Beta Release and Documentation 03/05/2013 - 03/20/2013
4. Prepare for Open Source Contribution 03/21/2013 - 04/11/2013
5. Job Requeue Plugin 04/12/2013 - 04/25/2013
6. Contribute to Open Source
Estimation accuracy improved over time, however, there were sprints where the estimation accuracy was heavily impacted by the occurrence of unexpected risks.
When determining the amount of work to be done in a sprint external obligations as well as RIT's recess schedule were considered.