This repository contains spot availability and preemption traces collected from the cloud providers and the scripts to collect and process the traces.
The data was used in paper Can't Be Late: Optimizing Spot Instance Savings under Deadlines (NSDI 24). The policy in the paper is prototyped on SkyPilot.
We collected the spot instance availability and preemption traces by directly pinging the cloud providers. The traces contain the following information:
{
"metadata": {
// The interval seconds between two consecutive data points
"gap_seconds": 300,
},
// A trace of spot instance availability or preemption, where each data point
// is the number of available instances at a specific tick.
"data": []
}
The 2-week (13 days) availability trace started on 10/26/2022 for K80 and V100 GPUs.
- Start date: 10/26/2022
- Instance types:
p3.2xlarge
/p3.16xlarge
(1/8 V100),p2.2xlarge
/p2.16xlarge
(1/8 K80) - Availability zones:
us-west-2a
andus-west-2b
- Data path: availability/1-node/aws-10-26-2022.
The 2-month (45 days) availability trace started on 02/15/2023 for V100 GPUs.
- Start date: 02/15/2023
- Instance types:
p3.2xlarge
(1 V100) - Availability zones: all 9 regions in
us-east-1
,us-east-2
, andus-west-2
- Data path: availability/1-node/aws-02-15-2023.
The 2-week (11/16 days) availability trace started on 08/27/2023 for V100 GPUs with multi-node.
- Start date: 08/27/2023
- Instance types:
p3.2xlarge
(1 V100) - Availability zones:
us-east-2b
,us-west-2a
, andus-west-2c
- Data path: availability/16-node/aws-08-27-2023.
The 1-week (7 days) preemption trace started on 04/22/2023 for V100 GPUs on AWS. This trace is used for ML workloads.
- Start date: 04/22/2023
- Instance types:
p3.2xlarge
(1 V100) - Availability zones:
us-east-1c
,us-east-1f
,us-west-2b
, andus-west-2c
- Data path: preemption/1-node/aws-04-22-2023.
The 2-day preemption trace started on 04/30/2023 for Intel CPUs on GCP.
This trace is used for Bioinformatics workloads.
- Start date: 04/30/2023
- Instance types:
c3-highcpu-88
(88 vCPUs) - Availability zones:
us-central1-a
,us-central1-b
,us-central1-c
, andus-east1-b
- Data path: preemption/1-node/gcp-04-30-2023.
The 1-week (7 days) preemption trace started on 05/01/2023 for Intel CPUs on AWS.
This trace is used for Data Analytics workloads.
- Start date: 05/01/2023
- Instance types:
r5.16xlarge
(64 vCPUs) - Availability zones:
us-east-1b
,us-east-1c
, andus-east-1d
- Data path: preemption/1-node/aws-04-19-2023.
The 2-week (13 days) preemption trace started on 08/03/2023 for V100 GPUs on AWS.
- Start date: 08/03/2023
- Instance types:
p3.2xlarge
(1 V100) - Availability zones:
us-east-1f
,us-east-2a
, andus-west-2c
- Data path: preemption/4-node/aws-08-03-2023.
@inproceedings{wu2024skyspot,
title={Can't Be Late: Optimizing Spot Instance Savings under Deadlines},
author={Wu, Zhanghao and Chiang, Wei-Lin and Mao, Ziming and Yang, Zongheng and Friedman, Eric and Shenker, Scott and Stoica, Ion}
booktitle = {21th USENIX Symposium on Networked Systems Design and Implementation (NSDI 24)},
year = {2024}
}