Skip to content

Latest commit

 

History

History
120 lines (90 loc) · 5.31 KB

README.md

File metadata and controls

120 lines (90 loc) · 5.31 KB

OpenPAI Job Examples

Table of Contents

Quick start: how to write and submit a CIFAR-10 job

(1) Prepare a job json file

In this section, we will use CIFAR-10 training job as an example to explain how to write and submit a job in OpenPAI.

CIFAR-10 is an established computer-vision dataset used for image classification.

  • Full example for tensorflow cifar10 image classification training on OpenPAI:
{
  // Name for the job, need to be unique
  "jobName": "tensorflow-cifar10",
  // URL pointing to the Docker image for all tasks in the job
  "image": "openpai/pai.example.tensorflow",
  // Data directory existing on HDFS
  "dataDir": "/tmp/data",
  // Output directory on HDFS, 
  "outputDir": "/tmp/output",
  // List of taskRole, one task role at least
  "taskRoles": [
    {
      // Name for the task role
      "name": "cifar_train",
      // Number of tasks for the task role, no less than 1
      "taskNumber": 1,
      // CPU number for one task in the task role, no less than 1
      "cpuNumber": 8,
      // Memory for one task in the task role, no less than 100
      "memoryMB": 32768,
      // GPU number for one task in the task role, no less than 0
      "gpuNumber": 1,
      // Executable command for tasks in the task role, can not be empty
      "command": "git clone https://github.com/tensorflow/models && cd models/research/slim && python download_and_convert_data.py --dataset_name=cifar10 --dataset_dir=$PAI_DATA_DIR && python train_image_classifier.py --batch_size=64 --model_name=inception_v3 --dataset_name=cifar10 --dataset_split_name=train --dataset_dir=$PAI_DATA_DIR --train_dir=$PAI_OUTPUT_DIR"
    }
  ]
}

(2) Submit job json file from OpenPAI webportal

Users can refer to this tutorial submit a job in web portal for job submission from OpenPAI webportal.

List of off-the-shelf examples

Examples which can be run by submitting the json straightly without any modification.

List of customized job template

These user could customize and run these jobs over OpenPAI.

Contributing

If you want to contribute a job example that can be run on PAI, please open a new pull request.

  • Prepare a folder under pai/examples folder, for example create pai/examples/caffe2/

  • Prepare example files:

    Under Caffe2 example dir, user should prepare these files for an example's contribution PR:

PAI_caffe2_dir

  1. README.md: Example's introductions
  2. Dockerfile: Example's dependencies
  3. Pai job json file: Example's OpenPAI job json template
  4. [Optional] Code file: Example's code file