Skip to content

Commit

Permalink
Add events not in scope and roadmap as suggested on KubeEdge Meeting
Browse files Browse the repository at this point in the history
Signed-off-by: Zimu Zheng <[email protected]>
  • Loading branch information
MooreZheng committed Jun 11, 2022
1 parent c381931 commit 772b99e
Show file tree
Hide file tree
Showing 4 changed files with 66 additions and 37 deletions.
103 changes: 66 additions & 37 deletions sig-ai/Distributed Synergy AI Benchmarking.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Distributed Synergy AI Benchmarking
Edge computing emerges as a promising technical framework to overcome the challenges in cloud computing. In this machine-learning era, the AI application becomes one of the most critical type of applications on the edge. Driven by the increasing computation power of edge devices and the increasing amount of data generated from the edge, edge-cloud synergy AI and distributed synergy AI techniques have received more and more attention for the sake of device, edge, and cloud intelligence enhancement.
Edge computing emerges as a promising technical framework to overcome the challenges in cloud computing. In this machine-learning era, the AI application becomes one of the most critical types of applications on the edge. Driven by the increasing computation power of edge devices and the increasing amount of data generated from the edge, edge-cloud synergy AI and distributed synergy AI techniques have received more and more attention for the sake of device, edge, and cloud intelligence enhancement.

Nevertheless, distributed synergy AI is at its initial stage. For the time being, the comprehensive evaluation standard is not yet available for scenarios with various AI paradigms on all three layers of edge computing systems. According to the landing challenge survey 2022, developers suffer from the lack of support of related datasets and algorithms; while end users are lost in the sea of mismatched solutions. That limits the wide application of related techniques and hinders a prosperous ecosystem of distributed synergy AI. A comprehensive end-to-end distributed synergy AI benchmark suite is thus needed to measure and optimize the systems and applications.
Nevertheless, distributed synergy AI is at its initial stage. For the time being, the comprehensive evaluation standard is not yet available for scenarios with various AI paradigms on all three layers of edge computing systems. According to the landing challenge survey 2022, developers suffer from the lack of support on related datasets and algorithms; while end users are lost in the sea of mismatched solutions. That limits the wide application of related techniques and hinders a prosperous ecosystem of distributed synergy AI. A comprehensive end-to-end distributed synergy AI benchmark suite is thus needed to measure and optimize the systems and applications.

This proposal provides a basic benchmark suite for distributed synergy AI, so that AI developers and end users can benefit from efficient development support and best practice discovery.

Expand All @@ -11,50 +11,62 @@ For developers or end users of distributed synergy AI solutions, the goals of th
- test cases including dataset and corresponding tools
- benchmarking tools including simulation and hyper-parameter searching
- Revealing best practices for developers and end users
- presentaion tools including leaderboards and test reports
- presentation tools including leaderboards and test reports


## Proposal
The distributed synergy AI benchmarking Ianvs aims to test the performance of distributed synergy AI solutions following recognized standards, in order to facilitate more efficient and effective development. Its scope includes:
- Provide end-to-end benchmark toolkits across devices, edge nodes and cloud nodes based on typical distributed-synergy AI paradigms and applications.
- Tools for Test Environment Management. For example, it would be necessary to support the CRUD (Create, Read, Update and Delete) actions in test environments. Elements of such test environments include algorithm-wise and system-wise configuration
- Tools for Test-case Preparation. Typical examples include paradigm templates, simulation tools, and hyper-parameter-based preparation tools.
- Tools for Benchmark Presentation, e.g., leaderboard and test report generation.
- Cooperation with other communities, e.g., in KubeEdge SIG AI, to establish comprehensive benchmarks and developed related applications, which can include but are not limited to
The distributed synergy AI benchmarking ianvs aims to test the performance of distributed synergy AI solutions following recognized standards, in order to facilitate more efficient and effective development.

The scope of ianvs includes
- Providing end-to-end benchmark toolkits across devices, edge nodes and cloud nodes based on typical distributed-synergy AI paradigms and applications.
- Tools to manage test environment. For example, it would be necessary to support the CRUD (Create, Read, Update and Delete) actions in test environments. Elements of such test environments include algorithm-wise and system-wise configuration.
- Tools to control test cases. Typical examples include paradigm templates, simulation tools, and hyper-parameter-based assistant tools.
- Tools to manage benchmark presentation, e.g., leaderboard and test report generation.
- Cooperation with other organizations or communities, e.g., in KubeEdge SIG AI, to establish comprehensive benchmarks and developed related applications, which can include but are not limited to
- Dataset collection, re-organization, and publication
- Formalized specifications, e.g., standards
- Holding competitions or coding events, e.g., open source promotion plan
- Maintaining solution leaderboards or certifications for commercial usage

Targeting users
- Developers: build and publish edge-cloud collaborative AI solutions efficiently from scratch
- Developers: Build and publish edge-cloud collaborative AI solutions efficiently from scratch
- End users: view and compare distributed synergy AI capabilities of solutions

The scope of ianvs does NOT include to
- Re-invent existing edge platform, i.e., kubeedge, etc.
- Re-invent existing AI framework, i.e., tensorflow, pytorch, mindspore, etc.
- Re-invent existing distributed synergy AI framework, i.e., kubeedge-sedna, etc.
- Re-invent existing UI or GUI toolkits, i.e., prometheus, grafana, matplotlib, etc.

## Design Details
### User flow
The user flow for the algorithm developer is as follows.
1. Ianvs Compilation: Prepare executable file for Ianvs.
- An algorithm developer can compile Ianvs in his/her own machine to adapt to the local environment, e.g., with TPU.
- Or directly download the pre-compiled executable file if the local environment is following default settings.
The user flow for an algorithm developer is as follows.
1. ianvs Preparation:
1. Install required packages specified in requirement files
1. Prepare executable files for ianvs. An algorithm developer can
- either interpret/ compile the code of ianvs in his/her machine to adapt to the local environment, e.g., with TPU.
- or directly download the pre-compiled executable files/ wheels if the local environment is following default settings.
1. Test Case Preparation
- Dataset. Datasets can be large. To avoid over-size projects, the Ianvs executable file and code base do not include origin datasets and developers can download datasets from source links (e.g., from Kaggle) given by Ianvs.
- Algorithm Interface. The tested algorithm should follow the Ianvs interface to ensure functional benchmarking.
- Prepare the dataset according to the targeted scenario
- Datasets can be large. To avoid over-size projects, the ianvs executable file and code base do not include origin datasets and developers can download datasets from source links (e.g., from Kaggle) given by ianvs.
- Leverage the ianvs algorithm interface for the targeted algorithm.
- The tested algorithm should follow the ianvs interface to ensure functional benchmarking.
1. Algorithm Development: Develop the targeted algorithm
1. Ianvs Configuration: Fill configuration files for Ianvs
1. Ianvs Execution: Run the executable file of Ianvs for benchmarking
1. Ianvs Presentation: View the benchmarking result of the targeted algorithms
1. ianvs Configuration: Fill configuration files for ianvs
1. ianvs Execution: Run the executable file of ianvs for benchmarking
1. ianvs Presentation: View the benchmarking result of the targeted algorithms
1. Repeat Step 3 - 6 until the targeted algorithm is satisfactory

![](images/user_flow.png)

### Architecture and Modules
The Ianvs is designed to run within a single node. The architectures and related concepts are shown in the below figure. Critical components include
- Test Environment Management: the CRUD of test environments serving for global usage
The architectures and related concepts are shown in the below figure. The ianvs is designed to run within a single node. Critical components include
- Test Environment Manager: the CRUD of test environments serving for global usage
- Test Case Controller: control the runtime behavior of test cases like instance generation and vanish
- Test Case Preparation: test case generation based on certain rules or constraints, e.g., the range of parameters
- Story Management: the output management and presentation of the test case, e.g., leaderboards
- Simulation Controller: control the simulation process of edge-cloud synergy AI, including the instance generation and vanish of simulation containers
- Generation Assistant: assist users to generate test cases based on certain rules or constraints, e.g., the range of parameters
- Simulation Controller: control the simulation process of edge-cloud synergy AI, including the instance generation and vanishment of simulation containers
- Story Manager: the output management and presentation of the test case, e.g., leaderboards


![](images/ianvs_arch.png)

Expand All @@ -64,19 +76,28 @@ Quite a few terms exist in ianvs, which include the detailed modules and instanc
![](images/ianvs_concept.png)

The concept definition of modules has been shown in the Architecture Section. In the following, we introduce the concepts of instances for easier understanding.
- Benchmark: standardized evaluation process recognized by the academic or industry.
- Benchmarking Job: the serving instance for an individual benchmarking with Ianvs, which takes charge of the lifetime management of all possible Ianvs components.
- Test Object: the targeted instance under benchmark testing. A typical example would be a particular algorithm or system.
- Test Environment: the configuration of benchmarking, including algorithm-wise and system-wise configuration. But it does not include the test object.
- Test Case: the executable instance to evaluate the performance of the test object under a particular test environment. Thus, the test case is usually generated with a particular test environment and outputs testing results if executed. It is the atomic unit of a benchmark. That is, a benchmarking job can include quite a few test cases.
- Attr. of Test Case: Attributes or descriptors of a test case, e.g., id, name, and time stamp.
- Algorithm Paradigm: acknowledged AI process which usually includes quite a few modules that can be implemented with replaceable algorithms, e.g., federated learning which includes modules of local train and global aggregation.
- ``Benchmark``: standardized evaluation process recognized by the academic or industry.
- ``Benchmarking Job``: the serving instance for an individual benchmarking with ianvs, which takes charge of the lifetime management of all possible ianvs components.
- Besides components, a benchmarking job includes instances of a test environment, one or more test cases, a leaderboard, or a test report.
- Different test environments lead to different benchmarking jobs and leaderboards. A benchmarking job can include multiple test cases.
- ``Test Object``: the targeted instance under benchmark testing. A typical example would be a particular algorithm or system.
- ``Test Environment (Test Env.)``: setups or configurations for benchmarking, typically excluding the test object.
- It can include algorithm-wise and system-wise configurations.
- It serves as the unique descriptor of a benchmarking job. Different test environments thus lead to different benchmarking jobs.
- ``Test Case``: the executable instance to evaluate the performance of the test object under a particular test environment. Thus, the test case is usually generated with a particular test environment and outputs testing results if executed.
- It is the atomic unit of a benchmark. That is, a benchmarking job can include quite a few test cases.
- ``Attribute (Attr.) of Test Case``: Attributes or descriptors of a test case, e.g., id, name, and time stamp.
- ``Algorithm Paradigm``: acknowledged AI process which usually includes quite a few modules that can be implemented with replaceable algorithms, e.g., federated learning which includes modules of local train and global aggregation.
- ``Algorithm Module``: the component of the algorithm paradigm, e.g., the global aggregation module of the federated learning paradigm.
- ``Leaderboard (LDB)/ Test Report (TR)``: the ranking of the test object under a specific test environment.
- The local node holds the local leaderboard for private usage.
- The local leaderboard can be uploaded to a shared space (e.g., GitHub) as the global leaderboard.


### Details of Modules

The proposal includes Test Environment Management, Test-case Preparation and Benchmark Presentation in the Distributed Synergy AI benchmarking toolkits, where
1. Test Environment Management supports the CRUD of Test environments, which include
The proposal includes Test-Environment Management, Test-case Controller and Story Manager in the Distributed Synergy AI benchmarking toolkits, where
1. Test-Environment Manager supports the CRUD of Test environments, which basically includes
- Algorithm-wise configuration
- Public datasets
- Pre-processing algorithms
Expand All @@ -87,14 +108,22 @@ The proposal includes Test Environment Management, Test-case Preparation and Ben
- System constraints or budgets
- End-to-end cross-node
- Per node
1. Test-case Preparation which includes
1. Test-case Controller, which includes but is not limited to the following components
- Templates of common distributed-synergy-AI paradigms, which can help the developer to prepare their test case without too much effort. Such paradigms include edge-cloud synergy joint inference, incremental learning, federated learning, and lifelong learning.
- Simulation tools. Develop simulated test environments for test cases
- Other test-case preparation tools. For instance, prepare test cases based on a given range of hyper-parameters.
1. Benchmark Presentation which includes
- Other tools to assist test-case generation. For instance, prepare test cases based on a given range of hyper-parameters.
1. Story Manager, which includes but is not limited to the following components
- Leaderboard generation
- Test report generation


### Roadmap


Upon the release of ianvs, the roadmap would be as follows
- AUG 2022: Release Another Use Case and Advanced Algorithm Paradigm - Non-structured lifelong learning paradigm in ianvs
- SEP 2022: Release Another Use Case, Dataset, and Algorithm Paradigm - Another structured dataset and lifelong learning paradigm in ianvs
- OCT 2022: Release Advanced Benchmark Presentation - shared space for story manager to present your work in public
- NOV 2022: Release Advanced Algorithm Paradigm - Re-ID with Multi-edge Synergy Inference in ianvs
- DEC 2022: Release Simulation Tools
- JUN 2023: More datasets, algorithms, and test cases with ianvs
- DEC 2023: Standards, coding events, and competitions with ianvs
Binary file modified sig-ai/images/ianvs_arch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified sig-ai/images/ianvs_concept.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified sig-ai/images/user_flow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 772b99e

Please sign in to comment.