Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to make the usage of the Android emulator in CIs more robust #17903

Merged
merged 9 commits into from
Oct 14, 2023

Conversation

skottmckay
Copy link
Contributor

Description

Android emulator usage updates:

  • Change approach to detecting boot has completed
    • use -delay-adb and a simple command (ls) with wait-for-device as the first step
      • this ensures enough startup has occurred for adb to be responsive
    • use secondary loop on the python side to check for sys.boot_completed to be set
      • doing the check on the python side provides more feedback and seems to work well
  • make the 'stop' logic more precise by using psutil
  • add internal timeout of 20 mins for emulator startup
    • waiting for the CI jobs overall timeout is way too long
    • value is hardcoded for now (most CIs startup in under 10 mins) but could be made configurable if needed

CI updates:

  • add template for using the Android emulator
    • update CIs to use template
  • reorder React Native CI
    • minimize the time the Android emulator or iOS simulator is running by moving some build steps around
    • don't run both at the same time
      • unnecessary and potentially adds significant memory pressure to the machine
  • fix QNN Android emulator CI as much as possible
    • now everything works apart from running onnx_test_runner with the QNN EP

Motivation and Context

Fix inconsistent detection of the emulator boot completing.

- Change approach to detecting boot has completed
  - use `-delay-adb` and a simple command (`ls`) with `wait-for-device` as the first step
    - this ensures enough startup has occurred for adb to be responsive
  - use secondary loop on the python side to check for sys.boot_completed to be set
    - doing the check on the python side provides more feedback and seems to work well
- make the 'stop' logic a lot more precise by using psutil

CI updates
- add template for using the emulator
  - update CIs to use template
- reorder React Native CI
  - minimize the time the Android emulator or iOS simulator is running by moving some build steps around
  - don't run both at the same time
    - unnecessary and potentially adds significant memory pressure to the machine
- fix QNN Android emulator CI as much as possible
  - now everything works apart from running onnx_test_runner with the QNN EP
@skottmckay skottmckay requested a review from a team as a code owner October 12, 2023 08:49
@skottmckay
Copy link
Contributor Author

I've run the new setup a lot of times and don't see the hang behavior. One CI from the PR did return an error from the emulator relatively quickly that I haven't seen before. Can't find any doco to know what -6 means though. If this happens more often we could add a retry to the startup to see if that helps.

https://dev.azure.com/onnxruntime/2a773b67-e88b-4c7f-9fc0-87d31fea8ef2/_apis/build/builds/1166434/logs/65

2023-10-12T11:41:19.0440240Z File "/Users/runner/work/1/s/tools/python/util/android/android.py", line 173, in start_emulator
2023-10-12T11:41:19.0477390Z raise RuntimeError(f"Emulator exited early with return code: {emulator_ret}")
2023-10-12T11:41:19.0505930Z RuntimeError: Emulator exited early with return code: -6

@skottmckay skottmckay merged commit ae21199 into main Oct 14, 2023
91 checks passed
@skottmckay skottmckay deleted the skottmckay/MakeAndroidEmulatorGreatAgain.Maybe branch October 14, 2023 22:42
jchen351 pushed a commit that referenced this pull request Oct 18, 2023
…17903)

### Description
<!-- Describe your changes. -->
Android emulator usage updates:
- Change approach to detecting boot has completed
- use `-delay-adb` and a simple command (`ls`) with `wait-for-device` as
the first step
    - this ensures enough startup has occurred for adb to be responsive
- use secondary loop on the python side to check for sys.boot_completed
to be set
- doing the check on the python side provides more feedback and seems to
work well
- make the 'stop' logic more precise by using psutil
- add internal timeout of 20 mins for emulator startup
  - waiting for the CI jobs overall timeout is way too long
- value is hardcoded for now (most CIs startup in under 10 mins) but
could be made configurable if needed

CI updates:
- add template for using the Android emulator
  - update CIs to use template
- reorder React Native CI
- minimize the time the Android emulator or iOS simulator is running by
moving some build steps around
  - don't run both at the same time
- unnecessary and potentially adds significant memory pressure to the
machine
- fix QNN Android emulator CI as much as possible
- now everything works apart from running onnx_test_runner with the QNN
EP

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fix inconsistent detection of the emulator boot completing.

---------

Co-authored-by: Edward Chen <[email protected]>
skottmckay added a commit to microsoft/onnxruntime-extensions that referenced this pull request Oct 30, 2023
skottmckay added a commit to microsoft/onnxruntime-extensions that referenced this pull request Oct 31, 2023
* Update JDK version to 17 in ci.yml

* Update com.diffplug.spotless to 6.22.0.

* Copy updated scripts to start/stop the emulator from ORT from microsoft/onnxruntime#17903.
Minimize the time the emulator is running as well.

* Fix includes

* Update to JDK 17 in packaging pipelines.

* Fix pool name.

---------

Co-authored-by: Edward Chen <[email protected]>
kleiti pushed a commit to kleiti/onnxruntime that referenced this pull request Mar 22, 2024
…icrosoft#17903)

### Description
<!-- Describe your changes. -->
Android emulator usage updates:
- Change approach to detecting boot has completed
- use `-delay-adb` and a simple command (`ls`) with `wait-for-device` as
the first step
    - this ensures enough startup has occurred for adb to be responsive
- use secondary loop on the python side to check for sys.boot_completed
to be set
- doing the check on the python side provides more feedback and seems to
work well
- make the 'stop' logic more precise by using psutil
- add internal timeout of 20 mins for emulator startup
  - waiting for the CI jobs overall timeout is way too long
- value is hardcoded for now (most CIs startup in under 10 mins) but
could be made configurable if needed

CI updates:
- add template for using the Android emulator
  - update CIs to use template
- reorder React Native CI
- minimize the time the Android emulator or iOS simulator is running by
moving some build steps around
  - don't run both at the same time
- unnecessary and potentially adds significant memory pressure to the
machine
- fix QNN Android emulator CI as much as possible
- now everything works apart from running onnx_test_runner with the QNN
EP

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fix inconsistent detection of the emulator boot completing.

---------

Co-authored-by: Edward Chen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants