Merge pull request #1357 from BmanClark/main

new Profiling ML on Arm Learning Path
ArmDeveloperEcosystem · Nov 14, 2024 · ad9dba5 · ad9dba5
2 parents ab47648 + fc1a47d
commit ad9dba5
Show file tree

Hide file tree

Showing 14 changed files with 541 additions and 0 deletions.
diff --git a/content/learning-paths/smartphones-and-mobile/profiling-ml-on-arm/Streamline.png b/content/learning-paths/smartphones-and-mobile/profiling-ml-on-arm/Streamline.png
diff --git a/content/learning-paths/smartphones-and-mobile/profiling-ml-on-arm/_index.md b/content/learning-paths/smartphones-and-mobile/profiling-ml-on-arm/_index.md
@@ -0,0 +1,38 @@
+---
+title: Profile the performance of ML models on Arm
+
+minutes_to_complete: 60
+
+who_is_this_for: This is an introductory topic for software developers who want to learn how to profile the performance of their ML models running on Arm devices.
+
+learning_objectives: 
+    - Profile the execution times of ML models on Arm devices.
+    - Profile ML application performance on Arm devices.
+
+prerequisites:
+    - An Arm-powered Android smartphone, and USB cable to connect with it.
+
+author_primary: Ben Clark
+
+### Tags
+skilllevels: Introductory
+subjects: ML
+armips:
+    - Cortex-X
+    - Cortex-A
+    - Mali
+    - Immortalis
+tools_software_languages:
+    - Android Studio
+    - tflite
+operatingsystems:
+    - Android
+    - Linux
+
+
+### FIXED, DO NOT MODIFY
+# ================================================================================
+weight: 1                       # _index.md always has weight of 1 to order correctly
+layout: "learningpathall"       # All files under learning paths have this same wrapper
+learning_path_main_page: "yes"  # This should be surfaced when looking for related content. Only set for _index.md of learning path content.
+---
diff --git a/content/learning-paths/smartphones-and-mobile/profiling-ml-on-arm/_next-steps.md b/content/learning-paths/smartphones-and-mobile/profiling-ml-on-arm/_next-steps.md
@@ -0,0 +1,20 @@
+---
+next_step_guidance: You might be interested in learning how to profile your Unity apps on Android.
+
+recommended_path: /learning-paths/smartphones-and-mobile/profiling-unity-apps-on-android/
+
+further_reading:
+    - resource:
+        title: Arm Streamline User Guide  
+        link: https://developer.arm.com/documentation/101816/latest/
+        type: documentation
+
+
+
+# ================================================================================
+#       FIXED, DO NOT MODIFY
+# ================================================================================
+weight: 21                  # set to always be larger than the content in this path, and one more than 'review'
+title: "Next Steps"         # Always the same
+layout: "learningpathall"   # All files under learning paths have this same wrapper
+---
diff --git a/content/learning-paths/smartphones-and-mobile/profiling-ml-on-arm/_review.md b/content/learning-paths/smartphones-and-mobile/profiling-ml-on-arm/_review.md
@@ -0,0 +1,45 @@
+---
+review:
+    - questions:
+        question: >
+            Streamline Profiling lets you profile:
+        answers:
+            - Arm CPU activity
+            - Arm GPU activity
+            - when your Neural Network is running
+            - All of the above
+        correct_answer: 4                    
+        explanation: >
+            Streamline will show you CPU and GPU activity (and a lot more counters!), and if Custom Activity Maps are used, you can see when your Neural Network and other parts of your application are running.
+
+    - questions:
+        question: >
+            Does Android Studio have a profiler?
+        answers:
+            - "Yes"
+            - "No"
+        correct_answer: 1                   
+        explanation: >
+            Yes, Android Studio has a built-in profiler that can be used to monitor the memory usage of your app among other things
+               
+    - questions:
+        question: >
+            Is there a way to profile what is happening inside your Neural Network?
+        answers:
+            - Yes, Streamline just shows you out of the box
+            - No.
+            - Yes, ArmNN's ExecuteNetwork can do this
+            - Yes, Android Studio Profiler can do this
+        correct_answer: 3          
+        explanation: >
+            Standard profilers don't have an easy way to see what is happening inside an ML framework to see a model running inside it. ArmNN's ExecuteNetwork can do this for TensorFlow Lite models, and ExecuTorch has tools that can do this for PyTorch models.
+
+
+
+# ================================================================================
+#       FIXED, DO NOT MODIFY
+# ================================================================================
+title: "Review"                 # Always the same title
+weight: 20                      # Set to always be larger than the content in this path
+layout: "learningpathall"       # All files under learning paths have this same wrapper
+---
diff --git a/...-paths/smartphones-and-mobile/profiling-ml-on-arm/android-profiling-version.png b/...-paths/smartphones-and-mobile/profiling-ml-on-arm/android-profiling-version.png
diff --git a/...aths/smartphones-and-mobile/profiling-ml-on-arm/app-profiling-android-studio.md b/...aths/smartphones-and-mobile/profiling-ml-on-arm/app-profiling-android-studio.md
@@ -0,0 +1,45 @@
+---
+title: Memory Profiling with Android Studio
+weight: 4
+
+### FIXED, DO NOT MODIFY
+layout: learningpathall
+---
+
+## Android Memory Profiling
+Memory is often a problem in ML, with ever bigger models and data. For profiling an Android app's memory, Android Studio has a built-in profiler. This can be used to monitor the memory usage of your app, and to find memory leaks.
+
+To find the Profiler, open your project in Android Studio and click on the *View* menu, then *Tool Windows*, and then *Profiler*. This opens the Profiler window. Attach your device in Developer Mode with a USB cable, and then you should be able to select your app's process. Here there are a number of different profiling tasks available.
+
+Most likely with an Android ML app you'll need to look at memory both from the Java/Kotlin side and the native side. The Java/Kotlin side is where the app runs, and may be where buffers are allocated for input and output if, for example, you're using LiteRT (formerly known as TensorFlow Lite). The native side is where the ML framework will run. Looking at the memory consumption for Java/Kotlin and native is 2 separate tasks in the Profiler: *Track Memory Consumption (Java/Kotlin Allocations)* and *Track Memory Consumption (Native Allocations)*.
+
+Before you start either task, you have to build your app for profiling. The instructions for this and for general profiling setup can be found [here](https://developer.android.com/studio/profile). You will want to start the correct profiling version of the app depending on the task.
+
+![Android Studio profiling run types alt-text#center](android-profiling-version.png "Figure 1. Profiling run versions")
+
+For the Java/Kotlin side, you want the **debuggable** "Profile 'app' with complete data", which is based off the debug variant. For the native side, you want the **profileable** "Profile 'app' with low overhead", which is based off the release variant.
+
+### Java/Kotlin
+
+If you start looking at the [Java/Kotlin side](https://developer.android.com/studio/profile/record-java-kotlin-allocations), choose *Profiler: Run 'app' as debuggable*, and then select the *Track Memory Consumption (Java/Kotlin Allocations)* task. Navigate to the part of the app you wish to profile and then you can start profiling. At the bottom of the Profiling window it should look like Figure 2 below. Click *Start Profiler Task*.
+
+![Android Studio Start Profile alt-text#center](start-profile-dropdown.png "Figure 2. Start Profile")
+
+When you're ready, *Stop* the profiling again. Now there will be a nice timeline graph of memory usage. While Android Studio has a nicer interface for the Java/Kotlin side than the native side, the key to the timeline graph may be missing. This key is shown below in Figure 3, so you can refer to the colors from this.
+![Android Studio memory key alt-text#center](profiler-jk-allocations-legend.png "Figure 3. Memory key for the Java/Kotlin Memory Timeline")
+
+The default height of the Profiling view, as well as the timeline graph within it is usually too small, so adjust these heights to get a sensible graph. You can click at different points of the graph to see the memory allocations at that time. If you look according to the key you can see how much memory is allocated by Java, Native, Graphics, Code etc.
+
+Looking further down you can see the *Table* of Java/Kotlin allocations for your selected time on the timeline. With ML a lot of your allocations are likely to be byte[] for byte buffers, or possibly int[] for image data, etc. Clicking on the data type will open up the particular allocations, showing their size and when they were allocated. This will help to quickly narrow down their use, and whether they are all needed etc.
+
+### Native
+
+For the [native side](https://developer.android.com/studio/profile/record-native-allocations), the process is similar but with different options. Choose *Profiler: Run 'app' as profileable*, and then select the *Track Memory Consumption (Native Allocations)* task. Here you have to *Start profiler task from: Process Start*. Choose *Stop* once you've captured enough data.
+
+The Native view doesn't have the same nice timeline graph as the Java/Kotlin side, but it does have the *Table* and *Visualization* tabs. The *Table* tab no longer has a list of allocations, but options to *Arrange by allocation method* or *callstack*. Choose *Arrange by callstack* and then you can trace down which functions were allocating significant memory. Potentially more useful, you can also see Remaining Size. 
+
+In the Visualization tab you can see the callstack as a graph, and once again you can look at total Allocations Size or Remaining Size. If you look at Remaining Size, you can see what is still allocated at the end of the profiling, and by looking a few steps up the stack, probably see which allocations are related to the ML model, by seeing functions that relate to the framework you are using. A lot of the memory may be allocated by that framework rather than in your code, and you may not have much control over it, but it is useful to know where the memory is going.
+
+## Other platforms
+
+On other platforms, you will need a different memory profiler. The objective of working out where the memory is being used is the same, and whether there are issues with leaks or just too much memory being used. There are often trade-offs between memory and speed, and they can be considered more sensibly if the numbers involved are known.