-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathmem2mem.txt
36 lines (29 loc) · 1.81 KB
/
mem2mem.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
memory-to-memory DMA vs library memcpy()
DUE 84MHz maple 72Mhz teensy 96MHz mbed LPC1768 96mhz
ZERO 48MHz CC3200 80MHz STM32L4 80MHz mbed MK64F 120 mhz
For the DUE DMA memory-to-memory transfer, we use the 32-byte FIFO channel 5
and ASAP_CFG. We compare the DMA times on the DUE,teensy, and maple with the
library memcpy() and memset() and with a for-loop (dst[i] = src[i]) using
1000 32-bit words. memcpy32() is the DMA version. The library memcpy()
uses an unrolled (64) loop of word load/store's. The teensy 3.* at 96MHz
has a max eDMA SRAM-to-SRAM speed of 192MBs (1536mbs).
Times are in microseconds
maple DUE teensy 3.1 mbed ZERO CC3200 STM32L4 MK64F
loop 269 132 95 94 9 217 461 26
memcpy32 93 31 23 27 9 43 31 69 35
memcpy 62 52 96 50 9 194 15 29 26
memset32 94 34 23 43 19
memset 38 41 51 24 6 12 16 17
DUEling memory copies:
In mem2mem2 we used an asynchronous DMA memory transfer of 1024 32-bit words,
starting the DMA version then starting the library memcpy. For the maple,
the library memcpy() completes first (64us), then the DMA finishes (122us)
with total elapsed time of 122us.
For the DUE, the DMA copy finishes first (34us), then the memcpy (80us),
with total elapsed time of 82us. Both src/dst pairs were in SRAM0.
With the memcpy() src/dst pairs in SRAM1, DMA took 33us, and the memcpy
took 60us, with elapsed time of 70us.
For the teensy 3.0, the DMA finishes first (27us), and then the memcpy (105us),
with total elapsed time of 107us. Teensy test used two different destination
buffers, but same source buffer (only 16KB RAM on 3.0). For teensy 3.1, DMA
finishes first (48us), then memcpy (65us), for total 66us.