Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[glibc] Optimize memcopy to Improve EEMBC Network 2.0 ip_reassembly / nat #11

Open
vineetgarc opened this issue Nov 6, 2020 · 0 comments
Assignees

Comments

@vineetgarc
Copy link
Contributor

glibc memcpy showed as top hotspot when profiling EEMBC network 2.0 specifically in 2 sub-tests

# perf stat   gcc/bin/ip_reassembly.exe -autogo >/tmp/x

 Performance counter stats for 'gcc/bin/ip_reassembly.exe -autogo':

           1137.96 msec task-clock                #    0.965 CPUs utilized          
               229      context-switches          #    0.201 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               733      page-faults               #    0.644 K/sec                  
     1,137,637,340      cycles                    #    1.000 GHz                    
       347,444,844      instructions              #    0.31  insn per cycle         
        30,389,001      branches                  #   26.705 M/sec                  
           5042494      branch-misses             #   16.59% of all branches        

       1.179703860 seconds time elapsed
# perf record -c 10000 gcc/bin/ip_reassembly.exe -autogo >/tmp/x
#
# Samples: 117K of event 'cycles'
# Event count (approx.): 1176170000
#
# Overhead       Samples  Command          Shared Object      Symbol            
# ........  ............  ...............  .................  .................. ..................

  61.29%         72084  ip_reassembly.e  libc-2.32.so       [.] _wordcopy_fwd_aligned
  15.95%         18762  ip_reassembly.e  ip_reassembly.exe  [.] ip_input
   9.82%         11551  ip_reassembly.e  ip_reassembly.exe  [.] ip_reass
   2.47%          2905  ip_reassembly.e  ip_reassembly.exe  [.] m_cat
   1.86%          2185  ip_reassembly.e  libc-2.32.so       [.] memmove

ARC glibc port uses the generic implementation of memcpy/memset which are already decentbut can be optimized for ARC with

  • unaligned access
  • Double load/store
  • any other arch specific helpers such as clz etc.
@vineetgarc vineetgarc self-assigned this Nov 6, 2020
@vineetgarc vineetgarc transferred this issue from foss-for-synopsys-dwc-arc-processors/toolchain Mar 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant