Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pops init on demand #282

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from
Draft

Pops init on demand #282

wants to merge 4 commits into from

Conversation

alhom
Copy link
Contributor

@alhom alhom commented Sep 25, 2024

... for less reads when creating VlsvReaders.

@markusbattarbee
Copy link
Contributor

Is it really worth optimizing out the simple reading of vmesh extents etc? That's quite limited reads, isn't it?

@alhom
Copy link
Contributor Author

alhom commented Sep 26, 2024

When extracting a time series of a bulk value over a thousand files? I'd hope it'd help a bit.

@alhom
Copy link
Contributor Author

alhom commented Sep 26, 2024

Test, open a full set of bulk files as readers

import pytools as pt
import glob
import cProfile

fns = glob.glob('/wrk-vakka/group/spacephysics/vlasiator/3D/FHA/bulk1/bulk1.*.vlsv')


print("Opening readers on ",len(fns),"files")
cProfile.run('readers = [pt.vlsvfile.VlsvReader(f) for f in fns]', sort='cumulative')

I ran the same script for both unoptimized and unoptimized branches twice in succession to try and keep the Vakka filesystem pre-heated; the first script runs were 61.075 seconds and 14.501 seconds, pre-heated ones 19.727 seconds and 2.359 seconds respectively. Below the pre-heated full outputs, notably:

2118622 vs 458154 function calls
19516 vs 5060 read calls

Unoptimized:

(python3.10.4) ~/proj/analysator$ python test_timeevol.py 
INFO:Using LaTeX formatting
INFO:generated new fontManager
INFO:Using matplotlib version 3.8.3
Opening readers on  1112 files
         2118622 function calls (2118618 primitive calls) in 19.727 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   19.727   19.727 {built-in method builtins.exec}
        1    0.000    0.000   19.727   19.727 <string>:1(<module>)
        1    0.002    0.002   19.727   19.727 <string>:1(<listcomp>)
     1112    0.075    0.000   19.724    0.018 vlsvreader.py:159(__init__)
    19516    0.400    0.000   17.687    0.001 vlsvreader.py:860(read)
    19516   15.852    0.001   15.867    0.001 {built-in method numpy.fromfile}
     1112    0.023    0.000    1.573    0.001 vlsvreader.py:369(__read_xml_footer)
    10620    0.019    0.000    1.488    0.000 vlsvreader.py:3034(read_parameter)
     2224    0.019    0.000    0.986    0.000 ElementTree.py:1336(XML)
     2224    0.967    0.000    0.967    0.000 {method 'feed' of 'xml.etree.ElementTree.XMLParser' objects}
    20628    0.800    0.000    0.800    0.000 {built-in method io.open}
     8896    0.552    0.000    0.552    0.000 {method 'read' of '_io.BufferedReader' objects}
    78064    0.176    0.000    0.505    0.000 ast.py:54(literal_eval)
    78064    0.042    0.000    0.279    0.000 ast.py:33(parse)
    12232    0.196    0.000    0.256    0.000 vlsvreader.py:587(check_parameter)
    78064    0.225    0.000    0.225    0.000 {built-in method builtins.compile}
     7784    0.007    0.000    0.111    0.000 vlsvreader.py:387(<lambda>)
   908452    0.089    0.000    0.089    0.000 {method 'lower' of 'str' objects}
    21740    0.069    0.000    0.069    0.000 {method 'close' of '_io.BufferedReader' objects}
   335108    0.042    0.000    0.055    0.000 {built-in method builtins.isinstance}
    39032    0.011    0.000    0.029    0.000 abc.py:117(__instancecheck__)
    21740    0.027    0.000    0.027    0.000 {method 'seek' of '_io.BufferedReader' objects}
    78064    0.020    0.000    0.026    0.000 ast.py:82(_convert)
    98692    0.022    0.000    0.022    0.000 {built-in method builtins.len}
    33972    0.012    0.000    0.018    0.000 vlsvreader.py:55(__getattr__)
    39032    0.017    0.000    0.017    0.000 {built-in method _abc._abc_instancecheck}
     1112    0.002    0.000    0.014    0.000 posixpath.py:376(abspath)
    21128    0.013    0.000    0.013    0.000 {method 'format' of 'str' objects}
     1112    0.007    0.000    0.010    0.000 posixpath.py:337(normpath)
    78064    0.009    0.000    0.009    0.000 {method 'lstrip' of 'str' objects}
    33972    0.006    0.000    0.006    0.000 {built-in method builtins.getattr}
     1112    0.001    0.000    0.006    0.000 os.py:771(getenv)
     1112    0.001    0.000    0.005    0.000 _collections_abc.py:816(get)
     1112    0.002    0.000    0.004    0.000 os.py:674(__getitem__)
     2224    0.003    0.000    0.003    0.000 {built-in method _struct.unpack}
    19516    0.003    0.000    0.003    0.000 reduction.py:43(pass_op)
     1112    0.003    0.000    0.003    0.000 {built-in method numpy.asarray}
    20016    0.003    0.000    0.003    0.000 {method 'append' of 'list' objects}
     1112    0.001    0.000    0.002    0.000 posixpath.py:60(isabs)
     1112    0.002    0.000    0.002    0.000 {built-in method numpy.array}
     1112    0.001    0.000    0.002    0.000 os.py:754(encode)
     1112    0.001    0.000    0.001    0.000 {method 'split' of 'str' objects}
     3336    0.001    0.000    0.001    0.000 {method 'startswith' of 'str' objects}
     1112    0.001    0.000    0.001    0.000 os.py:758(decode)
     1112    0.000    0.000    0.001    0.000 posixpath.py:41(_get_sep)
     2224    0.000    0.000    0.000    0.000 {method 'close' of 'xml.etree.ElementTree.XMLParser' objects}
     3336    0.000    0.000    0.000    0.000 {built-in method posix.fspath}
     1112    0.000    0.000    0.000    0.000 {method 'join' of 'str' objects}
     1112    0.000    0.000    0.000    0.000 {method 'encode' of 'str' objects}
     1112    0.000    0.000    0.000    0.000 vlsvreader.py:52(__init__)
     1112    0.000    0.000    0.000    0.000 {method 'decode' of 'bytes' objects}
     1112    0.000    0.000    0.000    0.000 {built-in method builtins.iter}
      4/2    0.000    0.000    0.000    0.000 abc.py:121(__subclasscheck__)
      4/2    0.000    0.000    0.000    0.000 {built-in method _abc._abc_subclasscheck}
        1    0.000    0.000    0.000    0.000 os.py:1079(__subclasshook__)
        1    0.000    0.000    0.000    0.000 _collections_abc.py:78(_check_methods)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Optimized:

(python3.10.4) ~/proj/analysator$ python test_timeevol.py 
INFO:Using LaTeX formatting
INFO:generated new fontManager
INFO:Using matplotlib version 3.8.3
Opening readers on  1112 files
         458154 function calls (458150 primitive calls) in 2.359 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.359    2.359 {built-in method builtins.exec}
        1    0.000    0.000    2.359    2.359 <string>:1(<module>)
        1    0.002    0.002    2.359    2.359 <string>:1(<listcomp>)
     1112    0.035    0.000    2.358    0.002 vlsvreader.py:159(__init__)
     1112    0.021    0.000    1.392    0.001 vlsvreader.py:487(__read_xml_footer)
     5060    0.080    0.000    0.773    0.000 vlsvreader.py:978(read)
     2224    0.062    0.000    0.739    0.000 ElementTree.py:1336(XML)
     2224    0.676    0.000    0.676    0.000 {method 'feed' of 'xml.etree.ElementTree.XMLParser' objects}
     8896    0.666    0.000    0.666    0.000 {method 'read' of '_io.BufferedReader' objects}
    20240    0.215    0.000    0.299    0.000 ast.py:54(literal_eval)
     6172    0.222    0.000    0.222    0.000 {built-in method io.open}
     5060    0.184    0.000    0.187    0.000 {built-in method numpy.fromfile}
      612    0.001    0.000    0.143    0.000 vlsvreader.py:3152(read_parameter)
     7784    0.007    0.000    0.105    0.000 vlsvreader.py:505(<lambda>)
    20240    0.011    0.000    0.072    0.000 ast.py:33(parse)
    20240    0.058    0.000    0.058    0.000 {built-in method builtins.compile}
     1112    0.022    0.000    0.028    0.000 vlsvreader.py:705(check_parameter)
     7284    0.021    0.000    0.021    0.000 {method 'close' of '_io.BufferedReader' objects}
    88244    0.011    0.000    0.014    0.000 {built-in method builtins.isinstance}
     1112    0.001    0.000    0.013    0.000 posixpath.py:376(abspath)
    19516    0.006    0.000    0.010    0.000 vlsvreader.py:55(__getattr__)
     1112    0.006    0.000    0.009    0.000 posixpath.py:337(normpath)
    90880    0.009    0.000    0.009    0.000 {method 'lower' of 'str' objects}
     7284    0.009    0.000    0.009    0.000 {method 'seek' of '_io.BufferedReader' objects}
    10120    0.003    0.000    0.007    0.000 abc.py:117(__instancecheck__)
    20240    0.005    0.000    0.006    0.000 ast.py:82(_convert)
    26412    0.005    0.000    0.005    0.000 {built-in method builtins.len}
    10120    0.004    0.000    0.004    0.000 {built-in method _abc._abc_instancecheck}
    19516    0.004    0.000    0.004    0.000 {built-in method builtins.getattr}
     2224    0.003    0.000    0.003    0.000 {built-in method _struct.unpack}
    20240    0.002    0.000    0.002    0.000 {method 'lstrip' of 'str' objects}
     1112    0.001    0.000    0.002    0.000 posixpath.py:60(isabs)
     1112    0.002    0.000    0.002    0.000 {built-in method numpy.array}
    10008    0.001    0.000    0.001    0.000 {method 'append' of 'list' objects}
     1112    0.001    0.000    0.001    0.000 {method 'split' of 'str' objects}
     5060    0.001    0.000    0.001    0.000 reduction.py:43(pass_op)
     3336    0.001    0.000    0.001    0.000 {method 'startswith' of 'str' objects}
     1112    0.000    0.000    0.001    0.000 posixpath.py:41(_get_sep)
     1112    0.001    0.000    0.001    0.000 {method 'join' of 'str' objects}
     2224    0.000    0.000    0.000    0.000 {method 'close' of 'xml.etree.ElementTree.XMLParser' objects}
     3336    0.000    0.000    0.000    0.000 {built-in method posix.fspath}
     1112    0.000    0.000    0.000    0.000 vlsvreader.py:52(__init__)
     1112    0.000    0.000    0.000    0.000 {built-in method builtins.iter}
      4/2    0.000    0.000    0.000    0.000 abc.py:121(__subclasscheck__)
      4/2    0.000    0.000    0.000    0.000 {built-in method _abc._abc_subclasscheck}
        1    0.000    0.000    0.000    0.000 os.py:1079(__subclasshook__)
        1    0.000    0.000    0.000    0.000 _collections_abc.py:78(_check_methods)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants