[Latency] DefaultLookupDict too slow #1300

hannw · 2020-08-16T00:23:44Z

Description

The __getitem__ method of DefaultLookupDict is too slow. Profiling on a p3.16xlarge machine on AWS shows that each __getitem__ method costs 2.4 micro second, whereas a regular dictionary .get(key, defaultvalue) method on the same machine is around 120 nano second, so the current implementation is 20 times slower than the regular dict get operation. This operation along is taking about 50% of all processing time in our data pipeline. Is there a way to speed this up?

Error Message

N/A

To Reproduce

Use Vocab to numericalize strings as usual.

The text was updated successfully, but these errors were encountered:

sxjscience · 2020-08-16T00:48:57Z

@hannw Would you provide some profiling scripts that arrive at these numbers? In the new version, we use the default dictionary for storing the mapping:

gluon-nlp/src/gluonnlp/data/vocab.py

Line 114 in 32e87d4

self._token_to_idx = dict()

. Also, we are doing a series of benchmarks to analyze the speed of GluonNLP: https://github.com/dmlc/gluon-nlp/tree/master/scripts/benchmarks. If you can provide some profiling scripts, it will be super helpful for us to speed up GluonNLP.

hannw · 2020-08-16T01:12:56Z

@sxjscience for profiling, we use the default python cProfiler while running the training script and snakeviz to visualize the breakdown.

We are currently using 0.8.x of gluon-nlp, so maybe that's why the get is slow. Let us test the bleeding edge and see if it speed things up.

szha · 2020-08-16T03:46:35Z

@hannw thanks for the interest. note that the master branch is now used for numpy-compatible version of gluonnlp (#1298) which relies on mxnet 2.0 nightly builds (available to developers at https://dist.mxnet.io/python).

szha · 2020-09-17T05:38:16Z

@hannw I think the above comparison is not apple to apple in our supported use cases. The Vocab class is designed to handle both the cases on whether unknown token is set. If the unknown token is not set, then it should throw an error according to the definition of the class.

If you know beforehand that you always have unknown token, a good option may be to directly use the built-in dictionary instead of Vocab.

I did a comparison among the vocab class on 0.x, 0.8.3, and the one on 1.x and ~~I don't see a significant difference, most likely due to the fact that there's a condition check in the new logic on whether unknown token is set~~. I see mixed performance for two implementation in the cases of w/ and w/o unknown_token:

# Tests done on python3.7 OSX 10.15.6
# GluonNLP 0.x
# w/ unknown token
v = Vocab({k:1 for k in ['a', 'b', 'c']})
%timeit v['a']
530 ns ± 17.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit v['abc']
565 ns ± 24 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
keys=['a', 'c', 'c', 'b', 'c', 'c', 'c', 'c', 'a', 'b']
%timeit v[keys]
3.32 µs ± 149 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
keys=['a', 'c', 'c', 'b', 'c', 'c', 'c', 'c', 'a', 'b'] * 1000
%timeit v[keys]
2.67 ms ± 144 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

# w/o unknown token
v = Vocab({k:1 for k in ['a', 'b', 'c']}, unknown_token=None)
%timeit v['a']
362 ns ± 16.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
keys=['a', 'c', 'c', 'b', 'c', 'c', 'c', 'c', 'a', 'b']
%timeit v[keys]
1.17 µs ± 28.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
keys=['a', 'c', 'c', 'b', 'c', 'c', 'c', 'c', 'a', 'b'] * 1000
%timeit v[keys]
582 µs ± 18.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


# GluonNLP 0.8.3
# w/ unknown token
v = Vocab({k:1 for k in ['a', 'b', 'c']})
%timeit v['a']
530 ns ± 17.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit v['abc']
550 ns ± 18.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
keys=['a', 'c', 'c', 'b', 'c', 'c', 'c', 'c', 'a', 'b']
%timeit v[keys]
3.2 µs ± 166 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
keys=['a', 'c', 'c', 'b', 'c', 'c', 'c', 'c', 'a', 'b'] * 1000
%timeit v[keys]
2.65 ms ± 121 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

# w/o unknown token
v = Vocab({k:1 for k in ['a', 'b', 'c']}, unknown_token=None)
%timeit v['a']
362 ns ± 16.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
keys=['a', 'c', 'c', 'b', 'c', 'c', 'c', 'c', 'a', 'b']
%timeit v[keys]
1.13 µs ± 14.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
keys=['a', 'c', 'c', 'b', 'c', 'c', 'c', 'c', 'a', 'b'] * 1000
%timeit v[keys]
618 µs ± 51.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


# GluonNLP master
# w/ unknown token
v = Vocab({k:1 for k in ['a', 'b', 'c']})
%timeit v['a']
598 ns ± 14.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit v['abc']
646 ns ± 34.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
keys=['a', 'c', 'c', 'b', 'c', 'c', 'c', 'c', 'a', 'b']
%timeit v[keys]
2.34 µs ± 200 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
keys=['a', 'c', 'c', 'b', 'c', 'c', 'c', 'c', 'a', 'b'] * 1000
%timeit v[keys]
1.34 ms ± 24.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

# w/o unknown token
v = Vocab({k:1 for k in ['a', 'b', 'c']}, unknown_token=None)
%timeit v['a']
641 ns ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
keys=['a', 'c', 'c', 'b', 'c', 'c', 'c', 'c', 'a', 'b']
%timeit v[keys]
2.37 µs ± 89.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
keys=['a', 'c', 'c', 'b', 'c', 'c', 'c', 'c', 'a', 'b'] * 1000
%timeit v[keys]
1.34 ms ± 27.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

hannw added the bug Something isn't working label Aug 16, 2020

sxjscience added performance Performance issues and removed bug Something isn't working labels Aug 16, 2020

szha added the v0.x label Aug 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Latency] DefaultLookupDict too slow #1300

[Latency] DefaultLookupDict too slow #1300

hannw commented Aug 16, 2020

sxjscience commented Aug 16, 2020

hannw commented Aug 16, 2020

szha commented Aug 16, 2020

szha commented Sep 17, 2020 •

edited

Loading

[Latency] DefaultLookupDict too slow #1300

[Latency] DefaultLookupDict too slow #1300

Comments

hannw commented Aug 16, 2020

Description

Error Message

To Reproduce

sxjscience commented Aug 16, 2020

hannw commented Aug 16, 2020

szha commented Aug 16, 2020

szha commented Sep 17, 2020 • edited Loading

szha commented Sep 17, 2020 •

edited

Loading