Speed of division with remainder for cpp_int #537

afabri · 2023-02-24T13:18:54Z

In a geometric code I observe a slowdown of a factor 4, when I switch from the gmp to the cpp_int backend. In vtune I see that more than 35% of the time are spent here. The comment at the signature leads me to the question if I am in the scenario where this function is not optimized. Is there a trivial fix? The comment seems to indicate that it is optimized for small numbers of limbs.

afabri · 2023-02-24T14:36:12Z

In my case the numerator has 78000 digits and the denominator 30000.

ckormanyos · 2023-02-24T18:10:34Z

It might be possible to low-word the shift on the carry. But I am really not sure if that exact sequence is using full multiples of the number of bits in a limb for the shift.

As a general comment, it's really hard to beat or match the raw speed of GMP when using portable code. They squeeze the assembler for their well-deserved, legendary reputation for bare-metal speed.

jzmaddock · 2023-02-24T18:12:46Z

I'm not surprised by this. Division is hard, so many corner cases to handle correctly. There are better algorithms (in the big-O sense), but it's a significant piece of work to implement them correctly.

ckormanyos · 2023-02-24T18:47:27Z

Division is hard, so many corner cases to handle correctly

I actually have a Knuth long division in an unrelated project. I've run hundreds of millions of tests on it and squeezed all the bugs out over the course of, my gosh, years. I use an opposite limb-ordering.

I feel that Knuth long division is the winner for limb-counts below 2,000 or so. But at that range of your test case, you're in the quadratic domain of classic algorithms. You'd probably be better off inverting-and-multiplying with conversion to bin-float, then Newton-Raphson-ing it. But to assemble the result, you'd be desperately driven to make sure you rorund correctly. Oh my gosh, I get exahusted just thingking about verifying and testing that code for hard-world-open use.

I wouldn't take on either of these tasks at the moment. But maybe note this as a potential place for optimization in the future...
As these are really daunting tasks.

ckormanyos · 2023-02-24T18:53:38Z

actually have a Knuth long division

Since I've never spoken to, really anyone about this one and it's a gem, ...

Here is that Knuth long division - in it's full-on, loving, 300 lines of glory (albeit with the wrong limb-ordering) for future reference if anyone ever cares for it.

afabri mentioned this issue Feb 24, 2023

CORE: Use boost::multiprecision::number API CGAL/cgal#6035

Merged

2 tasks

afabri changed the title ~~Speed of division with reminder for cpp_int~~ Speed of division with remainder for cpp_int Feb 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed of division with remainder for cpp_int #537

Speed of division with remainder for cpp_int #537

afabri commented Feb 24, 2023

afabri commented Feb 24, 2023

ckormanyos commented Feb 24, 2023 •

edited

Loading

jzmaddock commented Feb 24, 2023

ckormanyos commented Feb 24, 2023 •

edited

Loading

ckormanyos commented Feb 24, 2023 •

edited

Loading

Speed of division with remainder for cpp_int #537

Speed of division with remainder for cpp_int #537

Comments

afabri commented Feb 24, 2023

afabri commented Feb 24, 2023

ckormanyos commented Feb 24, 2023 • edited Loading

jzmaddock commented Feb 24, 2023

ckormanyos commented Feb 24, 2023 • edited Loading

ckormanyos commented Feb 24, 2023 • edited Loading

ckormanyos commented Feb 24, 2023 •

edited

Loading

ckormanyos commented Feb 24, 2023 •

edited

Loading

ckormanyos commented Feb 24, 2023 •

edited

Loading