Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle reading of floating point numbers better #68

Open
scymtym opened this issue May 15, 2020 · 13 comments
Open

Handle reading of floating point numbers better #68

scymtym opened this issue May 15, 2020 · 13 comments
Labels
enhancement New feature or request

Comments

@scymtym
Copy link
Member

scymtym commented May 15, 2020

There are at least two things to improve here:

  • Allow the client to make the floating point number itself by calling a new generic function with the individual pieces (desired type, sign, exponent, …).

  • Provide a default implementation that uses a more efficient and/or more numerically stable algorithm.

    Potentially relevant: https://dl.acm.org/doi/pdf/10.1145/93548.93557

@scymtym scymtym added the enhancement New feature or request label May 15, 2020
@Bike
Copy link
Member

Bike commented May 15, 2020

Aubrey Jaffer wrote a brief article on this: https://arxiv.org/abs/1310.8121v1

The code in "Reading" is in Scheme, but should be easily adaptable. It's only a few more lines than what Eclector has now. The round-quotient function described is probably doable as cl:round plus some extra adaptation.

If I understand correctly, this algorithm is less efficient than Clinger's, as it uses more and bigger intermediate bignums. But it might be an improvement over the current procedure.

@scymtym
Copy link
Member Author

scymtym commented May 16, 2020

Thank you.

@Bike
Copy link
Member

Bike commented Mar 5, 2021

upon further examination, no adaptation of round is actually needed. So you can pretty much use the code directly, substituting round for round-quotient, and float for exact->inexact and such. Plus you can use ash directly. Overall, it's

(defconstant +double-mantissa-digits+ 53)

(defun mantexp->dbl (mantissa exponent &key (base 10))
  (declare (type integer mantissa exponent))
  (if (minusp exponent)
      (let* ((scale (expt base (- exponent)))
             (bex (- (integer-length mantissa)
                     (integer-length scale)
                     +double-mantissa-digits+))
             (numerator (ash mantissa (- bex)))
             (quotient (round numerator scale)))
        (when (> (integer-length quotient) +double-mantissa-digits+)
          ;; too many bits of quotient
          (setf bex (1+ bex)
                quotient (round numerator (* scale 2))))
        (scale-float (float quotient 1d0) bex))
      (float (* mantissa (expt base exponent)) 1d0)))

Based on a quick test where I generate a million random pairs of mantissa and exponent, this results in exactly the same numbers that SBCL reads, so I think it works?

@scymtym
Copy link
Member Author

scymtym commented Mar 7, 2021

Looking into the Jaffer article. Quite the version history there. Conversion from HTML to PDF, then rewrite in Java.

@scymtym
Copy link
Member Author

scymtym commented Mar 7, 2021

With extensions for

  • both float types
  • negative zeros
  • "exploded" mantissa representation provided by the token lexer
    That is, -12.34 is represented as sign = -1 mantissa = 1234 decimal-exponent = 2

I ended up with the following:

;;; Based on "v1" version of Aubrey Jaffer (2018): Easy Accurate
;;; Reading and Writing of Floating-Point Numbers. Initial conversion
;;; to Common Lisp by Alex Wood.
(defun mantissa-and-exponent-to-float
    (sign mantissa decimal-exponent exponent-sign exponent float-type)
  (declare (type (member -1 1) sign)
           (type integer mantissa)
           (type (integer 0) decimal-exponent)
           (type (or null integer) exponent)
           (type (member -1 1) exponent-sign)
           (type (member single-float double-float float-type)))
  (macrolet
      ((convert (type)
         (let* ((prototype (ecase type
                             (single-float 0f0)
                             (double-float 0d0)))
                (precision (float-precision (ecase type
                                              (single-float 1f0)
                                              (double-float 1d0)))))
           `(labels ((negative-exponent (mantissa exponent)
                       (let* ((scale (expt 10 exponent))
                              (bex (- (integer-length mantissa)
                                      (integer-length scale)
                                      ,precision))
                              (numerator (ash mantissa (- bex)))
                              (quotient (round numerator scale)))
                         (when (> (integer-length quotient) ,precision)
                           (setf bex (1+ bex)
                                 quotient (round numerator (* scale 2))))
                         (scale-float (float quotient ,prototype) bex)))
                     (positive-exponent (mantissa exponent)
                       (float (* mantissa (expt 10 exponent)) ,prototype))
                     (uncertain-exponent (mantissa exponent)
                       (if (minusp exponent)
                           (negative-exponent mantissa (- exponent))
                           (positive-exponent mantissa exponent))))
              (cond ((eql mantissa 0) ; (negative) zero
                     (if (eql sign -1)
                         ,(- prototype)
                         ,prototype))
                    ((and (null exponent) (zerop decimal-exponent))
                     (break "cannot happen"))
                    ((null exponent)
                     (negative-exponent (* sign mantissa) decimal-exponent))
                    ((zerop decimal-exponent)
                     (if (eql exponent-sign 1)
                         (positive-exponent (* sign mantissa) exponent)
                         (negative-exponent (* sign mantissa) exponent)))
                    (t ; both, "decimal exponent" and "scientific exponent"
                     (uncertain-exponent
                      (* sign mantissa) (- (* exponent-sign exponent)
                                           decimal-exponent))))))))
    (ecase float-type
      (single-float (convert single-float))
      (double-float (convert double-float)))))

@scymtym
Copy link
Member Author

scymtym commented Jun 16, 2024

I think this is where left off:

;;; Taken from SBCL code
;;; Truncate EXPONENT if it's too large for a float.
(defun truncate-exponent (exponent mantissa decimal-exponent)
  ;; Work with base-2 logarithms to avoid conversions to floats, and
  ;; convert to base-10 conservatively at the end.  Use the least
  ;; positive float, because denormalized exponent can be larger than
  ;; normalized.
  (let* ((max-exponent          (+ sb-vm:double-float-digits
                                   sb-vm:double-float-bias))
         (mantissa-bits         (integer-length mantissa))
         (decimal-exponent-bits (1- (integer-length decimal-exponent)))
         (magnitude             (- mantissa-bits decimal-exponent-bits)))
    (if (minusp exponent)
        (max exponent (ceiling (- (+ max-exponent magnitude))
                               #.(cl:floor (cl:log 10 2))))
        (min exponent (floor (- max-exponent magnitude)
                             #.(cl:floor (cl:log 10 2)))))))

;;; Based on "v1" version of Aubrey Jaffer (2018): Easy Accurate
;;; Reading and Writing of Floating-Point Numbers. Initial conversion
;;; to Common Lisp by Alex Wood.
(defun mantissa-and-exponent-to-float
    (sign mantissa decimal-exponent exponent-sign exponent float-type)
  (declare (type (member -1 1) sign)
           (type integer mantissa)
           (type (integer 0) decimal-exponent)
           (type (or null integer) exponent)
           (type (member -1 1) exponent-sign)
           (type (member single-float double-float) float-type)
           (optimize speed))
  (macrolet
      ((convert (type)
         (let* ((prototype (ecase type
                             (single-float 0f0)
                             (double-float 0d0)))
                (precision (float-precision (ecase type
                                              (single-float 1f0)
                                              (double-float 1d0)))))
           `(labels ((negative-exponent (mantissa exponent)
                       (let* ((exponent (truncate-exponent
                                         exponent mantissa decimal-exponent))
                              (scale (expt 10 exponent))
                              (bex (- (integer-length mantissa)
                                      (integer-length scale)
                                      ,precision))
                              (numerator (ash mantissa (- bex)))
                              (quotient (round numerator scale)))
                         (when (> (integer-length quotient) ,precision)
                           (setf bex (1+ bex)
                                 quotient (round numerator (* scale 2))))
                         (scale-float (float quotient ,prototype) bex)))
                     (positive-exponent (mantissa exponent)
                       (let ((exponent (truncate-exponent
                                        exponent mantissa decimal-exponent)))
                         (float (* mantissa (expt 10 exponent)) ,prototype)))
                     (uncertain-exponent (mantissa exponent)
                       (if (minusp exponent)
                           (negative-exponent mantissa (- exponent))
                           (positive-exponent mantissa exponent))))
              (cond ((eql mantissa 0)   ; (negative) zero
                     (if (eql sign -1)
                         ,(- prototype)
                         ,prototype))
                    ((null exponent)
                     (negative-exponent (* sign mantissa) decimal-exponent))
                    ((zerop decimal-exponent)
                     (if (eql exponent-sign 1)
                         (positive-exponent (* sign mantissa) exponent)
                         (negative-exponent (* sign mantissa) exponent)))
                    (t ; both, "decimal exponent" and "scientific exponent"
                     (uncertain-exponent
                      (* sign mantissa) (- (* exponent-sign exponent)
                                           decimal-exponent))))))))
    (ecase float-type
      (single-float (convert single-float))
      (double-float (convert double-float)))))

@yitzchak
Copy link
Member

yitzchak commented Jun 24, 2024

Is the decimal-exponent referenced from the start of the mantissa or the end? Its ambiguous in your example for 12.34. For 12.345 is it 2 or 3?

From a conversion perspective 3 would probably be better because that just represents 10^(-3), whereas if you reference it from the start as 2, then the converter will probably need to know how many digits were in the string representation.

@yitzchak
Copy link
Member

Provided that the decimal-exponent is referenced from the end of the mantissa, then mantissa-and-exponent-to-float could be implemented via Quaviver:

(defun mantissa-and-exponent-to-float
    (sign mantissa decimal-exponent exponent-sign exponent float-type)
  (quaviver:integer-float (make-instance 'quaviver/liebler:client)
                          float-type
                          10
                          mantissa
                          (- (* exponent-sign (or exponent 0))
                             decimal-exponent)
                          sign))

Right now we have only one base 10 to base 2 client, quaviver/liebler. There is no state in the client instance so you can just reuse the instance across multiple mantissa-and-exponent-to-float calls.

@scymtym
Copy link
Member Author

scymtym commented Jun 25, 2024

It is from the end like you suspected so everything should work out.

Reusing a single client instance would be a prerequisite since otherwise the consing and object initialization would possibly dominate.

I will probably compare the quaviver version against the Jaffer/SBCL code above at some point. Compared to the completely naive implementation that Eclector currently uses, the Jaffer-based code achieved around 20 - 30 % improvements in both run time and consing.

@yitzchak
Copy link
Member

You could also use the Eclector client and add the Quaviver as a mixin.

@scymtym
Copy link
Member Author

scymtym commented Jun 25, 2024

You could also use the Eclector client and add the Quaviver as a mixin.

That's a good idea in principle but Eclector does not require client classes to have anything at all in their superclass list. So existing clients would have to be adjusted to add the appropriate mixin (or newly exposed superclass).

One idea could be to have something like (floating-point-behavior eclector-client) in order to retrieve an appropriate quaviver client object. This style was used in the Lisp Interface Library: interfaces could contain "sub-interfaces" for parametrizing individual aspects. I will have to think about possible solutions some more.

@yitzchak
Copy link
Member

Makes sense.

@yitzchak
Copy link
Member

I've added an implementation of Jaffer v7 to Quaviver. It hasn't been optimized or tested yet. It is doing bignum expt-5 right now which could be replaced with some table lookups for an easy speedup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants