[numbers] Fraction() and Knuth 4.5.1 -- overflow, BigInteger, long, and rounding

Discussion:

Eric Barnhill

2018-11-07 17:52:30 UTC

I read Kunth's "Art of Computer Programming 4.5.1" that is referenced many
times in the doc as the guidance for the commons-math/commons-numbers
Fraction class. It is an interesting read. Also, for all the times it is
cited in the doc, it is interesting that Fraction doesn't really use it as
implemented. Here is one example.

Knuth is concerned about overflow in multiplication and division, because
numerator of f1 is multiplied by denominator of f2 and so forth, so he
suggests a technique called "mediant rounding" that allows for intermediate
quantities in fraction multiplication to be rounded.

It is a clever technique and probably works well, however the current
Fraction class cites this chapter, then implements multiplication with
BigInteger instead, ignoring this suggestion.

First of all, the doc should be clear that the code is NOT following 4.5.1,
while it gives the opposite impression. And that's ok but the use of
BigInteger creates additional inconsistency: Multiply and divide are
accomplished using ArithmeticUtils.addAndCheck and
ArithmeticUtils.mulAndCheck . These convert the relevant ints to longs,
then perform the operation, then if the resulting long is greater than the
range of an int, throw an OverflowException. So some parts of Fraction
check for overflow using longs and others use BigInteger.

It seems to me that BigInteger is overkill here for the vast majority of
practical uses of Fraction in a way that could be damaging for performance.
And furthermore, we already have a BigFraction class to handle cases that
require BigInteger.

So, I propose rewriting the doc to say the opposite of what it currently
says when appropriate, and get usages of BigInteger out of Fraction, use
them only in BigFraction, and use the long-based ArithmeticUtils methods to
check for overflow and underflow in fraction addition and subtraction.

Eric

Gilles

2018-11-07 23:46:07 UTC

Permalink

Hi.

Post by Eric Barnhill
I read Kunth's "Art of Computer Programming 4.5.1" that is referenced many
times in the doc as the guidance for the commons-math/commons-numbers
Fraction class. It is an interesting read. Also, for all the times it is
cited in the doc, it is interesting that Fraction doesn't really use it as
implemented. Here is one example.
Knuth is concerned about overflow in multiplication and division, because
numerator of f1 is multiplied by denominator of f2 and so forth, so he
suggests a technique called "mediant rounding" that allows for
intermediate
quantities in fraction multiplication to be rounded.
It is a clever technique and probably works well, however the current
Fraction class cites this chapter, then implements multiplication with
BigInteger instead, ignoring this suggestion.
First of all, the doc should be clear that the code is NOT following 4.5.1,
while it gives the opposite impression. And that's ok but the use of
BigInteger creates additional inconsistency: Multiply and divide are
accomplished using ArithmeticUtils.addAndCheck and
ArithmeticUtils.mulAndCheck . These convert the relevant ints to longs,
then perform the operation, then if the resulting long is greater than the
range of an int, throw an OverflowException. So some parts of
Fraction
check for overflow using longs and others use BigInteger.
It seems to me that BigInteger is overkill here for the vast majority of
practical uses of Fraction in a way that could be damaging for
performance.
And furthermore, we already have a BigFraction class to handle cases that
require BigInteger.
So, I propose rewriting the doc to say the opposite of what it
currently
says when appropriate, and get usages of BigInteger out of Fraction, use
them only in BigFraction, and use the long-based ArithmeticUtils methods to
check for overflow and underflow in fraction addition and
subtraction.
Eric

Thanks a lot for the fact-checking.

Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-***@commons.apache.org
For additional commands, e-mail: dev-***@commons.apache.org

Gary Gregory

2018-11-08 18:34:21 UTC

Permalink

I'm all for the Javadoc made to reflect the reality of the code. It is fine
to have an additional section that points out Knuth and how we may want to
change things as a hint or request to contributors.

Gary

Post by Eric Barnhill
I read Kunth's "Art of Computer Programming 4.5.1" that is referenced many
times in the doc as the guidance for the commons-math/commons-numbers
Fraction class. It is an interesting read. Also, for all the times it is
cited in the doc, it is interesting that Fraction doesn't really use it as
implemented. Here is one example.
Knuth is concerned about overflow in multiplication and division, because
numerator of f1 is multiplied by denominator of f2 and so forth, so he
suggests a technique called "mediant rounding" that allows for intermediate
quantities in fraction multiplication to be rounded.
It is a clever technique and probably works well, however the current
Fraction class cites this chapter, then implements multiplication with
BigInteger instead, ignoring this suggestion.
First of all, the doc should be clear that the code is NOT following 4.5.1,
while it gives the opposite impression. And that's ok but the use of
BigInteger creates additional inconsistency: Multiply and divide are
accomplished using ArithmeticUtils.addAndCheck and
ArithmeticUtils.mulAndCheck . These convert the relevant ints to longs,
then perform the operation, then if the resulting long is greater than the
range of an int, throw an OverflowException. So some parts of Fraction
check for overflow using longs and others use BigInteger.
It seems to me that BigInteger is overkill here for the vast majority of
practical uses of Fraction in a way that could be damaging for performance.
And furthermore, we already have a BigFraction class to handle cases that
require BigInteger.
So, I propose rewriting the doc to say the opposite of what it currently
says when appropriate, and get usages of BigInteger out of Fraction, use
them only in BigFraction, and use the long-based ArithmeticUtils methods to
check for overflow and underflow in fraction addition and subtraction.
Eric

Eric Barnhill

2018-11-10 00:33:32 UTC

Permalink

Addendum to the above. In an exercise in the Knuth book Knuth does indeed
state that "If the inputs are n-bit binary numbers, 2N+1 bits may be
necessary to represent t." where t is a derived quantity that would take
some time to explain.

So that means in extreme cases, the needed precision to represent a
fraction operation with 32 bits ints is 65 bits, one more than a long has.

The present code solves this by using BigInteger briefly in the code, which
strikes me as an awfully big performance hit for what must surely be very
occasional and very extreme cases.

I think the most sensible strategy would be to restrict the precision of
Fraction to longs, with user guidance to use BigFraction if there is
concern of overflow.

Eric

Post by Gary Gregory
I'm all for the Javadoc made to reflect the reality of the code. It is fine
to have an additional section that points out Knuth and how we may want to
change things as a hint or request to contributors.
Gary

Post by Eric Barnhill
I read Kunth's "Art of Computer Programming 4.5.1" that is referenced

many

Post by Eric Barnhill
times in the doc as the guidance for the commons-math/commons-numbers
Fraction class. It is an interesting read. Also, for all the times it is
cited in the doc, it is interesting that Fraction doesn't really use it

Post by Eric Barnhill
implemented. Here is one example.
Knuth is concerned about overflow in multiplication and division, because
numerator of f1 is multiplied by denominator of f2 and so forth, so he
suggests a technique called "mediant rounding" that allows for

intermediate

Post by Eric Barnhill
quantities in fraction multiplication to be rounded.
It is a clever technique and probably works well, however the current
Fraction class cites this chapter, then implements multiplication with
BigInteger instead, ignoring this suggestion.
First of all, the doc should be clear that the code is NOT following

4.5.1,

Post by Eric Barnhill
while it gives the opposite impression. And that's ok but the use of
BigInteger creates additional inconsistency: Multiply and divide are
accomplished using ArithmeticUtils.addAndCheck and
ArithmeticUtils.mulAndCheck . These convert the relevant ints to longs,
then perform the operation, then if the resulting long is greater than

the

Post by Eric Barnhill
range of an int, throw an OverflowException. So some parts of Fraction
check for overflow using longs and others use BigInteger.
It seems to me that BigInteger is overkill here for the vast majority of
practical uses of Fraction in a way that could be damaging for

performance.

Post by Eric Barnhill
And furthermore, we already have a BigFraction class to handle cases that
require BigInteger.
So, I propose rewriting the doc to say the opposite of what it currently
says when appropriate, and get usages of BigInteger out of Fraction, use
them only in BigFraction, and use the long-based ArithmeticUtils methods

Post by Eric Barnhill
check for overflow and underflow in fraction addition and subtraction.
Eric

Eric Barnhill

2018-11-30 23:56:54 UTC

Permalink

Here is what I propose for the Fraction doc text regarding this issue:

* Implement add and subtract. This algorithm is similar to that
* described in Knuth 4.5.1. while making some concessions to
* performance. Note Knuth 4.5.1 Exercise 7, which observes that
* adding two fractions with 32-bit numerators and denominators
* requires 65 bits in extreme cases. Here calculations are performed
* with 64-bit longs and the BigFraction class is recommended for
numbers
* that may grow large enough to be in danger of overflow.

Post by Eric Barnhill
Addendum to the above. In an exercise in the Knuth book Knuth does indeed
state that "If the inputs are n-bit binary numbers, 2N+1 bits may be
necessary to represent t." where t is a derived quantity that would take
some time to explain.
So that means in extreme cases, the needed precision to represent a
fraction operation with 32 bits ints is 65 bits, one more than a long has.
The present code solves this by using BigInteger briefly in the code,
which strikes me as an awfully big performance hit for what must surely be
very occasional and very extreme cases.
I think the most sensible strategy would be to restrict the precision of
Fraction to longs, with user guidance to use BigFraction if there is
concern of overflow.
Eric

Post by Eric Barnhill
I read Kunth's "Art of Computer Programming 4.5.1" that is referenced

many

Post by Eric Barnhill
implemented. Here is one example.
Knuth is concerned about overflow in multiplication and division,

because

Post by Eric Barnhill
numerator of f1 is multiplied by denominator of f2 and so forth, so he
suggests a technique called "mediant rounding" that allows for

intermediate

4.5.1,

the

performance.

Post by Eric Barnhill
And furthermore, we already have a BigFraction class to handle cases

that

Post by Eric Barnhill
require BigInteger.
So, I propose rewriting the doc to say the opposite of what it currently
says when appropriate, and get usages of BigInteger out of Fraction, use
them only in BigFraction, and use the long-based ArithmeticUtils

methods to

Post by Eric Barnhill
check for overflow and underflow in fraction addition and subtraction.
Eric

Gilles

2018-12-01 01:23:31 UTC

Permalink

Post by Eric Barnhill
* Implement add and subtract. This algorithm is similar to that
* described in Knuth 4.5.1. while making some concessions to
* performance. Note Knuth 4.5.1 Exercise 7, which observes that
* adding two fractions with 32-bit numerators and denominators
* requires 65 bits in extreme cases. Here calculations are performed
* with 64-bit longs and the BigFraction class is recommended for
numbers
* that may grow large enough to be in danger of overflow.

Does this mean that computations can "unpredictably" overflow
(or throw an exception)?
Is it acceptable, or should we enclose the problematic code in
a "try" block and redo the computation with "BigInteger" when
necessary?

What is the performance hit of using "BigFraction" rather than
"Fraction"?
Are there use-cases that would need the ultimate performance from
"Fraction" while not worry about overflow?

Regards,
Gilles

Post by Eric Barnhill

Post by Eric Barnhill
Addendum to the above. In an exercise in the Knuth book Knuth does indeed
state that "If the inputs are n-bit binary numbers, 2N+1 bits may be
necessary to represent t." where t is a derived quantity that would take
some time to explain.
So that means in extreme cases, the needed precision to represent a
fraction operation with 32 bits ints is 65 bits, one more than a long has.
The present code solves this by using BigInteger briefly in the code,
which strikes me as an awfully big performance hit for what must surely be
very occasional and very extreme cases.
I think the most sensible strategy would be to restrict the
precision of
Fraction to longs, with user guidance to use BigFraction if there is
concern of overflow.
Eric
On Thu, Nov 8, 2018 at 11:11 AM Gary Gregory

Post by Gary Gregory
I'm all for the Javadoc made to reflect the reality of the code. It
is
fine
to have an additional section that points out Knuth and how we may want to
change things as a hint or request to contributors.
Gary
On Wed, Nov 7, 2018 at 10:52 AM Eric Barnhill

Post by Eric Barnhill
I read Kunth's "Art of Computer Programming 4.5.1" that is

referenced
many

Post by Eric Barnhill
times in the doc as the guidance for the

commons-math/commons-numbers

Post by Eric Barnhill
Fraction class. It is an interesting read. Also, for all the

times it is

Post by Eric Barnhill
cited in the doc, it is interesting that Fraction doesn't really

use it
as

Post by Eric Barnhill
implemented. Here is one example.
Knuth is concerned about overflow in multiplication and division,

because

Post by Eric Barnhill
numerator of f1 is multiplied by denominator of f2 and so forth,

so he

Post by Eric Barnhill
suggests a technique called "mediant rounding" that allows for

intermediate

Post by Eric Barnhill
quantities in fraction multiplication to be rounded.
It is a clever technique and probably works well, however the

current

Post by Eric Barnhill
Fraction class cites this chapter, then implements multiplication

with

Post by Eric Barnhill
BigInteger instead, ignoring this suggestion.
First of all, the doc should be clear that the code is NOT

following
4.5.1,

Post by Eric Barnhill
while it gives the opposite impression. And that's ok but the use

Post by Eric Barnhill
BigInteger creates additional inconsistency: Multiply and divide

are

Post by Eric Barnhill
accomplished using ArithmeticUtils.addAndCheck and
ArithmeticUtils.mulAndCheck . These convert the relevant ints to

longs,

Post by Eric Barnhill
then perform the operation, then if the resulting long is greater

than
the

Post by Eric Barnhill
range of an int, throw an OverflowException. So some parts of

Fraction

Post by Eric Barnhill
check for overflow using longs and others use BigInteger.
It seems to me that BigInteger is overkill here for the vast

majority of

Post by Eric Barnhill
practical uses of Fraction in a way that could be damaging for

performance.

Post by Eric Barnhill
And furthermore, we already have a BigFraction class to handle

cases
that

Post by Eric Barnhill
require BigInteger.
So, I propose rewriting the doc to say the opposite of what it

currently

Post by Eric Barnhill
says when appropriate, and get usages of BigInteger out of

Fraction, use

Post by Eric Barnhill
them only in BigFraction, and use the long-based ArithmeticUtils

methods to

Post by Eric Barnhill
check for overflow and underflow in fraction addition and

subtraction.

Post by Eric Barnhill
Eric

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-***@commons.apache.org
For additional commands, e-mail: dev-***@commons.apache.org

Eric Barnhill

2018-12-03 18:01:39 UTC

Permalink

Post by Gilles
Does this mean that computations can "unpredictably" overflow
(or throw an exception)?

The ArithmeticUtils() methods mulAndCheck and addAndCheck throw exceptions
if there is overflow during primitive operations. That is the "check" part
of the method name.

Post by Gilles
Is it acceptable, or should we enclose the problematic code in
a "try" block and redo the computation with "BigInteger" when
necessary?
What is the performance hit of using "BigFraction" rather than
"Fraction"?

I once used BigDecimal for a project, it is great code but the performance
is nothing close to using primitives.

Post by Gilles
Are there use-cases that would need the ultimate performance from
"Fraction" while not worry about overflow?

You would need a greatest common factor between the two fractions that was
larger than 64 bits.

Again, BigFraction is there for anyone worried about such a case and there
is no significant performance hit to switching over to BigFraction compared
to a Fraction class that was using BigInteger under the hood. But I
suspect there would be a substantial performance gain if longs were being
used under the hood for the Fraction class for the more common use case of
smaller fractions. If it would be best practice, a bit of microbenchmarking
could be done to check.

A FractionOverflowException could be specifically tailored to this use case
and the error message can suggest using BigFraction. Or as you suggest, the
catch block could silently or with warning return a BigFraction. If we have
class inheritance straight, and both Fraction and BigFraction have the
exact same interface, this could be an elegant solution.

Eric

Gilles

2018-12-03 23:33:25 UTC

Permalink

Hi Eric.

Post by Eric Barnhill

Post by Gilles
Does this mean that computations can "unpredictably" overflow
(or throw an exception)?

The ArithmeticUtils() methods mulAndCheck and addAndCheck throw exceptions
if there is overflow during primitive operations. That is the "check" part
of the method name.

I once used BigDecimal for a project, it is great code but the
performance
is nothing close to using primitives.

Post by Gilles
Are there use-cases that would need the ultimate performance from
"Fraction" while not worry about overflow?

You would need a greatest common factor between the two fractions that was
larger than 64 bits.
Again, BigFraction is there for anyone worried about such a case and there
is no significant performance hit to switching over to BigFraction compared
to a Fraction class that was using BigInteger under the hood. But I
suspect there would be a substantial performance gain if longs were being
used under the hood for the Fraction class for the more common use case of
smaller fractions. If it would be best practice, a bit of
microbenchmarking
could be done to check.
A FractionOverflowException could be specifically tailored to this use case
and the error message can suggest using BigFraction. Or as you
suggest, the
catch block could silently or with warning return a BigFraction. If we have
class inheritance straight, and both Fraction and BigFraction have the
exact same interface, this could be an elegant solution.

Maybe I misunderstood, but I thought that only intermediate results
would require "BigInteger"; if the result needs to be converted to
"BigFraction", than I'd favour raising an exception (with the advice
which you mention).
If later on someone comes up with a use-case for the alternate
solution,
we can revisit.

Regards,
Gilles

Post by Eric Barnhill
Eric

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-***@commons.apache.org
For additional commands, e-mail: dev-***@commons.apache.org