Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added better guess for IRR #61

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Added better guess for IRR #61

wants to merge 2 commits into from

Conversation

user799595
Copy link

Following up on my comment here: #60 (comment)

This PR implements a better guess for IRR.

I tested it on some of my data and it's faster than using guess = 0. Did not benchmark it against the previous heuristic.

Two tests now fail: test_gh_39 and test_gh_44.

In the former, -0.0180967864739624 is found instead of 0.12. In the latter, -0.9997912604283283 is found instead of 1.00426.

@jlopezpena
Copy link
Contributor

I am all for introducing better guesses (having a good guess is critical for Newton-Raphson to converge!) but it seems the method in this PR is producing the wrong answer in several cases. Do you have any examples in which the simple-interest heuristic (the one introduced in the PR you quote, which in turn was replacing the old "always use 0.1 as the guess") produces a wrong result? If not, I'd say that correctness trumps efficiency for the default choice (users can still pass their own guess if they want to!)

@user799595
Copy link
Author

Thanks for the quick reply!

it seems the method in this PR is producing the wrong answer in several cases

I'm not sure I understand. The test cases are failing, but I would argue that -1.8% is no worse an IRR than 12% (in test_gh_39). Similarly for test_gh_44.

@jlopezpena
Copy link
Contributor

jlopezpena commented Mar 8, 2023

In the first test, the positive cashflows are larger than the negative ones, so a negative IRR doesn't really make sense. Similarly in the second one, how can an IRR of basically -1 be correct? That would basically mean all the investment is lost

@user799595
Copy link
Author

user799595 commented Mar 8, 2023

There was a mistake in my code, please see the latest commit. I'm thankful for test_gh_44 (now passing) and for your comments. If we decide to go forward with this new heuristic, I have an idea for another test.

Regarding test_gh_39: I think you're arguing that sum(values) > 0 therefore irr(values) > 0. But this isn't necessarily the case. In fact, irr(-values) == irr(values) but sum(-values) and sum(values) will have opposite signs.

@jlopezpena
Copy link
Contributor

jlopezpena commented Mar 8, 2023

There are two different considerations here. One is the logical meaning of IRR as a return on an investment. The other is the definition as a solution of a certain polynomial equation. The polynomial has degree n (length of cashflows minus 1) so there will potentially be many solutions. This does NOT mean that EVERY solution should be thought of as a "valid" IRR.

In the case of the first example, the fact that the sum of the values is positive together with the fact that not all cashflows have the same sign guarantees that there exist a POSITIVE real solution. Because an investment in which the inflows are smaller than the outflows is intuitively understood as being "profitable", textbook implementations of IRR will always pick the positive solution instead of the negative one. Similarly, if sum of values is negative, it is guaranteed to have a NEGATIVE solution (which is consistent with the logical interpretation as a financial loss). The heuristic I implemented just made sure the initial guess will be closer to a positive solution than to a negative one in the case of positive cashflows, and closer to a negative one in the case of negative cashflows, nothing more. It is not perfect and it tends to "overestimate" the initial guess (either positive or negative), but this is to prevent it from accidentally changing signs, which would be very unintuitive.

In your example when you change all the signs in the cashflows, then you have a positive solution, but also a negative one, and using the logical meaning the reasonable choice is the negative one (because outflows are larger than inflows, the investment should be thought of as a loss). We need to be careful here as we are talking about picking a DEFAULT guess, and this has the potential implication of changing the values that people have been getting in their code

@user799595
Copy link
Author

user799595 commented Mar 10, 2023

There are two different considerations here. One is the logical meaning of IRR as a return on an investment. The other is the definition as a solution of a certain polynomial equation. The polynomial has degree n (length of cashflows minus 1) so there will potentially be many solutions. This does NOT mean that EVERY solution should be thought of as a "valid" IRR.

This is a good point, and it made me realize that I need to think about IRR more deeply. I found this interesting & relevant paper: https://www.tandfonline.com/doi/abs/10.1080/00137910490453419.

Based on this paper, I think it would be helpful if the user supplied the following additional information to irr:

  1. A reference rate of interest $r_0$ for the project [default=0] - the paper calls this MARR=minimum acceptable rate of return
  2. Whether the cashflows are for an investment project or a loan project [default=investment]. For an investment project, higher IRR is better (=higher NPV) and for a loan project, lower IRR is preferred (=higher NPV).

The input 2. can be inferred for certain cashflows. If $\partial \mathrm{npv}(r, \mathrm{cf}) / \partial \mathrm{r} < 0$ for all $r$ then higher $r$ is better and this is a "pure investment project". Similarly, when $\partial \mathrm{npv}(r, \mathrm{cf}) / \partial \mathrm{r} > 0$ everywhere, then lower $r$ is always better and this is a "pure loan project". Ambiguity and multiple roots for IRR arise when the derivative changes sign, i.e. there are local optima for NPV in r.

More generally we could infer it based on the some heuristics:

  • if the sign of the first cashflow is negative, it's an investment project
  • if the sign of the last cashflow is positive, it's an investment project
  • if the derivative $\partial \mathrm{npv}(r, \mathrm{cf}) / \partial \mathrm{r}$ is negative at $r_0$, it's an investment project

The existing heuristic for the IRR is very much in the spirit of investment projects. In fact, for loan projects it guesses the wrong sign. Consider the pure loan project [1, -2] with IRR of 100%. The heuristic gives -50%.

It would be good to make sure that a correct root is found, i.e. if the user sends in an investment project, we don't return a root with $\partial \mathrm{npv} / \partial \mathrm{r} > 0.$

The input 1. could be used to resolve the remaining ambiguity over which IRR to return.

In the case of the first example, the fact that the sum of the values is positive together with the fact that not all cashflows have the same sign guarantees that there exist a POSITIVE real solution. Because an investment in which the inflows are smaller than the outflows is intuitively understood as being "profitable", textbook implementations of IRR will always pick the positive solution instead of the negative one. Similarly, if sum of values is negative, it is guaranteed to have a NEGATIVE solution (which is consistent with the logical interpretation as a financial loss). The heuristic I implemented just made sure the initial guess will be closer to a positive solution than to a negative one in the case of positive cashflows, and closer to a negative one in the case of negative cashflows, nothing more. It is not perfect and it tends to "overestimate" the initial guess (either positive or negative), but this is to prevent it from accidentally changing signs, which would be very unintuitive.

I'm not familiar with this, do you have a reference for the existence of positive & negative solutions? I think the way it's written is not entirely accurate, e.g. [4, -6, 4, -1] has a single real IRR of -50%, but the sum of cashflows is positive and the cashflows have different signs. Am I missing something here? I think it's true if we require the first cashflow to be negative (and we can relax the assumption of different signs).

@jlopezpena
Copy link
Contributor

jlopezpena commented Mar 12, 2023

I'm not familiar with this, do you have a reference for the existence of positive & negative solutions? I think the way it's written is not entirely accurate, e.g. [4, -6, 4, -1] has a single real IRR of -50%, but the sum of cashflows is positive and the cashflows have different signs. Am I missing something here? I think it's true if we require the first cashflow to be negative (and we can relax the assumption of different signs).

You are right, I was assuming the first cashflow is negative. The requirement that "not all cashflows have the same sign" is necessary for the existence of a real solution. If we rewrite the IRR expression using the gain g = 1 + r and multiply the equation for the net present value by g^N (as done in the PR I submitted), then the polynomial becomes

C_0 g^N + C_1 g^(N-1) + ... + C_N

for r=0, we have g=1 and the polynomial above evaluates to the sum of the coefficients, which we are assuming to be positive. For r really large, g also becomes really large, and the evaluation is dominated by the leading coefficient of the polynomial, which I was implicitly assuming to be negative. By continuity this means there must exist a root with g > 1, i.e. with positive r. Further work can be done to estimate the range where that root lives, and a first approximation to that boundary is given by the heuristic we merged in the previous PR. Newton-Raphson method does not allow to specify bounds for the root found, so a way to guarantee that a positive root is found is to overshoot the actual root, guaranteeing that the positive root is closer to the initial guess than any potential negative root.

@jlopezpena
Copy link
Contributor

There are many ways the computation of the IRR can be sped up, such as choosing a different root finding algorithm like Brent-Q. But the essence of numpy-financial is being simple and mostly self contained (apart from the dependency in numpy, which prevents it from using the fancy optimisation methods in scipy. If you need the best possible performance I'd recommend using pyxirr instead of numpy-financial

@Kai-Striega
Copy link
Member

@user799595 thanks for taking the time to contribute. I'm all for supporting better guesses. My apologies for taking so long to respond. I've been really busy with work + life outside FOSS. There has been quite a bit of discussion here, I don't have the time to give it the proper depth it deserves, so I'll make some time this week.

There are two different considerations here. One is the logical meaning of IRR as a return on an investment. The other is the definition as a solution of a certain polynomial equation. The polynomial has degree n (length of cashflows minus 1) so there will potentially be many solutions. This does NOT mean that EVERY solution should be thought of as a "valid" IRR.

I agree with this. It has given numpy-financial a lot of trouble to maintain and was the reason we originally rewrote it using an iterative solver. To me, the most important factor is choosing the financially correct solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants