Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Improve Calculation of Retreivability #703

Open
brishtibheja opened this issue Oct 26, 2024 · 17 comments
Open

[Feature Request] Improve Calculation of Retreivability #703

brishtibheja opened this issue Oct 26, 2024 · 17 comments
Labels
enhancement New feature or request

Comments

@brishtibheja
Copy link

Using difficulty_asc as a sort order, I noticed having more of my lapses occur at later portions of the review session despite retrievability being the same. In Anki's stats screen, searches like prop:d>0.9 and prop:d<0.9 prop:d>0.8 and looking at the graphs makes it more evident.

I find FSRS consistently underestimating R for more easier cards such that average retention is higher than set DR for cards low in difficulty. This is in contrast with cards in prop:d>0.9 where retention is almost 10 percentage points below DR. It's erring in both directions.

Previously, @richard-lacasse has also reported similar experience.

One initial idea to solve this might be by trying to incorporate D in the formula for forgetting curve. Has this been tried before in some way?

@brishtibheja brishtibheja added the enhancement New feature or request label Oct 26, 2024
@Expertium
Copy link
Collaborator

One initial idea to solve this might be by trying to incorporate D in the formula for forgetting curve. Has this been tried before in some way?

Nope. It would complicate things a lot. It's not compatible with how the initial S is estimated, it would screw up the "A 100 day interval will become x days." thingy and it would make stability less interpretable.

@user1823
Copy link
Collaborator

R should only depend on S and time elapsed. D affects R indirectly by affecting S.

So, the actual issue here is that S is not calculated accurately for high and low D values.

@richard-lacasse
Copy link

I'm not sure if this is actually a problem, or just the algorithm tempering its speed that it will adapt to cards that are subjectively very easy or very hard. You don't want to overfit the model. It might end up being better that it doesn't "jump to conclusions" as it adjusts the difficulty. The variance of subjective difficulty is so broad, that there's always going to be cards that are scheduled incorrectly, but those are the cards FSRS will learn the most information from each time.

That being said, if it is a problem I agree with @user1823.

So, the actual issue here is that S is not calculated accurately for high and low D values.

I'm guessing you'd want to change how D is handled in the Stability equation.

@Expertium
Copy link
Collaborator

Expertium commented Oct 27, 2024

I'm guessing you'd want to change how D is handled in the Stability equation.

Tried that too. I couldn't find anything that improved the results.

@brishtibheja
Copy link
Author

I was just thinking because I often have backlogs, maybe that affected the stats for high difficult cards. It's possible. But, still that wouldn't mean cards in the prop:d<0.9 prop:d>0.8 range will look like this:

Screenshot_2024-10-27-19-59-52-02_a9eef3a2a561b80d5c76daebd0f9a14c

I am going through a backlog of 2k cards this month which I got from rescheduling. My DR is set to .85.


I think we should make RMSE (bins) create the bins according to difficulty too. I gave a quick look at how it's done and didn't seem to find anything like this.

@Expertium
Copy link
Collaborator

I think we should make RMSE (bins) create the bins according to difficulty too. I gave a quick look at how it's done and didn't seem to find anything like this.

We can't make the binning method depend on D, S or R (well, IIRC we can, it's just painfully slow). The binning depends on:

  1. Total number of reviews
  2. Interval length
  3. Number of lapses (Agains)

I'd say that the number of lapses is a good proxy for D

@brishtibheja
Copy link
Author

I'd say that the number of lapses is a good proxy for D

Yes, you might be right. So we should see metrics improving if R prediction is improved for low/high D cards.

Actually, why is it not something like Pass/Fail ratio though? That sounds better to me.

@brishtibheja
Copy link
Author

R should only depend on S and time elapsed. D affects R indirectly by affecting S.

@user1823 That cannot achieve all the effects we would possibly want.

Consider cards with a stability of 7d. Assuming, Anki has correctly assigned the stability, you remember around 90% of the material a week later. Then a month and a half later though, do we expect to remember the highly difficult cards at all. In some disciplines, it's possible you forget almost all the difficult cards a month later but you still retain some of the relatively easier cards. (I think this happens naturally, but also because of reasons like getting more real-life encounters with easier stuff or inherent reviews of easier material while doing other harder anki cards).

I think how you'd make changes that take that into account is by differing the formula for R on the basis of what value D has taken. As D rises, say the curve for R gets steeper and steeper.


Re: making S meaningful

That can be done if all the forgetting curves for the same S intersects at some point. Then, you can possibly still say for the S equals time it takes for R to reach .90 etcetra.

@user1823
Copy link
Collaborator

In some disciplines, it's possible you forget almost all the difficult cards a month later but you still retain some of the relatively easier cards.

This is just another way of saying that the stability of some cards is less than the others.

@brishtibheja
Copy link
Author

I don't get it. In my example, stability was same because R was really 90% after a week for both the easy and harder cards.

@user1823
Copy link
Collaborator

user1823 commented Oct 30, 2024

How is it possible that R decreases at the same rate for both cards in the first week but later decreases faster for one of the cards?

If one card is harder, the R should decrease faster in the first week too, which means that its R after one week can't be equal to that of the other card.

@Expertium
Copy link
Collaborator

It's kind of possible.
https://www.desmos.com/calculator/9pbylwb5yu
image
See how R falls much faster if the function is exponential? You may be surprised what happens if we zoom in to the beginning, when S is les than or equal to 1.
image
All three curves are practically indistinguishable.

So yeah, theoretically we could change the shape of the curve based on D, but as I said earlier

It's not compatible with how the initial S is estimated, it would screw up the "A 100 day interval will become x days." thingy and it would make stability less interpretable.

@brishtibheja
Copy link
Author

How is it possible that R decreases at the same rate for both cards in the first week but later decreases faster for one of the cards?

I'm sure we'll need evidence for that but for now only experience guides me. E.g. I remember lot of the easy stuff I learned in Spanish but have forgotten almost all the hard stuff though I learned them around the same time.

If one card is harder, the R should decrease faster in the first week too

You failed the easy card and it's S became 1 week. I don't see why the Stability can't be the same. You'd need to start learning the cards at different times.

@brishtibheja
Copy link
Author

it would make stability less interpretable.

I think I answered this but I don't have any solution for the other two. Maybe in the pop-up case, we can show an estimate like "intervals will increase by 13%" which would be the average.

@Expertium
Copy link
Collaborator

E.g. I remember lot of the easy stuff I learned in Spanish but have forgotten almost all the hard stuff though I learned them around the same time.

That just means that stability was different.

@user1823
Copy link
Collaborator

You failed the easy card and it's S became 1 week.

I am not talking about the stability given by FSRS. I am saying that the actual S for those cards is different and this is the problem - FSRS is not able to calculate S with 100% accuracy.

theoretically we could change the shape of the curve based on D

Well, long ago (when we were developing FSRS 4), I said that we are introducing a power forgetting curve in FSRS only because we are unable to accurately calculate S. But, forgetting is exponential in nature. So, if we are somehow able to make the calculation of S very accurate, we will start using the exponential forgetting curve again.

@L-M-Sherlock
Copy link
Member

If we change the shape of the curve based on D, given the same S, we'll obtain a family of forgetting curves intersecting at the point (S, 90%). However, they are non-overlapping.

Khajah, M. M., Lindsey, R. V., & Mozer, M. C. (2014). Maximizing Students’ Retention via Spaced Review: Practical Guidance From Computational Models of Memory. Topics in Cognitive Science, 6(1), 157–169. https://doi.org/10.1111/tops.12077

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants