Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binseg returns incorrect segment means #47

Open
tdhock opened this issue Oct 9, 2020 · 5 comments
Open

binseg returns incorrect segment means #47

tdhock opened this issue Oct 9, 2020 · 5 comments
Assignees
Labels

Comments

@tdhock
Copy link

tdhock commented Oct 9, 2020

hi @rkillick I'm trying to get the segment means computed by binary segmentation, which appear to be incorrect below.

> changepoint::cpt.mean(c(1,2,4), penalty="Manual", method="BinSeg", pen.value=0, Q=1)@param.est
$mean
[1] 1.000000 2.333333
# I expected 1.5, 4
> changepoint::cpt.mean(c(1,2,4), penalty="Manual", method="BinSeg", pen.value=0, Q=2)@param.est
$mean
[1] 1.000000 1.000000 2.333333
# I expected 1,2,4
> changepoint::cpt.mean(c(1,2,4), penalty="Manual", method="BinSeg", pen.value=0, Q=3)@param.est
$mean
[1] 1.000000 1.000000 1.000000 2.333333
# I expected error because there can not be Q=3 changepoints in 3 data points.
> 
@rkillick rkillick added the bug label Nov 2, 2020
@rkillick rkillick self-assigned this Nov 2, 2020
@rkillick
Copy link
Owner

rkillick commented Mar 8, 2022

Due to other fixes, the last statement errors now.

For the first two there was a bug which added an extra 0 to the start of the changepoint list which is why all of them start with 1. For the first example it should just be a mean of 2.33333 as there is 1 segment so mean(c(1,2,4)) is 2.33333.
Then the second example does have a bug as the cpts.full shows that for 1 changepoint it gives an NA value in the table. Thus the best segmentation is still no changepoints but it reports c(0,0,3) as the changepoint locations which is clearly incorrect.

I will need to look further into this bug after the current release (2.2.3) as it only seems to happen with small datasets.

@tdhock
Copy link
Author

tdhock commented Mar 8, 2022

Hi @rkillick thanks for the update. I confirm an error for the third now, but the first two issues persist.

> changepoint::cpt.mean(c(1,2,4), penalty="Manual", method="BinSeg", pen.value=0, Q=1)@param.est
$mean
[1] 1.000000 2.333333

Warning message:
In BINSEG(sumstat, pen = pen.value, cost_func = costfunc, minseglen = minseglen,  :
  The number of changepoints identified is Q, it is advised to increase Q to make sure changepoints have not been missed.
> changepoint::cpt.mean(c(1,2,4), penalty="Manual", method="BinSeg", pen.value=0, Q=2)@param.est
$mean
[1] 1.000000 1.000000 2.333333

Warning message:
In BINSEG(sumstat, pen = pen.value, cost_func = costfunc, minseglen = minseglen,  :
  The number of changepoints identified is Q, it is advised to increase Q to make sure changepoints have not been missed.
> changepoint::cpt.mean(c(1,2,4), penalty="Manual", method="BinSeg", pen.value=0, Q=3)@param.est
Error in BINSEG(sumstat, pen = pen.value, cost_func = costfunc, minseglen = minseglen,  : 
  Q is larger than the maximum number of segments 2.5 

@rkillick
Copy link
Owner

rkillick commented Mar 8, 2022

I haven't committed the version to github yet!

@rkillick rkillick closed this as completed Mar 8, 2022
@rkillick rkillick reopened this Mar 8, 2022
@sanjmeh
Copy link

sanjmeh commented Oct 13, 2022

Is this R package under active development and maintenance? Can the package author please upddate old issues and bring them to a logical end? Alternatively if the package author recommends there is another package that has superceded this package, it will be nice we know that. Thanks.

@rkillick
Copy link
Owner

Yes this package is under active development and maintenance. As you can imagine, when someone is an academic and covid hits then other tasks need to take priority. This is unfortunately what happens when packages are maintained by volunteers and not paid staff.
We have been working on a new major release and so have encorporated these fixes into that. The new term has just started so we are distracted by that but are hoping to have it out early 2023. As stated above, this only happens for very small datasets (toy examples rather than things we encounter in reality if you will) and so isn't a priority for a patch fix.

If you would like to submit a patch fix we would be happy to accept the pull request for others to benefit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants