-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fit of continuous data to discrete distributions should return an error #120
Comments
Some details about the tests I ran (script here ): fitls and fitpowerThe density functions for these sads models correctly outputs zero for non-integer values, making the log-likelihood = > x1 <- c(rls(100, N=1000, 10), 1.18)
> fitls(x1) ## fit with LogLik=-Inf and issues warnings about non-integer values
Maximum likelihood estimation
Type: discrete species abundance distribution
Species: 101 individuals: 2368.18
Call:
mle2(minuslogl = function (N, alpha)
-sum(dls(x, N, alpha, log = TRUE)), start = list(alpha = 21.4238153469672),
method = "Brent", fixed = list(N = 2368.18), data = list(
x = list(1, 75, 3, 4, 1, "etc")), lower = 0, upper = 101L)
Coefficients:
N alpha
2368.18 101.00
Log-likelihood: -Inf
There were 50 or more warnings (use warnings() to see the first 50)
> warnings()[1]
Mensagem de aviso:
In dls(x, N, alpha, log = TRUE) : non integer values in x fitmzsmThe density function incorrectly returns non-zero to continuous values (#119 ) and thus fits data with continuous values returns a numeric log-likelihood: > x1 <- c(rmzsm(999, 1000, 20), 1.18)
> fitmzsm(x1) ## fit with LogLik!=-Inf and issues warnings about non-integer values.
Maximum likelihood estimation
Type: discrete species abundance distribution
Species: 1000 individuals: 14280.18
Call:
mle2(minuslogl = function (J, theta)
-sum(dmzsm(x, J = J, theta = theta, log = TRUE)), start = list(
theta = 1000L), method = "Brent", fixed = list(J = 14280.18),
data = list(x = list(24, 2, 10, 2, 35, "etc")), lower = 0.001,
upper = 1000L)
Coefficients:
J theta
14280.1800 242.8042
Log-likelihood: -3380.22
> warnings()[1]
Mensagem de aviso:
In dls(x, N, alpha, log = TRUE) : non integer values in x fitgeom, fitpowbend, fitnbinom, fitvolkovAt least in my tests did not fit because of convergence problems. |
OK, what we need to decide is how to deal with this in a coherent fashion. I believe that all fitting procedures should return an error if invalid data is entered, but the problem is: what is invalid data? Non-integer numbers for discrete fits are invalid, that's fine, but also negative numbers are invalid for all distributions, and still they fit (with ll=-Inf ):
This is particularly troubling for rad fits, because as they are converted to ranks, no checking at all is done and the fit seems valid:
We can add a check to all fitting functions to make sure x is positive; also integer if the distribution is discrete. Are we overlooking some other case of invalid data? |
Most of the functions that fit discrete sads proceed to fitting even when there is non-integer values in the data. I got fits to data with continuous values from
fitls
,fitpower
andfitmzsm
. Fits from the other discrete sads did not converge in the tests I've done, but return an error frommle2
, showing that they proceed to fitting. Onlyfitpoilog
stops when there is any continuous value in the data, which is an error-checking frompoilog::fitpoilog
Which makes sense to me.
The text was updated successfully, but these errors were encountered: