math - R optim(){fExtremes} gets 0 hessian matrix -
i using r {fextremes} find best parameters of gev distribution data (a vector). following error message
error in solve.default(fit$hessian) : lapack routine dgesv: system singular: u[1,1] = 0
i traced fit$hessian, found hessian matrix sigular matrix, of elements 0s. source code (https://github.com/cran/fextremes/blob/master/r/gevfit.r) of gevfit() shows fit$hessian calculated optim(). output parameters same value initial parameters. wondering problems of data cause problem? copied code here
> min(sample); [1] 5.240909 > max(sample) [1] 175.8677 > length(sample) [1] 6789 > mean(sample) [1] 78.04107 >para<-gevfit(sample, type = "mle") error in solve.default(fit$hessian) : lapack routine dgesv: system singular: u[1,1] = 0 fit = optim(theta, .gumllh, hessian = true, ..., tmp = data) > fit $par xi -0.3129225 mu 72.5542497 beta 16.4450897 $value [1] 1e+06 $counts function gradient 4 na $convergence [1] 0 $message null $hessian xi mu beta xi 0 0 0 mu 0 0 0 beta 0 0 0
i updated dataset on google docs: https://docs.google.com/spreadsheets/d/1irrpjmdrrjphnmfilism_p0efv_ot4hlesa6kwmnljc/edit?usp=sharing
this going long story, possibly more suited https://stats.stackexchange.com/.
====== part 1 -- problem ======
this sequence generating error:
library(fextremes) samp <- read.csv("optimdata.csv")[ ,2] ## not converge para <- gevfit(samp, type = "mle")
we facing typical cause of lack-of-convergence when using optim()
, friends: inadequate starting values optimisation.
to see goes wrong, let use pwm estimator (http://arxiv.org/abs/1310.3222); consists of analytical formula, hence not incur convergence problems, since makes no use of optim()
:
para <- gevfit(samp, type = "pwm") fitpwm<- attr(para, "fit") fitpwm$par.ests
the estimated tail parameter xi
negative, corresponding bounded upper tail; in fact fitted distribution displays more "upper tail boundedness" sample data, can see "leveling off" of quantile-quantile graph @ right:
qqgevplot <- function(samp, params){ probs <- seq(0.1,0.99,by=0.01) qqempir <- quantile(samp, probs) qqtheor <- qgev(probs, xi=params["xi"], mu=params["mu"], beta=params["beta"]) rang <- range(qqempir,qqtheor) plot(qqempir, qqtheor, xlim=rang, ylim=rang, xlab="empirical", ylab="theoretical", main="quantile-quantile plot") abline(a=0,b=1, col=2) } qqgevplot(samp, fitpwm$par.ests)
for xi<0.5
mle estimator not regular (http://arxiv.org/abs/1301.5611): value of -0.46 estimated pwm xi
close that. pwm estimates used internally gevfit()
starting values optim()
: can see if print out code function gevfit()
:
print(gevfit) print(.gevfit) print(.gevmlefit)
the starting value optim theta
, obtained pwm. specific data @ hand, starting value not adequate, in leads non-convergence of optim()
.
====== part 2 -- solutions? ======
solution 1 use para <- gevfit(samp, type = "pwm")
above. if you'd use ml, have specify starting values optim()
. unfortunately, fextremes
package not make easy so. can re-define own version of .gevmlefit
include those, e.g.
.gevmlefit <- function (data, block = na, start.param, ...) { data = as.numeric(data) n = length(data) if(missing(start.param)){ theta = .gevpwmfit(data)$par.ests }else{ theta = start.param } fit = optim(theta, .gevllh, hessian = true, ..., tmp = data) if (fit$convergence) warning("optimization may not have succeeded") par.ests = fit$par varcov = solve(fit$hessian) par.ses = sqrt(diag(varcov)) ans = list(n = n, data = data, par.ests = par.ests, par.ses = par.ses, varcov = varcov, converged = fit$convergence, nllh.final = fit$value) class(ans) = "gev" ans } ## diverges, above .gevmlefit(samp) ## diverges, above startp <- fitpwm$par.ests .gevmlefit(samp, start.param=startp) ## converges startp <- structure(c(-0.1, 1, 1), names=names(fitpwm$par.ests)) .gevmlefit(samp, start.param=startp)$par.ests
now check out: beta
estimated pwm 0.1245; changing tiny amount, mle made converge:
startp <- fitpwm$par.ests startp["beta"] startp["beta"] <- 0.13 .gevmlefit(samp, start.param=startp)$par.ests
this illustrates blindly optim()
ise works until doesn't , might turn quite delicate endeavour indeed. reason, might useful leave reply here, rather migrate crossvalidated.
Comments
Post a Comment