Robust mean

Synopsis


use oakleaf

subroutine rmean(data,mean,stderr,stdsig,scale,reltol,robfun,rankinit,flag,verbose,report)

   real, dimension(:), intent(in) :: data
   real, intent(in out) :: mean
   real, intent(out), optional :: stderr, stdsig
   real, intent(in out), optional :: scale
   real, intent(in) :: reltol
   character(len=*), intent(in) :: robfun
   logical, intent(in), optional :: rankinit
   integer, intent(out), optional :: flag
   logical, intent(in), optional :: verbose
   character(len=*), intent(in), optional :: report

end subroutine rmean
  

The subroutine is generic, supporting all REAL32, REAL64 and REAL128 datatypes. The reccomended, if the nature of the problem allows it, is REAL64. REAL32 has low relative accuracy, whilst REAL128 is slow.

Description

The robust mean subroutine rmean() estimates the mean, and related statistical quantities. The data are supposed to be sample from the Normal distribution with possible contamination of a unknown origin.

The parameters for both the location and the dispersion of the Normal distribution N(mean,stdsig) are estimated. The true value X of the sample falls, with 68% probability, into the interval mean - stderr < X < mean + stderr.

Parameters

On input:

data
array of data,
mean
(optional) an initial estimate of the mean,
scale
(optional) an initial estimate of the scale (i.e. stdsig),
reltol
(optional) the relative accuracy of determination of the optimum (default: 2.4%),
robfun
(optional) select the robust function among: `hampel' (default), `tukey', `huber' or `square' (non-robust least square),
rankinit
(optional) estimate initial values by the order estimates (default: .true.)
verbose
(optional) print a detailed information about the calulations.
report
(optional) a filename of a report

The mandatory input parameter is the data. The procedure initialisation is done internally if rankinit == .true. (default); otherwise, both mean,scale must be set by the calling procedure. The initial estimates are taken by the median and MAD described in qmean.

On output:

mean
the mean
stderr
(optional) the standard error
stdsig
(optional) the standard deviation
scale
(optional) the scale
flag
(optional) a flag, see the status codes.

The mandatory output parameter is the robust mean. The check of the status is highly recommended.

Example

Save the program to the file example.f08:

program example

   use oakleaf

   real, dimension(5) :: data = [ 1, 2, 3, 4, 5 ]
   real :: mean,stderr,stdsig

   call rmean(data,mean,stderr,stdsig)
   write(*,*) 'rmean:',mean,' stderr:',stderr,'stdsig:',stdsig

end program example
  

Than compile and run it:

$ gfortran -I/usr/local/include example.f08 -L/usr/local/lib -loakleaf -lminpack
$ ./a.out
rmean:   3.00016236      stderr:  0.688852131     stdsig:   1.54032016
  

Reports

The report file keeps the detailed track of the algorithm for visualisation or debugging purposes. It is not recommended on a common use as it slow-down a run.

The format is (see src/sqpfmean.f08) self-descibing and includes: initial estimates, the optimisation steps (current values, trust-region radius, the step, the gradient, the quasi-Newton hessian and the trust-step flag) and final number of iterations, function evaluations, restores, steps out of the trust-region, indefinite and hard steps and the final hessian computed analytically.

References

Hroch,F.: A contribution to the estimate of the robust mean value, in preparation