62E15 Exact distribution theory
Higher Order Asymptotics for the MSE of Robust M-Estimators of Location on Shrinking Total Variation Neighborhoods
- In the setup of shrinking neighbourhoods in sample size n about an ideal (“smooth”) central model, Rieder (1994) determines the optimal asymptotic linear estimator w.r.t. the asymptotic MSE evaluated uniformly on these neighbourhoods. We answer the question to which degree the asymptotic optimality carries over to finite sample size in the context of total variation neighbourhoods. In contrast to usual higher order asymptotics, instead of giving approximations to distribution functions (or densities), we expand the risk directly by application of Edgeworth and Taylor expansions. In the context of determining the exact finite sample risk for sample size n>2 M. Kohl showed in Kohl (2005) that the speed of convergence towards the asymptotic risk is faster by an order in case of total variation compared to convex contamination neighbourhoods. M. Kohl conjectured that this is caused by the higher symmetry of total variation neighbourhoods. Looking at the MSE in the asymptotic optimal setup we get the same results as M. Kohl for finite sample risk. Furthermore we can show by direct expansion of the MSE that for a higher speed of convergence symmetry of the ideal distribution F is essential. We confirm our result by a cross-check in the ideal model and illustrate our theoretical investigations for F = N(0;1). We also deal with the question of an actual realization of a least favourable deviation from the ideal model in a finite context. We settle on the symmetric case and monotone odd influence curves of Hampel-type form, and show that for a certain kind of manipulation mechanism we gain our theoretical results up to the desired order. It shows up that we only get access to the results in the finite context if we require the finite sample to attain the minimum and maximum of the given influence curve with a certain probability already. Depending on this probability we derive a lower bound on the sample length n. We substantiate the sufficiently exact algorithm by determining the amount K of observations to be manipulated as well as the bound c on the observations for having maximum influence on the MSE according to the value of the influence curve. We give a restrictive condition on the distribution of K that lets the probability of the (K:n)-quantile exceeding a now concrete bound c be exponentially negligible. The bound c is explicitly calculated for F = N(0;1) and suitable four-point distributions for K are given that satisfy all the previous claimed conditions. Thus we gain an algorithm generating observations from a least favourable distribution out of a finite total variation neighbourhood w.r.t. an asymptotically optimal risk.