On the Evaluation of Sample Size Required for a Good Approximation by the Normal Curve for Some Statistics

Authors

  • Janusz L. Wywiał University of Economics in Katowice, Department of Statistics, Econometrics and Mathematics

DOI:

https://doi.org/10.15678/ZNUEK.2017.0965.0502

Keywords:

sample size, central theorem, sampling scheme, computer simulation, chi-square test of goodness of fit

Abstract

Testing hypotheses or evaluation confidence intervals requires knowledge of some  statistics’ distributions. It is convenient if the probability distribution of the statistic converges to normal distribution when the sample size is sufficiently large. This paper examines the problem of how to evaluate sample size in order to determine that a statistic’s distribution does not depart from normal distribution by more than an assumed amount. Two procedures are proposed to evaluate the necessary sample size. The first is based on Berry-Esseen inequality while the second is based on simulation procedure. In order to evaluate the necessary sample size, the distribution of the sample mean is generated by replicating samples of a fixed size. Next, the normal distribution of the evaluated sample means is tested. The size of the generated samples is gradually increased until the hypothesis on the normality of the sample mean distribution is not rejected. This procedure is applied in the cases of statistics other than sample mean.

Downloads

Download data is not yet available.

References

Berger Y. G. (1998), Rate of Convergence to Normal Distribution for Horvitz-Thompson Estimator, “Journal of Statistical Planning and Inference”, vol. 67, https://doi.org/10.1016/s0378-3758(97)00107-9.

Cassel C. M., Särndal C. E., Wretman J. H. (1977), Foundation of Inference in Survey Sampling, John Wiley & Sons, New York–London–Sydney–Toronto.

Chernick M. R., Liu C. Y. (2002), The Saw-toothed Behavior of the Power versus Sample and Software Solutions: Single Binomial Proportion Using Exact Methods, “The American Statistician”, vol. 56, https://doi.org/10.1198/000313002317572835.

Cochran W. G. (1952), The chi-squared Test of Goodness of Fit, “Annals of Mathematical Statistics”, vol. 23, https://doi.org/10.1214/aoms/1177729380.

Cramér H. (1946), Mathematical Methods of Statistics, Princeton University Press, Princeton.

Drost F. C., Kallenberg W. C. M., Moore D. S., Oosterhoff J. (1989), Power Approximations to Multinomial Tests of Fit, “Journal of the American Statistical Association”, vol. 84, https://doi.org/10.2307/2289856.

Edgeworth F. Y. (1907), On the Representation of a Statistical Frequency by a Series, “Journal of the Royal Statistical Society”, vol. A 70.

Fuller W. A. (2009), Sampling Statistics, John Wiley & Sons, Hoboken, New Jersey.

Greselin F., Zenga M. (2006), Convergence of the Sample Mean Difference to the Normal Distribution: Simulation Results, “Statistica & Applicazioni”, vol. 4, no 1.

Hájek J. (1964), Asymptotic Theory of Rejective Sampling with Varying Probabilities from a Finite Population, “Annals of Mathematical Statistics”, vol. 35, https://doi.org/10.1214/aoms/1177700375.

Hájek J. (1981), Sampling from a Finite Population, ed. V. Dupač, Marcel Dekker, Inc., New York–Basel.

Hall P. (1992), The Bootstrap and Edgeworth Expansion, Springer-Verlag, New York.

Hansen M. H., Hurvitz W. N. (1943), On the Theory of Sampling from Finite Population, “Annals of Mathematical Statistics”, vol. 14, https://doi.org/10.1214/aoms/1177731356.

Horvitz D. G., Thompson D. J. (1952), A Generalization of Sampling without Replacement from a Finite Universe, “Journal of the American Statistical Association”, vol. 47, https://doi.org/10.1080/01621459.1952.10483446.

Krzyśko M. (2000), Statystyka matematyczna, Wydawnictwo Naukowe Uniwersytetu im. Adama Mickiewicza w Poznaniu, Poznań.

Lahiri D. B. (1951), A Method of Sample Selection Providing Unbiased Ratio Estimator, “Bulletin of the International Statistical Institute”, vol. 33.

Midzuno H. (1952), On the Sampling System with Probability Proportional to Sum of Size, “Annals of the Institute of Statistical Mathematics”, vol. 3, https://doi.org/10.1007/bf02949779.

Ryan T. P. (2013), Sample Size Determination and Power, John Wiley & Sons, Hoboken, New Jersey.

Santer T. J., Duffy D. E. (1989), The Statistical Analysis of Discrete Data, Springer-Verlag, New York.

Seber G. A. F. (2013), Statistical Models for Proportions and Probabilities, Springer Briefs in Statistics, Heidelberg–New York–Dordrecht–London.

Sen A. R. (1953), On the Estimate of the Variance in Sampling with Varying Probabilities, “Journal of the Indian Society of Agricultural Statistics”, vol. 5.

Tillé Y. (2006), Sampling Algorithms, Springer, New York.

Wywiał J. L. (2016), Contributions to Testing Statistical Hypotheses in Auditing, Wydawnictwo Naukowe PWN, Warszawa.

Yates F., Grundy P. M. (1953), Selection without Replacement from Within Strata with Probability Proportional to Size, “Journal of the Royal Statistical Society”, Series B, vol. 15.

Downloads

Published

2017-11-30

Issue

Section

Articles