Electronic Proceedings of the Eighteenth Annual International Conference on Technology in Collegiate Mathematics

Orlando, Florida, March 16-19, 2006

Paper C131

This is an electronic reprint, reproduced by permission of Pearson Education Inc. Originally appeared in the Proceedings of the Eighteenth Annual International Conference on Technology in Collegiate Mathematics, Edited by Joanne Foster, ISBN 0-321-49160-2, Copyright (C) 2007 by Pearson Education, Inc.

Creating Realistic Data Sets with Specified Properties Via Simulation

Robert Goldman

Simmons College
Boston, MA 02115

John D. McKenzie, Jr.

Babson College
Babson Park, MA 02457

Click to access this paper: paper.pdf


There are many situations in which an instructor is confronted with a summarized data set. For example, he or she may discover an interesting data set in which only its mean, standard deviation, and sample size are given. This summary may be found in a newspaper or journal article or in a textbook example or exercise. But, without the actual or raw data, the instructor may decide not to use the data set because it is impossible to present a visual display such as a graph or a table, something that the statistical community considers to be essential for a complete data analysis. Nor can he or she illustrate the far more common situation of data analysis with raw data if only summarized data are given.

This paper will describe how to generate a raw data set with specified characteristics by using simulation. Some introductory applied statistics courses include an introduction to simulation by showing how to generate a set of random data from a normal distribution. However, the creation of realistic data sets is almost never present in such courses because most instructors are unaware of the ease in which one may generate such data with the use of statistical software, such as Minitab.

Keyword(s): statistics, simulation, Minitab