Perhaps the most widely known statistical archive, Carnegie Mellon's
StatLib is an archive system with extensive datasets as well as other statistical
and Story Library
DASL (pronounced "dazzle") is a project sponsored by Cornell University
containing an online library of datafiles and "stories" illustrating basic
statistical concepts. Stories are classified according to statistical methods
and major topics of interest.
for Teaching Statistics
Features Dr. B's Wide World of Web Data (links to hundreds of
on-line datasets all over the world, organized by subject heading) and
Dr. B's Data Gallery (collection of variables with gifs of histograms,
boxplots, stem-and-leaf plots, normal-probability plots, descriptive statistics,
and raw data used to illustrate what real data look like and for examples
Series Data Library
Robert Hyndman of Monash University has identified over 800 datasets
and organized them by topic
Machine Learning Repository
Acts as a repository of over 100 databases, domain theories and data
generators used by the machine learning community for the empirical analysis
of machine learning algorithms
Reference Datasets (StRD)
The purpose of this project is to improve the accuracy of statistical
software by providing reference datasets with certified computational results
that enable the objective evaluation of statistical software.
Maintained by the Federal Interagency Council on Statistical Policy
to facilitate access to the statistics and information for over 70 agencies
of the Federal Government.
Drop us an
e-mail if you have a comment, suggestion
or online resource you would like to share?
Multivariate Data Analysis
Hair, Black, Babin and Anderson