Multivariate Data Analysis  Sixth Edition by Hair, Black, Babin, Anderson and Tatham


A number of datasets are available to enable students and faculty to perform the multivariate analyses described in the textbook.  While some techniques require specialized datasets (e.g., multidimensional scaling, conjoint analysis and structural equation modeling), many of the techniques are performed using conventional survey data. 


We have collected all of the datasets needed for each edition, along with some supplemental datasets and documentation.  Each dataset is is SPSS format (.SAV) which is easily read by most statistical packages.  Moreover, the basic datafiles are also provided in Excel format for ease of use in other statistical packages.


Select the edition below for a compressed (zipped) file of all datasets.  Descriptions of the datasets are provided in documentation within each file as well as in the section below.

Download the Complete Set of Datasets

(Right click, then "Save Link As")


   Eighth Edition  
Descriptions of the Individual Datasets

HBAT:  Actually a series of datasets used with many of the techniques.






HBAT: the primary database with multiple metric & nonmetric variables allowing for use in most of the multivariate techniques. HBAT_200: an expanded dataset, comparable to HBAT except for 200 rather than 100 respondents, used in MANOVA.
HBAT_MISSING; a reduced dataset with 70 respondents and missing data in the variables.  Used with techniques for diagnosis and remedy of missing data (Chapter 2). HBAT_SPLITS: contains two variables that split the HBAT dataset into 50/50 and 60/40 subsamples.  This dataset can be merged with the original HBAT dataset if desired.
Structural Equation

Modeling (SEM)


Download the set of five datasets or individual datasets.

HBAT_SEM: the original data responses from 400 individuals used to derive the input matrices for SEM programs (e.g. LISREL, EQS or AMOS) HBAT_SEM_NOMISSING: the original dataset of 400 responses has two individuals with missing data.  This dataset replaces the missing values so that the resulting sample is 400 complete responses.
NEW -- HBAT400_6CON: the original data responses from 400 individuals with the addition of indicators for a sixth construct -- Supervisor Support
NEW -- HBAT PLS-SEM_No Missing Data: variant of HBAT_SEM_NOMISSING used for SmartPLS estimation in Chapter 13 (Excel version only)
NEW -- HBAT_SEM_FT_NOMISS and HBAT_SEM_PT_NOMISS – These two sub-samples of the HBAT_SEM_NOMISSING dataset are defined by employees full-time or part-time status (variable C2). These sub-samples are used in multi-group analysis presented in Chapter 12. 
HBAT.COV, HBATF.COV and HBATM.COV: these three covariance matrices represent the overall sample, female respondents and male respondents, respectively.
Supplemental Chapter Datasets: Several chapters from past editions (Canonical Correlation, Conjoint Analysis, Multidimensional Scaling and Correspondence Analysis) have been shifted to online supplements to allow for additional material in the current edition.
Conjoint Analysis HBAT_CPLAN: details the "full-profile" stimulus descriptions HBAT_CONJOINT: contains the actual responses to the stimulus profiles


HBAT_MDS: used in MDS (multidimensional scaling) HBAT_CORRESP: used for correspondence analysis
Other Datasets:
Two additional datasets are provided to allow students access to data other than the HBAT data files described in the textbook HATCO: this dataset has been utilized in past versions of the textbook and provides a simplified set of variables amenable to all of the basic multivariate techniques. SALES: this dataset concerns sales training and is comprised of 80 respondents, representing a portion of data that was collected by an academic researcher

Drop us an e-mail if you have a comment, suggestion
or online resource you would like to share.


Multivariate Data Analysis
Hair, Black, Babin and Anderson