Model Choice and Sample Size in Item Response Theory Analysis of Aphasia Tests Purpose The purpose of this study was to identify the most appropriate item response theory (IRT) measurement model for aphasia tests requiring 2-choice responses and to determine whether small samples are adequate for estimating such models. Method Pyramids and Palm Trees (Howard & Patterson, 1992) test data that ... Supplement Article
Supplement Article  |   May 01, 2012
Model Choice and Sample Size in Item Response Theory Analysis of Aphasia Tests
 
Author Affiliations & Notes
  • William D. Hula
    VA Pittsburgh Healthcare System and University of Pittsburgh, PA
  • Gerasimos Fergadiotis
    Arizona State University, Tempe, AZ
  • Nadine Martin
    Temple University, Philadelphia, PA
  • Correspondence to William D. Hula: william.hula@va.gov
  • Editor: Swathi Kiran
    Editor: Swathi Kiran×
  • Associate Editor: Diane Kendall
    Associate Editor: Diane Kendall×
Article Information
Research Issues, Methods & Evidence-Based Practice / Language Disorders / Aphasia / Supplement: Select Papers From the 41st Clinical Aphasiology Conference
Supplement Article   |   May 01, 2012
Model Choice and Sample Size in Item Response Theory Analysis of Aphasia Tests
American Journal of Speech-Language Pathology, May 2012, Vol. 21, S38-S50. doi:10.1044/1058-0360(2011/11-0090)
History: Received August 12, 2011 , Revised December 1, 2011 , Accepted December 20, 2011
 
American Journal of Speech-Language Pathology, May 2012, Vol. 21, S38-S50. doi:10.1044/1058-0360(2011/11-0090)
History: Received August 12, 2011; Revised December 1, 2011; Accepted December 20, 2011
Web of Science® Times Cited: 1

Purpose The purpose of this study was to identify the most appropriate item response theory (IRT) measurement model for aphasia tests requiring 2-choice responses and to determine whether small samples are adequate for estimating such models.

Method Pyramids and Palm Trees (Howard & Patterson, 1992) test data that had been collected from individuals with aphasia were analyzed, and the resulting item and person estimates were used to develop simulated test data for 3 sample size conditions. The simulated data were analyzed using a standard 1-parameter logistic (1-PL) model and 3 models that accounted for the influence of guessing: augmented 1-PL and 2-PL models and a 3-PL model. The model estimates obtained from the simulated data were compared to their known true values.

Results With small and medium sample sizes, an augmented 1-PL model was the most accurate at recovering the known item and person parameters; however, no model performed well at any sample size. Follow-up simulations confirmed that the large influence of guessing and the extreme easiness of the items contributed substantially to the poor estimation of item difficulty and person ability.

Conclusion Incorporating the assumption of guessing into IRT models improves parameter estimation accuracy, even for small samples. However, caution should be exercised in interpreting scores obtained from easy 2-choice tests, regardless of whether IRT modeling or percentage correct scoring is used.

Acknowledgments
This research was supported by Veterans Affairs Rehabilitation Research & Development Career Development Award C7476W, the VA Pittsburgh Healthcare System Geriatric Research Education and Clinical Center, and Grant DC01924-15 from the National Institutes of Health (National Institute on Deafness and Other Communication Disorders) to Temple University (PI N. Martin). The authors also gratefully acknowledge the support and assistance of Heather Harris Wright and Michelene Kalinyak-Fliszar. The contents of this paper do not represent the views of the Department of Veterans Affairs or the U.S. Government.
Order a Subscription
Pay Per View
Entire American Journal of Speech-Language Pathology content & archive
24-hour access
This Article
24-hour access