Item Response Theory (IRT) is most often concerned with the design and analysis of psychological and educational assessments (e.g. achievement tests, rating scales) that measure mental traits. Most of the educational tests are multidimensional and each dimension measures different contents of domains or areas. The objective of this study was to explore the number of dimensions, the best model that effectively represents the proficiency of students to correctly answer a question and adequately characterize the items in a test taken by statistics students.

The mixed-format test consisted of 16 dichotomous and one polychotomous item. Multidimensional item response theory (MIRT) models were applied. Three dimensions and a four-parameter logistic model with graded response model (M4PL+MGR) fits the data well compared to all other models. Finally, a confirmatory M4PL+MGR model was applied. The confirmatory M4PL+MGR model item response curve is discussed in contrast to a three-parameter logistic model with graded response model (3PL+GR), as unidimensional models are the most commonly used models in current practice. In MIRT, the item characteristic curve (ICC) has more discrimination power compared to unidimensional models and represents different difficulty levels for some items.