96-1 Context effects on examinatons: The effects of time, item order, and
item difficulty. by Arthur H. Perlini, David L. Lind, & Bruno D. Zumbo
Abstract. Four studies were conducted to test the effects of contextual
factors -- time, item, chapter and difficulty arrangement -- on test performance.
In Experiment 1, we found an advantage for students writing a quiz in the
early and middle testing periods for an afternoon group; however, early
testing, compared to middle and late testing, was a disadvantage for an
evening group writing the same quiz. In a second study, we found that neither
chapter retrieval-cues nor sequenced chapter content-order affected exam
performance; that is, exam items forward sequenced with text coverage yielded
no advantage over those items that were reverse or random sequenced. In
a third study, we manipulated both chapter order (as above) and within-chapter
item order so that exam items in any chapter were not sequenced in the
same order as text coverage. There was no advantage to receiving exam questions
in one sequence over another. In a final study, item arrangement was manipulated
according to difficulty level (as described in the text test-bank) so that
students were presented with the same set of questions in one of three
arrangements: (a) easy-to-hard (EH), (b) hard-to-easy (HE), and (c) random
order (R). The results here indicate no effect for order of item difficulty
on performance. Taken together, the findings indicate that contextual factors
like time have an effect on overall performance; however, other factors
such as item, chapter and difficulty arrangements have little or no effect
on overall performance.
96-2 Robustness of validity and efficiency of the one-sample t-test in
the presence of normal contamination. by Bruno D. Zumbo & Martha J.
Jennings
Abstract. The performance of parametric tests given data which are
essentially normal but contain outliers is largely unknown. In this Monte
Carlo study the robustness of validity and efficiency for the one-sample
location problem are investigated. The Type I error rate and power of the
one-sample t test given a normal underlying population are compared with
the performance of this test given a systematic range of outlier contamination
in the underlying population. Sample sizes of 8, 16, 32, 64, and 128 are
included in the design. The robustness of validity results are explored
using three sets of regression models. The first set of models is constructed
using the parameters of the contamination model and is intended to inform
the scientific methodologist. The second set of models is constructed using
skewness and kurtosis values. A third set of models is developed using
a novel index of contamination. This set of models has practical relevance
to the data analyst confronted with outlier contaminated data. Robustness
of efficiency results are expressed using both power curves and a proposed
fairly stringent criterion for power. In general, the results indicate
that the one-sample t-test demonstrates fairly stringent robustness of
validity for all of the symmetric contamination explored. When contamination
is asymmetric the Type I error rate becomes inflated as the proportion
of contamination increases. If robustness of validity is intact, power
is not greatly affected when medium or large effect sizes are examined.
This is not necessarily true for small effect sizes and the problems are
further exacerbated when sample sizes are also small.
96-3 Variable importance in regression and related analyses. by D. Roland
Thomas, Edward Hughes, & Bruno D. Zumbo
.
Abstract. A measure of variable importance that was axiomatically
derived by Pratt (1987) is considered, and it is shown that a sample estimate
of this measure can be derived by means of a geometric argument. Pratt's
measure can be negative, a disincentive for prospective users of the method.
In the paper, the geometric approach is used to show that negative importance
is associated with multicollinearity of the explanatory variables, which
puts the issue of negative importance into perspective. A multicollinear
example is provided to illustrate the case of negative importance. The
example demonstrates that removal of the offending variable yields interpretable
postive values of Pratt's measure.
96-4 Effects of employing ridge regression in structural equations models.
by Shaun McQuitty
.
Abstract. LISREL 8 invokes a ridge option when estimating a
structural equation model with a nonpositive definite covariance or correlation
matrix. The implications of the ridge option for model fit statistics,
parameter estimates, and standard errors are explored through the use of
two simulations. The results indicate that maximum likelihood parameter
estimates are quite stable with the ridge option, though fit statistics
and standard errors vary significantly and therefore cannot be trusted.
As a result of these findings, the application of the ridge method to structural
equation models is not recommended.
96-5 An empirical test of Roskam's conjecture about the interpretation
of an ICC parameter in personality inventories. by Bruno D. Zumbo, Gregory
A. Pope, Jackie E. Watson, & Anita M. Hubley
.
Abstract. Roskam (1985) conjectured that a general interpretation
can be made for the a-parameter (slope) in an item response theory analysis
of personality inventories. That is, Roskam suggests that steeper slopes
(and hence higher item-total correlations in classical test theory) will
be found with more concretely-worded items whereas lower slopes will be
found with more abstractly-worded items. We test this conjecture using
item-total correlations and both 2- and 3-parameter IRT models with the
Eysenck Personality Questionnaire and were unable to support Roskams proposed
interpretation.
96-6 Specialized tests for detecting treatment effects in the two-sample
problem. by Harvey J. Keselman, Robert Cribbie, & Bruno D. Zumbo.
.
Abstract. Nonparametric and robust statistics (those employing
trimmed means and Winsorized variances) were compared for their ability
to detect treatment effects in the two-sample case. In particular, two
specialized tests, tests designed to be sensitive to treatment effects
when the distributions of the data are skewed to the right, were compared
to two nonspecialized nonparametric (Wilcoxon (1949)-Mann-Whitney(1947))
and trimmed tests (Yuen, 1974) for six non-normal distributions which varied
according to their measures of skewness and kurtosis. As expected, the
specialized tests did provide more power to detect treatment effects, particularly
for the nonparametric comparison. However, when distributions were symmetric
the nonspecialized tests were more powerful, and therefore, for all the
distributions investigated, power differences did not favor the specialized
tests. Consequently, the authors are reticant to recommend the specialized
tests; researchers would have to know the shapes of the distributions that
they work with in order to benefit from specialized tests.
96-7 Modification of the Taylor Complex Figure:
A Comparable Figure to the Rey-Osterrieth Figure? by Anita M. Hubley
Abstract. One of the most commonly used neuropsychological measures
of visuo-spatial abilities is the Rey-Osterrieth
Complex Figure Test (ROCF). Previous research has reliably shown
that its companion figure, the Taylor Complex Figure, is not a comparable
measure of visuo-spatial memory. The purpose of this working paper
is to present a modified version of the Taylor Complex Figure, describe
the steps taken to achieve this new figure (called the Modified Taylor
Complex Figure - MTCF), and discuss some issues that need to be considered
when developing comparable measures or when assessing the comparability
of measures. It is hoped that the comparability of the MTCF to the
ROCF will be examined in future studies.
Contact Dr. Anita Hubley at anita.hubley@ubc.ca
for a copy of the paper.