TASK-BASED LANGUAGE ASSESSMENT:
Difficulty-based task items sequencing and Task-based Test
Performance
Shadab Jabbarpoor1
Islamic Azad University, Garmsar Branch, Iran
Abstract. Task-based assessment can be defined as an approach that attempts to assess as directly as
possible whether test takers are able to perform specific target language tasks in particular communicative
settings. When it comes to task-based testing the crucial factor to be considered would be task item
sequencing (which item comes first). In the literature of language testing it is assumed that items should be
sequenced from easiest to the most difficult. In this study the researcher intended to find out whether
sequencing tasks on the basis of task difficulty (determined by the scope of the outcome) from easiest (close
ended tasks) to the most difficult (open-ended tasks) improves testees’ performance on those tasks. To this
end she randomly selected two homogenous groups of subjects and assigned them into experimental and
control groups. In one group the task items were sequenced from easiest to the most difficult and in the
experimental group this sequence was reversed. Data analysis revealed that sequencing the tasks from easy to
difficult does not significantly improve testees’ performance. This study has implications for test designers as
well as language materials developers who intend to make task-based language teaching and testing most
practical.
Keywords: task, language assessment
1. Introduction
In the last two decades, L2 instruction has become more communicative with greater emphasis placed on
students’ ability to use the L2 in real-life situations. Task-based instruction is one increasingly popular
approach to communicative language learning. Tasks have gained support in the L2 teaching community
because they have often been seen principally as devices to allow learners to practice using the language as a
tool of communication rather than as a device to get learners to focus on grammatical features of the
language.
A very important assumption in task-based learning, as stated by Skehan (1998), is that this focus on
meaning will engage naturalistic acquisitional mechanisms, cause the underlying interlanguage system to be
stretched, and drive development forward. In short, task-based pedagogy moves away from the traditional
focus on form to an approach that promotes, in addition to grammatical skills, the ability to interact to
achieve communicative goals in the real world.
While the L2 literature includes numerous investigations of task-based instruction and learning
investigations a cursory examination of testing publications shows that task-based assessment work is scarce.
In fact, at first glance it may appear that the term ‘task’, except for denoting an activity or exercise such as
performance assessments, is relatively new in the L2-testing field. Where the term ‘task’ is used in testing, it
has been closely connected with the notion of test method (Bachman, 1990).
1 + Corresponding author. Tel.: + (989123273992); fax: +(9877635782).
E-mail address: (sjabbarpour@iau-garmsar.ac.ir).
219
2011 International Conference on Languages, Literature and Linguistics
IPEDR vol.26 (2011) © (2011) IACSIT Press, Singapore
2. Task-based assessment
Norris et al. (1998 cited in Brown, 2004) defined tasks as real world activities “that people do in
everyday life and which require language for their accomplishment” (p. 33). In this definition, a test task is a
real-world activity. On the other hand, Bachman and Palmer (1996) consider tasks as “an activity that
involves individuals in using language for the purpose of achieving a particular goal or objective in a
particular situation” (p. 44). Their definition is broader as it encompasses tasks specifically designed for
assessment and instruction as well as real-world activities. Task-based testing is part of a broader approach to
assessment called performance assessment. There are three essential characteristics of performance
assessment. Firstly, it must be based on tasks; secondly, the tasks should be as authentic as possible; and
finally, ‘success or failure in the outcome of the task, because they are performances, must usually be rated
by qualified judges.
Task-based tests are defined as any assessments that “require students to engage in some sort of behavior
which stimulates, with as much fidelity as possible, goal-oriented target language use outside the language
test situation. Performances on these tasks are then evaluated according to pre-determined, real-world
criterion elements (i.e., task processes and outcomes) and criterion levels (i.e., authentic standards related to
task success)” (Brown, 2004).
Task-based testing fits within the definition of performance testing Moreover, any discussions of
performance testing will necessarily include some discussion of task-based testing, but the reverse will not
necessarily be true.
Task-based assessment can be defined as an approach that attempts to assess as directly as possible
whether test takers are able to perform specific target language tasks in particular communicative settings.
Task-based assessment does not merely utilize the real-world tasks as a means for eliciting the production of
particular components of the language system, which are then measured or evaluated. Instead, the construct
of interest in task-based assessment is the performance of the task itself.
If language tasks are defined as being real-life activities that require meaningful language for their
performance, assessment tasks ideally should be motivating and authentic tasks that relate to what learners
are expected to be able to do with the target language in real life.
2.1. The objectives of task-based assessment
Task-based test developers aim to devise tests that provide direct information on test takers’ target
language performance in specific language use situations, but they will never reach a stage of perfection. In
fact, tests can, at best, be semi-direct (Colpin & Gysen, 2006).
An underlying premise to virtually all of discussions of task-based language assessment is that the
inferences we want to make are about underlying ‘language ability’ or ‘capacity for language use’ or ‘ability
for use’. Thus, Brindley (1994, cited in Bachman, 2002) explicitly includes, in his definition of task-based
language assessment, the view of language proficiency as encompassing both knowledge and ability for use.
Task-based tests are ‘new’ to many of the teachers involved: in more than one way, they deviated from
the kind of tests typically included in available teaching methods or from the tests traditionally developed by
teachers themselves, especially those who had been working in a more ‘linguistic’, forms-focused tradition.
The task-based tests were intended as models for the teachers, showing them how reading , listening,
writing and speaking proficiency can be assessed in a functional way. Since some of these tests could be
directly linked to the task-based, functional attainment goals, the tests also showed great potential for
heightening teachers’ sensitivity for functional goals, in fact, the introduction of task-based tests may have
great potential in ‘pushing’ teachers to make one of the main paradigm shifts that is involved in replacing or
supplementing traditional teaching and testing practices with task-based ones.
Some teachers may actually be more sensitive to changes with regard to their testing practices than to
innovations directly trying to affect their teaching practices, because of the high importance they attach to the
former. In this case, the washback effect of task-based tests on teachers’ pedagogical approaches may not be
direct, but mediated by their heightened awareness of the essential attainment goals they have to pursue.
220
Over the years, measurement theorists (Bachman, 1990; Bachman and Palmer, 1996) have discussed a
large number of sources of variation, or factors that may affect test performance.
One example of such a formulation is that provided by Bachman (1990), who identifies several distinct
sets of factors (language ability of the test-taker, test-task characteristics, personal characteristics of the test-
taker and random/ unpredictable factors) which are hypothesized to affect test performance. This formulation
recognizes that these factors may well be correlated with each other to some degree (except for the random
factors, which are by definition uncorrelated with anything else). In this formulation there is no factor
identified as ‘difficulty’.
A different conceptualization has been proposed in which task difficulty is conceptualized as the
classification of tasks according to an integration of ability requirements and task characteristics.
Skehan (1998) proposes three sets of features that he hypothesizes affect performance on tasks:
• code complexity: the language required to accomplish the task;
• cognitive complexity: the thinking required to accomplish the task; and
• communicative stress: the performance conditions for accomplishing a task (Skehan, 1998, p. 88).
It is hypothesized that these task difficulty features can affect the difficulty of a given task (ostensibly for
all learners, regardless of individual differences), and can be manipulated to increase or decrease task
difficulty’.
One of the first attempts at sequencing tasks from simple to complex was advanced by Brown et al.
(1984, cited in Mislevy, Steinberg & Almond, 2002). They distinguished among three different types of tasks
which they presented as ranging from easy to difficult. The first type, static tasks, was proposed as the easiest
type. In this kind of tasks, all the information to be exchanged is presented to the speaker in the materials for
carrying out the task (e.g. a map task in which the speaker has to give directions to the listener).
The second type, dynamic tasks, also present the speaker with all the information in stimulus materials,
but the tasks can present problems. In such tasks, characters, events, and activities change, and this change
forces the speaker to fully describe the stimulus material, and be explicit, discriminating, and consistent in
his or her use of language (e.g. a story in a comic strip in which characters appear and disappear or change
places and behaviors).
The last type, abstract tasks, is the most difficult one since the stimulus material does not contain the
content to be communicated. It involves making reference to abstract concepts, establishing connections
between ideas, and providing reasons for certain statements or behaviors (e.g. an opinion task in which
learners must choose the most suitable candidate for a scholarship out of a closed list of candidate
descriptions).
3. The present study
The present study investigates the influence of sequencing the tasks on the basis of the scope of the
outcome on task item performance. The tasks used in this study are assumed to vary in the degree of
linguistic and cognitive demands imposed on the learners. As for the tasks to vary in difficulty, both closed
and open outcome tasks were utilized.
Open-ended tasks are tasks to which there is not a single absolutely correct answer or where a variety of
answers are possible. They can be distinguished from 'closed tasks', where students have to answer in a
particular way. An example of an open-ended task might be where the students are asked to imagine a person
standing in a pair of shoes which they are shown and then to write a description of that person. A closed task
using the same type of language might be one where they are given a description with certain words missing,
which they have to supply. Both closed tasks and open-ended tasks are useful in language teaching. Where
students are working in groups, for example, closed tasks can force the students to discuss more in order to
find the correct answer. Open-ended tasks, however, are also very valuable for a number of reasons. Since
there is no single correct answer, the students can often answer at the level of their ability.
221
From this perspective, the present investigation provides empirical evidence on the effect of sequencing
tasks on the basis of the scope of the outcome on the learners’ task-based test performance. Additionally, it
attempts to shed light on the way task-based assessment benefits task-based teaching and syllabus design.
Thus the study aims to answer the following research question:
• Do the performances of the testees improve by sequencing the test items on the basis of increasing
item difficulty? Considering the above research question, the following null hypothesis is formulated:
• Learners’ performances in task-based test will not improve as the items are sequenced from easiest to
the most difficult.
4. Research design
The participants of this study were 30 students (26 females, 4 males) studying English translation at
Islamic Azad University, Garmsar Branch, Iran. All students had Persian as their native language; except for
one student whose native language was Armenian. All students were assumed to be at the same proficiency
level. Students’ mean age was 27 years.
The participants were randomly assigned into two groups, 15 students at each group. Two sets of tasks
were utilized in this study. In the first set the tasks were sequenced from easiest to the most difficult. Task
difficulty had been determined by the scope of the outcome as intuitively tasks with closed outcomes will be
easier in that the participants know there is a ‘right’ answer and thus can direct their efforts more
purposefully and more economically.
In the second set the same tasks were sequenced in the reverse order. Each set was given to one group.
In order to minimize the threat to the reliability, the researcher administered both tests simultaneously. Time
and rater facet were identical in both administrations.
Two sets of scores were obtained from two independent groups. The researcher used Statistical Package
for Social Sciences (SPSS 16.0) for data analysis.
In order to account for statistical differences in test performance, the researcher chose t-test for
independent samples as she was interested in determining the difference in the mean scores of two groups
under investigation.
Table 1: Group Statistics
item difficulty N Mean Std. Deviation Std. Error Mean
item sequence
control 15 3.8167 1.29720 .33494
experimental 15 3.6333 1.35576 .35006
Table 2: Independent sample t-test
Levene’s Test
for Equality of
Variances
t-test for Equality of Means
F Sig. t df
Sig.
(2-
tailed)
Mean
Difference
Std. Error
Differenc
e
95% Confidence Interval
of the Difference
Lower Upper
item
sequenc
e
Equal variances
assumed .097 .757 .378 28 .708 .18333 .48448 .80908 1.17575
Equal variances
not assumed .378
27.94
6 .708 .18333 .48448 -.80917 1.17584
As is shown in table 2, level of significance is .708 which is higher than p value (p= .05) therefore the
researcher can claim that her results indicate that there is no statistically significant difference between two
groups under investigation. And the difference is due to sampling error.
222
5. General discussion and conclusion
From a cognitive perspective our study is based on one of the assumptions of language testing that
claims that sequencing items from easiest to the most difficult helps learners’ performance on a test.
This study aimed to examine the effect of task sequencing on task performance in task-based testing. The
criterion for task sequencing was the scope of the outcome (Ellis, 2003). Therefore, the researcher assumes
that sequencing the tasks from closed outcomes to open outcomes will help task performance.
Although results of the study shows that reversing the sequence of the item tasks makes very little or no
changes in testees’ performances findings can shed light on the debate regarding optimal task sequencing in
task-based teaching and syllabus design.
Further research needs to be conducted to investigate some limitations of the present study. Among these
is the number of subjects. Further research can be done with more participants at different proficiency levels.
Moreover, in this study the criterion for task difficulty was the scope of the outcome. Other criteria such as
input medium, code complexity, cognitive complexity, context dependency, familiarity of information, task
demands, discourse mode, reasoning needed, medium of the outcome, etc. can also be investigated in future
studies.
In summary, it has also been argued that task-based assessment can be very useful for meeting actual
inferential demands in language classrooms as it employs complex, integrative, and open-ended tasks.
Furthermore, its high degree of authenticity may be beneficial in achieving the intended consequences of
assessment by bridging the gap between what the students face in the world and the way they are tested.
6. Acknowledgements
I am deeply indebted to Dr. Parviz Maftoon for all his helps to broaden my view regarding task-based
learning, teaching and assessment. His criticisms, guidance and support have always motivated me to
become a better researcher.
Thanks are also due to Dr. Parviz Birjandi for his guidance throughout my doctoral work. I am also
indebted to him for teaching testing and assessment courses.
My appreciations also go to my dear friend, Dr. Parva Panahi in Islamic Azad University, Tabriz Branch,
for her helpful comments.
7. References
[1]. L. F. Bachman. Fundamental considerations in language testing. Oxford University Press, 1990.
[2]. L. F. Bachman. Some reflections on task-based language performance assessment. Language Testing, 2002, 19 (4):
453-476.
[3]. L. F., Bachman, and A. S. Palmer. Language testing in practice. Oxford University Press, 1996.
[4]. J. D. Brown. Performance assessment: Existing literature and directions for research. Second Language Studies,
2004, 22 (2): 91-139.
[5]. M. Colpin, S. Gyaen. Developing and introducing task-based language tests. In: K. V. Branden (ed.). Task-based
language Education: From theory to practice. Cambridge: CUP. 2006, pp. 151-175.
[6]. R. Ellis. Task-based language learning and teaching. Oxford University Press, 2003.
[7]. R. J. Mislevy. L. S. Steinberg. R. G. Almond. Design and analysis in task-based language assessment. Language
Testing, 2002, 19(4): 477-496.
[8]. P. Skehan. A cognitive approach to Language Learning. Oxford University Press, 1998.
223
本文档为【Difficulty-based task items sequencing and task-based test performance】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。