آموزش ارزیاب با کمک سامانه‌های رهگیری چشم: مطالعه موردی یک ارزیاب مبتدی

doi:10.22132/tel.2025.472698.1668

آموزش ارزیاب با کمک سامانه‌های رهگیری چشم: مطالعه موردی یک ارزیاب مبتدی

نوع مقاله : مقاله پژوهشی

10.22132/tel.2025.472698.1668

چکیده

پژوهش حاضر بر مفهوم آموزش ارزیاب با کمک سامانه‌های رهگیری چشم متمرکز بوده است. یک ارزیاب مبتدی در برنامه‌ای آموزشی شرکت کرد که بر اساس رهگیری حرکات چشمان او طراحی شده بود. بلافاصله پس از ارزیابی یک نمونه انشا در هر جلسه، بازخوردی از رهگیری چشم به‌صورت نقشه حرارتی بر اساس حرکات چشم او ارائه می‌شد. این نقشه حرارتی مورد بحث قرار می‌گرفت تا به ارزیاب کمک کند رفتار خود را هنگام ارزیابی درک کند و مشخص شود کدام توصیفگرهای جدول معیارها و بخش‌های انشا را بیشتر مورد توجه قرار داده است. یافته‌ها نشان داد که در جلسات اولیه، ارزیاب تحت تأثیر اثر تقدم بود؛ یعنی عمدتاً بر دو معیار اول (محتوا و سازماندهی) تمرکز داشت. افزون بر این، در ابتدا در تصمیم‌گیری درباره محدوده نمره با مشکل مواجه بود و به‌جای توصیفگرها، توجه زیادی به محدوده نمره دهی می‌کرد. با این حال، پس از چند جلسه آموزش، رفتار خود را تعدیل کرد و سعی نمود بر تمامی معیارها و توصیفگرهای معادل تمرکز کند. یافته‌های این تحقیق می‌تواند به مدرسان آموزش ارزیاب در سازماندهی مؤثرتر برنامه‌های ارزیابی با استفاده از سامانه‌های رهگیری چشم برای بررسی رفتار ارزیابان کمک کند.

کلیدواژه‌ها

آموزش ارزیاب، نمره دهی انشا، رهگیری چشم، روند شناختی، ارزیاب مبتدی

Ary, D., Jacobs, L. C., Irvine, C. K. S., & Walker, D. (2018). Introduction to research in education (10^th Ed.). Cengage Learning.

Ashraf, H., Sodergren, M. H., Merali, N., Mylonas, G., Singh, H., & Darzi, A. (2018). Eye-tracking technology in medical education: A systematic review. Medical teacher, 40(1), 62-69.

Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford University Press.

Ballard, L. (2017). The effects of primacy on rater cognition: An eye-tracking study. Michigan State University.

Barkaoui, K. (2010). Variability in ESL essay rating processes: The role of the rating scale and rater experience. Language Assessment Quarterly, 7(1), 54-74.

Bejar, I. I., Williamson, D. M., & Mislevy, R. J. (2006). Human scoring. Automated scoring of complex tasks in computer-based testing, 49-82.

Chen, K. T., Prouzeau, A., Langmead, J., Whitelock-Jones, R. T., Lawrence, L., Dwyer, T., ... & Goodwin, S. (2023, May). Gazealytics: A Unified and Flexible Visual Toolkit for Exploratory and Comparative Gaze Analysis. In Proceedings of the 2023 Symposium on Eye Tracking Research and Applications (pp. 1-7). Preprint available at arXiv:2303.17202.

Conklin, K. & Pellicer-Sánchez, A. (2016). Using eye-tracking in applied linguistics and second language acquisition research. Second Language Research, 32(3), 453-467.

Cumming, A. (1990). Expertise in evaluating second-language compositions. Language Testing, 7(1), 31-51.

Cumming, A., Kantor, R., & Powers, D. E. (2002). Decision making while rating ESL/EFL writing tasks: A descriptive framework. The Modern Language Journal, 86(1), 67-96.

DeRemer, M. (1998). Writing assessment: Raters’ elaboration of the rating task. Assessing Writing, 5, 7–29.

Deygers, B., & Van Gorp, K. (2015). Determining the scoring validity of a co-constructed CEFR-based rating scale. Language Testing, 32(4), 521-541.

Diederich, P. B., French, J. W., & Carlton, S. T. (1961). Factors in judgments of writing ability. ETS Research Bulletin Series, 1961(2), i-93.

Dogan, C. D., & Uluman, M. (2017). A Comparison of Rubrics and Graded Category Rating Scales with Various Methods Regarding Raters' Reliability. Educational sciences: Theory and practice, 17(2), 631-651.

Eckes, T. (2005). Examining rater effects in TestDaF writing and speaking performance assessments: A many-facet Rasch analysis. Language Assessment Quarterly, 2(3), 197–221.

Eckes, T. (2015). Introduction to many-facet Rasch measurement: Analyzing and evaluating rater-mediated assessments (2^nd ed.). Peter Lang.

Eckstein, G., Casper, R., Chan, J., & Blackwell, L. (2018). Assessment of L2 student writing: Does teacher disciplinary background matter? Journal of Writing Research, 10(1), 1-23.

Elder, C., Knoch, U., Barkhuizen, G., & Von Randow, J. (2005). Individual feedback to enhance rater training: Does it work?. Language Assessment Quarterly: An International Journal, 2(3), 175-196.

Elder, C., Barkhuizen, G., Knoch, U., & Von Randow, J. (2007). Evaluating rater responses to an online training program for L2 writing assessment. Language Testing, 24(1), 37-64.

Engelhard Jr, G. (2013). Invariant measurement: Using Rasch models in the social, behavioral, and health sciences. Routledge.

Erguvan, I. D., & DÜNYA, B. A. (2021). Gathering evidence on e-rubrics: Perspectives and many facet Rasch analysis of rating behavior. International Journal of Assessment Tools in Education, 8(2), 454-474.

Erlam, R., von Randow, J., & Read, J. (2013). Investigating an online rater training program: product and process. Papers in Language Testing and Assessment, 2(1), 1-29.

Godfroid, A. (2019). Investigating instructed second language acquisition using L2 learners’ eye-tracking data. In The Routledge handbook of second language research in classroom learning (pp. 44-57). Routledge.

Godfroid, A., & Spino, L. A. (2015). Reconceptualizing reactivity of think‐alouds and eye tracking: Absence of evidence is not evidence of absence. Language Learning, 65(4), 896-928.

Godfroid, A., Winke, P., & Conklin, K. (2020). Exploring the depths of second language processing with eye tracking: An introduction. Second Language Research, 36(3), 243-255.

Gyamfi, G., Hanna, B. E., & Khosravi, H. (2022). The effects of rubrics on evaluative judgement: a randomised controlled experiment. Assessment & Evaluation in Higher Education, 47(1), 126-143.

Hamp-Lyons, L. (2007). Worrying about rating. Assessing Writing, 1(12), 1-9.

Harsch, C., & Martin, G. (2012). Adapting CEF-descriptors for rating purposes: Validation by a combined rater training and scale revision approach. Assessing Writing, 17(4), 228-250.

Jacobs, H., Zinkgraf, S., Wormuth, D., Hartfiel, V., & Hughey, J. (1981). Testing ESL composition: A practical approach. Rowley. Newbury House.

Janssen, G., Meier, V., & Trace, J. (2015). Building a better rubric: Mixed methods rubric revision. Assessing writing, 26, 51-66.

Jin, K. Y., & Eckes, T. (2022). Detecting differential rater functioning in severity and centrality: The dual DRF facets model. Educational and Psychological Measurement, 82(4), 757-781.

Johnson, J. S., & Lim, G. S. (2009). The influence of rater language background on writing performance assessment. Language Testing, 26(4), 485-505.

King, A. J., Bol, N., Cummins, R. G., & John, K. K. (2019). Improving visual behavior research in communication science: An overview, review, and reporting recommendations for using eye-tracking methods. Communication Methods and Measures, 13(3), 149-177.

Knoch, U. (2009). Diagnostic assessment of writing: A comparison of two rating scales. Language Testing, 26(2), 275-304.

Knoch, U. (2011). Investigating the effectiveness of individualized feedback to rating behavior—a longitudinal study. Language Testing, 28(2), 179-200.

Knoch, U., Read, J., & von Randow, J. (2007). Re-training writing raters online: How does it compare with face-to-face training? Assessing writing, 12(1), 26-43.

Li, Y., Wei, C., & Ma, T. (2019). Towards explaining the regularization effect of initial large learning rate in training neural networks. Advances in Neural Information Processing Systems, 3(2), 1-49.

Linacre, J. M. (2004). Optimizing rating scale effectiveness. In E. V. Smith & R.M. Smith (Eds.), Introduction to Rasch measurement (pp. 257–578). JAM Press.

Low, A. R. L., & Aryadoust, V. (2021). Investigating test-taking strategies in listening assessment: A comparative study of eye-tracking and self-report questionnaires. International Journal of Listening, 35(1), 1-20.

Lumley, T. (2002). Assessment criteria in a large-scale writing test: What do they really mean to the raters? Language Testing, 19(3), 246-276.

Lumley, T. (2005). Assessing second language writing: The rater’s perspective. P. Lang.

Lumley, T., & McNamara, T. F. (1995). Rater characteristics and rater bias: Implications for training. Language testing, 12(1), 54-71.

Luoma, S. (2004). Assessing speaking. Cambridge University Press.

Myford, C. M., & Wolfe, E. W. (2003). Detecting and measuring rater effects using many-facet Rasch measurement: Part I. Journal of applied measurement, 4(4), 386-422.

Myford, C. M., & Wolfe, E. W. (2004). Detecting and measuring rater effects using many-facet Rasch measurement: Part II. Journal of applied measurement, 5(2), 189-227.

Rayner, K. (1978). Eye movements in reading and information processing. Psychological bulletin, 85(3), 618.

Rayner, K. (2009). Eye movements in reading: Models and data. Journal of eye movement research, 2(5), 1.

Saito, H. (2008). EFL classroom peer assessment: Training effects on rating and commenting. Language testing, 25(4), 553-581.

Saslow, J., & Ascher, A. (2015). Top notch (3rd ed.). Pearson Education.

Schaefer, E. (2008). Rater bias patterns in an EFL writing assessment. Language Testing, 25(4), 465-493.

Shin, Y. S. (2009). A FACETS analysis of rater characteristics and rater bias in measuring L2 writing performance. English Language & Literature Teaching, 16(1), 123-142.

Shohamy, E., Gordon, C. M., & Kraemer, R. (1992). The effect of raters’ background and training on the reliability of direct writing tests. The Modern Language Journal, 76(1), 27-33.

Stewart, A. J., Pickering, M. J., & Sturt, P. (2004). Using eye movements during reading as an implicit measure of the acceptability of brand extensions. Applied Cognitive Psychology: The Official Journal of the Society for Applied Research in Memory and Cognition, 18(6), 697-709.

Suto, I. (2012). A critical review of some qualitative research methods used to explore rater cognition. Educational Measurement: Issues and Practice, 31(3), 21-30.

Vaughan. C. (1991). Holistic assessment: What goes on in the rater's mind? In L. Hamp-Lyons (Ed.) Assessing second language writing in academic contexts, 111-125.

Wang, J., & Engelhard Jr, G. (2019). Exploring the impersonal judgments and personal preferences of raters in rater-mediated assessments with unfolding models. Educational and Psychological Measurement, 79(4), 773-795.

Weigle, S. C. (1994). Effects of training on raters of ESL compositions. Language Testing, 11(2), 197-223.

Weigle, S. C. (1998). Using FACETS to model rater training effects. Language Testing, 15(2), 263-287.

Weigle, S. C. (1999). Investigating rater/prompt interactions in writing assessment: Quantitative and qualitative approaches. Assessing Writing, 6(2), 145-178.

Weigle, S. C. (2002). Assessing writing. Cambridge University Press.

Wind, S. A. (2019a). A nonparametric procedure for exploring differences in rating quality across test-taker subgroups in rater-mediated writing assessments. Language Testing, 36(4), 595-616.

Wind, S. A. (2019b). Examining the impacts of rater effects in performance assessments. Applied Psychological Measurement, 43(2), 159-171.

Wind, S. A., & Peterson, M. E. (2018). A systematic review of methods for evaluating rating quality in language assessment. Language Testing, 35(2), 161-192.

Winke, P., & Brunfaut, T. (Eds.). (2021). The Routledge handbook of second language acquisition and language testing. Routledge.

Winke, P., & Lim, H. (2015). ESL essay raters’ cognitive processes in applying the Jacobs et al. rubric: An eye-movement study. Assessing Writing, 25, 38-54.

Wolfe, E. W. (1997). The relationship between essay reading style and scoring proficiency in a psychometric scoring system. Assessing Writing, 4(1), 83-106.

Yan, X. (2014). An examination of rater performance on a local oral English proficiency test: A mixed-methods approach. Language Testing, 31(4), 501–527.

Youn, S. J. (2018). Rater variability across examinees and rating criteria in paired speaking assessment. Papers in Language Testing and Assessment, 7(1), 32-60.

دوره 19، شماره 2
تیر 1404
صفحه 511-539

XML

اصل مقاله 494.04 K

تاریخ دریافت 20 مرداد 1403
تاریخ بازنگری 28 مرداد 1404
تاریخ پذیرش 31 مرداد 1404

تعداد مشاهده مقاله 207
تعداد دریافت فایل اصل مقاله 143

Teaching English Language

آموزش ارزیاب با کمک سامانه‌های رهگیری چشم: مطالعه موردی یک ارزیاب مبتدی

دوره 19، شماره 2تیر 1404صفحه 511-539

فایل ها

سابقه مقاله

هم رسانی

ارجاع به این مقاله

آمار

دوره 19، شماره 2
تیر 1404
صفحه 511-539