Similarity of the cut score in test sets with different item amounts using the modified Angoff, modified Ebel, and Hofstee standard-setting methods for the Korean Medical Licensing Examination

Park, Janghee; Yim, Mi Kyoung; Kim, Na Jin; Ahn, Duck Sun; Kim, Young-Min

doi:10.3352/jeehp.2020.17.28

Detailed Information

Cited 7 time in webofscience

Cited 7 time in scopus

Metadata Downloads

Similarity of the cut score in test sets with different item amounts using the modified Angoff, modified Ebel, and Hofstee standard-setting methods for the Korean Medical Licensing Examinationopen access

Authors: Park, Janghee; Yim, Mi Kyoung; Kim, Na Jin; Ahn, Duck Sun; Kim, Young-Min

Issue Date: Oct-2020

Publisher: Korea Health Personnel Licensing Examination Institute

Keywords: Educational measurement; Medical education; Medical licensure; Republic of Korea; Reproducibility of results

Citation: Journal of Educational Evaluation for Health Professions, v.17

Indexed: SCOPUS
ESCI
KCI

Journal Title: Journal of Educational Evaluation for Health Professions

Volume: 17

URI: https://scholarworks.korea.ac.kr/kumedicine/handle/2020.sw.kumedicine/33546

DOI: 10.3352/jeehp.2020.17.28

ISSN: 1975-5937

Abstract: Purpose The Korea Medical Licensing Exam (KMLE) typically contains a large number of items. The purpose of this study was to investigate whether there is a difference in the cut score between evaluating all items of the exam and evaluating only some items when conducting standard-setting. Methods We divided the item sets that appeared on 3 recent KMLEs for the past 3 years into 4 subsets of each year of 25% each based on their item content categories, discrimination index, and difficulty index. The entire panel of 15 members assessed all the items (360 items, 100%) of the year 2017. In split-half set 1, each item set contained 184 (51%) items of year 2018 and each set from split-half set 2 contained 182 (51%) items of the year 2019 using the same method. We used the modified Angoff, modified Ebel, and Hofstee methods in the standard-setting process. Results Less than a 1% cut score difference was observed when the same method was used to stratify item subsets containing 25%, 51%, or 100% of the entire set. When rating fewer items, higher rater reliability was observed. Conclusion When the entire item set was divided into equivalent subsets, assessing the exam using a portion of the item set (90 out of 360 items) yielded similar cut scores to those derived using the entire item set. There was a higher correlation between panelists’ individual assessments and the overall assessments.

Files in This Item: There are no files associated with this item.

Appears in Collections: 1. Basic Science > Department of Medical Humanities > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Ahn, Duck Sun photo

Ahn, Duck Sun: College of Medicine (Department of Medical Humanities)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :12,948,594; Today View :337

RSS_1.0 RSS_2.0 ATOM_1.0

73, Goryeodae-ro, Seongbuk-gu, Seoul, Republic of Korea (02841)82-2-2286-1265

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE