通过AI和语义技术改善学生的学业表现

论文标题

通过AI和语义技术改善学生的学业表现

Improving Students' Academic Performance with AI and Semantic Technologies

论文作者

Cheng, Yixin

论文摘要

人工智能和语义技术正在发展，并已在包括教育领域在内的各个研究领域应用。高等教育机构努力提高学生的学业表现。早期干预高危学生和合理的课程对于学生的成功至关重要。先前的研究选择部署传统的机器学习模型来预测学生的表现。 In terms of curriculum semantic analysis, after conducting a comprehensive systematic review regarding the use of semantic technologies in the Computer Science curriculum, a major finding of the study is that technologies used to measure similarity have limitations in terms of accuracy and ambiguity in the representation of concepts, courses, etc. To fill these gaps, in this study, three implementations were developed, that is, to predict students' performance using marks from the previous semester, to model a course representation以语义方式计算相似性，并确定两个相似课程之间的先决条件。关于绩效预测，我们使用了包含248730记录的巴西大学的数据集上的遗传算法和长期术语记忆（LSTM）的组合。至于相似性测量，我们部署了BERT来编码句子并使用余弦相似性来获得课程之间的距离。关于先决条件，将得克萨拉龙用于从课程描述中提取概念，然后采用SEMREFD来衡量两个概念之间的先决条件程度。这项研究的结果可以概括为：（i）突破性的结果将Manrique的工作提高了2.5％，就辍学预测的准确性而言；（ii）根据课程描述揭示课程之间的相似性；（iii）确定ANU的三个强制性计算学院的先决条件。

Artificial intelligence and semantic technologies are evolving and have been applied in various research areas, including the education domain. Higher Education institutions strive to improve students' academic performance. Early intervention to at-risk students and a reasonable curriculum is vital for students' success. Prior research opted for deploying traditional machine learning models to predict students' performance. In terms of curriculum semantic analysis, after conducting a comprehensive systematic review regarding the use of semantic technologies in the Computer Science curriculum, a major finding of the study is that technologies used to measure similarity have limitations in terms of accuracy and ambiguity in the representation of concepts, courses, etc. To fill these gaps, in this study, three implementations were developed, that is, to predict students' performance using marks from the previous semester, to model a course representation in a semantic way and compute the similarity, and to identify the prerequisite between two similar courses. Regarding performance prediction, we used the combination of Genetic Algorithm and Long-Short Term Memory (LSTM) on a dataset from a Brazilian university containing 248730 records. As for similarity measurement, we deployed BERT to encode the sentences and used cosine similarity to obtain the distance between courses. With respect to prerequisite identification, TextRazor was applied to extract concepts from course description, followed by employing SemRefD to measure the degree of prerequisite between two concepts. The outcomes of this study can be summarized as: (i) a breakthrough result improves Manrique's work by 2.5% in terms of accuracy in dropout prediction; (ii) uncover the similarity between courses based on course description; (iii) identify the prerequisite over three compulsory courses of School of Computing at ANU.

下载PDF全文

下载文献需遵守相关版权规定

论文标题