scispace - formally typeset
Journal ArticleDOI

A novel cluster-based approach for keyphrase extraction from MOOC video lectures

Reads0
Chats0
TLDR
SemKeyphrase is proposed, an unsupervised cluster-based approach for keyphrase extraction from MOOC video lectures that incorporates a new semantic relatedness metric and a ranking algorithm that involves two phases on ranking candidates.
Abstract
Massive open online courses (MOOCs) have emerged as a great resource for learners. Numerous challenges remain to be addressed in order to make MOOCs more useful and convenient for learners. One such challenge is how to automatically extract a set of keyphrases from MOOC video lectures that can help students quickly identify the right knowledge they want to learn and thus expedite their learning process. In this paper, we propose SemKeyphrase, an unsupervised cluster-based approach for keyphrase extraction from MOOC video lectures. SemKeyphrase incorporates a new semantic relatedness metric and a ranking algorithm, called PhraseRank, that involves two phases on ranking candidates. We conducted experiments on a real-world dataset of MOOC video lectures, and the results show that our proposed approach outperforms the state-of-the-art keyphrase extraction methods.

read more

Citations
More filters
Journal ArticleDOI

Knowledge discovery for course choice decision in Massive Open Online Courses using machine learning approaches

TL;DR: In this article , the authors proposed a novel framework through machine learning techniques to propose course recommendations in MOOCs according to the uses' preferences and behavior, which used Latent Dirichlet Allocation (LDA) for text mining, Decision Trees for decision rule generation, Self-Organizing Map (SOM) for users' reviews on courses and the fuzzy rule-based system for users preferences prediction.
Journal ArticleDOI

LVTIA: A new method for keyphrase extraction from scientific video lectures

TL;DR: In this paper, the authors proposed an approach by which appropriate keyphrases are assigned to scientific video lectures, in which the textual content of video frames along with the text extracted from audio signal are merged together, and a new keyphrase extraction method is proposed.
Journal ArticleDOI

Extract Concept using Subtitles in MOOC

TL;DR: In this article , two keyword extraction methods, BERT and LDA, were evaluated using different Coursera courses and the experimental results show that BERT outperforms LDA in terms of Coherence.
Journal ArticleDOI

A Closer Look into Recent Video-based Learning Research: A Comprehensive Review of Video Characteristics, Tools, Technologies, and Learning Effectiveness

TL;DR: In this article , the authors present a comprehensive review of 257 articles on video-based learning from computer science databases for the period from 2016 to 2021 using the PRISMA guidelines and suggest a taxonomy which organizes the video characteristics and contextual aspects into eight categories: audio features, visual features, instructor behavior, learners' activities, interactive features (quizzes, etc.), production style, and instructional design.
Journal ArticleDOI

A novel data quality framework for assessment of scientific lecture video indexing

TL;DR: In this paper , the authors defined new data quality dimensions for lecture video indexing, including accuracy, value-added, relevancy, completeness, appropriate amount of data, concise, consistency, interpretability and accessibility.
References
More filters
Journal Article

Scikit-learn: Machine Learning in Python

TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Posted Content

Efficient Estimation of Word Representations in Vector Space

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.

Classification and Regression by randomForest

TL;DR: random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler.
Related Papers (5)