Rough Set based keyword selection and weighing for textual answer evaluation

doi:10.1109/INDICON.2015.7443405

Proceedings Article•DOI•

Rough Set based keyword selection and weighing for textual answer evaluation

Udit Kr. Chakraborty¹, Debanjan Konar¹, Samir Roy, Sankhayan Choudhury²•Institutions (2)

Sikkim Manipal University¹, University of Calcutta²

01 Dec 2015-pp 1-6

TL;DR: The proposed rough set based strategy to augment automated free text evaluation system(s) using keyword and associated expression based technique outperforms the manual keyword selection and weight association and also higher correlation with human evaluators.

read less

Abstract: Automatic assessment of learners responses has gained wider acceptance and popularity in recent times. Due to associated complexities of free text evaluation, the trend has gradually shifted towards close ended question which have their limitations. The current work proposes a rough set based strategy to augment automated free text evaluation system(s) using keyword and associated expression based technique. The proposed method uses human evaluated answers as training data and using Rough Set Theory, extracts information from them to be used in the shortlisting and weighing of keywords which are to be used in assessment. The results of the proposed technique outperforms the manual keyword selection and weight association and also higher correlation with human evaluators.

...read moreread less

References

PDF

Open Access

More filters

Journal Article•DOI•

Rough sets

[...]

Zdzisław Pawlak¹, Jerzy W. Grzymala-Busse², Roman Słowiński³, Wojciech Ziarko⁴•Institutions (4)

Warsaw University of Technology¹, University of Kansas², Poznań University of Technology³, University of Regina⁴

01 Nov 1995-Communications of The ACM

TL;DR: This approach seems to be of fundamental importance to artificial intelligence (AI) and cognitive sciences, especially in the areas of machine learning, knowledge acquisition, decision analysis, knowledge discovery from databases, expert systems, decision support systems, inductive reasoning, and pattern recognition.

...read moreread less

Abstract: Rough set theory, introduced by Zdzislaw Pawlak in the early 1980s [11, 12], is a new mathematical tool to deal with vagueness and uncertainty. This approach seems to be of fundamental importance to artificial intelligence (AI) and cognitive sciences, especially in the areas of machine learning, knowledge acquisition, decision analysis, knowledge discovery from databases, expert systems, decision support systems, inductive reasoning, and pattern recognition.

...read moreread less

7,185 citations

Journal Article•DOI•

An introduction to latent semantic analysis

[...]

Thomas K. Landauer¹, Peter W. Foltz², Darrell Laham¹•Institutions (2)

University of Colorado Boulder¹, New Mexico State University²

01 Jan 1998-Discourse Processes

TL;DR: The adequacy of LSA's reflection of human knowledge has been established in a variety of ways, for example, its scores overlap those of humans on standard vocabulary and subject matter tests; it mimics human word sorting and category judgments; it simulates word‐word and passage‐word lexical priming data.

...read moreread less

Abstract: Latent Semantic Analysis (LSA) is a theory and method for extracting and representing the contextual‐usage meaning of words by statistical computations applied to a large corpus of text (Landauer & Dumais, 1997). The underlying idea is that the aggregate of all the word contexts in which a given word does and does not appear provides a set of mutual constraints that largely determines the similarity of meaning of words and sets of words to each other. The adequacy of LSA's reflection of human knowledge has been established in a variety of ways. For example, its scores overlap those of humans on standard vocabulary and subject matter tests; it mimics human word sorting and category judgments; it simulates word‐word and passage‐word lexical priming data; and, as reported in 3 following articles in this issue, it accurately estimates passage coherence, learnability of passages by individual students, and the quality and quantity of knowledge contained in an essay.

...read moreread less

4,391 citations

Book Chapter•DOI•

From computing with numbers to computing with words. From manipulation of measurements to manipulation of perceptions

[...]

Lotfi A. Zadeh¹•Institutions (1)

University of California, Berkeley¹

01 Jan 1999-IEEE Transactions on Circuits and Systems I-regular Papers

TL;DR: The computational theory of perceptions (CTP) as mentioned in this paper is a methodology for reasoning and computing with perceptions rather than measurements, where words play the role of labels of perceptions and, more generally, perceptions are expressed as propositions in a natural language.

...read moreread less

Abstract: Discusses a methodology for reasoning and computing with perceptions rather than measurements. An outline of such a methodology-referred to as a computational theory of perceptions is presented in this paper. The computational theory of perceptions, or CTP for short, is based on the methodology of CW. In CTP, words play the role of labels of perceptions and, more generally, perceptions are expressed as propositions in a natural language. CW-based techniques are employed to translate propositions expressed in a natural language into what is called the Generalized Constraint Language (GCL). In this language, the meaning of a proposition is expressed as a generalized constraint, N is R, where N is the constrained variable, R is the constraining relation and isr is a variable copula in which r is a variable whose value defines the way in which R constrains S. Among the basic types of constraints are: possibilistic, veristic, probabilistic, random set, Pawlak set, fuzzy graph and usuality. The wide variety of constraints in GCL makes GCL a much more expressive language than the language of predicate logic. In CW, the initial and terminal data sets, IDS and TDS, are assumed to consist of propositions expressed in a natural language. These propositions are translated, respectively, into antecedent and consequent constraints. Consequent constraints are derived from antecedent constraints through the use of rules of constraint propagation. The principal constraint propagation rule is the generalized extension principle. The derived constraints are retranslated into a natural language, yielding the terminal data set (TDS). The rules of constraint propagation in CW coincide with the rules of inference in fuzzy logic. A basic problem in CW is that of explicitation of N, R, and r in a generalized constraint, X is R, which represents the meaning of a proposition, p, in a natural language.

...read moreread less

1,453 citations

Journal Article•DOI•

An Overview of Current Research on Automated Essay Grading

[...]

S. Valenti¹, Francesca Neri¹, Alessandro Cucchiarelli¹•Institutions (1)

Marche Polytechnic University¹

01 Jan 2003-Journal of Information Technology Education

TL;DR: A survey of current approaches to the automated assessment of free text answers is presented and the following systems will be discussed: Project Essay Grade, Intelligent Essay Assessor (IEA), Educational Testing service I, Electronic Essay Rater, C-Rater, BETSY, IntelligentEssay Marking System, SEAR, Paperless School free text Marking Engine and Automark.

...read moreread less

Abstract: Introduction Assessment is considered to play a central role in the educational process. The interest in the development and in use of Computer-based Assessment Systems (CbAS) has grown exponentially in the last few years, due both to the increase of the number of students attending universities and to the possibilities provided by e-learning approaches to asynchronous and ubiquitous education. According to our findings (Valenti, Cucchiarelli, & Panti., 2002) more than forty commercial CbAS are currently available on the market. Most of those tools are based on the use of the so-called objective-type questions: i.e. multiple choice, multiple answer, short answer, selection/association, hot spot and visual identification (Valenti et al., 2000). Most researchers in this field agree on the thesis that some aspects of complex achievement are difficult to measure using objective-type questions. Learning outcomes implying the ability to recall, organize and integrate ideas, the ability to express oneself in writing and the ability to supply merely than identify interpretation and application of data, require less structuring of response than that imposed by objective test items (Gronlund, 1985). It is in the measurement of such outcomes, corresponding to the higher levels of the Bloom's (1956) taxonomy (namely evaluation and synthesis) that the essay question serves its most useful purpose. One of the difficulties of grading essays is the subjectivity, or at least the perceived subjectivity, of the grading process. Many researchers claim that the subjective nature of essay assessment leads to variation in grades awarded by different human assessors, which is perceived by students as a great source of unfairness. Furthermore essay grading is a time consuming activity. According to Mason (2002), about 30% of teachers' time in Great Britain is devoted to marking. "So, if we want to free up that 30% (worth 3 billion UK Pounds/year to the taxpayer by the way) then we must find an effective way, that teacher will trust, to mark essays and short text responses." This issue may be faced through the adoption of automated assessment tools for essays. A system for automated assessment would at least be consistent in the way it scores essays, and enormous cost and time savings could be achieved if the system can be shown to grade essays within the range of those awarded by human assessor. Furthermore, according to Hearst (2000) using computers to increase our understanding of the textual features and cognitive skills involved in the creation and in the comprehension of written texts, will provide a number of benefits to the educational community. In fact "it will help us develop more effective instructional materials for improving reading, writing and other communication abilities. It will also help us develop more effective technologies such as search engines and question answering systems for providing universal access to electronic information." Purpose of this paper is to present a survey of current approaches to the automated assessment of free text answers. Thus, in the next section, the following systems will be discussed: Project Essay Grade (PEG), Intelligent Essay Assessor (IEA), Educational Testing service I, Electronic Essay Rater (E-Rater), C-Rater, BETSY, Intelligent Essay Marking System, SEAR, Paperless School free text Marking Engine and Automark. All these systems are currently available either as commercial systems or as the result of research in this field. For each system, the general structure and the performance claimed by the authors are presented. In the last section, we will try to compare these systems and to identify issues that may foster the research in the field. Current Tools for Automated Essay Grading Project Essay Grade (PEG) PEG is one of the earliest and longest-lived implementations of automated essay grading. It was developed by Page and others (Hearst, 2000; Page, 1994, 1996) and primarily relies on style analysis of surface linguistic features of a block of text. …

...read moreread less

300 citations

Journal Article•

From computing with numbers to computing with words - From manipulation of measurements to manipulation of perceptions

[...]

Lotfi A. Zadeh¹•Institutions (1)

University of California, Berkeley¹

01 Jan 2002-International Journal of Applied Mathematics and Computer Science

TL;DR: An outline of a methodology for reasoning and computing with perceptions rather than measurements is presented, a theory which may have an important bearing on how humans make--and machines might make--perception-based rational decisions in an environment of imprecision, uncertainty, and partial truth.

...read moreread less

Abstract: Interest in issues relating to consciousness has grown markedly during the last several years. And yet, nobody can claim that consciousness is a well-understood concept that lends itself to precise analysis. It may be argued that, as a concept, consciousness is much too complex to fit into the conceptual structure of existing theories based on Aristotelian logic and probability theory. An approach suggested in this paper links consciousness to perceptions and perceptions to their descriptors in a natural language. In this way, those aspects of consciousness which relate to reasoning and concept formation are linked to what is referred to as the methodology of computing with words (CW). Computing, in its usual sense, is centered on manipulation of numbers and symbols. In contrast, computing with words, or CW for short, is a methodology in which the objects of computation are words and propositions drawn from a natural language (e.g., small, large, far, heavy, not very likely, the price of gas is low and declining, Berkeley is near San Francisco, it is very unlikely that there will be a significant increase in the price of oil in the near future, etc.). Computing with words is inspired by the remarkable human capability to perform a wide variety of physical and mental tasks without any measurements and any computations. Familiar examples of such tasks are parking a car, driving in heavy traffic, playing golf, riding a bicycle, understanding speech, and summarizing a story. Underlying this remarkable capability is the brain's crucial ability to manipulate perceptions--perceptions of distance, size, weight, color, speed, time, direction, force, number, truth, likelihood, and other characteristics of physical and mental objects. Manipulation of perceptions plays a key role in human recognition, decision and execution processes. As a methodology, computing with words provides a foundation for a computational theory of perceptions: a theory which may have an important bearing on how humans make--and machines might make--perception-based rational decisions in an environment of imprecision, uncertainty, and partial truth. A basic difference between perceptions and measurements is that, in general, measurements are crisp, whereas perceptions are fuzzy. One of the fundamental aims of science has been and continues to be that of progressing from perceptions to measurements. Pursuit of this aim has led to brilliant successes. We have sent men to the moon; we can build computers that are capable of performing billions of computations per second; we have constructed telescopes that can explore the far reaches of the universe; and we can date the age of rocks that are millions of years old. But alongside the brilliant successes stand conspicuous underachievements and outright failures. We cannot build robots that can move with the agility of animals or humans; we cannot automate driving in heavy traffic; we cannot translate from one language to another at the level of a human interpreter; we cannot create programs that can summarize non-trivial stories; our ability to model the behavior of economic systems leaves much to be desired; and we cannot build machines that can compete with children in the performance of a wide variety of physical and cognitive tasks. It may be argued that underlying the underachievements and failures is the unavailability of a methodology for reasoning and computing with perceptions rather than measurements. An outline of such a methodology--referred to as a computational theory of perceptions--is presented in this paper. The computational theory of perceptions (CTP) is based on the methodology of CW. In CTP, words play the role of labels of perceptions, and, more generally, perceptions are expressed as propositions in a natural language. CW-based techniques are employed to translate propositions expressed in a natural language into what is called the Generalized Constraint Language (GCL). In this language, the meaning of a proposition is expressed as a generalized constraint, X isr R, where X is the constrained variable, R is the constraining relation, and isr is a variable copula in which r is an indexing variable whose value defines the way in which R constrains X. Among the basic types of constraints are possibilistic, veristic, probabilistic, random set, Pawlak set, fuzzy graph, and usuality. The wide variety of constraints in GCL makes GCL a much more expressive language than the language of predicate logic. In CW, the initial and terminal data sets, IDS and TDS, are assumed to consist of propositions expressed in a natural language. These propositions are translated, respectively, into antecedent and consequent constraints. Consequent constraints are derived from antecedent constraints through the use of rules of constraint propagation. The principal constraint propagation rule is the generalized extension principle. (ABSTRACT TRUNCATED)

...read moreread less

227 citations

Additional excerpts

...theory has been presented by Zadeh [14], where the author proposed the introduction of special classes for meaning representation....
[...]
...The Answer as a relation. theory has been presented by Zadeh [14], where the author proposed the introduction of special classes for meaning representation....
[...]

Rough Set based keyword selection and weighing for textual answer evaluation

References

Additional excerpts

Related Papers (5)