Showing papers by "Lingjia Tang published in 2021"
•
23 Feb 2021
TL;DR: In this paper, a machine learning-based automated dialogue system is proposed to automatically detect discrepancies in annotated training data samples and repair the annotated data samples for a machine-learning-based dialog system.
Abstract: Systems and methods for automatically detecting annotation discrepancies in annotated training data samples and repairing the annotated training data samples for a machine learning-based automated dialogue system include evaluating a corpus of a plurality of distinct training data samples; identifying one or more of a slot span defect and a slot label defect of a target annotated slot span of a target training data sample of the corpus based on the evaluation; and automatically correcting one or more annotations of the target annotated slot span based on the identified one or more of the slot span defect and the slot label defect.
•
02 Mar 2021
TL;DR: In this article, a machine learning classifier is used to predict slot segments of utterance data based on an input of utterances data, and a semantic vector value is computed for each of the slot segments.
Abstract: Systems and methods for building a response for a machine learning-based dialogue agent includes implementing machine learning classifiers that predict slot segments of the utterance data based on an input of the utterance data; predict a slot classification label for each of the slot segments of the utterance data; computing a semantic vector value for each of the slot segments of the utterance data; assessing the semantic vector value of the slot segments of the utterance data against a multi-dimensional vector space of structured categories of dialogue, wherein the assessment includes: for each of a distinct structured categories of dialogue computing a similarity metric value; selecting one structured category of dialogue from the distinct structured categories of dialogue based on the computed similarity metric value for each of distinct structured categories; and producing a response to the utterance data.