scispace - formally typeset
Search or ask a question

Showing papers by "Lingjia Tang published in 2021"


Patent
23 Feb 2021
TL;DR: In this paper, a machine learning-based automated dialogue system is proposed to automatically detect discrepancies in annotated training data samples and repair the annotated data samples for a machine-learning-based dialog system.
Abstract: Systems and methods for automatically detecting annotation discrepancies in annotated training data samples and repairing the annotated training data samples for a machine learning-based automated dialogue system include evaluating a corpus of a plurality of distinct training data samples; identifying one or more of a slot span defect and a slot label defect of a target annotated slot span of a target training data sample of the corpus based on the evaluation; and automatically correcting one or more annotations of the target annotated slot span based on the identified one or more of the slot span defect and the slot label defect.

Patent
02 Mar 2021
TL;DR: In this article, a machine learning classifier is used to predict slot segments of utterance data based on an input of utterances data, and a semantic vector value is computed for each of the slot segments.
Abstract: Systems and methods for building a response for a machine learning-based dialogue agent includes implementing machine learning classifiers that predict slot segments of the utterance data based on an input of the utterance data; predict a slot classification label for each of the slot segments of the utterance data; computing a semantic vector value for each of the slot segments of the utterance data; assessing the semantic vector value of the slot segments of the utterance data against a multi-dimensional vector space of structured categories of dialogue, wherein the assessment includes: for each of a distinct structured categories of dialogue computing a similarity metric value; selecting one structured category of dialogue from the distinct structured categories of dialogue based on the computed similarity metric value for each of distinct structured categories; and producing a response to the utterance data.