scispace - formally typeset
Search or ask a question

Showing papers on "Phrase published in 2021"


Proceedings ArticleDOI
TL;DR: In this article, the authors present an in-depth analysis of the impact of multi-word suggestion choices from a neural language model on user behaviour regarding input and text composition in email writing.
Abstract: We present an in-depth analysis of the impact of multi-word suggestion choices from a neural language model on user behaviour regarding input and text composition in email writing. Our study for the first time compares different numbers of parallel suggestions, and use by native and non-native English writers, to explore a trade-off of "efficiency vs ideation", emerging from recent literature. We built a text editor prototype with a neural language model (GPT-2), refined in a prestudy with 30 people. In an online study (N=156), people composed emails in four conditions (0/1/3/6 parallel suggestions). Our results reveal (1) benefits for ideation, and costs for efficiency, when suggesting multiple phrases; (2) that non-native speakers benefit more from more suggestions; and (3) further insights into behaviour patterns. We discuss implications for research, the design of interactive suggestion systems, and the vision of supporting writers with AI instead of replacing them.

33 citations


Proceedings Article
18 May 2021
TL;DR: In this article, a video caption generating network is proposed to group video frames with discriminating word phrases of partially decoded caption and then decode those semantically aligned groups in predicting the next word.
Abstract: This paper considers a video caption generating network referred to as Semantic Grouping Network (SGN) that attempts (1) to group video frames with discriminating word phrases of partially decoded caption and then (2) to decode those semantically aligned groups in predicting the next word. As consecutive frames are not likely to provide unique information, prior methods have focused on discarding or merging repetitive information based only on the input video. The SGN learns an algorithm to capture the most discriminating word phrases of the partially decoded caption and a mapping that associates each phrase to the relevant video frames - establishing this mapping allows semantically related frames to be clustered, which reduces redundancy. In contrast to the prior methods, the continuous feedback from decoded words enables the SGN to dynamically update the video representation that adapts to the partially decoded caption. Furthermore, a contrastive attention loss is proposed to facilitate accurate alignment between a word phrase and video frames without manual annotations. The SGN achieves state-of-the-art performances by outperforming runner-up methods by a margin of 2.1%p and 2.4%p in a CIDEr-D score on MSVD and MSR-VTT datasets, respectively. Extensive experiments demonstrate the effectiveness and interpretability of the SGN.

32 citations


Posted ContentDOI
04 May 2021-bioRxiv
TL;DR: It is found that syntactic structure-based features are better than effort-based metrics at predicting brain activity in various parts of the language system and called for a shift in the approach used for studying syntactic processing.
Abstract: We are far from having a complete mechanistic understanding of the brain computations involved in language processing and of the role that syntax plays in those computations. Most language studies do not computationally model syntactic structure, and most studies that do model syntactic processing use effort-based metrics. These metrics capture the effort needed to process the syntactic information given by every word (Brennan et al., 2012; Hale et al., 2018; Brennan et al.,2016). They can reveal where in the brain syntactic processing occurs, but not what features of syntax are processed by different brain regions. Here, we move beyond effort-based metrics and propose explicit features capturing the syntactic structure that is incrementally built while a sentence is being read. Using these features and functional Magnetic Resonance Imaging (fMRI) recordings of participants reading a natural text, we study the brain representation of syntax. We find that our syntactic structure-based features are better than effort-based metrics at predicting brain activity in various parts of the language system. We show evidence of the brain representation of complex syntactic information such as phrase and clause structures. We see that regions well-predicted by syntactic features are distributed in the language system and are not distinguishable from those processing semantics. Our results call for a shift in the approach used for studying syntactic processing.

28 citations


Journal ArticleDOI
TL;DR: This article proposed a phrase dependency graph attention network (PD-RGAT) for aspect-based sentiment analysis, which aggregates directed dependency edges and phrase information to extract aspect-sentiment polarity pairs.
Abstract: Aspect-based Sentiment Analysis (ABSA) is a subclass of sentiment analysis, which aims to identify the sentiment polarity such as positive, negative, or neutral for specific aspects or attributes that appear in a sentence. Previous studies have focused on extracting aspect-sentiment polarity pairs based on dependency trees, ignoring edge labels and phrase information. In this paper, we instead propose a phrase dependency graph attention network (PD-RGAT) on the ABSA task, which is a relational graph attention network constructed based on the phrase dependency graph, aggregating directed dependency edges and phrase information. We perform experiments with two pre-training models, GloVe and BERT. Experimental results on three benchmarking datasets (i.e., Twitter, Restaurant, and Laptop) demonstrate that our proposed PD-RGAT has comparable effectiveness to a range of state-of-the-art models and further illustrate that the graph convolutional structure based on the phrase dependency graph can capture both syntactic information and short long-range word dependencies. It also shows that incorporating directed edge labels and phrase information can enhance the analysis of aspect-sentiment polarities on the ABSA task.

26 citations


Journal ArticleDOI
TL;DR: The results have validated that the image captions generated by the proposed method contain more accurate visual information and comply with language habits and grammar rules better.
Abstract: To generate an image caption, firstly, the content of the image should be fully understood; and then the semantic information contained in the image should be described using a phrase or statement that conforms to certain grammatical rules. Thus, it requires techniques from both computer vision and natural language processing to connect the two different media forms together, which is highly challenging. To adaptively adjust the effect of visual information and language information on the captioning process, in this paper, the part of speech information is proposed to novelly integrate with image captioning models based on the encoder-decoder framework. First, a part of speech prediction network is proposed to analyze and model the part of speech sequences for the words in natural language sentences; then, different mechanisms are proposed to integrate the part of speech guidance information with merge-based and inject-based image captioning models, respectively; finally, according to the integrated frameworks, a multi-task learning paradigm is proposed to facilitate model training. Experiments are conducted on two widely used image captioning datasets, Flickr30 k and COCO, and the results have validated that the image captions generated by the proposed method contain more accurate visual information and comply with language habits and grammar rules better.

26 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: This work creates augmented parallel translation corpora by generating (path-specific) counterfactual aligned phrases by interpreting language models and phrasal alignment causally by taking both context and alignment into account.
Abstract: We propose a data augmentation method for neural machine translation. It works by interpreting language models and phrasal alignment causally. Specifically, it creates augmented parallel translation corpora by generating (path-specific) counterfactual aligned phrases. We generate these by sampling new source phrases from a masked language model, then sampling an aligned counterfactual target phrase by noting that a translation language model can be interpreted as a Gumbel-Max Structural Causal Model (Oberst and Sontag, 2019). Compared to previous work, our method takes both context and alignment into account to maintain the symmetry between source and target sequences. Experiments on IWSLT’15 English → Vietnamese, WMT’17 English → German, WMT’18 English → Turkish, and WMT’19 robust English → French show that the method can improve the performance of translation, backtranslation and translation robustness.

25 citations


Journal ArticleDOI
TL;DR: Both objective and subjective evaluations validate the effectiveness of the proposed phrase break prediction framework, that consistently improves voice quality in a Mongolian text-to-speech synthesis system.
Abstract: Prosodic phrasing is an important factor that affects naturalness and intelligibility in text-to-speech synthesis. Studies show that deep learning techniques improve prosodic phrasing when large text and speech corpus are available. However, for low-resource languages, such as Mongolian, prosodic phrasing remains a challenge for various reasons. First, the database suitable for system training is limited. Second, word composition knowledge that is prosody-informing has not been used in prosodic phrase modeling. To address these problems, in this article, we propose a feature augmentation method in conjunction with a self-attention neural classifier. We augment input text with morphological and phonological decompositions of words to enhance the text encoder. We study the use of self-attention classifier, that makes use of global context of a sentence, as a decoder for phrase break prediction. Both objective and subjective evaluations validate the effectiveness of the proposed phrase break prediction framework, that consistently improves voice quality in a Mongolian text-to-speech synthesis system.

22 citations


Journal ArticleDOI
TL;DR: This paper presents the first unsupervised text simplification system based on phrase-based machine translation system, which leverages a careful initialization of phrase tables and language models.
Abstract: Most recent approaches for Text Simplification (TS) have drawn on insights from machine translation to learn simplification rewrites from the monolingual parallel corpus of complex and simple sentences, yet their effectiveness strongly relies on large amounts of parallel sentences. However, there has been a serious problem haunting TS for decades, that is, the availability of parallel TS corpora is scarce or not fit for the learning task. In this paper, we will focus on one especially useful and challenging problem of unsupervised TS without a single parallel sentence. To the best of our knowledge, we present the first unsupervised text simplification system based on phrase-based machine translation system, which leverages a careful initialization of phrase tables and language models. On the widely used WikiLarge and WikiSmall benchmarks, our system respectively obtains 39.08 and 25.12 SARI points, even outperforms some supervised baselines.

22 citations


Proceedings ArticleDOI
14 Aug 2021
TL;DR: The authors induce high-quality phrase spans as silver labels from consistently co-occurring word sequences within each document, which can be applied to new input to recognize (unseen) quality phrases regardless of their surface names or frequency.
Abstract: Identifying and understanding quality phrases from context is a fundamental task in text mining. The most challenging part of this task arguably lies in uncommon, emerging, and domain-specific phrases. The infrequent nature of these phrases significantly hurts the performance of phrase mining methods that rely on sufficient phrase occurrences in the input corpus. Context-aware tagging models, though not restricted by frequency, heavily rely on domain experts for either massive sentence-level gold labels or handcrafted gazetteers. In this work, we propose UCPhrase, a novel unsupervised context-aware quality phrase tagger. Specifically, we induce high-quality phrase spans as silver labels from consistently co-occurring word sequences within each document. Compared with typical context-agnostic distant supervision based on existing knowledge bases (KBs), our silver labels root deeply in the input domain and context, thus having unique advantages in preserving contextual completeness and capturing emerging, out-of-KB phrases. Training a conventional neural tagger based on silver labels usually faces the risk of overfitting phrase surface names. Alternatively, we observe that the contextualized attention maps generated from a transformer-based neural language model effectively reveal the connections between words in a surface-agnostic way. Therefore, we pair such attention maps with the silver labels to train a lightweight span prediction model, which can be applied to new input to recognize (unseen) quality phrases regardless of their surface names or frequency. Thorough experiments on various tasks and datasets, including corpus-level phrase ranking, document-level keyphrase extraction, and sentence-level phrase tagging, demonstrate the superiority of our design over state-of-the-art pre-trained, unsupervised, and distantly supervised methods.

21 citations


Proceedings ArticleDOI
10 Oct 2021
TL;DR: In this article, the authors present a prototype system that generates and recommends utterances for visual analysis based on a combination of data interestingness metrics and language pragmatics to support conversational visual analysis by guiding the participants' analytic workflows and making them aware of the system's language interpretation capabilities.
Abstract: Natural language interfaces (NLIs) have become a prevalent medium for conducting visual data analysis, enabling people with varying levels of analytic experience to ask questions of and interact with their data. While there have been notable improvements with respect to language understanding capabilities in these systems, fundamental user experience and interaction challenges including the lack of analytic guidance (i.e., knowing what aspects of the data to consider) and discoverability of natural language input (i.e., knowing how to phrase input utterances) persist. To address these challenges, we investigate utterance recommendations that contextually provide analytic guidance by suggesting data features (e.g., attributes, values, trends) while implicitly making users aware of the types of phrasings that an NLI supports. We present Snowy, a prototype system that generates and recommends utterances for visual analysis based on a combination of data interestingness metrics and language pragmatics. Through a preliminary user study, we found that utterance recommendations in Snowy support conversational visual analysis by guiding the participants’ analytic workflows and making them aware of the system’s language interpretation capabilities. Based on the feedback and observations from the study, we discuss potential implications and considerations for incorporating recommendations in future NLIs for visual analysis.

20 citations


Proceedings ArticleDOI
TL;DR: The authors induce high-quality phrase spans as silver labels from consistently co-occurring word sequences within each document, which can be applied to new input to recognize (unseen) quality phrases regardless of their surface names or frequency.
Abstract: Identifying and understanding quality phrases from context is a fundamental task in text mining. The most challenging part of this task arguably lies in uncommon, emerging, and domain-specific phrases. The infrequent nature of these phrases significantly hurts the performance of phrase mining methods that rely on sufficient phrase occurrences in the input corpus. Context-aware tagging models, though not restricted by frequency, heavily rely on domain experts for either massive sentence-level gold labels or handcrafted gazetteers. In this work, we propose UCPhrase, a novel unsupervised context-aware quality phrase tagger. Specifically, we induce high-quality phrase spans as silver labels from consistently co-occurring word sequences within each document. Compared with typical context-agnostic distant supervision based on existing knowledge bases (KBs), our silver labels root deeply in the input domain and context, thus having unique advantages in preserving contextual completeness and capturing emerging, out-of-KB phrases. Training a conventional neural tagger based on silver labels usually faces the risk of overfitting phrase surface names. Alternatively, we observe that the contextualized attention maps generated from a transformer-based neural language model effectively reveal the connections between words in a surface-agnostic way. Therefore, we pair such attention maps with the silver labels to train a lightweight span prediction model, which can be applied to new input to recognize (unseen) quality phrases regardless of their surface names or frequency. Thorough experiments on various tasks and datasets, including corpus-level phrase ranking, document-level keyphrase extraction, and sentence-level phrase tagging, demonstrate the superiority of our design over state-of-the-art pre-trained, unsupervised, and distantly supervised methods.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: This paper proposes a navigation agent that utilizes syntax information derived from a dependency tree to enhance alignment between the instruction and the current visual scenes and achieves the new state-of-the-art on Room-Across-Room dataset.
Abstract: Vision language navigation is the task that requires an agent to navigate through a 3D environment based on natural language instructions. One key challenge in this task is to ground instructions with the current visual information that the agent perceives. Most of the existing work employs soft attention over individual words to locate the instruction required for the next action. However, different words have different functions in a sentence (e.g., modifiers convey attributes, verbs convey actions). Syntax information like dependencies and phrase structures can aid the agent to locate important parts of the instruction. Hence, in this paper, we propose a navigation agent that utilizes syntax information derived from a dependency tree to enhance alignment between the instruction and the current visual scenes. Empirically, our agent outperforms the baseline model that does not use syntax information on the Room-to-Room dataset, especially in the unseen environment. Besides, our agent achieves the new state-of-the-art on Room-Across-Room dataset, which contains instructions in 3 languages (English, Hindi, and Telugu). We also show that our agent is better at aligning instructions with the current visual information via qualitative visualizations.

Proceedings ArticleDOI
01 Aug 2021
TL;DR: This paper proposed to learn phrase representations from the supervision of reading comprehension tasks, coupled with novel negative sampling methods, which achieved state-of-the-art performance in open-domain question answering.
Abstract: Open-domain question answering can be reformulated as a phrase retrieval problem, without the need for processing documents on-demand during inference (Seo et al., 2019). However, current phrase retrieval models heavily depend on sparse representations and still underperform retriever-reader approaches. In this work, we show for the first time that we can learn dense representations of phrases alone that achieve much stronger performance in open-domain QA. We present an effective method to learn phrase representations from the supervision of reading comprehension tasks, coupled with novel negative sampling methods. We also propose a query-side fine-tuning strategy, which can support transfer learning and reduce the discrepancy between training and inference. On five popular open-domain QA datasets, our model DensePhrases improves over previous phrase retrieval models by 15%-25% absolute accuracy and matches the performance of state-of-the-art retriever-reader models. Our model is easy to parallelize due to pure dense representations and processes more than 10 questions per second on CPUs. Finally, we directly use our pre-indexed dense phrase representations for two slot filling tasks, showing the promise of utilizing DensePhrases as a dense knowledge base for downstream tasks.

Journal ArticleDOI
TL;DR: This work defines the phrase information network for answer selection and provides a novel idea for the heterogeneous information network fusion and proposes an innovative answering method that matches the most relevant answers for the new issue automatically.
Abstract: Community Question Answering (CQA) allows users to ask or answer questions in a social way, so it is becoming the primary means for people acquiring knowledge. However, the asker must wait until a satisfactory answer appears, which reduces user activity. In this paper, we propose an innovative answering method that matches the most relevant answers for the new issue automatically. Firstly, we utilize phrases to represent the semantic of the posts (answers/questions) and construct a Phrase Fusion Heterogeneous Information Network, called PFHIN, to represent complex entity relationships in CQA. So, the answer selection is regarded as the related entity retrieval task. Then, we define the distance between entities in PFHIN, which is independent of the meta path. Finally, the Type-constrained Top-k Similarity Entity Finding Algorithm (TTSEF) is proposed for finding the nearest entities according to the known start entity and end-entity type, which can match the most relevant answers automatically.To the best of our knowledge, it is the first work to define the phrase information network for answer selection and provide a novel idea for the heterogeneous information network fusion. Experimental results on three large-scale datasets (Stack Overflow, Super User, and Mathematics) from Stack Exchange demonstrate that our proposed approaches significantly outperform the state-of-the-art answer retrieval methods. Moreover, we conduct an in-depth analysis of the meta path to the optimal answer and reveal the critical role of phrases in community answer matching.

Journal ArticleDOI
01 Jan 2021
TL;DR: The hypothesis that different brain regions could be sensitive to different kinds of syntactic computations is investigated, and the fit of phrase-structure and dependency structure descriptors to activity in brain areas using fMRI is compared.
Abstract: Finding the structure of a sentence — the way its words hold together to convey meaning — is a fundamental step in language comprehension. Several brain regions, including the left inferior frontal gyrus, the left posterior superior temporal gyrus, and the left anterior temporal pole, are supposed to support this operation. The exact role of these areas is nonetheless still debated. In this paper we investigate the hypothesis that different brain regions could be sensitive to different kinds of syntactic computations. We compare the fit of phrase-structure and dependency structure descriptors to activity in brain areas using fMRI. Our results show a division between areas with regard to the type of structure computed, with the left ATP and left IFG favouring dependency structures and left pSTG favouring phrase structures.

Proceedings ArticleDOI
17 Oct 2021
TL;DR: Wang et al. as mentioned in this paper proposed a Conceptual and Syntactical Cross-modal Alignment with Cross-level Consistency (CSCC) for image-text matching by simultaneously exploring the multiple-level cross-modality alignments across the concept and syntactic with a consistency constraint.
Abstract: Image-Text Matching (ITM) is a fundamental and emerging task, which plays a key role in cross-modal understanding. It remains a challenge because prior works mainly focus on learning fine-grained (i.e. coarse and/or phrase) correspondence, without considering the syntactical correspondence. In theory, a sentence is not only a set of words or phrases but also a syntactic structure, consisting of a set of basic syntactic tuples (i.e.(attribute) object - predicate - (attribute) subject). Inspired by this, we propose a Conceptual and Syntactical Cross-modal Alignment with Cross-level Consistency (CSCC) for Image-text Matching by simultaneously exploring the multiple-level cross-modal alignments across the concept and syntactic with a consistency constraint. Specifically, a conceptual-level cross-modal alignment is introduced for exploring the fine-grained correspondence, while a syntactical-level cross-modal alignment is proposed to explicitly learn a high-level syntactic similarity function. Moreover, an empirical cross-level consistent attention loss is introduced to maintain the consistency between cross-modal attentions obtained from the above two cross-modal alignments. To justify our method, comprehensive experiments are conducted on two public benchmark datasets, i.e. MS-COCO (1K and 5K) and Flickr30K, which show that our CSCC outperforms state-of-the-art methods with fairly competitive improvements.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: This paper propose a contrastive learning framework that accounts for both region-phrase and image-sentence matching, which removes the need of object detection at test time, thereby significantly reducing the inference cost.
Abstract: Weakly supervised phrase grounding aims at learning region-phrase correspondences using only image-sentence pairs. A major challenge thus lies in the missing links between image regions and sentence phrases during training. To address this challenge, we leverage a generic object detector at training time, and propose a contrastive learning framework that accounts for both region-phrase and image-sentence matching. Our core innovation is the learning of a region-phrase score function, based on which an image-sentence score function is further constructed. Importantly, our region-phrase score function is learned by distilling from soft matching scores between the detected object names and candidate phrases within an image-sentence pair, while the image-sentence score function is supervised by ground-truth image-sentence pairs. The design of such score functions removes the need of object detection at test time, thereby significantly reducing the inference cost. Without bells and whistles, our approach achieves state-of-the-art results on visual phrase grounding, surpassing previous methods that require expensive object detectors at test time.

Journal ArticleDOI
01 Jan 2021
TL;DR: A Sentiment Classification model of Chinese Micro-blog Comments based on Key Sentences (SC-CMC-KS) is proposed, so that the key sentences and emoticon of the complete micro-blog are weighted to compute the final sentiment value, and comments are classified according to the threshold set.
Abstract: With the advancements of communication technology, the growth in the number of blogs has been remarkable. Information sharing has been an interesting matter of interest for researches, as people contribute to information shared through the Internet. Such sharing is common in the Chinese language, as approximately 15% of the world’s population are native speakers of Mandarin. Due to comments shared that may contain complex sentences in such micro-blogs, the result of sentiment classification may be affected, reducing its accuracy. Aimed at this point, a Sentiment Classification model of Chinese Micro-blog Comments based on Key Sentences (SC-CMC-KS) is proposed. The key sentence extraction algorithm for Chinese Microblog Comments is presented by considering three factors to recognize the key sentences of a given comment: the sentiment attributes, location attributes, and the critical feature word attributes. Besides, a computing algorithm of sentence sentiment value that integrates both dependency relationships and multi-rules (i.e., sentence-type rule and inter-sentence rule) is designed, as well defined a modification distance describing the relationship between modification words and core words, in which the sentiment value in the phrase level is computed according to the calculation rules so that the sentiment value in sentence level is obtained based on the multi-rules. Furthermore, the sentiment classification algorithm of micro-blog comments is presented, so that the key sentences and emoticon of the complete micro-blog are weighted to compute the final sentiment value, and comments are classified according to the threshold set. Experimental results show the effectiveness of this model yet promising.

Journal ArticleDOI
TL;DR: This article examined the extent to which hierarchical structure plays a role in language processing using EEG from human participants as they listen to isochronous streams of monosyllabic words.
Abstract: The interlocking roles of lexical, syntactic and semantic processing in language comprehension has been the subject of longstanding debate. Recently, the cortical response to a frequency-tagged linguistic stimulus has been shown to track the rate of phrase and sentence, as well as syllable, presentation. This could be interpreted as evidence for the hierarchical processing of speech, or as a response to the repetition of grammatical category. To examine the extent to which hierarchical structure plays a role in language processing we recorded EEG from human participants as they listen to isochronous streams of monosyllabic words. Comparing responses to sequences in which grammatical category is strictly alternating and chosen such that two-word phrases can be grammatically constructed—cold food loud room—or is absent—rough give ill tell—showed cortical entrainment at the two-word phrase rate was only present in the grammatical condition. Thus, grammatical category repetition alone does not yield entertainment at higher level than a word. On the other hand, cortical entrainment was reduced for the mixed-phrase condition that contained two-word phrases but no grammatical category repetition—that word send less—which is not what would be expected if the measured entrainment reflected purely abstract hierarchical syntactic units. Our results support a model in which word-level grammatical category information is required to build larger units.

Journal ArticleDOI
TL;DR: The tenets of critical and social realism are well supported in the literature as discussed by the authors, however, researchers following a realist paradigm have concerns about the lack of methodical guidance for qualitative analysis.
Abstract: The tenets of critical and social realism are well supported in the literature. However, researchers following a realist paradigm have concerns about the lack of methodical guidance for qualitative...

Journal ArticleDOI
Kun Xu1, Han Wu2, Linfeng Song1, Haisong Zhang1, Linqi Song2, Dong Yu1 
TL;DR: Experiments show that while traditional SRL systems perform poorly for analyzing dialogues, modeling dialogue histories and participants greatly helps the performance, indicating that adapting SRL to conversations is very promising for universal dialogue understanding.
Abstract: Semantic role labeling (SRL) aims to extract the arguments for each predicate in an input sentence. Traditional SRL can fail to analyze dialogues because it only works on every single sentence, while ellipsis and anaphora frequently occur in dialogues. To address this problem, we propose the conversational SRL task, where an argument can be the dialogue participants, a phrase in the dialogue history or the current sentence. As the existing SRL datasets are in the sentence level, we manually annotate semantic roles for 3000 chit-chat dialogues (27198 sentences) to boost the research in this direction. Experiments show that while traditional SRL systems (even with the help of coreference resolution or rewriting) perform poorly for analyzing dialogues, modeling dialogue histories and participants greatly helps the performance, indicating that adapting SRL to conversations is very promising for universal dialogue understanding. Our initial study by applying CSRL to two mainstream conversational tasks, dialogue response generation and dialogue context rewriting, also confirms the usefulness of CSRL.

Posted Content
TL;DR: This paper proposed a contrastive fine-tuning objective that enables BERT to produce more powerful phrase embeddings, which can be integrated with a simple autoencoder to build a phrase-based neural topic model that interprets topics as mixtures of words and phrases.
Abstract: Phrase representations derived from BERT often do not exhibit complex phrasal compositionality, as the model relies instead on lexical similarity to determine semantic relatedness. In this paper, we propose a contrastive fine-tuning objective that enables BERT to produce more powerful phrase embeddings. Our approach (Phrase-BERT) relies on a dataset of diverse phrasal paraphrases, which is automatically generated using a paraphrase generation model, as well as a large-scale dataset of phrases in context mined from the Books3 corpus. Phrase-BERT outperforms baselines across a variety of phrase-level similarity tasks, while also demonstrating increased lexical diversity between nearest neighbors in the vector space. Finally, as a case study, we show that Phrase-BERT embeddings can be easily integrated with a simple autoencoder to build a phrase-based neural topic model that interprets topics as mixtures of words and phrases by performing a nearest neighbor search in the embedding space. Crowdsourced evaluations demonstrate that this phrase-based topic model produces more coherent and meaningful topics than baseline word and phrase-level topic models, further validating the utility of Phrase-BERT.

Journal ArticleDOI
TL;DR: A syntax-driven method that first parses a given information type phrase into its constituents using a context-free grammar and second infers semantic relationships between constituents using semantic rules can be generalized reliably in inferring relations and reducing the ambiguity and abstraction in privacy policies.
Abstract: Context: Several government laws and app markets, such as Google Play, require the disclosure of app data practices to users. These data practices constitute critical privacy requirements statements, since they underpin the app’s functionality while describing how various personal information types are collected, used, and with whom they are shared. Objective: and ambiguous terminology in requirements statements concerning information types (e.g., “we collect your device information”), can reduce shared understanding among app developers, policy writers, and users. Method: To address this challenge, we propose a syntax-driven method that first parses a given information type phrase (e.g. mobile device identifier) into its constituents using a context-free grammar and second infers semantic relationships between constituents using semantic rules. The inferred semantic relationships between a given phrase and its constituents generate a hierarchy that models the generality and ambiguity of phrases. Through this method, we infer relations from a lexicon consisting of a set of information type phrases to populate a partial ontology. The resulting ontology is a knowledge graph that can be used to guide requirements authors in the selection of the most appropriate information type terms. Results: We evaluate the method’s performance using two criteria: (1) expert assessment of relations between information types; and (2) non-expert preferences for relations between information types. The results suggest performance improvement when compared to a previously proposed method. We also evaluate the reliability of the method considering the information types extracted from different data practices (e.g., collection, usage, sharing, etc.) in privacy policies for mobile or web-based apps in various app domains. Contributions: The method achieves average of 89% precision and 87% recall considering information types from various app domains and data practices. Due to these results, we conclude that the method can be generalized reliably in inferring relations and reducing the ambiguity and abstraction in privacy policies.

Journal ArticleDOI
TL;DR: It is concluded that Amahuaca provides evidence that maximal projections can be probes, and derives C’s simultaneous sensitivity to DPs within its own clause and in the clause to which it adjoins as a maximal projection that probes its c-command domain in second cycle Agree.
Abstract: When we couple the cyclic expansion of a probe’s domain assumed in Cyclic Agree (Rezac 2003, 2004, Béjar and Rezac 2009) with the lack of formal distinction between heads, intermediate projections, and phrases emphasized in Bare Phrase Structure (Chomsky 1995a,b) an interesting prediction arises. Maximal projections should be able to probe through the same mechanisms that allow intermediate projections to probe in familiar cases of Cyclic Agree. I argue that this prediction is borne out. I analyze agreeing adjunct C in Amahuaca (Panoan; Peru) as a maximal projection that probes its c-command domain in second cycle Agree. This account derives C’s simultaneous sensitivity to DPs within its own clause and in the clause to which it adjoins. Therefore, I conclude that Amahuaca provides evidence that maximal projections can be probes. The account also yields insight into the syntax of switch-reference in Panoan and beyond.

Posted ContentDOI
01 May 2021-bioRxiv
TL;DR: In this article, the authors evaluate a minimal compositional scheme using intracranial recordings to map the process of semantic composition in phrase structure comprehension, and find significant broadband gamma activity (70-150Hz) occurred in temporo-occipital junction (TOJ) and posterior middle temporal gyrus (pMTG) for pseudowords over words (300-700ms post-onset) in both first and second word positions.
Abstract: The ability to comprehend meaningful phrases is an essential component of language. Here we evaluate a minimal compositional scheme - the 9red-boat9 paradigm - using intracranial recordings to map the process of semantic composition in phrase structure comprehension. 18 human participants, implanted with penetrating depth or surface subdural intracranial electrode for the evaluation of medically refractory epilepsy, were presented with auditory recordings of adjective-noun, pseudoword-noun and adjective-pseudoword phrases before being presented with a colored drawing, and were asked to judge whether the phrase matched the object presented. Significantly greater broadband gamma activity (70-150Hz) occurred in temporo-occipital junction (TOJ) and posterior middle temporal gyrus (pMTG) for pseudowords over words (300-700ms post-onset) in both first- and second-word positions. Greater inter-trial phase coherence (8-12Hz) was found for words than for pseudowords in posterior superior temporal gyrus (pSTG). Isolating phrase structure sensitivity, we identified a portion of TOJ and posterior superior temporal sulcus (pSTS) that showed increased gamma activity for phrase composition than for non-composition, while left anterior temporal lobe (ATL) showed greater low frequency (2-15Hz) activity for phrase composition, likely coordinating distributed semantic representations. Greater functional connectivity between pSTS-TOJ and pars triangularis, and between pSTS-TOJ and ATL, was also found for phrase composition. STG, ATL and pars triangularis were found to encode anticipation of composition in the beta band (15-30Hz), and alpha (8-12Hz) power increases in ATL were also linked to anticipation. These results indicate that pSTS-TOJ appears to be crucial hub in the network responsible for the retrieval and computation of minimal phrases, and that anticipation of such composition is encoded in fronto-temporal regions.

Journal ArticleDOI
TL;DR: A novel Multi-Channel Self-Attention Network (MCSAN) incorporating both the inter-channel and inter-positional interaction to extract n-grams of the characters, words, parts of speech (POS), phrase structures, dependency relationships, and topics from multiple dimensions (style, content, syntactic and semantic features) to distinguish different authors.

Journal ArticleDOI
TL;DR: Two experiments further test the mechanisms found in NST—priming, activation, and satiation—as an account of the speech to song illusion and demonstrate that once lexical nodes are satiated the higher level semantic information associated with the word cannot differentially influence song-like ratings to lists of words varying in emotional arousal.
Abstract: In the speech to song illusion, a spoken phrase begins to sound as if it is being sung after several repetitions. Castro et al. (2018) used Node Structure Theory (NST; MacKay, 1987), a model of speech perception and production, to explain how the illusion occurs. Two experiments further test the mechanisms found in NST-priming, activation, and satiation-as an account of the speech to song illusion. In Experiment 1, words varying in the phonological clustering coefficient influenced how quickly a lexical node could recover from satiation, thereby influencing the song-like ratings to lists of words that were high versus low in phonological clustering coefficient. In Experiment 2, we used equivalence testing (i.e., the TOST procedure) to demonstrate that once lexical nodes are satiated the higher level semantic information associated with the word cannot differentially influence song-like ratings to lists of words varying in emotional arousal. The results of these two experiments further support the NST account of the speech to song illusion.

Journal ArticleDOI
TL;DR: An unsupervised Noun Phrase-based Open RE system for the Chinese language (NPORE), which employs a three-layer data-driven architecture and contains three components, i.e., Modifier-sensitive Phrase Segmenter, Candidate Relation Generator and Missing Relation Predicate Detector.
Abstract: Relation Extraction (RE) aims at harvesting relational facts from texts. A majority of existing research targets at knowledge acquisition from sentences, where subject-verb-object structures are usually treated as the signals of existence of relations. In contrast, relational facts expressed within noun phrases are highly implicit. Previous works mostly relies on human-compiled assertions and textual patterns in English to address noun phrase-based RE. For Chinese, the corresponding task is non-trivial because Chinese is a highly analytic language with flexible expressions. Additionally, noun phrases tend to be incomplete in grammatical structures, where clear mentions of predicates are often missing. In this article, we present an unsupervised Noun Phrase-based Open RE system for the Chinese language (NPORE), which employs a three-layer data-driven architecture. The system contains three components, i.e., Modifier-sensitive Phrase Segmenter, Candidate Relation Generator and Missing Relation Predicate Detector. It integrates with a graph clique mining algorithm to chunk Chinese noun phrases, considering how relations are expressed. We further propose a probabilistic method with knowledge priors and a hypergraph-based random walk process to detect missing relation predicates. Experiments over Chinese Wikipedia show NPORE outperforms state-of-the-art, capable of extracting 55.2 percent more relations than the most competitive baseline, with a comparable precision at 95.4 percent.

Journal ArticleDOI
TL;DR: This is the first attempt taken for group incremental adaptive clustering of crime reports integrating neural network and rough set theory, and is compared with some state-of-the-art clustering algorithms to express its effectiveness and statistical significance in the domain of crime corpora.

Proceedings Article
18 May 2021
TL;DR: The authors proposed a relation of relation learning network (R2-Net) for sentence semantic matching, which employs BERT to encode the input sentences from a global perspective, and then a CNN-based encoder is designed to capture keywords and phrase information from a local perspective.
Abstract: Sentence semantic matching is one of the fundamental tasks in natural language processing, which requires an agent to determine the semantic relation among input sentences. Recently, deep neural networks have achieved impressive performance in this area, especially BERT. Despite the effectiveness of these models, most of them treat output labels as meaningless one-hot vectors, underestimating the semantic information and guidance of relations that these labels reveal, especially for tasks with a small number of labels. To address this problem, we propose a Relation of Relation Learning Network (R2-Net) for sentence semantic matching. Specifically, we first employ BERT to encode the input sentences from a global perspective. Then a CNN-based encoder is designed to capture keywords and phrase information from a local perspective. To fully leverage labels for better relation information extraction, we introduce a self-supervised relation of relation classification task for guiding R2-Net to consider more about labels. Meanwhile, a triplet loss is employed to distinguish the intra-class and inter-class relations in a finer granularity. Empirical experiments on two sentence semantic matching tasks demonstrate the superiority of our proposed model. As a byproduct, we have released the codes to facilitate other researches.