Showing papers on "Context (language use) published in 2018"

PDF

Open Access

Journal Article•DOI•

Recent Trends in Deep Learning Based Natural Language Processing [Review Article]

[...]

Tom Young¹, Devamanyu Hazarika², Soujanya Poria³, Erik Cambria³•Institutions (3)

Beijing Institute of Technology¹, National University of Singapore², Nanyang Technological University³

20 Jul 2018-IEEE Computational Intelligence Magazine

TL;DR: This paper reviews significant deep learning related models and methods that have been employed for numerous NLP tasks and provides a walk-through of their evolution.

...read moreread less

Abstract: Deep learning methods employ multiple processing layers to learn hierarchical representations of data, and have produced state-of-the-art results in many domains. Recently, a variety of model designs and methods have blossomed in the context of natural language processing (NLP). In this paper, we review significant deep learning related models and methods that have been employed for numerous NLP tasks and provide a walk-through of their evolution. We also summarize, compare and contrast the various models and put forward a detailed understanding of the past, present and future of deep learning in NLP.

...read moreread less

2,466 citations

Book•DOI•

Empathy : a social psychological approach

[...]

Mark H. Davis¹•Institutions (1)

Eckerd College¹

20 Feb 2018

TL;DR: In this article, a multidimensional approach brings together cognitive, sociobiological and behavioural perspectives providing students with a thorough, balanced and well-synthesised presentation of contemporary empathy research.

...read moreread less

Abstract: This multidimensional approach brings together cognitive, sociobiological and behavioural perspectives providing students with a thorough, vet balanced and well-synthesised presentation of contemporary empathy research. The author approaches the topic in two ways: 1. through empirical work which is examined in a variety of empathy-related areas, clearly recognising the theoretical context; 2. through an organisational model which puts the smaller pieces into one, more coherent whole.

...read moreread less

2,171 citations

Proceedings Article•DOI•

Know What You Don't Know: Unanswerable Questions for SQuAD

[...]

Pranav Rajpurkar¹, Robin Jia¹, Percy Liang¹•Institutions (1)

Stanford University¹

11 Jun 2018

TL;DR: SQuADRUn as discussed by the authors is a new dataset that combines the existing Stanford Question Answering Dataset with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones.

...read moreread less

Abstract: Extractive reading comprehension systems can often locate the correct answer to a question in a context document, but they also tend to make unreliable guesses on questions for which the correct answer is not stated in the context. Existing datasets either focus exclusively on answerable questions, or use automatically generated unanswerable questions that are easy to identify. To address these weaknesses, we present SQuADRUn, a new dataset that combines the existing Stanford Question Answering Dataset (SQuAD) with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuADRUn, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering. SQuADRUn is a challenging natural language understanding task for existing models: a strong neural system that gets 86% F1 on SQuAD achieves only 66% F1 on SQuADRUn. We release SQuADRUn to the community as the successor to SQuAD.

...read moreread less

1,398 citations

Proceedings Article•DOI•

Pyramid Stereo Matching Network

[...]

Jia-Ren Chang¹, Yong-Sheng Chen•Institutions (1)

National Chiao Tung University¹

14 Dec 2018

TL;DR: PSMNet as discussed by the authors proposes a pyramid stereo matching network consisting of two main modules: spatial pyramid pooling and 3D CNN to regularize cost volume using stacked multiple hourglass networks in conjunction with intermediate supervision.

...read moreread less

Abstract: Recent work has shown that depth estimation from a stereo pair of images can be formulated as a supervised learning task to be resolved with convolutional neural networks (CNNs). However, current architectures rely on patch-based Siamese networks, lacking the means to exploit context information for finding correspondence in ill-posed regions. To tackle this problem, we propose PSMNet, a pyramid stereo matching network consisting of two main modules: spatial pyramid pooling and 3D CNN. The spatial pyramid pooling module takes advantage of the capacity of global context information by aggregating context in different scales and locations to form a cost volume. The 3D CNN learns to regularize cost volume using stacked multiple hourglass networks in conjunction with intermediate supervision. The proposed approach was evaluated on several benchmark datasets. Our method ranked first in the KITTI 2012 and 2015 leaderboards before March 18, 2018. The codes of PSMNet are available at: https://github.com/JiaRenChang/PSMNet.

...read moreread less

1,172 citations

Journal Article•DOI•

Inverse molecular design using machine learning: Generative models for matter engineering

[...]

Benjamin Sanchez-Lengeling¹, Alán Aspuru-Guzik², Alán Aspuru-Guzik³•Institutions (3)

Harvard University¹, Canadian Institute for Advanced Research², University of Toronto³

27 Jul 2018-Science

TL;DR: Methods for achieving inverse design, which aims to discover tailored materials from the starting point of a particular desired functionality, are reviewed.

...read moreread less

Abstract: The discovery of new materials can bring enormous societal and technological progress. In this context, exploring completely the large space of potential materials is computationally intractable. Here, we review methods for achieving inverse design, which aims to discover tailored materials from the starting point of a particular desired functionality. Recent advances from the rapidly growing field of artificial intelligence, mostly from the subfield of machine learning, have resulted in a fertile exchange of ideas, where approaches to inverse molecular design are being proposed and employed at a rapid pace. Among these, deep generative models have been applied to numerous classes of materials: rational design of prospective drugs, synthetic routes to organic compounds, and optimization of photovoltaics and redox flow batteries, as well as a variety of other solid-state materials.

...read moreread less

1,090 citations

Posted Content•

Know What You Don't Know: Unanswerable Questions for SQuAD

[...]

Pranav Rajpurkar¹, Robin Jia¹, Percy Liang¹•Institutions (1)

Stanford University¹

11 Jun 2018-arXiv: Computation and Language

TL;DR: SQuadRUn is a new dataset that combines the existing Stanford Question Answering Dataset (SQuAD) with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones.

...read moreread less

Abstract: Extractive reading comprehension systems can often locate the correct answer to a question in a context document, but they also tend to make unreliable guesses on questions for which the correct answer is not stated in the context. Existing datasets either focus exclusively on answerable questions, or use automatically generated unanswerable questions that are easy to identify. To address these weaknesses, we present SQuAD 2.0, the latest version of the Stanford Question Answering Dataset (SQuAD). SQuAD 2.0 combines existing SQuAD data with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD 2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering. SQuAD 2.0 is a challenging natural language understanding task for existing models: a strong neural system that gets 86% F1 on SQuAD 1.1 achieves only 66% F1 on SQuAD 2.0.

...read moreread less

793 citations

Journal Article•DOI•

Variable selection - A review and recommendations for the practicing statistician.

[...]

Georg Heinze¹, Christine Wallisch¹, Daniela Dunkler¹•Institutions (1)

Medical University of Vienna¹

02 Jan 2018-Biometrical Journal

TL;DR: In this article, the authors provide an overview of variable selection methods that are based on significance or information criteria, penalized likelihood, change-in-estimate criterion, background knowledge, or combinations thereof.

...read moreread less

Abstract: Statistical models support medical research by facilitating individualized outcome prognostication conditional on independent variables or by estimating effects of risk factors adjusted for covariates. Theory of statistical models is well-established if the set of independent variables to consider is fixed and small. Hence, we can assume that effect estimates are unbiased and the usual methods for confidence interval estimation are valid. In routine work, however, it is not known a priori which covariates should be included in a model, and often we are confronted with the number of candidate variables in the range 10-30. This number is often too large to be considered in a statistical model. We provide an overview of various available variable selection methods that are based on significance or information criteria, penalized likelihood, the change-in-estimate criterion, background knowledge, or combinations thereof. These methods were usually developed in the context of a linear regression model and then transferred to more generalized linear models or models for censored survival data. Variable selection, in particular if used in explanatory modeling where effect estimates are of central interest, can compromise stability of a final model, unbiasedness of regression coefficients, and validity of p-values or confidence intervals. Therefore, we give pragmatic recommendations for the practicing statistician on application of variable selection methods in general (low-dimensional) modeling problems and on performing stability investigations and inference. We also propose some quantities based on resampling the entire variable selection process to be routinely reported by software packages offering automated variable selection algorithms.

...read moreread less

783 citations

Proceedings Article•DOI•

Model Cards for Model Reporting

[...]

Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, Timnit Gebru - Show less +5 more

05 Oct 2018-arXiv: Learning

TL;DR: This work proposes model cards, a framework that can be used to document any trained machine learning model in the application fields of computer vision and natural language processing, and provides cards for two supervised models: One trained to detect smiling faces in images, and one training to detect toxic comments in text.

...read moreread less

Abstract: Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.

...read moreread less

744 citations

Proceedings Article•DOI•

QuAC: Question Answering in Context

[...]

Eunsol Choi¹, He He², Mohit Iyyer³, Mohit Iyyer⁴, Mark Yatskar⁴, Wen-tau Yih⁴, Yejin Choi¹, Yejin Choi⁴, Percy Liang⁵, Luke Zettlemoyer¹ - Show less +6 more•Institutions (5)

University of Washington¹, New York University², University of Massachusetts Amherst³, Allen Institute for Artificial Intelligence⁴, Stanford University⁵

21 Aug 2018

TL;DR: QuAC introduces challenges not found in existing machine comprehension datasets: its questions are often more open-ended, unanswerable, or only meaningful within the dialog context, as it shows in a detailed qualitative evaluation.

...read moreread less

Abstract: We present QuAC, a dataset for Question Answering in Context that contains 14K information-seeking QA dialogs (100K questions in total). The dialogs involve two crowd workers: (1) a student who poses a sequence of freeform questions to learn as much as possible about a hidden Wikipedia text, and (2) a teacher who answers the questions by providing short excerpts from the text. QuAC introduces challenges not found in existing machine comprehension datasets: its questions are often more open-ended, unanswerable, or only meaningful within the dialog context, as we show in a detailed qualitative evaluation. We also report results for a number of reference models, including a recently state-of-the-art reading comprehension architecture extended to model dialog context. Our best model underperforms humans by 20 F1, suggesting that there is significant room for future work on this data. Dataset, baseline, and leaderboard available at http://quac.ai.

...read moreread less

690 citations

Mitigation Pathways Compatible with 1.5°C in the Context of Sustainable Development

[...]

Joeri Rogelj, Drew Shindell, Kejun Jiang, Solomone Fifita, Veronika Ginzburg, Collins Handa, Haroon S. Kheshgi, Shigeki Kobayashi, Elmar Kriegler¹, Luis Mundaca², Roland Séférian, M.V. Vilarino - Show less +8 more•Institutions (2)

Potsdam Institute for Climate Impact Research¹, International Institute of Minnesota²

01 Jan 2018

TL;DR: In this paper, the authors present a survey of the work of the authors of this paper, including the following authors: Katherine Calvin (USA), Joana Correia de Oliveira de Portugal Pereira (UK/Portugal), Oreane Edelenbosch (Netherlands/Italy), Johannes Emmerling (Italy/Germany), Sabine Fuss (Germany), Thomas Gasser (Austria/France), Nathan Gillett (Canada), Chenmin He (China), Edgar Hertwich (USA/Austria), Lena Höglund-Is

...read moreread less

Abstract: Contributing Authors: Katherine Calvin (USA), Joana Correia de Oliveira de Portugal Pereira (UK/Portugal), Oreane Edelenbosch (Netherlands/Italy), Johannes Emmerling (Italy/Germany), Sabine Fuss (Germany), Thomas Gasser (Austria/France), Nathan Gillett (Canada), Chenmin He (China), Edgar Hertwich (USA/Austria), Lena Höglund-Isaksson (Austria/Sweden), Daniel Huppmann (Austria), Gunnar Luderer (Germany), Anil Markandya (Spain/UK), David L. McCollum (USA/Austria), Malte Meinshausen (Australia/Germany), Richard Millar (UK), Alexander Popp (Germany), Pallav Purohit (Austria/India), Keywan Riahi (Austria), Aurélien Ribes (France), Harry Saunders (Canada/USA), Christina Schädel (USA/Switzerland), Chris Smith (UK), Pete Smith (UK), Evelina Trutnevyte (Switzerland/Lithuania), Yang Xiu (China), Wenji Zhou (Austria/China), Kirsten Zickfeld (Canada/Germany)

...read moreread less

671 citations

Proceedings Article•DOI•

PiCANet: Learning Pixel-Wise Contextual Attention for Saliency Detection

[...]

Nian Liu, Junwei Han¹, Ming-Hsuan Yang²•Institutions (2)

Northwestern Polytechnical University¹, University of California, Merced²

18 Jun 2018

TL;DR: Zhang et al. as discussed by the authors proposed a pixel-wise contextual attention network to learn to selectively attend to informative context locations for each pixel, which can generate an attention map in which each attention weight corresponds to the contextual relevance at each context location.

...read moreread less

Abstract: Contexts play an important role in the saliency detection task. However, given a context region, not all contextual information is helpful for the final task. In this paper, we propose a novel pixel-wise contextual attention network, i.e., the PiCANet, to learn to selectively attend to informative context locations for each pixel. Specifically, for each pixel, it can generate an attention map in which each attention weight corresponds to the contextual relevance at each context location. An attended contextual feature can then be constructed by selectively aggregating the contextual information. We formulate the proposed PiCANet in both global and local forms to attend to global and local contexts, respectively. Both models are fully differentiable and can be embedded into CNNs for joint training. We also incorporate the proposed models with the U-Net architecture to detect salient objects. Extensive experiments show that the proposed PiCANets can consistently improve saliency detection performance. The global and local PiCANets facilitate learning global contrast and homogeneousness, respectively. As a result, our saliency model can detect salient objects more accurately and uniformly, thus performing favorably against the state-of-the-art methods.

...read moreread less

Posted Content•

OCNet: Object Context Network for Scene Parsing

[...]

Yuhui Yuan, Jingdong Wang

04 Sep 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper addresses the semantic segmentation task with a new context aggregation scheme named \emph{object context}, which focuses on enhancing the role of object information by using a dense relation matrix to serve as a surrogate for the binary relation matrix.

...read moreread less

Abstract: In this paper, we address the semantic segmentation task with a new context aggregation scheme named \emph{object context}, which focuses on enhancing the role of object information. Motivated by the fact that the category of each pixel is inherited from the object it belongs to, we define the object context for each pixel as the set of pixels that belong to the same category as the given pixel in the image. We use a binary relation matrix to represent the relationship between all pixels, where the value one indicates the two selected pixels belong to the same category and zero otherwise. We propose to use a dense relation matrix to serve as a surrogate for the binary relation matrix. The dense relation matrix is capable to emphasize the contribution of object information as the relation scores tend to be larger on the object pixels than the other pixels. Considering that the dense relation matrix estimation requires quadratic computation overhead and memory consumption w.r.t. the input size, we propose an efficient interlaced sparse self-attention scheme to model the dense relations between any two of all pixels via the combination of two sparse relation matrices. To capture richer context information, we further combine our interlaced sparse self-attention scheme with the conventional multi-scale context schemes including pyramid pooling~\citep{zhao2017pyramid} and atrous spatial pyramid pooling~\citep{chen2018deeplab}. We empirically show the advantages of our approach with competitive performances on five challenging benchmarks including: Cityscapes, ADE20K, LIP, PASCAL-Context and COCO-Stuff

...read moreread less

Proceedings Article•DOI•

Leveraging Meta-path based Context for Top- N Recommendation with A Neural Co-Attention Model

[...]

Binbin Hu¹, Chuan Shi¹, Wayne Xin Zhao², Philip S. Yu³•Institutions (3)

Beijing University of Posts and Telecommunications¹, Renmin University of China², University of Illinois at Chicago³

19 Jul 2018

TL;DR: A novel deep neural network with the co-attention mechanism for leveraging rich meta-path based context for top-N recommendation and performs well in the cold-start scenario and has potentially good interpretability for the recommendation results.

...read moreread less

Abstract: Heterogeneous information network (HIN) has been widely adopted in recommender systems due to its excellence in modeling complex context information. Although existing HIN based recommendation methods have achieved performance improvement to some extent, they have two major shortcomings. First, these models seldom learn an explicit representation for path or meta-path in the recommendation task. Second, they do not consider the mutual effect between the meta-path and the involved user-item pair in an interaction. To address these issues, we develop a novel deep neural network with the co-attention mechanism for leveraging rich meta-path based context for top-N recommendation. We elaborately design a three-way neural interaction model by explicitly incorporating meta-path based context. To construct the meta-path based context, we propose to use a priority based sampling technique to select high-quality path instances. Our model is able to learn effective representations for users, items and meta-path based context for implementing a powerful interaction function. The co-attention mechanism improves the representations for meta-path based con- text, users and items in a mutual enhancement way. Extensive experiments on three real-world datasets have demonstrated the effectiveness of the proposed model. In particular, the proposed model performs well in the cold-start scenario and has potentially good interpretability for the recommendation results.

...read moreread less

Journal Article•DOI•

A Guide to Field Notes for Qualitative Research: Context and Conversation:

[...]

Julia C. Phillippi¹, Jana Lauderdale¹•Institutions (1)

Vanderbilt University¹

01 Feb 2018-Qualitative Health Research

TL;DR: This work provides a description of field note content for contextualization of an entire study as well as individual interviews and focus groups and two “sketch note” guides, one for study context and one for individual interviews or focus groups for use in the field.

...read moreread less

Abstract: Field notes are widely recommended in qualitative research as a means of documenting needed contextual information. With growing use of data sharing, secondary analysis, and metasynthesis, field notes ensure rich context persists beyond the original research team. However, while widely regarded as essential, there is not a guide to field note collection within the literature to guide researchers. Using the qualitative literature and previous research experience, we provide a concise guide to collection, incorporation, and dissemination of field notes. We provide a description of field note content for contextualization of an entire study as well as individual interviews and focus groups. In addition, we provide two "sketch note" guides, one for study context and one for individual interviews or focus groups for use in the field. Our guides are congruent with many qualitative and mixed methodologies and ensure contextual information is collected, stored, and disseminated as an essential component of ethical, rigorous qualitative research.

...read moreread less

Journal Article•DOI•

Machine Learning Approaches for Clinical Psychology and Psychiatry.

[...]

Dominic B. Dwyer¹, Peter Falkai¹, Nikolaos Koutsouleris¹•Institutions (1)

Ludwig Maximilian University of Munich¹

07 May 2018-Annual Review of Clinical Psychology

TL;DR: The limitations of current statistical paradigms in mental health research are critiqued, and an introduction is provided to critical machine learning methods used in clinical studies to reinforce the usefulness of machine learning Methods and provide evidence of their potential.

...read moreread less

Abstract: Machine learning approaches for clinical psychology and psychiatry explicitly focus on learning statistical functions from multidimensional data sets to make generalizable predictions about individuals The goal of this review is to provide an accessible understanding of why this approach is important for future practice given its potential to augment decisions associated with the diagnosis, prognosis, and treatment of people suffering from mental illness using clinical and biological data To this end, the limitations of current statistical paradigms in mental health research are critiqued, and an introduction is provided to critical machine learning methods used in clinical studies A selective literature review is then presented aiming to reinforce the usefulness of machine learning methods and provide evidence of their potential In the context of promising initial results, the current limitations of machine learning approaches are addressed, and considerations for future clinical translation are outlined

...read moreread less

Journal Article•DOI•

Blockchains for Business Process Management - Challenges and Opportunities

[...]

Jan Mendling¹, Ingo Weber², Wil M. P. van der Aalst³, Jan vom Brocke⁴, Cristina Cabanillas¹, Florian Daniel⁵, Søren Debois⁶, Claudio Di Ciccio¹, Marlon Dumas⁷, Schahram Dustdar⁸, Avigdor Gal⁹, Luciano García-Bañuelos⁷, Guido Governatori², Richard Hull¹⁰, Marcello La Rosa¹¹, Henrik Leopold¹², Frank Leymann¹³, Jan C. Recker¹¹, Manfred Reichert¹⁴, Hajo A. Reijers¹², Stefanie Rinderle-Ma¹⁵, Andreas Solti¹, Michael Rosemann¹¹, Stefan Schulte⁸, Munindar P. Singh¹⁶, Tijs Slaats¹⁷, Mark Staples², Barbara Weber¹⁸, Matthias Weidlich¹⁹, Mathias Weske²⁰, Xiwei Xu², Liming Zhu² - Show less +28 more•Institutions (20)

26 Feb 2018

TL;DR: In this paper, the challenges and opportunities of blockchain for business process management (BPM) are outlined and a summary of seven research directions for investigating the application of blockchain technology in the context of BPM are presented.

...read moreread less

Abstract: Blockchain technology offers a sizable promise to rethink the way interorganizational business processes are managed because of its potential to realize execution without a central party serving as a single point of trust (and failure). To stimulate research on this promise and the limits thereof, in this article, we outline the challenges and opportunities of blockchain for business process management (BPM). We first reflect how blockchains could be used in the context of the established BPM lifecycle and second how they might become relevant beyond. We conclude our discourse with a summary of seven research directions for investigating the application of blockchain technology in the context of BPM.

...read moreread less

Journal Article•DOI•

The evolution of production systems from Industry 2.0 through Industry 4.0

[...]

Yong Yin¹, Kathryn E. Stecke², Dongni Li³•Institutions (3)

Doshisha University¹, University of Texas at Dallas², Beijing Institute of Technology³

17 Jan 2018-International Journal of Production Research

TL;DR: Potential applications of lean and seru principles for Industry 4.0 are presented and comparisons between seru with TPS and cell are given.

...read moreread less

Abstract: This paper discusses production systems with a focus on the relationships between product supply and customer demand in the context of Industry 2.0–4.0. One driver of production evolution is change...

...read moreread less

Proceedings Article•DOI•

Hypothesis Only Baselines in Natural Language Inference

[...]

Adam Poliak¹, Jason Naradowsky¹, Aparajita Haldar², Rachel Rudinger¹, Benjamin Van Durme¹ - Show less +1 more•Institutions (2)

Johns Hopkins University¹, Birla Institute of Technology and Science²

02 May 2018

TL;DR: This article proposed a hypothesis-only baseline for diagnosing NLI, which is able to significantly outperform a majority-class baseline across a number of NLI datasets, and showed that statistical irregularities may allow a model to perform NLI in some datasets beyond what should be achievable without access to the context.

...read moreread less

Abstract: We propose a hypothesis only baseline for diagnosing Natural Language Inference (NLI). Especially when an NLI dataset assumes inference is occurring based purely on the relationship between a context and a hypothesis, it follows that assessing entailment relations while ignoring the provided context is a degenerate solution. Yet, through experiments on 10 distinct NLI datasets, we find that this approach, which we refer to as a hypothesis-only model, is able to significantly outperform a majority-class baseline across a number of NLI datasets. Our analysis suggests that statistical irregularities may allow a model to perform NLI in some datasets beyond what should be achievable without access to the context.

...read moreread less

Book Chapter•DOI•

Women Also Snowboard: Overcoming Bias in Captioning Models

[...]

Lisa Anne Hendricks¹, Kaylee Burns¹, Kate Saenko², Trevor Darrell¹, Anna Rohrbach¹ - Show less +1 more•Institutions (2)

University of California, Berkeley¹, Boston University²

08 Sep 2018

TL;DR: The authors proposed a new Equalizer model that encourages equal gender probability when gender evidence is occluded in a scene and confident predictions when gender evidences is present, which can be added to any description model in order to mitigate impacts of unwanted bias in a description dataset.

...read moreread less

Abstract: Most machine learning methods are known to capture and exploit biases of the training data. While some biases are beneficial for learning, others are harmful. Specifically, image captioning models tend to exaggerate biases present in training data (e.g., if a word is present in 60% of training sentences, it might be predicted in 70% of sentences at test time). This can lead to incorrect captions in domains where unbiased captions are desired, or required, due to over-reliance on the learned prior and image context. In this work we investigate generation of gender-specific caption words (e.g. man, woman) based on the person’s appearance or the image context. We introduce a new Equalizer model that encourages equal gender probability when gender evidence is occluded in a scene and confident predictions when gender evidence is present. The resulting model is forced to look at a person rather than use contextual cues to make a gender-specific prediction. The losses that comprise our model, the Appearance Confusion Loss and the Confident Loss, are general, and can be added to any description model in order to mitigate impacts of unwanted bias in a description dataset. Our proposed model has lower error than prior work when describing images with people and mentioning their gender and more closely matches the ground truth ratio of sentences including women to sentences including men. Finally, we show that our model more often looks at people when predicting their gender (https://people.eecs.berkeley.edu/~lisa anne/snowboard.html).

...read moreread less

Journal Article•DOI•

Is Overtourism Overused? Understanding the Impact of Tourism in a City Context

[...]

Ko Koens, Albert Postma, Bernadett Papp

01 Nov 2018-Sustainability

TL;DR: In this paper, the authors present results from a qualitative investigation among 80 stakeholders in 13 European cities to identify seven overtourism myths that may inhibit a well-rounded understanding of the concept and call for researchers from other disciplines to engage with the topic to come to new insights.

...read moreread less

Abstract: In less than two years, the concept of overtourism has come to prominence as one of the most discussed issues with regards to tourism in popular media and, increasingly, academia. In spite of its popularity, the term is still not clearly delineated and remains open to multiple interpretations. The current paper aims to provide more clarity with regard to what overtourism entails by placing the concept in a historical context and presenting results from a qualitative investigation among 80 stakeholders in 13 European cities. Results highlight that overtourism describes an issue that is multidimensional and complex. Not only are the issues caused by tourism and nontourism stakeholders, but they should also be viewed in the context of wider societal and city developments. The article concludes by arguing that while the debate on overtourism has drawn attention again to the old problem of managing negative tourism impacts, it is not well conceptualized. Seven overtourism myths are identified that may inhibit a well-rounded understanding of the concept. To further a contextualized understanding of overtourism, the paper calls for researchers from other disciplines to engage with the topic to come to new insights.

...read moreread less

Posted Content•

A Simple Method for Commonsense Reasoning.

[...]

Trieu H. Trinh¹, Quoc V. Le¹•Institutions (1)

Google¹

07 Jun 2018-arXiv: Artificial Intelligence

TL;DR: Key to this method is the use of language models, trained on a massive amount of unlabled data, to score multiple choice questions posed by commonsense reasoning tests, which outperform previous state-of-the-art methods by a large margin.

...read moreread less

Abstract: Commonsense reasoning is a long-standing challenge for deep learning. For example, it is difficult to use neural networks to tackle the Winograd Schema dataset (Levesque et al., 2011). In this paper, we present a simple method for commonsense reasoning with neural networks, using unsupervised learning. Key to our method is the use of language models, trained on a massive amount of unlabled data, to score multiple choice questions posed by commonsense reasoning tests. On both Pronoun Disambiguation and Winograd Schema challenges, our models outperform previous state-of-the-art methods by a large margin, without using expensive annotated knowledge bases or hand-engineered features. We train an array of large RNN language models that operate at word or character level on LM-1-Billion, CommonCrawl, SQuAD, Gutenberg Books, and a customized corpus for this task and show that diversity of training data plays an important role in test performance. Further analysis also shows that our system successfully discovers important features of the context that decide the correct answer, indicating a good grasp of commonsense knowledge.

...read moreread less

Proceedings Article•DOI•

Context-Aware Synthesis for Video Frame Interpolation

[...]

Simon Niklaus, Feng Liu

18 Jun 2018

TL;DR: A context-aware synthesis approach that warps not only the input frames but also their pixel-wise contextual information and uses them to interpolate a high-quality intermediate frame and outperforms representative state-of-the-art approaches.

...read moreread less

Abstract: Video frame interpolation algorithms typically estimate optical flow or its variations and then use it to guide the synthesis of an intermediate frame between two consecutive original frames. To handle challenges like occlusion, bidirectional flow between the two input frames is often estimated and used to warp and blend the input frames. However, how to effectively blend the two warped frames still remains a challenging problem. This paper presents a context-aware synthesis approach that warps not only the input frames but also their pixel-wise contextual information and uses them to interpolate a high-quality intermediate frame. Specifically, we first use a pre-trained neural network to extract per-pixel contextual information for input frames. We then employ a state-of-the-art optical flow algorithm to estimate bidirectional flow between them and pre-warp both input frames and their context maps. Finally, unlike common approaches that blend the pre-warped frames, our method feeds them and their context maps to a video frame synthesis neural network to produce the interpolated frame in a context-aware fashion. Our neural network is fully convolutional and is trained end to end. Our experiments show that our method can handle challenging scenarios such as occlusion and large motion and outperforms representative state-of-the-art approaches.

...read moreread less

Journal Article•DOI•

An Empirical Study of Chronic Diseases in the United States: A Visual Analytics Approach to Public Health

[...]

Wullianallur Raghupathi¹, Viju Raghupathi²•Institutions (2)

Fordham University¹, City University of New York²

01 Mar 2018-International Journal of Environmental Research and Public Health

TL;DR: This research explores the current state of chronic diseases in the United States, using data from the Centers for Disease Control and Prevention and applying visualization and descriptive analytics techniques to discover possible correlations between variables in several categories.

...read moreread less

Abstract: In this research we explore the current state of chronic diseases in the United States, using data from the Centers for Disease Control and Prevention and applying visualization and descriptive analytics techniques. Five main categories of variables are studied, namely chronic disease conditions, behavioral health, mental health, demographics, and overarching conditions. These are analyzed in the context of regions and states within the U.S. to discover possible correlations between variables in several categories. There are widespread variations in the prevalence of diverse chronic diseases, the number of hospitalizations for specific diseases, and the diagnosis and mortality rates for different states. Identifying such correlations is fundamental to developing insights that will help in the creation of targeted management, mitigation, and preventive policies, ultimately minimizing the risks and costs of chronic diseases. As the population ages and individuals suffer from multiple conditions, or comorbidity, it is imperative that the various stakeholders, including the government, non-governmental organizations (NGOs), policy makers, health providers, and society as a whole, address these adverse effects in a timely and efficient manner.

...read moreread less

Journal Article•DOI•

The use of mobile learning in higher education: A systematic review

[...]

Helen Crompton¹, Diane Burke²•Institutions (2)

Old Dominion University¹, Keuka College²

01 Aug 2018-Computers in Education

TL;DR: A systematic review provides the scholarly community with a current synthesis of mobile learning research across 2010–2016 in higher education settings regarding the purposes, outcomes, methodologies, subject matter domains, educational level, educational context, device types and geographical distribution of studies.

...read moreread less

Abstract: Mobile device ownership has exploded with the majority of adults owning more than one mobile device. The largest demographic of mobile device users are 18–29 years old which is also the typical age of college attendees. This systematic review provides the scholarly community with a current synthesis of mobile learning research across 2010–2016 in higher education settings regarding the purposes, outcomes, methodologies, subject matter domains, educational level, educational context, device types and geographical distribution of studies. Major findings include that the majority of the studies focused on the impact of mobile learning on student achievement. Language instruction was the most often researched subject matter domain. The findings reveal that 74% involved undergraduate students and 54% took place in a formal educational context. Higher education faculty are encouraged to consider the opportunity to expand their learning possibilities beyond the classroom with mobile learning.

...read moreread less

Journal Article•DOI•

Translingual Practice as Spatial Repertoires: Expanding the Paradigm beyond Structuralist Orientations

[...]

Suresh Canagarajah¹•Institutions (1)

Pennsylvania State University¹

01 Feb 2018-Applied Linguistics

TL;DR: In this article, a post-structuralist paradigm is proposed for translingualism, which embeds communication in space and time, considering all resources as working together as an assemblage in shaping meaning.

...read moreread less

Abstract: The expanding orientations to translingualism are motivated by a gradual shift from the structuralist paradigm that has been treated as foundational in modern linguistics. Structuralism encouraged scholars to consider language, like other social constructs, as organized as a self-defining and closed structure, set apart from spatiotemporal ‘context’ (which included diverse considerations such as history, geography, politics, and society). Translingualism calls for a shift from these structuralist assumptions to consider more mobile, expansive, situated, and holistic practices. In this article, I articulate how a poststructuralist paradigm might help us theorize and practice translingualism according to a spatial orientation that embeds communication in space and time, considering all resources as working together as an assemblage in shaping meaning. I illustrate from my ongoing research with international STEM scholars in a Midwestern American university to theorize how translingualism will redefine the role of constructs such as language, non-verbal artifacts, and context in communicative proficiency.

...read moreread less

Posted Content•

MaskGAN: Better Text Generation via Filling in the______

[...]

William Fedus, Ian Goodfellow, Andrew M. Dai

23 Jan 2018-arXiv: Machine Learning

TL;DR: This work introduces an actor-critic conditional GAN that fills in missing text conditioned on the surrounding context and shows qualitatively and quantitatively, evidence that this produces more realistic conditional and unconditional text samples compared to a maximum likelihood trained model.

...read moreread less

Abstract: Neural text generation models are often autoregressive language models or seq2seq models. These models generate text by sampling words sequentially, with each word conditioned on the previous word, and are state-of-the-art for several machine translation and summarization benchmarks. These benchmarks are often defined by validation perplexity even though this is not a direct measure of the quality of the generated text. Additionally, these models are typically trained via maxi- mum likelihood and teacher forcing. These methods are well-suited to optimizing perplexity but can result in poor sample quality since generating text requires conditioning on sequences of words that may have never been observed at training time. We propose to improve sample quality using Generative Adversarial Networks (GANs), which explicitly train the generator to produce high quality samples and have shown a lot of success in image generation. GANs were originally designed to output differentiable values, so discrete language generation is challenging for them. We claim that validation perplexity alone is not indicative of the quality of text generated by a model. We introduce an actor-critic conditional GAN that fills in missing text conditioned on the surrounding context. We show qualitatively and quantitatively, evidence that this produces more realistic conditional and unconditional text samples compared to a maximum likelihood trained model.

...read moreread less

Proceedings Article•DOI•

Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations

[...]

Sosuke Kobayashi¹•Institutions (1)

Tohoku University¹

16 May 2018

TL;DR: This paper proposed a data augmentation method for labeled sentences called contextual augmentation, which replaces words in the sentences with other words with paradigmatic relations by a bi-directional language model at word positions.

...read moreread less

Abstract: We propose a novel data augmentation for labeled sentences called contextual augmentation. We assume an invariance that sentences are natural even if the words in the sentences are replaced with other words with paradigmatic relations. We stochastically replace words with other words that are predicted by a bi-directional language model at the word positions. Words predicted according to a context are numerous but appropriate for the augmentation of the original words. Furthermore, we retrofit a language model with a label-conditional architecture, which allows the model to augment sentences without breaking the label-compatibility. Through the experiments for six various different text classification tasks, we demonstrate that the proposed method improves classifiers based on the convolutional or recurrent neural networks.

...read moreread less

Journal Article•DOI•

Paintings predict the distribution of species, or the challenge of selecting environmental predictors and evaluation statistics

[...]

Yoan Fourcade¹, Aurélien G. Besnard, Jean Secondi²•Institutions (2)

University of Agricultural Sciences, Dharwad¹, Centre national de la recherche scientifique²

01 Feb 2018-Global Ecology and Biogeography

TL;DR: The findings confirm the crucial importance of variable selection and the inability of current evaluation metrics to assess the biological significance of distribution models and recommend that researchers carefully select variables according to the species’ ecology and evaluate models only according to their capacity to be transfered in distant areas.

...read moreread less

Abstract: Aim: Species distribution modelling, a family of statistical methods that predicts species distribu- tions from a set of occurrences and environmental predictors, is now routinely applied in many macroecological studies. However, the reliability of evaluation metrics usually employed to validate these models remains questioned. Moreover, the emergence of online databases of environmental variables with global coverage, especially climatic, has favoured the use of the same set of standard predictors. Unfortunately, the selection of variables is too rarely based on a careful examination of the species’ ecology. In this context, our aim was to highlight the importance of selecting ad hoc variables in species distribution models, and to assess the ability of classical evaluation statistics to identify models with no biological realism. Innovation: First, we reviewed the current practices in the field of species distribution modelling in terms of variable selection and model evaluation. Then, we computed distribution models of 509 European species using pseudo-predictors derived from paintings or using a real set of climatic and topographic predictors. We calculated model performance based on the area under the receiver operating curve (AUC) and true skill statistics (TSS), partitioning occurrences into training and test data with different levels of spatial independence. Most models computed from pseudo- predictors were classified as good and sometimes were even better evaluated than models com- puted using real environmental variables. However, on average they were better discriminated when the partitioning of occurrences allowed testing for model transferability. Main conclusions: These findings confirm the crucial importance of variable selection and the inability of current evaluation metrics to assess the biological significance of distribution models. We recommend that researchers carefully select variables according to the species’ ecology and evaluate models only according to their capacity to be transfered in distant areas. Nevertheless, statistics of model evaluations must still be interpreted with great caution.

...read moreread less

Journal Article•DOI•

Dynamical systems applied to cosmology: dark energy and modified gravity

[...]

Sebastian Bahamonde¹, Christian G. Böhmer¹, Sante Carloni², Edmund J. Copeland³, Wei Fang⁴, Nicola Tamanini⁵, Nicola Tamanini⁶, Nicola Tamanini⁷ - Show less +4 more•Institutions (7)

University College London¹, Instituto Superior Técnico², University of Nottingham³, Shanghai Normal University⁴, Max Planck Society⁵, Paris Diderot University⁶, Université Paris-Saclay⁷

24 Nov 2018-Physics Reports

TL;DR: In this paper, a comprehensive and detailed study of dynamical systems applications to cosmological models focusing on the late-time behaviour of our Universe, and in particular on its accelerated expansion is presented.

...read moreread less

Proceedings Article•

An efficient framework for learning sentence representations

[...]

Lajanugen Logeswaran¹, Honglak Lee¹•Institutions (1)

University of Michigan¹

15 Feb 2018

TL;DR: This article propose a simple and efficient framework for learning sentence representations from unlabeled data, which is based on the distributional hypothesis and recent work on learning sentence representation, and reformulate the problem of predicting the context in which a sentence appears as a classification problem.

...read moreread less

Abstract: In this work we propose a simple and efficient framework for learning sentence representations from unlabelled data. Drawing inspiration from the distributional hypothesis and recent work on learning sentence representations, we reformulate the problem of predicting the context in which a sentence appears as a classification problem. Given a sentence and its context, a classifier distinguishes context sentences from other contrastive sentences based on their vector representations. This allows us to efficiently learn different types of encoding functions, and we show that the model learns high-quality sentence representations. We demonstrate that our sentence representations outperform state-of-the-art unsupervised and supervised representation learning methods on several downstream NLP tasks that involve understanding sentence semantics while achieving an order of magnitude speedup in training time.

...read moreread less

Collapse