scispace - formally typeset
Open AccessProceedings Article

Sparse Additive Generative Models of Text

TLDR
This approach has two key advantages: it can enforce sparsity to prevent overfitting, and it can combine generative facets through simple addition in log space, avoiding the need for latent switching variables.
Abstract
Generative models of text typically associate a multinomial with every class label or topic. Even in simple models this requires the estimation of thousands of parameters; in multi-faceted latent variable models, standard approaches require additional latent "switching" variables for every token, complicating inference. In this paper, we propose an alternative generative model for text. The central idea is that each class label or latent topic is endowed with a model of the deviation in log-frequency from a constant background distribution. This approach has two key advantages: we can enforce sparsity to prevent overfitting, and we can combine generative facets through simple addition in log space, avoiding the need for latent switching variables. We demonstrate the applicability of this idea to a range of scenarios: classification, topic modeling, and more complex multifaceted generative models.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Structural topic models for open ended survey responses

TL;DR: The structural topic model makes analyzing open-ended responses easier, more revealing, and capable of being used to estimate treatment effects, and is illustrated with analysis of text from surveys and experiments.
Journal ArticleDOI

stm: An R Package for Structural Topic Models

TL;DR: This paper demonstrates how to use the R package stm for structural topic modeling, which allows researchers to flexibly estimate a topic model that includes document-level metadata.
Journal ArticleDOI

A model of text for experimentation in the social sciences

TL;DR: A hierarchical mixed membership model for analyzing topical content of documents, in which mixing weights are parameterized by observed covariates is posit, enabling researchers to introduce elements of the experimental design that informed document collection into the model, within a generally applicable framework.
Proceedings ArticleDOI

Discovering geographical topics in the twitter stream

TL;DR: An algorithm is presented by modeling diversity in tweets based on topical diversity, geographical diversity, and an interest distribution of the user by exploiting sparse factorial coding of the attributes, thus allowing it to deal with a large and diverse set of covariates efficiently.
Proceedings ArticleDOI

RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models

TL;DR: It is found that pretrained LMs can degenerate into toxic text even from seemingly innocuous prompts, and empirically assess several controllable generation methods find that while data- or compute-intensive methods are more effective at steering away from toxicity than simpler solutions, no current method is failsafe against neural toxic degeneration.
References
More filters
Journal ArticleDOI

Regression Shrinkage and Selection via the Lasso

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Journal ArticleDOI

Latent dirichlet allocation

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Proceedings Article

Latent Dirichlet Allocation

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Journal ArticleDOI

Sparse bayesian learning and the relevance vector machine

TL;DR: It is demonstrated that by exploiting a probabilistic Bayesian learning framework, the 'relevance vector machine' (RVM) can derive accurate prediction models which typically utilise dramatically fewer basis functions than a comparable SVM while offering a number of additional advantages.
Posted Content

Supervised Topic Models

TL;DR: This article proposed supervised latent Dirichlet allocation (sLDA), a statistical model of labeled documents, which accommodates a variety of response types and derived an approximate maximum-likelihood procedure for parameter estimation, which relies on variational methods to handle intractable posterior expectations.
Related Papers (5)