scispace - formally typeset
Open Access

Frequency tables: tests, effect sizes, and explorations

TLDR
Very often, the data the authors study as linguists are discrete in nature and the linguistic elements they study come in different categories and, trivially, if two elements are labeled the same, they belong to the same category, and if they are labeled differently, they belongs to different categories.
Abstract
Very often, the data we study as linguists are discrete in nature. That is, the linguistic elements we study come in different categories and, trivially, if two elements are labeled the same, they belong to the same category, and if they are labeled differently, they belong to different categories. In statistical approaches, this kind of scenario is usually described with the terminology of variables (or factors) and their levels. For example, when direct objects are studied, it may be interesting to describe them in terms of which part of speech the direct object's head is. In other terminology, each direct object studied is then described with regard to the variable PART OF SPEECH by assigning a particular variable level to it; depending on what the direct objects look like, the following levels are conceivable: PART OF SPEECH: LEXICAL NOUN, PART OF SPEECH: PRONOUN, PART OF SPEECH: SEMIPRONOUN, (such as matters or things), etc. Trivially, if direct objects are categorized this way, then a direct object whose head is categorized as PART OF SPEECH: PRONOUN is, for the purposes of this analysis, identical to another one whose head is categorized as PART OF SPEECH: PRONOUN and different from one whose head is categorized as PART OF SPEECH: LEXICAL NOUN. On other occasions, the observed variables are actually not discrete, but continuous, but for the purposes of an analysis they may be grouped into two or more categories such as

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Analyzing linguistic data: a practical introduction to statistics using R

TL;DR: The author guides the reader in about 350 pages from descriptive and basic statistical methods over classification and clustering to (generalised) linear and mixed models to enable researchers and students alike to reproduce the analyses and learn by doing.
Journal ArticleDOI

Corpus linguistics and naive discriminative learning

TL;DR: Three classifiers from machine learning are compared with a naive discriminative learning classifier, derived from basic principles of error-driven learning characterizing animal and human learning, which emerges with state-of-the-art predictive accuracy.
Journal ArticleDOI

Predicting is not explaining: targeted learning of the dative alternation

TL;DR: The aim is to explain how native speakers of English choose a pattern over another in any given context, and derives estimates and confidence regions for well-defined parameters that can be interpreted as the influence of each contextual variable on the outcome of the alternation.
Posted Content

Extending the Linear Model With R: Generalized Linear, Mixed Effects and Nonparametric Regression Models. Julian J. Faraway

TL;DR: In this paper, the authors present a very short text on countable state Markov processes (CMSMPs), which is the shortest text for a specific type of process, except for Chapter 6 (martingales).
Dissertation

Les grammaires de constructions à l'épreuve de l'empirie

TL;DR: HAL as discussed by the authors is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not, which may come from teaching and research institutions in France or abroad, or from public or private research centers.
References
More filters
Book

The R Book

TL;DR: The R Book is the first comprehensive reference manual for the R language, including practical guidance and full coverage of the graphics facilities, and introduces the advantages of the R environment, detailing its applications in a wide range of disciplines.

The R book

TL;DR: The R Book is the first comprehensive reference manual for the R language, including practical guidance and full coverage of the graphics facilities, and introduces the advantages of the R environment, detailing its applications in a wide range of disciplines.
Journal ArticleDOI

Statistical Learning by 8-Month-Old Infants

TL;DR: The present study shows that a fundamental task of language acquisition, segmentation of words from fluent speech, can be accomplished by 8-month-old infants based solely on the statistical relationships between neighboring speech sounds.
Book

Analyzing linguistic data : a practical introduction to statistics using R

TL;DR: This paper presents a meta-modelling framework for modeling mixed models of clustering, classification, and probability distributions using the 'R' programming language.
Book

Morphology: A study of the relation between meaning and form

Joan L. Bybee
TL;DR: The authors proposes principles to predict properties previously considered arbitrary and brings together the psychological and the diachronic to explain the recurrent properties of morphological systems in terms of the processes that create them.