Wordbank: an open repository for developmental vocabulary data.

doi:10.1017/S0305000916000209

Home
/
Papers
/
Wordbank: an open repository for developmental vocabulary data.

Journal Article•DOI•

Wordbank: an open repository for developmental vocabulary data.

Michael C. Frank¹, Mika Braginsky¹, Daniel Yurovsky¹, Virginia A. Marchman¹•Institutions (1)

Stanford University¹

01 May 2017-Journal of Child Language (J Child Lang)-Vol. 44, Iss: 3, pp 677-694

TL;DR: Wordbank as mentioned in this paper is a structured database of parent-report data combined with a browsable web interface for exploring patterns of vocabulary growth at the level of both individual children and particular words.

read less

Abstract: The MacArthur-Bates Communicative Development Inventories (CDIs) are a widely used family of parent-report instruments for easy and inexpensive data-gathering about early language acquisition. CDI data have been used to explore a variety of theoretically important topics, but, with few exceptions, researchers have had to rely on data collected in their own lab. In this paper, we remedy this issue by presenting Wordbank, a structured database of CDI data combined with a browsable web interface. Wordbank archives CDI data across languages and labs, providing a resource for researchers interested in early language, as well as a platform for novel analyses. The site allows interactive exploration of patterns of vocabulary growth at the level of both individual children and particular words. We also introduce wordbankr, a software package for connecting to the database directly. Together, these tools extend the abilities of students and researchers to explore quantitative trends in vocabulary development.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Individual Differences in Language Acquisition and Processing

[...]

Evan Kidd¹, Evan Kidd², Seamus Donnelly², Morten H. Christiansen•Institutions (2)

Max Planck Society¹, Australian National University²

01 Feb 2018-Trends in Cognitive Sciences

TL;DR: It is argued that a focus on individual differences (IDs) provides a crucial source of evidence that bears strongly upon core issues in theories of the acquisition and processing of language; specifically, the role of experience in language acquisition, processing, and attainment, and the architecture of the language system.

...read moreread less

230 citations

Cites background from "Wordbank: an open repository for de..."

...For instance, vocabulary production norms from Wordbank shows that a child in the 90th percentile at 16 months knows the same number of words as a child in 10th percentile at 26 months [37]....
[...]
...(A) Cross-sectional MacArthur-Bates Communicative Development Inventory vocabulary production data from 4687 English-speaking children aged 16–30 months [37]....
[...]

Journal Article•DOI•

A Collaborative Approach to Infant Research : Promoting Reproducibility, Best Practices, and Theory-Building

[...]

Michael C. Frank¹, Elika Bergelson², Christina Bergmann³, Alejandrina Cristia³, Caroline Floccia⁴, Judit Gervain⁵, J. Kiley Hamlin⁶, Erin E. Hannon⁷, Melissa Kline⁸, Claartje Levelt⁹, Casey Lew-Williams¹⁰, Thierry Nazzi⁵, Robin Panneton¹¹, Hugh Rabagliati¹², Melanie Soderstrom¹³, Jessica Sullivan¹⁴, Sandra R. Waxman¹⁵, Daniel Yurovsky¹⁶ - Show less +14 more•Institutions (16)

Stanford University¹, Duke University², École Normale Supérieure³, University of Plymouth⁴, Paris Descartes University⁵, University of British Columbia⁶, University of Nevada, Las Vegas⁷, Harvard University⁸, Leiden University⁹, Princeton University¹⁰, Virginia Tech¹¹, University of Edinburgh¹², University of Manitoba¹³, Skidmore College¹⁴, Northwestern University¹⁵, University of Chicago¹⁶

01 Jul 2017-Infancy

TL;DR: The ManyBabies project, the instantiation of this proposal, will not only help to estimate how robust and replicable these phenomena are, but also gain new theoretical insights into how they vary across ages, linguistic communities, and measurement methods.

...read moreread less

Abstract: The ideal of scientific progress is that we accumulate measurements and integrate these into theory, but recent discussion of replicability issues has cast doubt on whether psychological research conforms to this model. Developmental research—especially with infant participants—also has discipline-specific replicability challenges, including small samples and limited measurement methods. Inspired by collaborative replication efforts in cognitive and social psychology, we describe a proposal for assessing and promoting replicability in infancy research: large-scale, multi-laboratory replication efforts aiming for a more precise understanding of key developmental phenomena. The ManyBabies project, our instantiation of this proposal, will not only help us estimate how robust and replicable these phenomena are, but also gain new theoretical insights into how they vary across ages, linguistic communities, and measurement methods. This project has the potential for a variety of positive outcomes, including less-biased estimates of theoretically important effects, estimates of variability that can be used for later study planning, and a series of best-practices blueprints for future infancy research.

...read moreread less

198 citations

Cites background from "Wordbank: an open repository for de..."

...1Although both the recent Infancy registered reports submission route and efforts for sharing observational data are notable exceptions; cf. Adolph, Gilmore, Freeman, Sanderson, & Millman, 2012; Frank et al., 2016; MacWhinney, 2000; Rose & MacWhinney, 2014; VanDam et al., 2016)....
[...]

Journal Article•

Early semantic Networks: Preferential Attachment or Preferential Acquisition?

[...]

Josita Maouene

01 Jan 2009-Psychological Science

TL;DR: Two alternative growth principles are introduced and test: preferential acquisition—words enter the lexicon not because they are related to well-connected words, but because they connect well to other words in the learning environment— and the lure of the associates—new words are favored in proportion to their connections with known words.

...read moreread less

Abstract: Analyses of adult semantic networks suggest a learning mechanism involving preferential attachment: A word is more likely to enter the lexicon the more connected the known words to which it is related. We introduce and test two alternative growth principles: preferential acquisition—words enter the lexicon not because they are related to well-connected words, but because they connect well to other words in the learning environment—and the lure of the associates—new words are favored in proportion to their connections with known words. We tested these alternative principles using longitudinal analyses of developing networks of 130 nouns children learn prior to the age of 30 months. We tested both networks with links between words represented by features and networks with links represented by associations. The feature networks did not predict age of acquisition using any growth model. The associative networks grew by preferential acquisition, with the best model incorporating word frequency, number of phonological neighbors, and connectedness of the new word to words in the learning environment, as operationalized by connectedness to words typically acquired by the age of 30 months.

...read moreread less

198 citations

Journal Article•DOI•

Quantifying the contribution of recessive coding variation to developmental disorders.

[...]

Hilary C. Martin¹, Wendy D Jones¹, Wendy D Jones², Rebecca E. McIntyre¹, Gabriela Sánchez-Andrade¹, Mark Sanderson¹, James Stephenson³, James Stephenson¹, Carla P. Jones¹, Juliet Handsaker¹, Giuseppe Gallone¹, Michaela Bruntraeger¹, Jeremy F. McRae¹, Elena Prigmore¹, Patrick J. Short¹, Mari Niemi¹, Joanna Kaplanis¹, Elizabeth J. Radford¹, Elizabeth J. Radford⁴, Nadia Akawi⁵, Meena Balasubramanian⁶, John Dean⁷, Rachel Horton⁸, Alice Hulbert, Diana S. Johnson⁶, Katie Johnson⁹, Dhavendra Kumar¹⁰, Sally Ann Lynch¹¹, Sarju G. Mehta⁴, Jenny Morton, Michael J. Parker¹¹, Miranda Splitt¹², Peter D. Turnpenny, Pradeep C. Vasudevan¹³, Michael Wright¹², Andrew R. Bassett¹, Sebastian S. Gerety¹, Caroline F. Wright¹⁴, David R. FitzPatrick¹⁵, Helen V. Firth⁴, Helen V. Firth¹, Matthew E. Hurles¹, Jeffrey C. Barrett¹ - Show less +39 more•Institutions (15)

Wellcome Trust Sanger Institute¹, Great Ormond Street Hospital², European Bioinformatics Institute³, Cambridge University Hospitals NHS Foundation Trust⁴, University of Oxford⁵, Northern General Hospital⁶, Aberdeen Royal Infirmary⁷, Princess Anne Hospital⁸, University of Nottingham⁹, University Hospital of Wales¹⁰, Boston Children's Hospital¹¹, Newcastle upon Tyne Hospitals NHS Foundation Trust¹², University Hospitals of Leicester NHS Trust¹³, Royal Devon and Exeter Hospital¹⁴, Medical Research Council¹⁵

07 Dec 2018-Science

TL;DR: The results suggest that recessive coding variants account for a small fraction of currently undiagnosed nonconsanguineous individuals, and that the role of noncoding variants, incomplete penetrance, and polygenic mechanisms need further exploration.

...read moreread less

Abstract: We estimated the genome-wide contribution of recessive coding variation in 6040 families from the Deciphering Developmental Disorders study. The proportion of cases attributable to recessive coding variants was 3.6% in patients of European ancestry, compared with 50% explained by de novo coding mutations. It was higher (31%) in patients with Pakistani ancestry, owing to elevated autozygosity. Half of this recessive burden is attributable to known genes. We identified two genes not previously associated with recessive developmental disorders, KDM5B and EIF3F, and functionally validated them with mouse and cellular models. Our results suggest that recessive coding variants account for a small fraction of currently undiagnosed nonconsanguineous individuals, and that the role of noncoding variants, incomplete penetrance, and polygenic mechanisms need further exploration.

...read moreread less

146 citations

Journal Article•DOI•

The Developing Infant Creates a Curriculum for Statistical Learning

[...]

Linda B. Smith¹, Swapnaa Jayaraman¹, Elizabeth M. Clerkin¹, Chen Yu¹•Institutions (1)

Indiana University¹

01 Apr 2018-Trends in Cognitive Sciences

TL;DR: Future advances in computational models will be necessary to connect the developmentally changing content and statistics of infant experience to the internal machinery that does the learning.

...read moreread less

142 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71

Collapse

References

PDF

Open Access

More filters

Journal Article•

R: A language and environment for statistical computing.

[...]

R Core Team

01 Jan 2014-MSOR connections

TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.

...read moreread less

Abstract: Copyright (©) 1999–2012 R Foundation for Statistical Computing. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the R Core Team.

...read moreread less

272,030 citations

Book•

ggplot2: Elegant Graphics for Data Analysis

[...]

Hadley Wickham

13 Aug 2009

TL;DR: This book describes ggplot2, a new data visualization package for R that uses the insights from Leland Wilkisons Grammar of Graphics to create a powerful and flexible system for creating data graphics.

...read moreread less

Abstract: This book describes ggplot2, a new data visualization package for R that uses the insights from Leland Wilkisons Grammar of Graphics to create a powerful and flexible system for creating data graphics. With ggplot2, its easy to: produce handsome, publication-quality plots, with automatic legends created from the plot specification superpose multiple layers (points, lines, maps, tiles, box plots to name a few) from different data sources, with automatically adjusted common scales add customisable smoothers that use the powerful modelling capabilities of R, such as loess, linear models, generalised additive models and robust regression save any ggplot2 plot (or part thereof) for later modification or reuse create custom themes that capture in-house or journal style requirements, and that can easily be applied to multiple plots approach your graph from a visual perspective, thinking about how each component of the data is represented on the final plot. This book will be useful to everyone who has struggled with displaying their data in an informative and attractive way. You will need some basic knowledge of R (i.e. you should be able to get your data into R), but ggplot2 is a mini-language specifically tailored for producing graphics, and youll learn everything you need in the book. After reading this book youll be able to produce graphics customized precisely for your problems,and youll find it easy to get graphics out of your head and on to the screen or page.

...read moreread less

29,504 citations

Book•

A First Language: The Early Stages

[...]

Roger Brown¹•Institutions (1)

Harvard University¹

01 Jan 1973

TL;DR: This article studied the early stages of grammatical constructions and the meanings they convey in pre-school children and found that the order of their acquisition is almost identical across children and is predicted by their relative semantic and grammatical complexity.

...read moreread less

Abstract: For many years, Roger Brown and his colleagues have studied the developing language of pre-school children--the language that ultimately will permit them to understand themselves and the world around them. This longitudinal research project records the conversational performances of three children, studying both semantic and grammatical aspects of their language development. These core findings are related to recent work in psychology and linguistics--and especially to studies of the acquisition of languages other than English, including Finnish, German, Korean, and Samoan. Roger Brown has written the most exhaustive and searching analysis yet undertaken of the early stages of grammatical constructions and the meanings they convey. The five stages of linguistic development Brown establishes are measured not by chronological age-since children vary greatly in the speed at which their speech develops--but by mean length of utterance. This volume treats the first two stages. Stage I is the threshold of syntax, when children begin to combine words to make sentences. These sentences, Brown shows, are always limited to the same small set of semantic relations: nomination, recurrence, disappearance, attribution, possession, agency, and a few others. Stage II is concerned with the modulations of basic structural meanings--modulations for number, time, aspect, specificity--through the gradual acquisition of grammatical morphemes such as inflections, prepositions, articles, and case markers. Fourteen morphemes are studied in depth and it is shown that the order of their acquisition is almost identical across children and is predicted by their relative semantic and grammaticalcomplexity. It is, ultimately, the intent of this work to focus on the nature and development of knowledge: knowledge concerning grammar and the meanings coded by grammar; knowledge inferred from performance, from sentences and the settings in which they are spoken, and from signs of comprehension or incomprehension of sentences.

...read moreread less

4,302 citations

Reference Entry•DOI•

Peabody Picture Vocabulary Test

[...]

Jonathan M. Campbell¹, Aila K. Dommestrup¹•Institutions (1)

University of Georgia¹

30 Jan 2010

TL;DR: The Peabody Picture Vocabulary Test (PPVT) as discussed by the authors is an individually administered, norm-referenced test of single-word receptive (or hearing) vocabulary.

...read moreread less

Abstract: The Peabody Picture Vocabulary Test (PPVT) is an individually administered, norm-referenced test of single-word receptive (or hearing) vocabulary. Originally published in 1959, the PPVT has been revised several times and currently exists in its fourth edition (PPVT-4; Dunn & Dunn, 2007). In addition to assessing receptive vocabulary, test authors report that the PPVT-4 may be used as a means of estimating verbal development (Dunn & Dunn, 2007). Normed with a sample of 3,540 individuals ages 2½ to 90 representative of March 2004 U.S. census data, the PPVT-4 features two parallel forms (Form A and Form B), each consisting of 228 test items. Items consist of two stimuli, a word spoken by the examiner and four pictures on a single card; the examinee selects the picture that best represents the examiner's spoken word. Raw scores may be translated into age-based standard scores (i.e., M = 100; SD = 15), percentile ranks, stanines, age equivalents, and grade equivalents. Keywords: receptive language; language screener

...read moreread less

4,281 citations

Journal Article•DOI•

ggplot2: Elegant Graphics for Data Analysis

[...]

Pedro M. Valero-Mora

30 Jul 2010-Journal of Statistical Software

TL;DR: ggplot2 as mentioned in this paper is an implementation in R of The Grammar of Graphics, a systematic approach to the specification of statistical graphics that was introduced in a book previously reviewed in the Journal of Statistical Software by Cox (2007).

...read moreread less

Abstract: ggplot2: Elegant Graphics for Data Analysis is a new addition to the UseR! series by Springer, probably the fastest expanding source of resources for computational statistics at the current moment. The books in this series are all linked with R, either presenting a new package developed by the own authors of the book or describing how to applying statistical techniques with the different packages available in R. ggplot2 is an implementation in R of The Grammar of Graphics (Wilkinson 2005) a systematic approach to the specification of statistical graphics that was introduced in a book previously reviewed in the Journal of Statistical Software by Cox (2007). This implementation has been developed by Hadley Wickham, who is also the author of the book reviewed here.

...read moreread less

4,089 citations