scispace - formally typeset
Search or ask a question

Showing papers in "Behavior Research Methods in 2012"


Journal ArticleDOI
Winter Mason1, Siddharth Suri1
TL;DR: It is shown that when taken as a whole Mechanical Turk can be a useful tool for many researchers, and how the behavior of workers compares with that of experts and laboratory subjects is discussed.
Abstract: Amazon’s Mechanical Turk is an online labor market where requesters post jobs and workers choose which jobs to do for pay. The central purpose of this article is to demonstrate how to use this Web site for conducting behavioral research and to lower the barrier to entry for researchers who could benefit from this platform. We describe general techniques that apply to a variety of types of research and experiments across disciplines. We begin by discussing some of the advantages of doing experiments on Mechanical Turk, such as easy access to a large, stable, and diverse subject pool, the low cost of doing experiments, and faster iteration between developing theory and executing experiments. While other methods of conducting behavioral research may be comparable to or even better than Mechanical Turk on one or more of the axes outlined above, we will show that when taken as a whole Mechanical Turk can be a useful tool for many researchers. We will discuss how the behavior of workers compares with that of experts and laboratory subjects. Then we will illustrate the mechanics of putting a task on Mechanical Turk, including recruiting subjects, executing the task, and reviewing the work that was submitted. We also provide solutions to common problems that a researcher might face when executing their research on this platform, including techniques for conducting synchronous experiments, methods for ensuring high-quality work, how to keep data private, and how to maintain code security.

2,521 citations


Journal ArticleDOI
TL;DR: OpenSesame is a graphical experiment builder for the social sciences that features a comprehensive and intuitive graphical user interface and supports Python scripting for complex tasks.
Abstract: In the present article, we introduce OpenSesame, a graphical experiment builder for the social sciences. OpenSesame is free, open-source, and cross-platform. It features a comprehensive and intuitive graphical user interface and supports Python scripting for complex tasks. Additional functionality, such as support for eyetrackers, input devices, and video playback, is available through plug-ins. OpenSesame can be used in combination with existing software for creating experiments.

1,666 citations


Journal ArticleDOI
TL;DR: This megastudy presents age-of-acquisition ratings for 30,121 English content words (nouns, verbs, and adjectives) using the Web-based crowdsourcing technology offered by the Amazon Mechanical Turk to indicate that the ratings collected are as valid and reliable as those collected in laboratory conditions.
Abstract: We present age-of-acquisition (AoA) ratings for 30,121 English content words (nouns, verbs, and adjectives). For data collection, this megastudy used the Web-based crowdsourcing technology offered by the Amazon Mechanical Turk. Our data indicate that the ratings collected in this way are as valid and reliable as those collected in laboratory conditions (the correlation between our ratings and those collected in the lab from U.S. students reached .93 for a subsample of 2,500 monosyllabic words). We also show that our AoA ratings explain a substantial percentage of the variance in the lexical-decision data of the English Lexicon Project, over and above the effects of log frequency, word length, and similarity to other words. This is true not only for the lemmas used in our rating study, but also for their inflected forms. We further discuss the relationships of AoA with other predictors of word recognition and illustrate the utility of AoA ratings for research on vocabulary growth.

873 citations


Journal ArticleDOI
TL;DR: A large-scale study with Dutch and Korean speakers of L2 English tested whether LexTALE, a 5-min vocabulary test, is a valid predictor of English vocabulary knowledge and, possibly, even of general English proficiency and showed that it was generally superior to self-ratings in its predictions.
Abstract: The increasing number of experimental studies on second language (L2) processing, frequently with English as the L2, calls for a practical and valid measure of English vocabulary knowledge and proficiency. In a large-scale study with Dutch and Korean speakers of L2 English, we tested whether LexTALE, a 5-min vocabulary test, is a valid predictor of English vocabulary knowledge and, possibly, even of general English proficiency. Furthermore, the validity of LexTALE was compared with that of self-ratings of proficiency, a measure frequently used by L2 researchers. The results showed the following in both speaker groups: (1) LexTALE was a good predictor of English vocabulary knowledge; 2) it also correlated substantially with a measure of general English proficiency; and 3) LexTALE was generally superior to self-ratings in its predictions. LexTALE, but not self-ratings, also correlated highly with previous experimental data on two word recognition paradigms. The test can be carried out on or downloaded from www.lextale.com.

745 citations


Journal ArticleDOI
TL;DR: Accuracy tests show that Arduino boards may be an inexpensive tool for many psychological and neurophysiological labs and may be useful in many lab environments.
Abstract: Typical experiments in psychological and neurophysiological settings often require the accurate control of multiple input and output signals. These signals are often generated or recorded via computer software and/or external dedicated hardware. Dedicated hardware is usually very expensive and requires additional software to control its behavior. In the present article, I present some accuracy tests on a low-cost and open-source I/O board (Arduino family) that may be useful in many lab environments. One of the strengths of Arduinos is the possibility they afford to load the experimental script on the board’s memory and let it run without interfacing with computers or external software, thus granting complete independence, portability, and accuracy. Furthermore, a large community has arisen around the Arduino idea and offers many hardware add-ons and hundreds of free scripts for different projects. Accuracy tests show that Arduino boards may be an inexpensive tool for many psychological and neurophysiological labs.

329 citations


Journal ArticleDOI
TL;DR: This article investigates the use of three further factors—namely, the application of stop-lists, word stemming, and dimensionality reduction using singular value decomposition (SVD)—that have been used to provide improved performance elsewhere and introduces an additional semantic task and explores the advantages of using a much larger corpus.
Abstract: In a previous article, we presented a systematic computational study of the extraction of semantic representations from the word-word co-occurrence statistics of large text corpora. The conclusion was that semantic vectors of pointwise mutual information values from very small co-occurrence windows, together with a cosine distance measure, consistently resulted in the best representations across a range of psychologically relevant semantic tasks. This article extends that study by investigating the use of three further factors--namely, the application of stop-lists, word stemming, and dimensionality reduction using singular value decomposition (SVD)--that have been used to provide improved performance elsewhere. It also introduces an additional semantic task and explores the advantages of using a much larger corpus. This leads to the discovery and analysis of improved SVD-based methods for generating semantic representations (that provide new state-of-the-art performance on a standard TOEFL task) and the identification and discussion of problems and misleading results that can arise without a full systematic study.

283 citations


Journal ArticleDOI
TL;DR: The high correlation between the BLP and ELP data indicates that a high percentage of variance in lexical decision data sets is systematic variance, rather than noise, and that the results of megastudies are rather robust with respect to the selection and presentation of the stimuli.
Abstract: We present a new database of lexical decision times for English words and nonwords, for which two groups of British participants each responded to 14,365 monosyllabic and disyllabic words and the same number of nonwords for a total duration of 16 h (divided over multiple sessions). This database, called the British Lexicon Project (BLP), fills an important gap between the Dutch Lexicon Project (DLP; Keuleers, Diependaele, & Brysbaert, Frontiers in Language Sciences. Psychology, 1, 174, 2010) and the English Lexicon Project (ELP; Balota et al., 2007), because it applies the repeated measures design of the DLP to the English language. The high correlation between the BLP and ELP data indicates that a high percentage of variance in lexical decision data sets is systematic variance, rather than noise, and that the results of megastudies are rather robust with respect to the selection and presentation of the stimuli. Because of its design, the BLP makes the same analyses possible as the DLP, offering researchers with a new interesting data set of word-processing times for mixed effects analyses and mathematical modeling. The BLP data are available at http://crr.ugent.be/blp and as Electronic Supplementary Materials.

251 citations


Journal ArticleDOI
TL;DR: MorePower 6.0 is a flexible freeware statistical calculator that computes sample size, effect size, and power statistics for factorial ANOVA designs and calculates Bayesian posterior probabilities for the null and alternative hypotheses based on formulas in Masson.
Abstract: MorePower 6.0 is a flexible freeware statistical calculator that computes sample size, effect size, and power statistics for factorial ANOVA designs. It also calculates relational confidence intervals for ANOVA effects based on formulas from Jarmasz and Hollands (Canadian Journal of Experimental Psychology 63:124-138, 2009), as well as Bayesian posterior probabilities for the null and alternative hypotheses based on formulas in Masson (Behavior Research Methods 43:679-690, 2011). The program is unique in affording direct comparison of these three approaches to the interpretation of ANOVA tests. Its high numerical precision and ability to work with complex ANOVA designs could facilitate researchers' attention to issues of statistical power, Bayesian analysis, and the use of confidence intervals for data interpretation. MorePower 6.0 is available at https://wiki.usask.ca/pages/viewpageattachments.action?pageId=420413544 .

234 citations


Journal ArticleDOI
TL;DR: It is argued that confidence intervals for within-subjects ANOVA designs are best accomplished by adapting intervals proposed by Cousineau and Morey so that nonoverlapping CIs for individual means correspond to a confidence for their difference that does not include zero.
Abstract: The psychological and statistical literature contains several proposals for calculating and plotting confidence intervals (CIs) for within-subjects (repeated measures) ANOVA designs A key distinction is between intervals supporting inference about patterns of means (and differences between pairs of means, in particular) and those supporting inferences about individual means In this report, it is argued that CIs for the former are best accomplished by adapting intervals proposed by Cousineau (Tutorials in Quantitative Methods for Psychology, 1, 42–45, 2005) and Morey (Tutorials in Quantitative Methods for Psychology, 4, 61–64, 2008) so that nonoverlapping CIs for individual means correspond to a confidence for their difference that does not include zero CIs for the latter can be accomplished by fitting a multilevel model In situations in which both types of inference are of interest, the use of a two-tiered CI is recommended Free, open-source, cross-platform software for such interval estimates and plots (and for some common alternatives) is provided in the form of R functions for one-way within-subjects and two-way mixed ANOVA designs These functions provide an easy-to-use solution to the difficult problem of calculating and displaying within-subjects CIs

179 citations


Journal ArticleDOI
TL;DR: The Cambridge Car Memory Test (CCMT) as discussed by the authors was matched in format to the established Cambridge Face Memory Test, requiring recognition of exemplars across view and lighting change, and results showed high reliability (Cronbach's alpha =.84) and a range of scores suitable both for normal-range individual-difference studies and, potentially, for diagnosis of impairment.
Abstract: Many research questions require a within-class object recognition task matched for general cognitive requirements with a face recognition task. If the object task also has high internal reliability, it can improve accuracy and power in group analyses (e.g., mean inversion effects for faces vs. objects), individual-difference studies (e.g., correlations between certain perceptual abilities and face/object recognition), and case studies in neuropsychology (e.g., whether a prosopagnosic shows a face-specific or object-general deficit). Here, we present such a task. Our Cambridge Car Memory Test (CCMT) was matched in format to the established Cambridge Face Memory Test, requiring recognition of exemplars across view and lighting change. We tested 153 young adults (93 female). Results showed high reliability (Cronbach's alpha = .84) and a range of scores suitable both for normal-range individual-difference studies and, potentially, for diagnosis of impairment. The mean for males was much higher than the mean for females. We demonstrate independence between face memory and car memory (dissociation based on sex, plus a modest correlation between the two), including where participants have high relative expertise with cars. We also show that expertise with real car makes and models of the era used in the test significantly predicts CCMT performance. Surprisingly, however, regression analyses imply that there is an effect of sex per se on the CCMT that is not attributable to a stereotypical male advantage in car expertise.

177 citations


Journal ArticleDOI
TL;DR: The EP adaptation of the Affective Norms for English Words for European Portuguese is shown to be a valid and useful tool that will allow researchers to control and/or manipulate the affective properties of stimuli, as well as to develop cross-linguistic studies.
Abstract: This study presents the adaptation of the Affective Norms for English Words (ANEW; Bradley & Lang, 1999a) for European Portuguese (EP). The EP adaptation of the ANEW was based on the affective ratings made by 958 college students who were EP native speakers. Subjects assessed about 60 words by considering the affective dimensions of valence, arousal, and dominance, using the Self-Assessment Manikin (SAM) in either a paper-and-pencil or a Web survey procedure. Results of the adaptation of the ANEW for EP are presented. Furthermore, the differences between EP, American (Bradley & Lang, 1999a), and Spanish (Redondo, Fraga, Padron, & Comesana, Behavior Research Methods, 39, 600–605, 2007) standardizations were explored. Results showed that the ANEW words were understood in a similar way by EP, American, and Spanish subjects, although some sex and cross-cultural differences were observed. The EP adaptation of the ANEW is shown to be a valid and useful tool that will allow researchers to control and/or manipulate the affective properties of stimuli, as well as to develop cross-linguistic studies. The normative values of EP adaptation of the ANEW can be downloaded at http://brm.psychonomic-journals.org/content/supplemental.

Journal ArticleDOI
TL;DR: The SUBTLEX-US corpus has been parsed with the CLAWS tagger, so that researchers have information about the possible word classes (parts‐of‐speech, or PoSs) of the entries, and five new columns have been added to the word frequency list.
Abstract: The SUBTLEX-US corpus has been parsed with the CLAWS tagger, so that researchers have information about the possible word classes (parts‐of‐speech, or PoSs) of the entries. Five new columns have been added to the SUBTLEX-US word frequency list: the dominant (most frequent) PoS for the entry, the frequency of the dominant PoS, the frequency of the dominant PoS relative to the entry’s total frequency, all PoSs observed for the entry, and the respective frequencies of these PoSs. Because the current definition of lemma frequency does not seem to provide word recognition researchers with useful information (as illustrated by a comparison of the lemma frequencies and the word form frequencies from the Corpus of Contemporary American English), we have not provided a column with this variable. Instead, we hope that the full list of PoS frequencies will help researchers to collectively determine which combination of frequencies is the most informative.

Journal ArticleDOI
TL;DR: A new method for scanpath comparison based on geometric vectors is validated, which compares scanpaths over multiple dimensions while retaining positional and sequential information, and is particularly relevant for “eye movements to nothing” in mental imagery and embodiment-of-cognition research, where satisfactory scan path comparison algorithms are lacking.
Abstract: Eye movement sequences-or scanpaths-vary depending on the stimulus characteristics and the task (Foulsham & Underwood Journal of Vision, 8(2), 6:1-17, 2008; Land, Mennie, & Rusted, Perception, 28, 1311-1328, 1999). Common methods for comparing scanpaths, however, are limited in their ability to capture both the spatial and temporal properties of which a scanpath consists. Here, we validated a new method for scanpath comparison based on geometric vectors, which compares scanpaths over multiple dimensions while retaining positional and sequential information (Jarodzka, Holmqvist, & Nystrom, Symposium on Eye-Tracking Research and Applications (pp. 211-218), 2010). "MultiMatch" was tested in two experiments and pitted against ScanMatch (Cristino, Mathot, Theeuwes, & Gilchrist, Behavior Research Methods, 42, 692-700, 2010), the most comprehensive adaptation of the popular Levenshtein method. In Experiment 1, we used synthetic data, demonstrating the greater sensitivity of MultiMatch to variations in spatial position. In Experiment 2, real eye movement recordings were taken from participants viewing sequences of dots, designed to elicit scanpath pairs with commonalities known to be problematic for algorithms (e.g., when one scanpath is shifted in locus or when fixations fall on either side of an AOI boundary). The results illustrate the advantages of a multidimensional approach, revealing how two scanpaths differ. For instance, if one scanpath is the reverse copy of another, the difference is in the direction but not the positions of fixations; or if a scanpath is scaled down, the difference is in the length of the saccadic vectors but not in the overall shape. As well as having enormous potential for any task in which consistency in eye movements is important (e.g., learning), MultiMatch is particularly relevant for "eye movements to nothing" in mental imagery and embodiment-of-cognition research, where satisfactory scanpath comparison algorithms are lacking.

Journal ArticleDOI
TL;DR: TripleR is presented, an R package for the calculation of social relations analyses (Kenny, 1994) based on round-robin designs that requires only minimal knowledge of R, and results can be exported for subsequent analyses to other software packages.
Abstract: In this article, we present TripleR, an R package for the calculation of social relations analyses (Kenny, 1994) based on round-robin designs. The scope of existing software solutions is ported to R and enhanced with previously unimplemented methods of significance testing in single groups (Lashley & Bond, 1997) and handling of missing values. The package requires only minimal knowledge of R, and results can be exported for subsequent analyses to other software packages. We demonstrate the use of TripleR with several didactic examples.

Journal ArticleDOI
TL;DR: The instructions not to count constitute the simplest and more efficient method of preventing counting in timing tasks, and further studies must now concentrate on the role of explicit instructions in the authors' experience of perception.
Abstract: The aim of the present study was to determine the best and easiest method of suppressing spontaneous counting in a temporal judgment task. Three classic methods used to avoid counting—instructions not to count, articulatory suppression, and administration of an interference task—were tested in temporal generalization, bisection, and reproduction tasks with two duration ranges (1–4 and 2–8 s). All the three no-counting conditions prevented participants from counting, counting leading to estimates that were more accurate and less variable and to violations of the fundamental scalar property of timing. With regard to the differences between the no-counting conditions, the interference task distorted time perception more strongly and increased variability in temporal estimates to a greater extent than did articulatory suppression, as well as the no-counting instructions condition. In addition, articulatory suppression produced more noise in behavioral outcome than did the no-counting instruction condition. In sum, although all methods have disadvantages, the instructions not to count actually constitute the simplest and more efficient method of preventing counting in timing tasks. However, further studies must now concentrate on the role of explicit instructions in our experience of perception.

Journal ArticleDOI
TL;DR: Age-related effects were found over all four tests, especially as age increased from young childhood through adulthood, indicating that the PEBL tests provide valid and versatile new research tools for measuring executive functions.
Abstract: The measurement of executive function has a long history in clinical and experimental neuropsychology. The goal of the present report was to determine the profile of behavior across the lifespan on four computerized measures of executive function contained in the recently developed Psychology Experiment Building Language (PEBL) test battery http://pebl.sourceforge.net/ and evaluate whether this pattern is comparable to data previously obtained with the non-PEBL versions of these tests. Participants (N = 1,223; ages, 5–89 years) completed the PEBL Trail Making Test (pTMT), the Wisconsin Card Sort Test (pWCST; Berg, Journal of General Psychology, 39, 15–22, 1948; Grant & Berg, Journal of Experimental Psychology, 38, 404–411, 1948), the Tower of London (pToL), or a time estimation task (Time-Wall). Age-related effects were found over all four tests, especially as age increased from young childhood through adulthood. For several tests and measures (including pToL and pTMT), age-related slowing was found as age increased in adulthood. Together, these findings indicate that the PEBL tests provide valid and versatile new research tools for measuring executive functions.

Journal ArticleDOI
Pan Liu1, Marc D. Pell1
TL;DR: To establish a valid database of vocal emotional stimuli in Mandarin Chinese, a set of Chinese pseudosentences were produced by four native Mandarin speakers to express seven emotional meanings: anger, disgust, fear, sadness, happiness, pleasant surprise, and neutrality.
Abstract: To establish a valid database of vocal emotional stimuli in Mandarin Chinese, a set of Chinese pseudosentences (i.e., semantically meaningless sentences that resembled real Chinese) were produced by four native Mandarin speakers to express seven emotional meanings: anger, disgust, fear, sadness, happiness, pleasant surprise, and neutrality. These expressions were identified by a group of native Mandarin listeners in a seven-alternative forced choice task, and items reaching a recognition rate of at least three times chance performance in the seven-choice task were selected as a valid database and then subjected to acoustic analysis. The results demonstrated expected variations in both perceptual and acoustic patterns of the seven vocal emotions in Mandarin. For instance, fear, anger, sadness, and neutrality were associated with relatively high recognition, whereas happiness, disgust, and pleasant surprise were recognized less accurately. Acoustically, anger and pleasant surprise exhibited relatively high mean f0 values and large variation in f0 and amplitude; in contrast, sadness, disgust, fear, and neutrality exhibited relatively low mean f0 values and small amplitude variations, and happiness exhibited a moderate mean f0 value and f0 variation. Emotional expressions varied systematically in speech rate and harmonics-to-noise ratio values as well. This validated database is available to the research community and will contribute to future studies of emotional prosody for a number of purposes. To access the database, please contact pan.liu@mail.mcgill.ca.

Journal ArticleDOI
TL;DR: In this paper, the authors present affective ratings for 380 Spanish words belonging to three semantic categories: animals, people, and objects, based on the assessments made by 504 participants, who rated about 47 words either in valence and arousal, by using the Self-Assessment Manikin (Bradley & Lang, Journal of Behavioral Therapy and Experimental Psychiatry, 25, 49-59. 1994), or in concreteness and familiarity.
Abstract: Emotional words are increasingly used in the study of word processing. To elucidate whether the experimental effects obtained with these words are due either to their affective content or to other semantic characteristics, it is necessary to conduct experiments with affectively valenced words obtained from different semantic categories. In the present article, we present affective ratings for 380 Spanish words belonging to three semantic categories: animals, people, and objects. The norms are based on the assessments made by 504 participants, who rated about 47 words either in valence and arousal, by using the Self-Assessment Manikin (Bradley & Lang, Journal of Behavioral Therapy and Experimental Psychiatry, 25, 49-59. 1994), or in concreteness and familiarity. These ratings will help researchers select stimuli for experiments in which both the affective properties of words and their membership to a given semantic category have to be taken into account. The database is available as an online supplement for this article.

Journal ArticleDOI
TL;DR: These imageability estimates for disyllabic words expand the number of words available for investigations of word processing, which should be useful for researchers interested in the influences of imageability both as an input and as an outcome variable.
Abstract: We provide imageability estimates for 3,000 disyllabic words (as supplementary materials that may be downloaded with the article from www.springerlink.com ). Imageability is a widely studied lexical variable believed to influence semantic and memory processes (see, e.g., Paivio, 1971). In addition, imageability influences basic word recognition processes (Plaut, McClelland, Seidenberg, & Patterson, 1996). In fact, neuroimaging studies have suggested that reading high- and low-imageable words elicits distinct neural activation patterns for the two types e.g., Bedny & Thompson-Schill (Brain and Language 98:127–139, 2006; Graves, Binder, Desai, Conant, & Seidenberg NeuroImage 53:638–646, 2010). Despite the usefulness of this variable, imageability estimates have not been available for large sets of words. Furthermore, recent megastudies of word processing e.g., Balota et al. (Behavior Research Methods 39:445–459, 2007) have expanded the number of words that interested researchers can select according to other lexical characteristics (e.g., average naming latencies, lexical decision times, etc.). However, the dearth of imageability estimates (as well as those of other lexical characteristics) limits the items that researchers can include in their experiments. Thus, these imageability estimates for disyllabic words expand the number of words available for investigations of word processing, which should be useful for researchers interested in the influences of imageability both as an input and as an outcome variable.

Journal ArticleDOI
TL;DR: A technique for estimating lexical norms based on the latent semantic analysis of a corpus that can be used to check human ratings to identify words for which the rating is very different from the corpus-based estimate.
Abstract: In psychology, lexical norms related to the semantic properties of words, such as concreteness and valence, are important research resources. Collecting such norms by asking judges to rate the words is very time consuming, which strongly limits the number of words that compose them. In the present article, we present a technique for estimating lexical norms based on the latent semantic analysis of a corpus. The analyses conducted emphasize the technique’s effectiveness for several semantic dimensions. In addition to the extension of norms, this technique can be used to check human ratings to identify words for which the rating is very different from the corpus-based estimate.

Journal ArticleDOI
TL;DR: A step-by-step tutorial for coding forced-choice responses, specifying a Thurstonian item response theory model that is appropriate for the design used, assessing the model’s fit, and scoring individuals on psychological attributes is provided.
Abstract: To counter response distortions associated with the use of rating scales (a.k.a. Likert scales), items can be presented in a comparative fashion, so that respondents are asked to rank the items within blocks (forced-choice format). However, classical scoring procedures for these forced-choice designs lead to ipsative data, which presents psychometric challenges that are well described in the literature. Recently, Brown and Maydeu-Olivares (Educational and Psychological Measurement 71: 460–502, 2011a) introduced a model based on Thurstone’s law of comparative judgment, which overcomes the problems of ipsative data. Here, we provide a step-by-step tutorial for coding forced-choice responses, specifying a Thurstonian item response theory model that is appropriate for the design used, assessing the model’s fit, and scoring individuals on psychological attributes. Estimation and scoring is performed using Mplus, and a very straightforward Excel macro is provided that writes full Mplus input files for any forced-choice design. Armed with these tools, using a forced-choice design is now as easy as using ratings.

Journal ArticleDOI
TL;DR: The present study introduces the first substantial German database with norms for semantic typicality, age of acquisition, and concept familiarity for 824 exemplars of 11 semantic categories, including four natural and man-made categories, as well as professions and sports.
Abstract: The present study introduces the first substantial German database with norms for semantic typicality, age of acquisition, and concept familiarity for 824 exemplars of 11 semantic categories, including four natural (animals, birds, fruits, and vegetables) and five man-made (clothing, furniture, vehicles, tools, and musical instruments) categories, as well as professions and sports. Each category exemplar in the database was collected empirically in an exemplar generation study. For each category exemplar, norms for semantic typicality, estimated age of acquisition, and concept familiarity were gathered in three different rating studies. Reliability data and additional analyses on effects of semantic category and intercorrelations between age of acquisition, semantic typicality, concept familiarity, word length, and word frequency are provided. Overall, the data show high inter- and intrastudy reliabilities, providing a new resource tool for designing experiments with German word materials. The full database is available in the supplementary material of this file and also at www.psychonomic.org/archive.

Journal ArticleDOI
TL;DR: The normative data provided here will enable clinicians to determine different kinds and specific levels of communicative impairments more precisely, and is a valuable tool in clinical practice.
Abstract: The Assessment Battery for Communication (ABaCo) was introduced to evaluate pragmatic abilities in patients with cerebral lesions. The battery is organized into five evaluation scales focusing on separate components of pragmatic competence. In the present study, we present normative data for individuals 15-75 years of age (N 0 300). The sample was stratified by age, sex, and years of education, according to Italian National Institute of Statistics indications in order to be representative of the general national population. Since performance on the ABaCo decreases with age and lower years of education, the norms were stratified for both age and education. The ABaCo is a valuable tool in clinical practice; the normative data provid- ed here will enable clinicians to determine different kinds and specific levels of communicative impairments more precisely.

Journal ArticleDOI
TL;DR: This article presents the MultiBlock Component Analysis program, which also includes procedures for missing data imputation and model selection, and the recently proposed clusterwise simultaneous component analysis, which is a generic and flexible approach that has no counterpart in the factor analysis tradition.
Abstract: To explore structural differences and similarities in multivariate multiblock data (e.g., a number of variables have been measured for different groups of subjects, where the data for each group constitute a different data block), researchers have a variety of multiblock component analysis and factor analysis strategies at their disposal. In this article, we focus on three types of multiblock component methods—namely, principal component analysis on each data block separately, simultaneous component analysis, and the recently proposed clusterwise simultaneous component analysis, which is a generic and flexible approach that has no counterpart in the factor analysis tradition. We describe the steps to take when applying those methods in practice. Whereas plenty of software is available for fitting factor analysis solutions, up to now no easy-to-use software has existed for fitting these multiblock component analysis methods. Therefore, this article presents the MultiBlock Component Analysis program, which also includes procedures for missing data imputation and model selection.

Journal ArticleDOI
TL;DR: This study validated the performance of an LPS in an indoor venue and to compare it to performance observed in an outdoor venue using static and dynamic measurements, showing that the absolute positioning errors obtained from the static measurements of the LPS were comparable for both indoor and outdoor venues.
Abstract: Radio-frequency local positioning systems (LPS) have the potential to provide accurate location information about sport players for performance analysis, making available for study the emergent nature of interpersonal coordination and collective decision-making behaviour among players in both indoor and outdoor sports. However, no available publications have validated the performance of LPS for indoor sports. The objective of this study was to validate the performance of an LPS in an indoor venue and to compare it to performance observed in an outdoor venue using static and dynamic measurements. Our results showed that the absolute positioning errors obtained from the static measurements of the LPS were comparable for both indoor and outdoor venues, with mean errors of 12.1 cm outdoors and 11.9 cm indoors. From the dynamic measurements, we analysed the relative position error and the distance estimation performance of the system. The 90th-percentile relative position errors were 28 cm for the indoor venue versus 18 cm for the outdoor venue. On average, the LPS overestimated the distance travelled, and the performance was similar for both indoor and outdoor venues. On a linear course, the mean errors of the distance estimates were 2.2% of the total distance indoors and 1.3% outdoors, and on a nonlinear course, they were 2.7% indoors and 3.2% outdoors.

Journal ArticleDOI
TL;DR: An SPSS program is provided that implements descriptive and inferential procedures for estimating tetrachoric correlations and constructing a correlation matrix to be used as input for factor analysis (in particular, the S PSS FACTOR procedure).
Abstract: We provide an SPSS program that implements descriptive and inferential procedures for estimating tetrachoric correlations. These procedures have two main purposes: (1) bivariate estimation in contingency tables and (2) constructing a correlation matrix to be used as input for factor analysis (in particular, the SPSS FACTOR procedure). In both cases, the program computes accurate point estimates, as well as standard errors and confidence intervals that are correct for any population value. For purpose (1), the program computes the contingency table together with five other measures of association. For purpose (2), the program checks the positive definiteness of the matrix, and if it is found not to be Gramian, performs a nonlinear smoothing procedure at the user's request. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from brm.psychonomic-journals.org/content/supplemental.

Journal ArticleDOI
TL;DR: The results indicate that a multilevel meta-analysis of unstandardized effect sizes results in good estimates of the effect, which is suitable when the number of measurement occasions for each subject is 20 or more.
Abstract: One way to combine data from single-subject experimental design studies is by performing a multilevel meta-analysis, with unstandardized or standardized regression coefficients as the effect size metrics. This study evaluates the performance of this approach. The results indicate that a multilevel meta-analysis of unstandardized effect sizes results in good estimates of the effect. The multilevel meta-analysis of standardized effect sizes, on the other hand, is suitable only when the number of measurement occasions for each subject is 20 or more. The effect of the treatment on the intercept is estimated with enough power when the studies are homogeneous or when the number of studies is large; the power of the effect on the slope is estimated with enough power only when the number of studies and the number of measurement occasions are large.

Journal ArticleDOI
TL;DR: The aim of this article is to increase the use of mixed models by giving a concise practical introduction and by giving clear directions for undertaking the analysis in the most popular statistical packages.
Abstract: Psychologists, psycholinguists, and other researchers using language stimuli have been struggling for more than 30 years with the problem of how to analyze experimental data that contain two crossed random effects (items and participants). The classical analysis of variance does not apply; alternatives have been proposed but have failed to catch on, and a statistically unsatisfactory procedure of using two approximations (known as F1 and F2) has become the standard. A simple and elegant solution using mixed model analysis has been available for 15 years, and recent improvements in statistical software have made mixed models analysis widely available. The aim of this article is to increase the use of mixed models by giving a concise practical introduction and by giving clear directions for undertaking the analysis in the most popular statistical packages. The article also introduces the djmixed add-on package for SPSS, which makes entering the models and reporting their results as straightforward as possible.

Journal ArticleDOI
TL;DR: This work proposes an extension of Mangat’s (Journal of the Royal Statistical Society: Series B, 56, 93–95, 1994) variant of the RRT that allows for determining whether participants respond truthfully, and shows how to implement the method using both closed-form equations and easily accessible free software for multinomial processing tree models.
Abstract: Surveys on sensitive issues provide distorted prevalence estimates when participants fail to respond truthfully. The randomized-response technique (RRT) encourages more honest responding by adding random noise to responses, thereby removing any direct link between a participant’s response and his or her true status with regard to a sensitive attribute. However, in spite of the increased confidentiality, some respondents still refuse to disclose sensitive attitudes or behaviors. To remedy this problem, we propose an extension of Mangat’s (Journal of the Royal Statistical Society: Series B, 56, 93–95, 1994) variant of the RRT that allows for determining whether participants respond truthfully. This method offers the genuine advantage of providing undistorted prevalence estimates for sensitive attributes even if respondents fail to respond truthfully. We show how to implement the method using both closed-form equations and easily accessible free software for multinomial processing tree models. Moreover, we report the results of two survey experiments that provide evidence for the validity of our extension of Mangat’s RRT approach.

Journal ArticleDOI
TL;DR: The Vienna comparative cognition technology combines modern computer, stimulus presentation, and reinforcement technology with flexibility and user-friendliness, which allows for efficient, widely automatized across-species experimentation, and thus makes the system appropriate for use in a broad range of learning tasks.
Abstract: This article describes a laboratory system for running learning experiments in operant chambers with various species. It is based on a modern version of a classical learning chamber for operant conditioning, the so-called “Skinner box”. Rather than constituting a stand-alone unit, as is usually the case, it is an integrated part of a comprehensive technical solution, thereby eliminating a number of practical problems that are frequently encountered in research on animal learning and behavior. The Vienna comparative cognition technology combines modern computer, stimulus presentation, and reinforcement technology with flexibility and user-friendliness, which allows for efficient, widely automatized across-species experimentation, and thus makes the system appropriate for use in a broad range of learning tasks.