scispace - formally typeset
Search or ask a question

Showing papers in "International Journal of Lexicography in 2016"




Journal ArticleDOI
TL;DR: Phraseological components of valency dictionaries for two West Slavic languages are presented and the expressive power of the phraseological subformalisms of these dictionaries are compared and recommendations for their possible extensions are made.
Abstract: Phraseological components of valency dictionaries for two West Slavic languages are presented, namely, of the PDT-Vallex dictionary for Czech and of the Walenty dictionary for Polish. Both dictionaries are corpus-based, albeit in different ways. Both are machinereadable and employable by syntactic parsers and generators. The paper compares the expressive power of the phraseological subformalisms of these dictionaries, discusses their limitations and makes recommendations for their possible extensions, which can be possibly applied also to other valency dictionaries with rich phraseological components.

18 citations



Journal ArticleDOI
TL;DR: In this article, a distributional semantics-based model is proposed that classifies collocations with respect to broad semantic categories as encountered in dictionaries.
Abstract: The presented work has been partially funded by the Spanish Ministry of Economy and Competitiveness (MINECO) under the contract number FFI2011-30219-C02-02.

15 citations


Journal ArticleDOI
TL;DR: An inventory of Japanese bilingual dictionaries (printed or electronic) with their historical evolution is conducted and a resource to build a good quality and broad coverage dictionary available on the Web is described.
Abstract: This research project is located in the field of natural language processing (NLP), at the intersection of computer science and linguistics, specifically multilingual lexicography and lexicology. Concerning the Web, although French and Japanese are two well resourced languages (Berment, 2004), is not the case of the French-Japanese couple: - Electronic French-Japanese bilingual dictionaries (denshi jisho) can not be copied to a computer or reused; - There is a French-Japanese dictionary on the Web1, but it only contains 40 000 entries, no examples and is not available for download. There are collaborative Web dictionaries such as the Japanese-English JMdict project led by Jim Breen (2004) that contains over 173,000 items. These resources are freely downloadable. It is therefore possible to carry out such projects. During a first stay in Japan from November 2001 to March 2004, we had already noticed the lack of French-Japanese bilingual resources on the Web. Which gave rise to the Papillon project about the construction of a multilingual lexical database with a pivot structure (Serasset et al., 2001). Since then, progress has been made in several areas (technical, theoretical, social) (Mangeot, 2006) but the actual production of data has made very little progress. On the other hand, there is a new trend in reusing existing lexical resources (word sense disambiguation, using open source resources (Wiktionary, dbpedia) merging with ontologies, etc.). Although they allow to consolidate and expand the coverage of existing resources, these experiences still use data created by hand by professional lexicographers. There are printed French-Japanese dictionaries of good quality and sufficiently old to be royalty free. It should be possible to reuse these resources as part of our project to build a good quality dictionary and broad coverage available on the Web. Based on this observation, we defined the following project to build a rich multilingual lexical system with priority over French-Japanese languages. The construction will be done first by reusing existing resources (printed Japanese-French dictionaries, Japanese-other language dictionaries, 1http://www.dictionnaire-japonais.com  Wikipedia) and automatic operations (scanning and corrections, calculating translation links) and then by volunteer contributors working as a community on the Web. They will have to contribute to dictionary articles according to their level of expertise and knowledge in the field of lexicography or bilingual translation. The resulting resources will be royalty-free and intended for use by both humans via conventional bilingual dictionaries and by machines for automatic language processing tools (analysis, machine translation, etc.).

14 citations






Journal ArticleDOI
TL;DR: Online collocation tools (LDOCE and WPI) contributed more than a book collocation dictionary (MCD) to accurate collocation production in L2 writers’ essays.
Abstract: L2 writers of English (N=45) in an intensive English program in the southwestern part of the USA were divided into three groups. Each group was provided with collocation training for a different collocation tool: Longman Dictionary of Contemporary English (LDOCE), Macmillan Collocation Dictionary (MCD), and www.wordandphrase.info (WPI). After training, each group used the collocation tool to correct 16 miscollocations embedded in an essay-format collocation test. After each test, the participants completed a quality review checklist. The procedure was repeated three times so that each group used each tool but in a different order. The results indicated that online collocation tools (LDOCE and WPI) contributed more than a book collocation dictionary (MCD) to accurate collocation production in L2 writers’ essays. In particular, L2 writers favored WPI because it was easier to navigate and it helped them locate the correct collocations.

Journal ArticleDOI
TL;DR: The paper summarizes the features supporting bilingual lexicography and the creation of bilingual learner’s dictionaries in Sketch Engine.
Abstract: Sketch Engine is a leading corpus query and corpus management tool that has been used for many large dictionary projects. The paper summarizes its features supporting bilingual lexicography and the creation of bilingual learner’s dictionaries. Some of these features have been added recently; some of them have been part of the software for a rather long time, but they have been recently improved.



Journal ArticleDOI
TL;DR: A model for analysing relations between lexical items across languages in terms of lexical domains is developed that overcomes the limitations of previous methods which rely heavily on introspection, and focus on single words.
Abstract: The purpose of the current study is to develop a model for analysing relations between lexical items across languages in terms of lexical domains. The model combines a corpus-informed distributional approach with the language-in-use theory of meaning to identify sets of semantically similar linguistic items across languages. It also applies the principle of differentiation to establish differences between individual items. The model overcomes the limitations of previous methods which rely heavily on introspection, and focus on single words. It will be proposed that the results of the study can be used as resources in the compilation of a new type of onomasiological bilingual dictionary. Such a dictionary would provide users with direct access to multi-word units across languages, and help them distinguish and choose between available options.

Journal ArticleDOI
TL;DR: The Bilingual Thesaurus project as mentioned in this paper is based at Birmingham City University and the University of Westminster and is used to create a bilingual thesaurus of everyday life in medieval England.
Abstract: This paper reports on issues at the interface between semantics and lexicography that arose out of the data collection and classification of vocabulary in Anglo-Norman and Middle English in order to create a bilingual thesaurus of everyday life in medieval England The Bilingual Thesaurus project is based at Birmingham City University and the University of Westminster Issues to be resolved included the definition of an occupational domain; the creation of a methodology of data collection; the delimitation of domain-specific vocabulary; making distinctions between sense and usage; and the categorisation of the lexical items Some of these issues are general to thesaurus-making, some are specific to the making of historical thesauruses, while some are unique to the production of a thesaurus of two languages whose use overlapped for several centuries in the late medieval period in England












Journal ArticleDOI
TL;DR: This article is primarily a description of the Oxford Learner’s Dictionary of Academic English (2014), a reference work aimed at higher education students who are studying through the medium of English but who are not native speakers of English.
Abstract: This article is primarily a description of the Oxford Learner’s Dictionary of Academic English (2014), a reference work aimed at higher education students who are studying through the medium of English but who are not native speakers of English. Description includes comparisons with the more general Oxford Advanced Learner’s Dictionary, and is preceded by an introduction to the field of ‘English for Academic Purposes’. 1. The Oxford Learner’s Dictionary of Academic English: Introduction The focal point of this article is the Oxford Learner’s Dictionary of Academic English (OLDAE) published in 2014 (see References). The main purpose of the article is to describe the dictionary in some detail, especially since it represents a new type of learner’s dictionary. OLDAE was designed for a specific user group, described by the dictionary’s chief editor as ‘non-native-English-speaking students who are studying academic subjects at tertiary level through the medium of English’ (Lea 2014: 181). At the time of its publication, it was, as far as I am aware, the only widely available MLD (monolingual learner’s dictionary) to focus exclusively on this set of potential users. 1 OLDAE comes in the form of a print dictionary and a CD-ROM; the main A-Z part of the dictionary is also downloadable as an app. In addition, there are ‘teaching resources’ on the publisher’s website. 2 This article is divided into seven sections. The first four deal with, respectively: the field of English for Academic Purposes; the compilation and coverage of OLDAE; lexical entries in the print dictionary; the CD-ROM. Thereafter, shorter sections deal with guidance on how to use the dictionary, and ‘non-lexical’ data in the dictionary. Lastly, there is an evaluative summary, followed by a series of more general points. 2. English for Academic Purposes From the perspective of the language teaching profession, OLDAE constitutes a new resource within the field of English for Academic Purposes (EAP). Overviews of the nature and development of EAP can be found in Hyland and Hamp-Lyons (2002) and Jordan (2002), both of which form part of the first issue of the Journal of English for Academic Purposes. The first of the two articles, written by the journal’s editors, includes the following summarizing statement of the purpose of EAP: ‘the growth of English as the leading language for the dissemination of academic knowledge has transformed the educational experiences of countless students, who must now gain fluency in the conventions of English language academic discourses to understand their disciplines and to successfully navigate their learning. The response of the language teaching profession to these demands has been the development over the past 25 years of a new field in the teaching of English as a Second/Foreign Language in universities and other academic settings: the field of English for Academic Purposes.’ (Hyland and Hamp-Lyons, 2002: 1). EAP is of relevance both to English-speaking and non-English-speaking countries. As it says in the Introduction to OLDAE itself, ‘Greater numbers of international students are choosing to pursue their higher education in English-speaking countries. Additionally, universities and colleges around the world are offering courses in a whole range of academic subjects taught through the medium of English.’ (p. v). In terms of linguistic description and language learning/teaching, EAP is concerned with identifying the particular characteristics which allow us to talk in terms of a general (i.e. crossdisciplinary) ‘academic English’, and with designing appropriate materials, methodology, and syllabuses to help teach this particular area of English usage. Many published materials are now available which deal with academically focussed aspects of each of the traditional ‘four skills’, as well as with academic vocabulary and areas of grammatical preference. English language exams should also be mentioned in the context of EAP, especially the ‘academic reading’ and ‘academic writing’ components of the IELTS exam. 3 2.1 Academic vocabulary Within EAP, vocabulary studies is the area of most obvious relevance to dictionary writing. There have been various ELT books published which focus on ‘academic vocabulary’. An early example is Sim and Laufer-Dvorkin (1984) and a more recent publication is McCarthy and Dell (2008). The Introduction to the latter gives a clear indication of what is normally meant by ‘academic vocabulary’: ‘This book presents and practises the kind of vocabulary that is used in academic speech and writing regardless of which discipline you are concerned with [....] It does not deal with the specialist vocabulary of any particular subject such as medicine or physics.’ (McCarthy and Dell 2008: 6). However, whereas the general concept of ‘academic vocabulary’, as described above, is easy to comprehend, it is much more difficult to draw up a list of lexical items which indisputably constitute this particular sub-lexicon of English. Indeed, some researchers have questioned the feasibility of creating a general, cross-discipline, academic word list (see Hyland and Tse, 2007, and, a reader’s response to this, Eldridge 2008). Certainly, different lists can be drawn up, and the differences between one list and another will depend on various factors, for example: (i) the nature of the corpus/corpora used to ‘discover’ the vocabulary in question (I take it as axiomatic that corpus data should be used); (ii) the numerical thresholds used for inclusion/exclusion; (iii) which types of linguistic unit are to be included (just words, or phrases as well; and if phrases, whether collocations are to be included); (iv) the exact purpose of drawing up the list. Various lists of academic vocabulary have been drawn up in the past. Gardner and Davies (p. 306) cite four lists which were compiled in the 1970s, and which were later combined to become what was known as the University Word List (UWL – see Xue and Nation 1984). In relation to the UWL, Nesi (2002: 352) says that, ‘Until fairly recently the most widely discussed wordlist for tertiary level students was the University Word List ..’. She also says that ‘[a]lthough none of the major learners’ dictionaries refer to it, the University Word List has been influential as a tool in English for Academic Purposes, serving as a syllabus component, as a yardstick by which to measure students’ knowledge of the words they will need for academic study, and as a teaching tool ..’ (ibid., p. 352). The successor to the UWL was Coxhead’s Academic Word List (AWL – see Coxhead 2000). 4 The AWL is a corpus-derived list of word families (570 in all), which comprise a total of 3,110 individual word forms. A word family is defined as ‘a stem plus all closely related affixed forms’ (ibid., p. 218); affixation includes ‘all inflections and the most frequent, productive, and regular prefixes and suffixes’ (ibid., p. 218). The purpose of the AWL ‘was to help teachers of EAP classes to set goals for their students’ vocabulary learning’ (Coxhead 2011: 357). Since the compilation of the AWL, a number of other research projects have addressed the question of identifying a core list of vocabulary items relevant to a broad range of academic disciplines. These include: a ‘spoken academic wordlist’ (Nesi 2002), an ‘academic keyword list’ (Paquot 2010: 29-63) 5 , an ‘academic formulas list’ (Simpson-Vlach and Ellis 2010); an ‘academic collocation list’ (Ackermann and Chen 2013) 6 ; and an ‘academic vocabulary list’ (Gardner and Davies 2014) 7 . 2.2 EAP and dictionaries Non-native speakers of English studying in higher education have always had at their disposal the resources of ‘general’ MLDs, initially in their pre-corpus versions, and, since the publication of the first COBUILD dictionary (see References), as corpus-based dictionaries. Such learners’ dictionaries include not only the standard, alphabetically arranged works, but also those in which entries are arranged in accordance with meaning, and dictionaries devoted to phraseological phenomena. In recent editions of some MLDs, AWL vocabulary has been specifically labelled (see Coxhead 2011: 359). Also, the Academic Word List was one of the various sources used in the compilation of the Macmillan Collocations Dictionary, even though AWL words themselves are not indicated as such. 8 There have also been subject-specific dictionaries of potential use to non-native speaker students in higher education, especially those which include simple explanations and example sentences. Such dictionaries are usually designed and marketed for both second-language users and native speakers; examples are Peter Collin Publishing’s Dictionary of Government and Politics (1997 [1988]) and the Cambridge Business English Dictionary (2011). The need for an additional type of dictionary, which focuses specifically on non-subject-specific ‘academic English’, was voiced by Kosem at the 2008 Euralex conference (Kosem 2008). Twenty years before that, at a previous Euralex conference, Hollósy had also argued in favour of such a dictionary, though relating it more to the needs of non-native-speaker academics and other professionals than students in higher education (see Hollósy 1990). 3. OLDAE: Purpose, compilation, coverage I have focussed above on the notion of academic vocabulary, and this is only natural, since lexical items constitute the primary organizational level of a non-meaning-based dictionary, and the starting point for dictionary consultation. However, the presentation of lexis in OLDAE is, above all, a means to an end, that of writing academic English (OLDAE, p. vi; Lea 2014: 183). Writing is probably the language skill with which most students in higher education need the most help, both because it is difficult to master and because of the role it has in many study environments. As Ian Bruce writes (2011: 118): ‘Central to the language needs of EAP students is competence


Journal ArticleDOI
TL;DR: In this article, the final draft, after peer-review, of a manuscript published in International Journal of Lexicography, is presented, which is available online at https://doi.org/10.1093/ijl/ecv046
Abstract: This is the final draft, after peer-review, of a manuscript published in International Journal of Lexicography. The published version is available online at https://doi.org/10.1093/ijl/ecv046