Open AccessPosted Content
European Language Grid: An Overview.
Georg Rehm,Maria Berger,Ela Elsholz,Stefanie Hegele,Florian Kintzel,Katrin Marheinecke,Stelios Piperidis,Miltos Deligiannis,Dimitris Galanis,Katerina Gkirtzou,Penny Labropoulou,Kalina Bontcheva,David L. Jones,Ian Roberts,Jan Hajič,Jana Hamrlová,Lukáš Kačena,Khalid Choukri,Victoria Arranz,Andrejs Vasiļjevs,Orians Anvari,Andis Lagzdiņš,Jūlija Meļņika,Gerhard Backfried,Erinç Dikici,Miroslav Janosik,Katja Prinz,Christoph Prinz,Severin Stampler,Dorothea Thomas-Aniola,José Manuel Gómez Pérez,Andres Garcia Silva,Christian Berrío,Ulrich Germann,Steve Renals,Ondrej Klejch +35 more
Reads0
Chats0
TLDR
The European Language Grid (ELG) project as discussed by the authors is a scalable cloud platform, providing, in an easy-to-integrate way, access to hundreds of commercial and non-commercial LTs for all European languages, including running tools and services as well as data sets and resources.Abstract:
With 24 official EU and many additional languages, multilingualism in Europe and an inclusive Digital Single Market can only be enabled through Language Technologies (LTs). European LT business is dominated by hundreds of SMEs and a few large players. Many are world-class, with technologies that outperform the global players. However, European LT business is also fragmented, by nation states, languages, verticals and sectors, significantly holding back its impact. The European Language Grid (ELG) project addresses this fragmentation by establishing the ELG as the primary platform for LT in Europe. The ELG is a scalable cloud platform, providing, in an easy-to-integrate way, access to hundreds of commercial and non-commercial LTs for all European languages, including running tools and services as well as data sets and resources. Once fully operational, it will enable the commercial and non-commercial European LT community to deposit and upload their technologies and data sets into the ELG, to deploy them through the grid, and to connect with other resources. The ELG will boost the Multilingual Digital Single Market towards a thriving European LT community, creating new jobs and opportunities. Furthermore, the ELG project organises two open calls for up to 20 pilot projects. It also sets up 32 National Competence Centres (NCCs) and the European LT Council (LTC) for outreach and coordination purposes.read more
Citations
More filters
Posted Content
Making Metadata Fit for Next Generation Language Technology Platforms: The Metadata Schema of the European Language Grid
Penny Labropoulou,Katerina Gkirtzou,Maria Gavriilidou,Miltos Deligiannis,Dimitrios Galanis,Stelios Piperidis,Georg Rehm,Maria Berger,Valérie Mapelli,Mickaël Rigault,Victoria Arranz,Khalid Choukri,Gerhard Backfried,José Manuel Gómez Pérez,Andres Garcia Silva +14 more
TL;DR: ELG-SHARE is presented, a rich metadata schema catering for the description of Language Resources and Technologies (processing and generation services and tools, models, corpora, term lists, etc.), as well as related entities (e.g., organizations, projects, supporting documents, etc.).
Proceedings Article
Abstractive Text Summarization based on Language Model Conditioning and Locality Modeling
Dmitrii Aksenov,Julián Moreno Schneider,Peter Bourgonje,Robert Schwarzenberg,Leonhard Hennig,Georg Rehm +5 more
TL;DR: A new method of BERT-windowing, which allows chunk-wise processing of texts longer than the BERT window size and how locality modeling, i.e., the explicit restriction of calculations to the local context, can affect the summarization ability of the Transformer.
Posted Content
A Dataset of German Legal Documents for Named Entity Recognition.
TL;DR: In this article, the authors describe a dataset developed for Named Entity Recognition in German federal court decisions, which consists of approx. 67,000 sentences with over 2 million tokens and is available under a CC-BY 4.0 license in the CoNNL-2002 format.
Terme-à-LLOD: Simplifying the Conversion and Hosting of Terminological Resources as Linked Data
TL;DR: Terme-‘a-LLOD (TAL) is presented, a new paradigm for transforming and publishing terminologies as linked data which relies on a virtualization approach and can be applied to any other resource format as well.
A Workflow Manager for Complex NLP and Content Curation Workflows
TL;DR: The first version of the workflow manager is presented, based on the four key principles of generality, flexibility, scalability and efficiency, by providing details on its custom definition language, explaining the communication components and the general system architecture and setup.
References
More filters
Proceedings ArticleDOI
The Stanford CoreNLP Natural Language Processing Toolkit
Christopher D. Manning,Mihai Surdeanu,John Bauer,Jenny Rose Finkel,Steven Bethard,David McClosky +5 more
TL;DR: The design and use of the Stanford CoreNLP toolkit is described, an extensible pipeline that provides core natural language analysis, and it is suggested that this follows from a simple, approachable design, straightforward interfaces, the inclusion of robust and good quality analysis components, and not requiring use of a large amount of associated baggage.
Journal ArticleDOI
Digital language death
TL;DR: It is argued that this consensus figure vastly underestimates the danger of digital language death, in that less than 5% of all languages can still ascend to the digital realm.
Proceedings Article
The META-SHARE Language Resources Sharing Infrastructure: Principles, Challenges, Solutions
TL;DR: META-SHARE, an open resource exchange infrastructure, which aims to boost visibility, documentation, identification, openness and sharing, collaboration, preservation and interoperability of language data and basic language processing tools, is presented.
Journal ArticleDOI
TextFlows: A visual programming platform for text mining and natural language processing
TL;DR: The platform enables visual construction of text mining workflows through a web browser, and the execution of the constructed workflows on a processing cloud, making TextFlows an adaptable infrastructure for the construction and sharing of text processing workflows, which can be reused in various applications.
Journal ArticleDOI
Evolution of the IBM cloud: enabling an enterprise cloud services ecosystem
Andrzej Kochut,Yu Deng,Michael R. Head,Jonathan P. Munson,Anca Sailer,Hidayatullah Habeebullah Shaikh,C. Tang,Alexander Phillip Amies,Murray J. Beaton,D. Geiss,D. Herman,H. Macho,Stefan Pappe,S. Peddle,R. Rendahl,A. E. Tomala Reyes,H. Sluiman,Brian J. Snitzer,T. Volin,H. Wagner +19 more
TL;DR: The evolution of the Common Cloud Management Platform (CCMP), a management system providing business and operations support for cloud services, is described, including novel approaches for scalable virtual machine provisioning and adaptive workload placement optimization.
Related Papers (5)
European Language Grid: An Overview
Georg Rehm,Maria Berger,Ela Elsholz,Stefanie Hegele,Florian Kintzel,Katrin Marheinecke,Stelios Piperidis,Miltos Deligiannis,Dimitris Galanis,Katerina Gkirtzou,Penny Labropoulou,Kalina Bontcheva,David L. Jones,Ian Roberts,Jan Hajič,Jana Hamrlová,Lukáš Kačena,Khalid Choukri,Victoria Arranz,Andrejs Vasiļjevs,Orians Anvari,Andis Lagzdiņš,Jūlija Meļņika,Gerhard Backfried,Erinç Dikici,Miroslav Janosik,Katja Prinz,Christoph Prinz,Severin Stampler,Dorothea Thomas-Aniola,Jose Manuel Gomez-Perez,Andres Garcia Silva,Christian Berrío,Ulrich Germann,Steve Renals,Ondrej Klejch +35 more