scispace - formally typeset
Search or ask a question
Author

Barbara Plank

Bio: Barbara Plank is an academic researcher from IT University of Copenhagen. The author has contributed to research in topics: Parsing & Syntax (programming languages). The author has an hindex of 36, co-authored 159 publications receiving 4230 citations. Previous affiliations of Barbara Plank include University of Trento & University of Groningen.


Papers
More filters
Proceedings ArticleDOI
19 Apr 2016
TL;DR: The authors compared bi-LSTMs with word, character, and unicode byte embeddings for POS tagging and showed that biLSTM is less sensitive to training data size and label corruptions than previously assumed.
Abstract: Bidirectional long short-term memory (biLSTM) networks have recently proven successful for various NLP sequence modeling tasks, but little is known about their reliance to input representations, target languages, data set size, and label noise. We address these issues and evaluate bi-LSTMs with word, character, and unicode byte embeddings for POS tagging. We compare bi-LSTMs to traditional POS taggers across languages and data sizes. We also present a novel biLSTM model, which combines the POS tagging loss function with an auxiliary loss function that accounts for rare words. The model obtains state-of-the-art performance across 22 languages, and works especially well for morphologically complex languages. Our analysis suggests that biLSTMs are less sensitive to training data size and label corruptions (at small noise levels) than previously assumed.

448 citations

Journal ArticleDOI
TL;DR: In this article, the authors provide a detailed review of existing models, highlighting their advantages and disadvantages, and extrapolate future directions in the area of automatic image description generation by providing an overview of the benchmark image datasets and the evaluation measures that have been developed to assess the quality of machine-generated image descriptions.
Abstract: Automatic description generation from natural images is a challenging problem that has recently received a large amount of interest from the computer vision and natural language processing communities. In this survey, we classify the existing approaches based on how they conceptualize this problem, viz., models that cast description as either generation problem or as a retrieval problem over a visual or multimodal representational space. We provide a detailed review of existing models, highlighting their advantages and disadvantages. Moreover, we give an overview of the benchmark image datasets and the evaluation measures that have been developed to assess the quality of machine-generated image descriptions. Finally we extrapolate future directions in the area of automatic image description generation.

251 citations

Posted Content
TL;DR: This survey classify the existing approaches based on how they conceptualize this problem, viz., models that cast description as either generation problem or as a retrieval problem over a visual or multimodal representational space.
Abstract: Automatic description generation from natural images is a challenging problem that has recently received a large amount of interest from the computer vision and natural language processing communities. In this survey, we classify the existing approaches based on how they conceptualize this problem, viz., models that cast description as either generation problem or as a retrieval problem over a visual or multimodal representational space. We provide a detailed review of existing models, highlighting their advantages and disadvantages. Moreover, we give an overview of the benchmark image datasets and the evaluation measures that have been developed to assess the quality of machine-generated image descriptions. Finally we extrapolate future directions in the area of automatic image description generation.

218 citations

Proceedings ArticleDOI
01 Dec 2020
TL;DR: This survey reviews neural unsupervised domain adaptation techniques which do not require labeled target domain data, and revisits the notion of domain, and uncovers a bias in the type of Natural Language Processing tasks which received most attention.
Abstract: Deep neural networks excel at learning from labeled data and achieve state-of-the-art results on a wide array of Natural Language Processing tasks. In contrast, learning from unlabeled data, especially under domain shift, remains a challenge. Motivated by the latest advances, in this survey we review neural unsupervised domain adaptation techniques which do not require labeled target domain data. This is a more challenging yet a more widely applicable setup. We outline methods, from early traditional non-neural methods to pre-trained model transfer. We also revisit the notion of domain, and we uncover a bias in the type of Natural Language Processing tasks which received most attention. Lastly, we outline future directions, particularly the broader need for out-of-distribution generalization of future NLP.

180 citations


Cited by
More filters
Posted Content
TL;DR: This article seeks to help ML practitioners apply MTL by shedding light on how MTL works and providing guidelines for choosing appropriate auxiliary tasks, particularly in deep neural networks.
Abstract: Multi-task learning (MTL) has led to successes in many applications of machine learning, from natural language processing and speech recognition to computer vision and drug discovery. This article aims to give a general overview of MTL, particularly in deep neural networks. It introduces the two most common methods for MTL in Deep Learning, gives an overview of the literature, and discusses recent advances. In particular, it seeks to help ML practitioners apply MTL by shedding light on how MTL works and providing guidelines for choosing appropriate auxiliary tasks.

2,202 citations

01 Jan 2014
TL;DR: Using Language部分的�’学模式既不落俗套,又能真正体现新课程标准所倡导的�'学理念,正是年努力探索的问题.
Abstract: 人教版高中英语新课程教材中,语言运用(Using Language)是每个单元必不可少的部分,提供了围绕单元中心话题的听、说、读、写的综合性练习,是单元中心话题的延续和升华.如何设计Using Language部分的教学,使自己的教学模式既不落俗套,又能真正体现新课程标准所倡导的教学理念,正是广大一线英语教师一直努力探索的问题.

2,071 citations

Journal ArticleDOI
TL;DR: This paper surveys the recent advances in multimodal machine learning itself and presents them in a common taxonomy to enable researchers to better understand the state of the field and identify directions for future research.
Abstract: Our experience of the world is multimodal - we see objects, hear sounds, feel texture, smell odors, and taste flavors Modality refers to the way in which something happens or is experienced and a research problem is characterized as multimodal when it includes multiple such modalities In order for Artificial Intelligence to make progress in understanding the world around us, it needs to be able to interpret such multimodal signals together Multimodal machine learning aims to build models that can process and relate information from multiple modalities It is a vibrant multi-disciplinary field of increasing importance and with extraordinary potential Instead of focusing on specific multimodal applications, this paper surveys the recent advances in multimodal machine learning itself and presents them in a common taxonomy We go beyond the typical early and late fusion categorization and identify broader challenges that are faced by multimodal machine learning, namely: representation, translation, alignment, fusion, and co-learning This new taxonomy will enable researchers to better understand the state of the field and identify directions for future research

1,945 citations

Book
01 Nov 2002
TL;DR: Drive development with automated tests, a style of development called “Test-Driven Development” (TDD for short), which aims to dramatically reduce the defect density of code and make the subject of work crystal clear to all involved.
Abstract: From the Book: “Clean code that works” is Ron Jeffries’ pithy phrase. The goal is clean code that works, and for a whole bunch of reasons: Clean code that works is a predictable way to develop. You know when you are finished, without having to worry about a long bug trail.Clean code that works gives you a chance to learn all the lessons that the code has to teach you. If you only ever slap together the first thing you think of, you never have time to think of a second, better, thing. Clean code that works improves the lives of users of our software.Clean code that works lets your teammates count on you, and you on them.Writing clean code that works feels good.But how do you get to clean code that works? Many forces drive you away from clean code, and even code that works. Without taking too much counsel of our fears, here’s what we do—drive development with automated tests, a style of development called “Test-Driven Development” (TDD for short). In Test-Driven Development, you: Write new code only if you first have a failing automated test.Eliminate duplication. Two simple rules, but they generate complex individual and group behavior. Some of the technical implications are:You must design organically, with running code providing feedback between decisionsYou must write your own tests, since you can’t wait twenty times a day for someone else to write a testYour development environment must provide rapid response to small changesYour designs must consist of many highly cohesive, loosely coupled components, just to make testing easy The two rules imply an order to the tasks ofprogramming: 1. Red—write a little test that doesn’t work, perhaps doesn’t even compile at first 2. Green—make the test work quickly, committing whatever sins necessary in the process 3. Refactor—eliminate all the duplication created in just getting the test to work Red/green/refactor. The TDD’s mantra. Assuming for the moment that such a style is possible, it might be possible to dramatically reduce the defect density of code and make the subject of work crystal clear to all involved. If so, writing only code demanded by failing tests also has social implications: If the defect density can be reduced enough, QA can shift from reactive to pro-active workIf the number of nasty surprises can be reduced enough, project managers can estimate accurately enough to involve real customers in daily developmentIf the topics of technical conversations can be made clear enough, programmers can work in minute-by-minute collaboration instead of daily or weekly collaborationAgain, if the defect density can be reduced enough, we can have shippable software with new functionality every day, leading to new business relationships with customers So, the concept is simple, but what’s my motivation? Why would a programmer take on the additional work of writing automated tests? Why would a programmer work in tiny little steps when their mind is capable of great soaring swoops of design? Courage. Courage Test-driven development is a way of managing fear during programming. I don’t mean fear in a bad way, pow widdle prwogwammew needs a pacifiew, but fear in the legitimate, this-is-a-hard-problem-and-I-can’t-see-the-end-from-the-beginning sense. If pain is nature’s way of saying “Stop!”, fear is nature’s way of saying “Be careful.” Being careful is good, but fear has a host of other effects: Makes you tentativeMakes you want to communicate lessMakes you shy from feedbackMakes you grumpy None of these effects are helpful when programming, especially when programming something hard. So, how can you face a difficult situation and: Instead of being tentative, begin learning concretely as quickly as possible.Instead of clamming up, communicate more clearly.Instead of avoiding feedback, search out helpful, concrete feedback.(You’ll have to work on grumpiness on your own.) Imagine programming as turning a crank to pull a bucket of water from a well. When the bucket is small, a free-spinning crank is fine. When the bucket is big and full of water, you’re going to get tired before the bucket is all the way up. You need a ratchet mechanism to enable you to rest between bouts of cranking. The heavier the bucket, the closer the teeth need to be on the ratchet. The tests in test-driven development are the teeth of the ratchet. Once you get one test working, you know it is working, now and forever. You are one step closer to having everything working than you were when the test was broken. Now get the next one working, and the next, and the next. By analogy, the tougher the programming problem, the less ground should be covered by each test. Readers of Extreme Programming Explained will notice a difference in tone between XP and TDD. TDD isn’t an absolute like Extreme Programming. XP says, “Here are things you must be able to do to be prepared to evolve further.” TDD is a little fuzzier. TDD is an awareness of the gap between decision and feedback during programming, and techniques to control that gap. “What if I do a paper design for a week, then test-drive the code? Is that TDD?” Sure, it’s TDD. You were aware of the gap between decision and feedback and you controlled the gap deliberately. That said, most people who learn TDD find their programming practice changed for good. “Test Infected” is the phrase Erich Gamma coined to describe this shift. You might find yourself writing more tests earlier, and working in smaller steps than you ever dreamed would be sensible. On the other hand, some programmers learn TDD and go back to their earlier practices, reserving TDD for special occasions when ordinary programming isn’t making progress. There are certainly programming tasks that can’t be driven solely by tests (or at least, not yet). Security software and concurrency, for example, are two topics where TDD is not sufficient to mechanically demonstrate that the goals of the software have been met. Security relies on essentially defect-free code, true, but also on human judgement about the methods used to secure the software. Subtle concurrency problems can’t be reliably duplicated by running the code. Once you are finished reading this book, you should be ready to: Start simplyWrite automated testsRefactor to add design decisions one at a time This book is organized into three sections. An example of writing typical model code using TDD. The example is one I got from Ward Cunningham years ago, and have used many times since, multi-currency arithmetic. In it you will learn to write tests before code and grow a design organically.An example of testing more complicated logic, including reflection and exceptions, by developing a framework for automated testing. This example also serves to introduce you to the xUnit architecture that is at the heart of many programmer-oriented testing tools. In the second example you will learn to work in even smaller steps than in the first example, including the kind of self-referential hooha beloved of computer scientists.Patterns for TDD. Included are patterns for the deciding what tests to write, how to write tests using xUnit, and a greatest hits selection of the design patterns and refactorings used in the examples. I wrote the examples imagining a pair programming session. If you like looking at the map before wandering around, you may want to go straight to the patterns in Section 3 and use the examples as illustrations. If you prefer just wandering around and then looking at the map to see where you’ve been, try reading the examples through and refering to the patterns when you want more detail about a technique, then using the patterns as a reference. Several reviewers have commented they got the most out of the examples when they started up a programming environment and entered the code and ran the tests as they read. A note about the examples. Both examples, multi-currency calculation and a testing framework, appear simple. There are (and I have seen) complicated, ugly, messy ways of solving the same problems. I could have chosen one of those complicated, ugly, messy solutions to give the book an air of “reality.” However, my goal, and I hope your goal, is to write clean code that works. Before teeing off on the examples as being too simple, spend 15 seconds imagining a programming world in which all code was this clear and direct, where there were no complicated solutions, only apparently complicated problems begging for careful thought. TDD is a practice that can help you lead yourself to exactly that careful thought.

1,864 citations

Proceedings ArticleDOI
23 Apr 2020
TL;DR: It is consistently found that multi-phase adaptive pretraining offers large gains in task performance, and it is shown that adapting to a task corpus augmented using simple data selection strategies is an effective alternative, especially when resources for domain-adaptive pretraining might be unavailable.
Abstract: Language models pretrained on text from a wide variety of sources form the foundation of today’s NLP. In light of the success of these broad-coverage models, we investigate whether it is still helpful to tailor a pretrained model to the domain of a target task. We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks, showing that a second phase of pretraining in-domain (domain-adaptive pretraining) leads to performance gains, under both high- and low-resource settings. Moreover, adapting to the task’s unlabeled data (task-adaptive pretraining) improves performance even after domain-adaptive pretraining. Finally, we show that adapting to a task corpus augmented using simple data selection strategies is an effective alternative, especially when resources for domain-adaptive pretraining might be unavailable. Overall, we consistently find that multi-phase adaptive pretraining offers large gains in task performance.

1,532 citations