What future works have the authors mentioned in the paper "Overview of imageclef 2018: challenges, datasets and evaluation" ?

On the other hand it is a much more modern platform that offers new possibilities, for example continuously running the challenge even beyond the workshop dates.

(Open Access) Overview of ImageCLEF 2018: Challenges, Datasets and Evaluation (2018) | Bogdan Ionescu

Overview of ImageCLEF 2018:

Challenges, Datasets and Evaluation

Bogdan Ionescu

)

, Henning M¨uller

, Mauricio Villegas

Alba Garc´ıa Seco de Herrera

, Carsten Eickhoﬀ

, Vincent Andrearczyk

Yashin Dicente Cid

, Vitali Liauchuk

, Vassili Kovalev

, Sadid A. Hasan

Yuan Ling

, Oladimeji Farri

,JoeyLiu

, Matthew Lungren

Duc-Tien Dang-Nguyen

, Luca Piras

, Michael Riegler

11,12

, Liting Zhou

Mathias Lux

, and Cathal Gurrin

University Politehnica of Bucharest, Bucharest, Romania

bionescu@alpha.imag.pub.ro

University of Applied Sciences Western Switzerland (HES-SO), Sierre, Switzerland

omni:us, Berlin, Germany

University of Essex, Colchester, UK

Brown University, Providence, RI, USA

United Institute of Informatics Problems, Minsk, Belarus

Artiﬁcial Intelligence Lab, Philips Research North America, Cambridge, MA, USA

Department of Radiology, Stanford University, Stanford, CA, USA

Dublin City University, Dublin, Ireland

University of Cagliari and Pluribus One, Cagliari, Italy

University of Oslo, Oslo, Norway

Simula Metropolitan Center for Digital Engineering, Oslo, Norway

Klagenfurt University, Klagenfurt, Austria

Abstract. This paper presents an overview of the ImageCLEF 2018

evaluation campaign, an event that was organized as part of the CLEF

(Conference and Labs of the Evaluation Forum) Labs 2018. ImageCLEF

is an ongoing initiative (it started in 2003) that promotes the evalua-

tion of technologies for annotation, indexing and retrieval with the aim

of providing information access to collections of images in various usage

scenarios and domains. In 2018, the 16th edition of ImageCLEF ran three

main tasks and a pilot task: (1) a caption prediction task that aims at

predicting the caption of a ﬁgure from the biomedical literature based

only on the ﬁgure image; (2) a tuberculosis task that aims at detecting

the tuberculosis type, severity and drug resistance from CT (Computed

Tomography) volumes of the lung; (3) a LifeLog task (videos, images

and other sources) about daily activities understanding and moment

retrieval, and (4) a pilot task on visual question answering where systems

are tasked with answering medical questions. The strong participation,

with over 100 research groups registering and 31 submitting results for

the tasks, shows an increasing interest in this benchmarking campaign.

 Springer Nature Switzerland AG 2018

P. Bellot et al. (Eds.): CLEF 2018, LNCS 11018, pp. 309–334, 2018.

https://doi.org/10.1007/978-3-319-98932-7

_28

310 B. Ionescu et al.

1 Introduction

One or two decades ago getting access to large visual data sets for research was

a problem and open data collections that could be used to compare algorithms

of researchers were rare. Now, it is getting easier to access data collections but it

is still hard to obtain annotated data with a clear evaluation scenario and strong

baselines to compare against. Motivated by this, ImageCLEF has for 16 years

been an initiative that aims at evaluating multilingual or language independent

annotation and retrieval of images [5,21,23,25,39]. The main goal of ImageCLEF

is to support the advancement of the ﬁeld of visual media analysis, classiﬁcation,

annotation, indexing and retrieval. It proposes novel challenges and develops the

necessary infrastructure for the evaluation of visual systems operating in diﬀerent

contexts and providing reusable resources for benchmarking. It is also linked to

initiatives such as Evaluation-as-a-Service (EaaS) [17,18].

Many research groups have participated over the years in these evaluation

campaigns and even more have acquired its datasets for experimentation. The

impact of ImageCLEF can also be seen by its signiﬁcant scholarly impact indi-

cated by the substantial numbers of its publications and their received cita-

tions [36].

There are other evaluation initiatives that have had a close relation with

ImageCLEF. LifeCLEF [22] was formerly an ImageCLEF task. However, due to

the need to assess technologies for automated identiﬁcation and understanding

of living organisms using data not only restricted to images, but also videos

and sound, it was decided to be organised independently from ImageCLEF.

Other CLEF labs linked to ImageCLEF, in particular the medical task, are:

CLEFeHealth [14] that deals with processing methods and resources to enrich

diﬃcult-to-understand eHealth text and the BioASQ [4] tasks from the Question

Answering lab that targets biomedical semantic indexing and question answering

but is now not a lab anymore. Due to their medical orientation, the organisation

is coordinated in close collaboration with the medical tasks in ImageCLEF. In

2017, ImageCLEF explored synergies with the MediaEval Benchmarking Initia-

tive for Multimedia Evaluation [15], which focuses on exploring the “multi” in

multimedia: speech, audio, visual content, tags, users, context. MediaEval was

founded in 2008 as VideoCLEF, a track in the CLEF Campaign.

This paper presents a general overview of the ImageCLEF 2018 evaluation

campaign

, which as usual was an event organised as part of the CLEF labs

The remainder of the paper is organized as follows. Section 2 presents a gen-

eral description of the 2018 edition of ImageCLEF, commenting about the overall

organisation and participation in the lab. Followed by this are sections dedicated

to the four tasks that were organised this year: Sect. 3 for the Caption Task,

Sect. 4 for the Tuberculosis Task, Sect. 5 for the Visual Question Answering

Task, and Sect. 6 for the Lifelog Task. For the full details and complete results

on the participating teams, the reader should refer to the corresponding task

http://imageclef.org/2018/.

http://clef2018.clef-initiative.eu/.

Overview of ImageCLEF 2018: Challenges, Datasets and Evaluation 311

overview papers [7,11,19,20]. The ﬁnal section concludes the paper by giving an

overall discussion, and pointing towards the challenges ahead and possible new

directions for future research.

2 Overview of Tasks and Participation

ImageCLEF 2018 consisted of three main tasks and a pilot task that covered

challenges in diverse ﬁelds and usage scenarios. In 2017 [21] the proposed chal-

lenges were almost all new in comparison to 2016 [40], the only exception being

Caption Prediction that was a subtask already attempted in 2016, but for which

no participant submitted results. After such a big change, for 2018 the objective

was to continue most of the tasks from 2017. The only change was that the

2017 Remote Sensing pilot task was replaced by a novel one on Visual Question

Answering. The 2018 tasks are the following:

– ImageCLEFcaption: Interpreting and summarizing the insights gained

from medical images such as radiology output is a time-consuming task that

involves highly trained experts and often represents a bottleneck in clinical

diagnosis pipelines. Consequently, there is a considerable need for automatic

methods that can approximate this mapping from visual information to con-

densed textual descriptions. The task addresses the problem of bio-medical

image concept detection and caption prediction from large amounts of train-

ing data.

– ImageCLEFtuberculosis: The main objective of the task is to provide

a tuberculosis severity score based on the automatic analysis of lung CT

images of patients. Being able to extract this information from the image

data alone allows to limit lung washing and laboratory analyses to determine

the tuberculosis type and drug resistances. This can lead to quicker decisions

on the best treatment strategy, reduced use of antibiotics and lower impact

on the patient.

– ImageCLEFlifelog: An increasingly wide range of personal devices, such

as smart phones, video cameras as well as wearable devices that allow cap-

turing pictures, videos, and audio clips of every moment of life are becoming

available. Considering the huge volume of data created, there is a need for

systems that can automatically analyse the data in order to categorize, sum-

marize and also to retrieve query-information that the user may desire. Hence,

this task addresses the problems of lifelog data understanding, summarization

and retrieval.

– ImageCLEF-VQA-Med (pilot task): Visual Question Answering is a new

and exciting problem that combines natural language processing and com-

puter vision techniques. With the ongoing drive for improved patient engage-

ment and access to the electronic medical records via patient portals, patients

can now review structured and unstructured data from labs and images to

text reports associated with their healthcare utilization. Such access can help

them better understand their conditions in line with the details received from

their healthcare provider. Given a medical image accompanied with a set of

312 B. Ionescu et al.

clinically relevant questions, participating systems are tasked with answering

the questions based on the visual image content.

In order to participate in the evaluation campaign, the research groups ﬁrst

had to register by following the instructions on the ImageCLEF 2018 web page.

To ease the overall management of the campaign, this year the challenge was

organized through the crowdAI platform

. To get access to the datasets, the

participants were required to submit a signed End User Agreement (EUA) form.

Table 1 summarizes the participation in ImageCLEF 2018, including the number

of registrations (counting only the ones that downloaded the EUA) and the

number of signed EUAs, indicated both per task and for the overall Lab. The

table also shows the number of groups that submitted results (runs) and the

ones that submitted a working notes paper describing the techniques used.

The number of registrations could be interpreted as the initial interest that

the community has for the evaluation. However, it is a bit misleading because

several persons from the same institution might register, even though in the

end they count as a single group participation. The EUA explicitly requires all

groups that get access to the data to participate, even though this is not enforced.

Unfortunately, the percentage of groups that submit results is often limited.

Nevertheless, as observed in studies of scholarly impact [36,37], in subsequent

years the datasets and challenges provided by ImageCLEF often get used, in

part due to the researchers that for some reason (e.g. alack of time, or other

priorities) were unable to participate in the original event or did not complete

the tasks by the deadlines.

After a decrease in participation in 2016, the participation again increased in

2017 and for 2018 it increased further. The number of signed EUAs is consider-

ably higher, mostly due to the fact that this time each task had an independent

EUA. Also, due to the change to crowdAI, the online registration became easier

and attracted other research groups than usual, which made the registration-

to-participation ratio lower than in previous years. Nevertheless, in the end, 31

groups participated and 28 working notes papers were submitted, which is a

slight increase with respect to 2017. The following four sections are dedicated to

each of the tasks. Only a short overview is reported, including general objectives,

description of the tasks and datasets and a short summary of the results.

3 The Caption Task

This task studies algorithmic approaches to medical image understanding. As

a testbed for doing so, teams were tasked with automatically “guessing” ﬁtting

keywords or free-text captions that best describe an image from a collection of

images published in the biomedical literature.

https://www.crowdai.org/.

Overview of ImageCLEF 2018: Challenges, Datasets and Evaluation 313

Table 1. Key ﬁgures of participation in ImageCLEF 2018.

Task Registered &

downloaded

EUA

Signed EUA Groups that

subm. results

Submitted

working notes

Caption 84 46 8 6

Tuberculosis 85 33 11 11

VQA-Med 58 28 5 5

Lifelog 38 25 7 7

Overall 265

∗

132

∗

31 29

∗

Total for all tasks, not unique groups/emails.

3.1 Task Setup

Following the structure of the 2017 edition, two sub tasks were proposed. The

ﬁrst task, concept detection, aims to extract the main biomedical concepts rep-

resented in an image based only on its visual content. These concepts are UMLS

(Uniﬁed Medical Language System



) Concept Unique Identiﬁers (CUIs). The

second task, caption prediction, aims to compose coherent free-text captions

describing the image based only on the visual information. Participants were, of

course, allowed to use the UMLS CUIs extracted in the ﬁrst task to compose

captions from individual concepts. Figure 1 shows an example of the information

available in the training set. An image is accompanied by a set of UMLS CUIs

and a free-text caption. Compared to 2017 the data sets was modiﬁed strongly

to respond to some of the diﬃculties with the task in the past [13].

3.2 Dataset

The dataset used in this task is derived from ﬁgures and their corresponding

captions extracted from biomedical articles on PubMed Central



(PMC)

. This

data set was changed strongly compared to the same task run in 2017 to reduce

the diversity on the data and limit the number of compound ﬁgures. A subset

of clinical ﬁgures was automatically obtained from the overall set of 5.8 million

PMC ﬁgures using a deep multimodal fusion of Convolutional Neural Networks

(CNN), described in [2]. In total, the dataset is comprised of 232,305 image–

caption pairs split into disjoint training (222,305 pairs) and test (10,000 pairs)

sets. For the Concept Detection subtask, concepts present in the caption text

were extracted using the QuickUMLS library [30]. After having observed a strong

breadth of concepts and image types in the 2017 edition of the task, this year’s

continuation focused on radiology artifacts, introducing a greater topical focus

to the collection.

https://www.ncbi.nlm.nih.gov/pmc/.

Overview of ImageCLEF 2018: Challenges, Datasets and Evaluation

Figures

Citations

Medical Visual Question Answering via Conditional Reasoning

ImageCLEF 2019: Multimedia Retrieval in Medicine, Lifelogging, Security and Nature

Overview of ImageCLEF 2018 Medical Domain Visual Question Answering Task.

Overview of ImageCLEFlifelog 2018: daily living understanding and lifelog moment retrieval

Analysis of tuberculosis severity levels from CT pulmonary images based on enhanced residual deep learning architecture

References

Bleu: a Method for Automatic Evaluation of Machine Translation

VQA: Visual Question Answering

Verb semantics and lexical selection

VQA: Visual Question Answering

Evaluating performance of biomedical image retrieval systems--an overview of the medical image retrieval task at ImageCLEF 2004-2013.

Related Papers (5)

Overview of ImageCLEF 2017: Information extraction from images

Deep Residual Learning for Image Recognition

Microsoft COCO: Common Objects in Context

Overview of the ImageCLEF 2016 Medical Task.

General Overview of ImageCLEF at the CLEF 2015 Labs

Frequently Asked Questions (2)

Q1. What are the contributions in "Overview of imageclef 2018: challenges, datasets and evaluation" ?

Q2. What future works have the authors mentioned in the paper "Overview of imageclef 2018: challenges, datasets and evaluation" ?