scispace - formally typeset
Search or ask a question
Author

Gerald Schaefer

Bio: Gerald Schaefer is an academic researcher from Loughborough University. The author has contributed to research in topics: Image retrieval & Automatic image annotation. The author has an hindex of 35, co-authored 465 publications receiving 6835 citations. Previous affiliations of Gerald Schaefer include University of Manitoba & University of East Anglia.


Papers
More filters
Proceedings ArticleDOI
TL;DR: A new dataset, UCID (pronounced "use it") - an Uncompressed Colour Image Dataset which tries to bridge the gap between standardised image databases and objective evaluation of image retrieval algorithms that operate in the compressed domain.
Abstract: Standardised image databases or rather the lack of them are one of the main weaknesses in the field of content based image retrieval (CBIR). Authors often use their own images or do not specify the source of their datasets. Naturally this makes comparison of results somewhat difficult. While a first approach towards a common colour image set has been taken by the MPEG 7 committee 1 their database does not cater for all strands of research in the CBIR community. In particular as the MPEG-7 images only exist in compressed form it does not allow for an objective evaluation of image retrieval algorithms that operate in the compressed domain or to judge the influence image compression has on the performance of CBIR algorithms. In this paper we introduce a new dataset, UCID (pronounced ”use it”) - an Uncompressed Colour Image Dataset which tries to bridge this gap. The UCID dataset currently consists of 1338 uncompressed images together with a ground truth of a series of query images with corresponding models that an ideal CBIR algorithm would retrieve. While its initial intention was to provide a dataset for the evaluation of compressed domain algorithms, the UCID database also represents a good benchmark set for the evaluation of any kind of CBIR method as well as an image set that can be used to evaluate image compression and colour quantisation algorithms.

1,117 citations

Journal ArticleDOI
TL;DR: A systematic overview of the recent border detection methods in the literature paying particular attention to computational issues and evaluation aspects is presented.

425 citations

Journal ArticleDOI
01 Jan 2014
TL;DR: This paper introduces an effective ensemble of cost-sensitive decision trees for imbalanced classification, which is capable of outperforming other state-of-the-art algorithms and representing a useful and effective approach for dealing with imbalanced datasets.
Abstract: Real-life datasets are often imbalanced, that is, there are significantly more training samples available for some classes than for others, and consequently the conventional aim of reducing overall classification accuracy is not appropriate when dealing with such problems. Various approaches have been introduced in the literature to deal with imbalanced datasets, and are typically based on oversampling, undersampling or cost-sensitive classification. In this paper, we introduce an effective ensemble of cost-sensitive decision trees for imbalanced classification. Base classifiers are constructed according to a given cost matrix, but are trained on random feature subspaces to ensure sufficient diversity of the ensemble members. We employ an evolutionary algorithm for simultaneous classifier selection and assignment of committee member weights for the fusion process. Our proposed algorithm is evaluated on a variety of benchmark datasets, and is confirmed to lead to improved recognition of the minority class, to be capable of outperforming other state-of-the-art algorithms, and hence to represent a useful and effective approach for dealing with imbalanced datasets.

282 citations

Journal ArticleDOI
Neeraj Kumar1, Ruchika Verma2, Deepak Anand3, Yanning Zhou4, Omer Fahri Onder, E. D. Tsougenis, Hao Chen, Pheng-Ann Heng4, Jiahui Li5, Zhiqiang Hu6, Yunzhi Wang7, Navid Alemi Koohbanani8, Mostafa Jahanifar8, Neda Zamani Tajeddin8, Ali Gooya8, Nasir M. Rajpoot8, Xuhua Ren9, Sihang Zhou10, Qian Wang9, Dinggang Shen10, Cheng-Kun Yang, Chi-Hung Weng, Wei-Hsiang Yu, Chao-Yuan Yeh, Shuang Yang11, Shuoyu Xu12, Pak-Hei Yeung13, Peng Sun12, Amirreza Mahbod14, Gerald Schaefer15, Isabella Ellinger14, Rupert Ecker, Örjan Smedby16, Chunliang Wang16, Benjamin Chidester17, That-Vinh Ton18, Minh-Triet Tran19, Jian Ma17, Minh N. Do18, Simon Graham8, Quoc Dang Vu20, Jin Tae Kwak20, Akshaykumar Gunda21, Raviteja Chunduri3, Corey Hu22, Xiaoyang Zhou23, Dariush Lotfi24, Reza Safdari24, Antanas Kascenas, Alison O'Neil, Dennis Eschweiler25, Johannes Stegmaier25, Yanping Cui26, Baocai Yin, Kailin Chen, Xinmei Tian26, Philipp Gruening27, Erhardt Barth27, Elad Arbel28, Itay Remer28, Amir Ben-Dor28, Ekaterina Sirazitdinova, Matthias Kohl, Stefan Braunewell, Yuexiang Li29, Xinpeng Xie29, Linlin Shen29, Jun Ma30, Krishanu Das Baksi31, Mohammad Azam Khan32, Jaegul Choo32, Adrián Colomer33, Valery Naranjo33, Linmin Pei34, Khan M. Iftekharuddin34, Kaushiki Roy35, Debotosh Bhattacharjee35, Anibal Pedraza36, Maria Gloria Bueno36, Sabarinathan Devanathan37, Saravanan Radhakrishnan37, Praveen Koduganty37, Zihan Wu38, Guanyu Cai39, Xiaojie Liu39, Yuqin Wang39, Amit Sethi3 
TL;DR: Several of the top techniques compared favorably to an individual human annotator and can be used with confidence for nuclear morphometrics as well as heavy data augmentation in the MoNuSeg 2018 challenge.
Abstract: Generalized nucleus segmentation techniques can contribute greatly to reducing the time to develop and validate visual biomarkers for new digital pathology datasets. We summarize the results of MoNuSeg 2018 Challenge whose objective was to develop generalizable nuclei segmentation techniques in digital pathology. The challenge was an official satellite event of the MICCAI 2018 conference in which 32 teams with more than 80 participants from geographically diverse institutes participated. Contestants were given a training set with 30 images from seven organs with annotations of 21,623 individual nuclei. A test dataset with 14 images taken from seven organs, including two organs that did not appear in the training set was released without annotations. Entries were evaluated based on average aggregated Jaccard index (AJI) on the test set to prioritize accurate instance segmentation as opposed to mere semantic segmentation. More than half the teams that completed the challenge outperformed a previous baseline. Among the trends observed that contributed to increased accuracy were the use of color normalization as well as heavy data augmentation. Additionally, fully convolutional networks inspired by variants of U-Net, FCN, and Mask-RCNN were popularly used, typically based on ResNet or VGG base architectures. Watershed segmentation on predicted semantic segmentation maps was a popular post-processing strategy. Several of the top techniques compared favorably to an individual human annotator and can be used with confidence for nuclear morphometrics.

251 citations

Journal ArticleDOI
TL;DR: This paper proposes a new colour invariant image representation based on an existing grey-scale image enhancement technique: histogram equalisation and applies the method to an image indexing application and shows that the method out performs all previous invariant representations.

216 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Proceedings Article
01 Jan 1989
TL;DR: A scheme is developed for classifying the types of motion perceived by a humanlike robot and equations, theorems, concepts, clues, etc., relating the objects, their positions, and their motion to their images on the focal plane are presented.
Abstract: A scheme is developed for classifying the types of motion perceived by a humanlike robot. It is assumed that the robot receives visual images of the scene using a perspective system model. Equations, theorems, concepts, clues, etc., relating the objects, their positions, and their motion to their images on the focal plane are presented. >

2,000 citations