scispace - formally typeset
Search or ask a question
Book

Elements of Information Theory, 2/E

About: The article was published on 2021-11-16 and is currently open access. It has received 1772 citations till now. The article focuses on the topics: Information theory.
Citations
More filters
Journal ArticleDOI
TL;DR: This paper presents a tuning method that uses presence-only data for parameter tuning, and introduces several concepts that improve the predictive accuracy and running time of Maxent and describes a new logistic output format that gives an estimate of probability of presence.
Abstract: Accurate modeling of geographic distributions of species is crucial to various applications in ecology and conservation. The best performing techniques often require some parameter tuning, which may be prohibitively time-consuming to do separately for each species, or unreliable for small or biased datasets. Additionally, even with the abundance of good quality data, users interested in the application of species models need not have the statistical knowledge required for detailed tuning. In such cases, it is desirable to use "default settings", tuned and validated on diverse datasets. Maxent is a recently introduced modeling technique, achieving high predictive accuracy and enjoying several additional attractive properties. The performance of Maxent is influenced by a moderate number of parameters. The first contribution of this paper is the empirical tuning of these parameters. Since many datasets lack information about species absence, we present a tuning method that uses presence-only data. We evaluate our method on independently collected high-quality presence-absence data. In addition to tuning, we introduce several concepts that improve the predictive accuracy and running time of Maxent. We introduce "hinge features" that model more complex relationships in the training data; we describe a new logistic output format that gives an estimate of probability of presence; finally we explore "background sampling" strategies that cope with sample selection bias and decrease model-building time. Our evaluation, based on a diverse dataset of 226 species from 6 regions, shows: 1) default settings tuned on presence-only data achieve performance which is almost as good as if they had been tuned on the evaluation data itself; 2) hinge features substantially improve model performance; 3) logistic output improves model calibration, so that large differences in output values correspond better to large differences in suitability; 4) "target-group" background sampling can give much better predictive performance than random background sampling; 5) random background sampling results in a dramatic decrease in running time, with no decrease in model performance.

5,314 citations


Cites background from "Elements of Information Theory, 2/E..."

  • ...Then the average of their log probabilities will be very close to H, the negative entropy (Cover and Thomas 2006), because H is simply the mean log probability: / H Sql(x) ln (ql(x)): Thus, for ‘‘typical’’ sites whose log probabilities are close to this mean, we obtain ql(x):e H....

    [...]

  • ...Then the average of their log probabilities will be very close to H, the negative entropy (Cover and Thomas 2006), because H is simply the mean log probability: / H Sql(x) ln (ql(x)): Thus, for ‘‘typical’’ sites whose log probabilities are close to this mean, we obtain ql(x):e ....

    [...]

Journal ArticleDOI
TL;DR: The information-theoretic (I-T) approaches to valid inference are outlined including a review of some simple methods for making formal inference from all the hypotheses in the model set (multimodel inference).
Abstract: We briefly outline the information-theoretic (I-T) approaches to valid inference including a review of some simple methods for making formal inference from all the hypotheses in the model set (multimodel inference). The I-T approaches can replace the usual t tests and ANOVA tables that are so inferentially limited, but still commonly used. The I-T methods are easy to compute and understand and provide formal measures of the strength of evidence for both the null and alternative hypotheses, given the data. We give an example to highlight the importance of deriving alternative hypotheses and representing these as probability models. Fifteen technical issues are addressed to clarify various points that have appeared incorrectly in the recent literature. We offer several remarks regarding the future of empirical science and data analysis under an I-T framework.

3,105 citations


Cites background from "Elements of Information Theory, 2/E..."

  • ...The concept of “information” was quantified and this provided a series of enormous breakthroughs affecting modern society (see Hobson and Cheng 1973, Guiasu 1977, Soofi 1994, Jessop 1995, and Cover and Thomas 2006 for background)....

    [...]

Proceedings ArticleDOI
01 Jun 2008
TL;DR: A novel filter bank common spatial pattern (FBCSP) is proposed to perform autonomous selection of key temporal-spatial discriminative EEG characteristics and shows that FBCSP, using a particular combination feature selection and classification algorithm, yields relatively higher cross-validation accuracies compared to prevailing approaches.
Abstract: In motor imagery-based brain computer interfaces (BCI), discriminative patterns can be extracted from the electroencephalogram (EEG) using the common spatial pattern (CSP) algorithm. However, the performance of this spatial filter depends on the operational frequency band of the EEG. Thus, setting a broad frequency range, or manually selecting a subject-specific frequency range, are commonly used with the CSP algorithm. To address this problem, this paper proposes a novel filter bank common spatial pattern (FBCSP) to perform autonomous selection of key temporal-spatial discriminative EEG characteristics. After the EEG measurements have been bandpass-filtered into multiple frequency bands, CSP features are extracted from each of these bands. A feature selection algorithm is then used to automatically select discriminative pairs of frequency bands and corresponding CSP features. A classification algorithm is subsequently used to classify the CSP features. A study is conducted to assess the performance of a selection of feature selection and classification algorithms for use with the FBCSP. Extensive experimental results are presented on a publicly available dataset as well as data collected from healthy subjects and unilaterally paralyzed stroke patients. The results show that FBCSP, using a particular combination feature selection and classification algorithm, yields relatively higher cross-validation accuracies compared to prevailing approaches.

991 citations


Cites background from "Elements of Information Theory, 2/E..."

  • ...The MI between the two random variables is [19]...

    [...]

Journal ArticleDOI
TL;DR: A new measure of dysphonia, pitch period entropy (PPE), is introduced, which is robust to many uncontrollable confounding effects including noisy acoustic environments and normal, healthy variations in voice frequency, and is well suited to telemonitoring applications.
Abstract: In this paper, we present an assessment of the practical value of existing traditional and nonstandard measures for discriminating healthy people from people with Parkinson's disease (PD) by detecting dysphonia. We introduce a new measure of dysphonia, pitch period entropy (PPE), which is robust to many uncontrollable confounding effects including noisy acoustic environments and normal, healthy variations in voice frequency. We collected sustained phonations from 31 people, 23 with PD. We then selected ten highly uncorrelated measures, and an exhaustive search of all possible combinations of these measures finds four that in combination lead to overall correct classification performance of 91.4%, using a kernel support vector machine. In conclusion, we find that nonstandard methods in combination with traditional harmonics-to-noise ratios are best able to separate healthy from PD subjects. The selected nonstandard methods are robust to many uncontrollable variations in acoustic environment and individual subjects, and are thus well suited to telemonitoring applications.

816 citations


Cites methods from "Elements of Information Theory, 2/E..."

  • ...Finally, we calculate the entropy of this probability distribution [44] which then characterizes the extent of (non-Gaussian) fluctuations in the sequence of relative semitone pitch period variations....

    [...]

Journal ArticleDOI
TL;DR: This paper discusses a selection of promising and interesting research areas in the design of protocols and systems for wireless industrial communications that have either emerged as hot topics in the industrial communications community in the last few years, or which could be worthwhile research Topics in the next few years.
Abstract: In this paper we discuss a selection of promising and interesting research areas in the design of protocols and systems for wireless industrial communications. We have selected topics that have either emerged as hot topics in the industrial communications community in the last few years (like wireless sensor networks), or which could be worthwhile research topics in the next few years (for example cooperative diversity techniques for error control, cognitive radio/opportunistic spectrum access for mitigation of external interferences).

696 citations