Polyphonic sonification of electrocardiography signals for diagnosis of cardiac pathologies.

doi:10.1038/SREP44549

1

SCIENTIFIC RepoRts | 7:44549 | DOI: 10.1038/srep44549

www.nature.com/scientificreports

Polyphonic sonication of

electrocardiography signals for

diagnosis of cardiac pathologies

Jakob Nikolas Kather

1,2

, Thomas Hermann

3

, Yannick Bukschat

1

, Tilmann Kramer

4

,

Lothar R. Schad

1

& Frank Gerrit Zöllner

1

Electrocardiography (ECG) data are multidimensional temporal data with ubiquitous applications in

the clinic. Conventionally, these data are presented visually. It is presently unclear to what degree data

sonication (auditory display), can enable the detection of clinically relevant cardiac pathologies in ECG

data. In this study, we introduce a method for polyphonic sonication of ECG data, whereby dierent

ECG channels are simultaneously represented by sound of dierent pitch. We retrospectively applied this

method to 12 samples from a publicly available ECG database. We and colleagues from our professional

environment then analyzed these data in a blinded way. Based on these analyses, we found that the

sonication technique can be intuitively understood after a short training session. On average, the

correct classication rate for observers trained in cardiology was 78%, compared to 68% and 50% for

observers not trained in cardiology or not trained in medicine at all, respectively. These values compare

to an expected random guessing performance of 25%. Strikingly, 27% of all observers had a classication

accuracy over 90%, indicating that sonication can be very successfully used by talented individuals. These

ndings can serve as a baseline for potential clinical applications of ECG sonication.

In medicine, technological advancements lead to a rapidly growing amount of data. Visualization techniques

are usually applied to support data inspection and analysis

1,2

However, visualization is just one way to make

information intelligible to humans. An alternative approach is sonication, i.e. the systematic and reproducible

representation of data by using sound

3

. Sonication has been applied to gene expression data

4

, DNA methylation

data

5

, biological imaging data

6

, electroencephalography (EEG) signals

7–9

, electrocardiogram (ECG) signals

7,10

and

combinations of biomedical signals

11

.

Coming from a clinical background, we asked whether sonication techniques (used in so-called auditory

displays) of complex data sets can aid clinicians in their diagnostic decision making. Specically, we focused on

ECG signals as complex multi-channel datasets with ubiquitous applications in the clinic. Although previous

studies have proposed techniques for heart rate sonication

12

and ECG sonication

10,11

, no study has evaluated in

how far these techniques are actually suited for clinical application of ECG analysis.

To investigate this questions, we first designed a parameter-mapping sonification

13

method that applies

time-variant oscillators to convert the multi-channel ECG datasets into a polyphonic sound. Secondly, we applied

this method to samples from a publicly available database

14

. irdly, we evaluated the diagnostic accuracy of

common cardiac pathologies based on sonied ECG signals.

Material and Methods

Ethics statement. In this retrospective study, we used human ECG measurements that are openly accessi-

ble in a public database

14

. All patient data were fully anonymized and could not be traced back to any individual

patient. Our institution’s medical ethics board II (Medical Faculty Mannheim, Heidelberg University, Germany)

gave their consent to this data analysis (decision number 2016–856R-MA, granted to FGZ) and waived the need

1

Computer Assisted Clinical Medicine, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.

2

Department of Medical Oncology and Internal Medicine VI, National Center for Tumor Diseases, University Hospital

Heidelberg, Heidelberg University, Heidelberg, Germany.

3

Ambient Intelligence Group, Center of Excellence in

Cognitive Interaction Technology (CITEC), Bielefeld University, Bielefeld, Germany.

4

Klinik III für Innere Medizin,

Herzzentrum der Universität zu Köln, Cologne, Germany. Correspondence and requests for materials should be

addressed to J.N.K. (email: jakob.kather@nct-heidelberg.de)

Received: 01 November 2016

Accepted: 10 February 2017

Published: 20 March 2017

OPEN

www.nature.com/scientificreports/

2

SCIENTIFIC RepoRts | 7:44549 | DOI: 10.1038/srep44549

for informed consent by the respective patients. All analyses were carried out in accordance with the Declaration

of Helsinki and in accordance with the ethics board approval.

Dataset. We used a selection of 12-channel ECG signals from the “St.-Petersburg Institute of Cardiological

Technics 12-lead Arrhythmia Database“ (incartdb) on www.physionet.org

14

. From the 12-channel datasets, we

extracted the rst six leads, corresponding to the electric vectors in the frontal plane (I, II, III, aVR, aVL, avF).

We selected the following four pathologies: ST-elevation myocardial infarction (STEMI), premature ventricular

contraction/ventricular extrasystole (PVC), atrial brillation and bigeminy. e reason for this selection was

(a) that these pathological patterns represent frequent pathological ndings in ECGs and (b) that these patterns

were among the most frequent patterns in the database. For each category, we retrieved a single 10 s sample for

training and three 10 s samples for testing. Furthermore, from the “PTB database”

15,16

on www.physionet.org

14

,

we retrieved a 12-lead ECG data set of a healthy control subject.

Data usage statement. All raw ECG data can be downloaded from www.physionet.org

14

as stated above.

All other data (including all sound samples) are available as supplementary Files S1, S2 and S3. A detailed ow-

chart of the algorithm is available as supplementary File S4. All Matlab

®

source codes used for this study are

available under the MIT license (http://opensource.org/licenses/MIT) and can be accessed via the following DOI:

[10.4119/unibi/2908653]. Also, we provide an implementation in for the open source platform SuperCollider that

can be accessed via the following DOI: [10.4119/unibi/2908653]. All performance data collected during the data

analysis by all observers are available as supplementary File S5.

Computational implementation and hardware. e approaches described in the preceding sections

have been implemented in Matlab

®

(R2015b, Mathworks, Natick, MA, USA). All experiments were carried out

on a standard computer workstation (2.2 GHz Intel Core i7, 16 GB RAM). All statistical calculations were carried

out using Matlab

®

. Statistical error is given as mean ± standard deviation if not otherwise noted. To test for signif-

icance, we used one-tailed Student’s t-test. Sound samples were played on “Bose SoundLink Mini II” loudspeakers

(Bose, Framingham, MA, USA). e entire code required to reproduce the experiments is freely available to the

public (see “Data usage” section).

Polyphonic ECG sonication. e aim of our study was to develop and test a method for polyphonic

sonication of pathological ECG signals. We used 6-channel ECG signals (Fig.1a) and assigned each channel a

note on the standard western chromatic musical scale (visualized in Fig.1b as a musical note). e voltage of each

ECG channel was mapped to the amplitude of the corresponding sound signal (and thus perceptually controls its

loudness in a nonlinear way). Similar to Hermann et al.

17

, the voltage was furthermore continuously mapped to a

frequency variation of 3% (i.e. half of a semi-tone) for each channel separately. In summary, higher (resp. lower)

voltage manifest as louder and slightly up-pitched (resp. soer and slightly downpitched) notes, and the overall

sonication is a continuous stream of six notes playing simultaneously.

For aesthetic reasons we selected the D minor scale (146.83 Hz, 174.61 Hz, 220.00 Hz, 293.67 Hz, 349.23 Hz,

440.00 Hz) over two octaves. In order to compensate for the unequal loudness at the dierent frequencies we

linearly reduced the amplitude of the channels’ notes from 100% (for the lowest pitch) to 30% (for the highest

pitch). While this is not exactly an equal loudness contour as suggested in the Robinson-Dadson curves, it is

subjectively balanced. We also added a xed set of harmonics

=⋅fkf

k 0

to each channel with k = 3, 4, 5 and ampli-

tude as 15%, 5% and 5% of the fundamental frequency

f

0

). is results in a more complex timbre for the channels’

sound streams. Note that the 2

nd

harmonic

f

2

has been le out intentionally to diminish spectral confusion with

the ECG channels 4–6, which are octave-shied fundamentals of channels 1–3. We refer to this specic version of

Figure 1. Principle of polyphonic sonication of multi-channel ECG data. (a) e Cabrera circle shows

the direction of the ECG signal channels projected on a frontal plane through the human body. (b) In our

technique, each of the six standard ECG channels is assigned a musical note so that the human auditory system

can identify each channel even if multiple channels are played simultaneously. e sampling rate of the ECG

signals was 257 Hz and the data shown in (b) correspond to 10 seconds.

www.nature.com/scientificreports/

3

SCIENTIFIC RepoRts | 7:44549 | DOI: 10.1038/srep44549

the parameter-mapping on time-variant oscillators as “polyphonic sonication”. A owchart of the algorithm

including relevant parameters is available in supplementaryFigureS4.

e results of this sonication technique are available in the supplementary data: S1 (S1_normal_ECG.zip)

contains a normal ECG of a healthy control sample. In S2 (S2_incremental_signal.zip), the channels from a

pathological ECG are sonied incrementally, i.e. ECG lead III alone, then III and aVF, then III and aVF and II,

etc. It can be heard that the individual channels can be identied even if they are played simultaneously.

Data analysis. Aer sonication, the data were analyzed by 22 blinded observers (one co-author of this paper

[TK] and 21 other members of our departments and our professional environment). is was to test whether

our technique can be used to distinguish clinically relevant cardiac pathologies. Observers who performed the

data analysis belonged to any of the following three groups. Group 1: N = 10 medical students with completed

cardiology course or young physicians in their rst to third year of clinical practice (“cardio course completed”),

Group 2: N = 7 medical students before completion of their cardiology course (“before cardiology course”),

Group 3: N = 5 science students (undergraduate and graduate) with no formal training in cardiology whatsoever

(“other science students”). We theoretically explained the method to all observers, demonstrated the incremental

buildup of six channels to a polyphonic sound sample and successively played four pathological 10 second sound

samples (one sample per target category, each sample played twice). During the demonstration of the sound

samples, observers were visually shown the underlying data as presented in Fig.2. Observers were not allowed to

go back to the training examples during the testing session. Examples used during the training session were not

re-used in the testing session. en, we played 12 short (10 s) sound samples and asked the participants to classify

each sample into exactly one of four categories. Examples for the four types of pathologies are depicted in Fig.2

and can be listened to in the supplementarydataS3 (S3_pathological_samples.zip).

Figure 2. Pathological ECG samples used for auditory data analysis. (a–d) Sample ECG signals for clinically

relevant cardiac pathologies. ese samples were used for training of human observers. Subsequently, other

samples were used to assess user performance in a blinded study. Channels mapped to lower frequencies are

shown in blue/green hues while channels mapped to higher frequencies are shown in yellowish hues. e

sampling rate of the ECG signals was 257/sec and the data correspond to 10 seconds.

www.nature.com/scientificreports/

4

SCIENTIFIC RepoRts | 7:44549 | DOI: 10.1038/srep44549

Results

ECG datasets can be polyphonically sonied. In this study, we developed a new method to convert digital

ECG signals to sound (“sonication”). We found that it is possible to process samples from a publicly available database

and that the resulting sound is subjectively rated as pleasant (see supplementary File S2_incremental_signal.zip).

Pathological ECG signals can be distinguished after sonication. To test human classication accu-

racy of sonied ECGs, sonied data were analyzed by N = 22 observers. Aer all observers had analyzed the data,

we assessed whether sonied ECG signals could be used to distinguish clinically relevant cardiac pathologies. We

found that in this analysis, there were marked dierences between the three groups of observers (Fig.3): Group

1 (medical students with completed cardiology training or resident physicians) scored highest (N = 10, average

performance 78 ± 22%), followed by group 2 (medical students before their cardiology course, N = 7, 68 ± 18%).

Group 3 (undergraduate or graduate science students without any formal training in cardiology) had the lowest

scores (N = 5, 50 ± 30%). ese values were all well above the expected baseline performance of 25% that cor-

responds to random guessing in a four-category classication experiment. We performed a univariate analysis

(one-tailed student’s t-test) and found that Group 1 and Group 3 signicantly diered (p = 0.028) in terms of

classication performance. All other comparisons of groups were not signicant (p > 0.05). 6 of 22 (27%) of all

observers had a correct classication rate of over 90% (Fig.4).

We also asked all observers whether they had been actively playing an instrument for three or more years

at any time during their lives. In a univariate analysis of this variable, those N = 13 observers that had musical

Figure 3. Group performance in blinded assessment of ECG signals. Average correct classication rate for

each of the three observer groups.

Figure 4. Classication performance of blinded observers. Data analysis results for 12 sound samples and 22

observers are shown. White cells show correct classication, black cells show wrong classication. Observers

are ordered by their group (G) with G 1 = medical students with completed cardiology training or resident

physicians; G 2 = medical students before their cardiology course; G 3 = science students without any formal

training in cardiology.

www.nature.com/scientificreports/

5

SCIENTIFIC RepoRts | 7:44549 | DOI: 10.1038/srep44549

training achieved higher scores than the other group (74 ± 21% vs. 59 ± 28%). However, these dierences were

not statistically signicant (p = 0.09, one-sided t-test).

Premature ventricular contractions are most easily detected in sonied ECGs. From the set of

264 data points in our experiments, we analyzed which type of ECG abnormality was most easily detected. We

found that classication performance was best in the class “premature ventricular contraction” (PVC) with 89%

correct classications (see Fig.5). is expected outcome can be attributed to the fact that it is the only of the

4 conditions where the rhythmical features deviate signicantly. Generally, this underpins our assumption that

rhythmical features and their deviation are a kind of structure that is easily perceived in an auditory display.

Discussion

In this study, we demonstrate for the rst time that minimally trained observers can successfully analyze sonied

ECG data and detect clinically relevant pathological patterns. Although the training period was only approxi-

mately ten minutes, most observers were able to intuitively grasp the sonication technique and to successfully

apply it to unknown samples. Classication performance was signicantly better in those with formal training in

cardiology compared to other observers. is shows that users who are already trained to visually detect abnor-

malities in ECG signals can make use of this ability in classifying sonied ECGs as well. Consequently, their

mental representation of pathological ECG patterns is not restricted to visual patterns. Another interesting nd-

ing during our study was that 24% of all observers achieved very high classication accuracies (over 90%). ese

participants can serve as a proof of principle, showing that it is possible for human observers to reliably classify

sonied pathological ECG patterns. During data analysis, several observers reported that they found the classi-

cation task to be easier towards the end of the analysis, suggesting a yet unexploited capacity of auditory learning

and classication improvement with more extensive training or even longitudinal use. Further analyses of our

data set showed that among the four selected ECG patterns, premature ventricular contractions were most easily

detected. We attribute this to the fact that only in those ECG samples of our four conditions, a regular rhythm is

disrupted by isolated events.

It should be noted that the present study has limited statistical power: 22 blinded observers analyzed the data

and showed a good overall performance, in almost all cases well above the random guessing accuracy of 25%.

Also, we detected dierences between the groups, with the best performance among observers that were formally

trained in cardiology. Still, to clearly demonstrate in which circumstances the sonication methods yields best

results and which group of observers might benet most, more research is needed. A rst step would be the

validation of our ndings in a larger study with more dierent types of pathological ECG samples. We plan to

optimize our method to render task-specic structures more salient, which can then evaluate rened sonication

types against the actual baseline. It will also be interesting to investigate in how far a time-compression aects

classication, assuming that a signicant time reduction can be achieved for diagnosis.

Another interesting perspective is the combination of data sonication with data visualization. In our personal

experience, simultaneous presentation of sonied and visualized ECG data allows a very ecient detection of

abnormal signals. In the future, these synergies between visual and auditory data presentation should be further

investigated.

Figure 5. Confusion matrix of the classication. Classication performance is shown for all 22 observers

for 264 classication tasks. Units on the color bar represent the number of samples. e vertical bar represents

the true class, the horizontal bar represents the class assigned by human observers. Correctly classied

samples are on the diagonal, while o-diagonal samples are not correctly classied. It can be seen that the class

“PVC” showed the highest number of correct classications (STEMI = ST-elevation myocardial infarction,

PVC = premature ventricular contraction, A. Fib. = atrial brillation).

Polyphonic sonification of electrocardiography signals for diagnosis of cardiac pathologies.

Citations

ECG sonification to support the diagnosis and monitoring of myocardial infarction

Investigating effective methods of designing sonifications

The CURAT Sonification Game: Gamification for Remote Sonification Evaluation

CardioSounds: Real-time Auditory Assistance for Supporting Cardiac Diagnostic and Monitoring

Real-time audio and visual display of the Coronavirus genome

References

PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals.

Nutzung der EKG-Signaldatenbank CARDIODAT der PTB über das Internet

Correction: Corrigendum: Novel anti-thrombotic agent for modulation of protein disulfide isomerase family member ERp57 for prophylactic therapy

Visualization of image data from cells to organisms

Taxonomy and definitions for sonification and auditory display

Related Papers (5)

Sonific ation Report: Status of the Field and Research Agenda

Transfer learning for ECG classification.

Identifying UMLS concepts from ECG Impressions using KnowledgeMap

Classification of 12-lead ECGs: the PhysioNet/Computing in Cardiology Challenge 2020

Educational Software Applied in Teaching Electrocardiogram: A Systematic Review.