What are the contributions in this paper?

Journal Article•DOI•

An introduction to hidden Markov models

Lawrence R. Rabiner¹, Biing-Hwang Juang•Institutions (1)

01 Jan 1986-IEEE Assp Magazine (IEEE)-Vol. 3, Iss: 1, pp 4-16

TL;DR: The purpose of this tutorial paper is to give an introduction to the theory of Markov models, and to illustrate how they have been applied to problems in speech recognition.

read less

Abstract: The basic theory of Markov chains has been known to mathematicians and engineers for close to 80 years, but it is only in the past decade that it has been applied explicitly to problems in speech processing. One of the major reasons why speech models, based on Markov chains, have not been developed until recently was the lack of a method for optimizing the parameters of the Markov model to match observed signal patterns. Such a method was proposed in the late 1960's and was immediately applied to speech processing in several research institutions. Continued refinements in the theory and implementation of Markov modelling techniques have greatly enhanced the method, leading to a wide range of applications of these models. It is the purpose of this tutorial paper to give an introduction to the theory of Markov models, and to illustrate how they have been applied to problems in speech recognition.

...read moreread less

Summary (1 min read)

Jump to: and [Summary]

Summary

A more efficient representation may then be obtained by using a common short time model for each of the steady, or well-behaved parts of the signal, along with some characterization of how one such period evolves to the next.
The model of Fig. 1c, which the authors call the 2-biased coins model, has two states (corresponding to two different coins).
The observations from the urn and ball model consists of announcing the color of the ball drawn at random from a selected urn.
The authors now explain the elements and the mechanism of the type of HMM's that they discuss in this paper: 1. There are a finite number, say N, of states in the model; they shall not rigorously define what a state is but simply say that within a state the signal possesses some measurable, distinctive properties.
Problem 3 is the one in which the authors attempt to optimize the model parameters so as to best describe how the observed sequence comes about.
Finally the authors point out that all the formulas presented in this paper for a single observation sequence can be modified to handle the case of multiple observation sequences.
For word recognition where the starting and ending points of the utterance are approximately known, it is found to be advantageous to use the above mentioned left-to-right models, particularly as shown in Fig. 6b.
Presently he is engaged in research on speech recognition and digital signal processing techniques at Bell laboratories, Murray Hill.

Did you find this useful? Give us your feedback

Figures (2)

Figure 6. Illustration of the computation required for the calculation of the joint event that the system is in state qi at time t. and state qj at ti~e t + 1. This event occurs with probability atU) (which accounts for the path terminating in state qi at time t). times Bij bj W t+ 1) (which accounts for the local transition from state qj). times {3t+1(j) (which accounts for the path being in state j at time t + 1 and then being unconstrained until the end of the observation sequence).

Content maybe subject to copyright Report

Introduction to Hidden

Markov Models

The basic theory of Markov chains has been known to

mathematicians and engineers for close to 80 years, but it

only

the

past decade that it has been applied explicitly to

problems

speech processing.

One

of the major reasons why

speech models, based on Markov chains, have not been devel-

oped until recently was

the

lack of a method for optimizing

the parameters of the Markov model to match observed signal

patterns. Such a method

was

proposed

the late 1960's and

was immediately applied to speech processing

.in

several re-

search institutions. Continued refinements

the theory and

implementation of Markov modelling techniques have greatly

enhanced the method, leading to a wide range of applications

of these models.

the purpose of this tutorial paper to

give an introduction to

the

theory of Markov models, and to

illustrate how they have been applied to problems

speech

recognition.

INTRODUCTION

SSUME

YOU

ARE

GIVEN

the

following problem. A

real world

process

produces

sequence

of observable

symbols. The symbols

could

discrete (outcomes

coin

tossing

experiments,

characters from a finite alphabet,

quantized vectors from a

code

book,

etc.)

continuous

(speech samples, autocorrelation vectors, vectors

linear

prediction coefficients, etc.). Your job

build a signal

model

that

explains

and

characterizes

the

occurrence

the

observed

symbols.

such a signal model

obtain-

able, it

then

can

used

later

identify

recognize

other

sequences

observations.

attacking

such

problem,

some

fundamental deci-

sions,

guided

by signal

and

system theory, must

made.

For

example,

one

must

decide

the

form of

the

model,

linear

non-linear, time-varying

time-invariant, deter-

ministic

stochastic.

Depending

these

decisions, as

well as

other

signal processing considerations, several

possible signal

models

can

constructed.

fix

ideas,

consider

modelling a

pure

sinewave.

have reason

believe

that

the

observed

symbols

are

from

pure

sinewave,

then

all

that

would

need

measured

the

amplitude,

frequency

and

perhaps

phase

the

sine-

wave

and

an exact

model,

which explains

the

observed

symbols,

would

result.

iEEE

ASSP MAGAZINE JANUARY 1986

Rabiner

juang

Consider next a

somewhat

complicated signal-

namely a sinewave

imbedded

in noise. The noise

compo-

nents

the

signal make

the

modelling problem more

complicated

because

order

properly estimate

the

sinewave

parameters

(amplitude,

frequency,

phase)

one

has

take into

account

the

characteristics

the

noise

component.

the

above

examples,

have

assumed

the

sinewave

part

the

signal was stationary-i .e.

not

time varying. This

may

not

a realistic assumption.

If,

for example,

the

unknown process

produces

a sinewave with varying am-

plitude,

then

clearly a non-linear

model,

e.g. amplitude-

modulation, may

appropriate. Similarly,

assume

that

the

frequency, instead

the

amplitude,

the

sinewave

changing, a frequency-modulation model

might

most appropriate.

Linear

system

models

The

concepts

behind

the

above

examples have

been

well

studied

classical communication theory. The vari-

ety and types

real world

processes,

however,

does

not

stop

here. Linear system

models,

which model

the

ob-

served symbols as

the

output

a linear system excited by

an appropriate

source,

form

another

important class

processes for signal

modeling

and

have proven useful for

a wide variety

applications. For example,

"short

time"

segments

speech

signals

can

effectively

modeled

the

output

an all-pole filter excited by

appropriate

sources with essentially a flat spectral

envelope.

The signal

modeling

technique,

this case,

thus

involves deter-

mination

the

linear filter coefficients and, in

some

cases,

the

excitation

parameters.

Obviously, spectral analy-

ses

other

kinds also

fall

within this category.

One

can further

incorporate

temporal variations of

the

signal. into

the

linear system model by allowing

the

filter

coefficients,

the

excitation

parameters,

change

with

time.

fact, many real world

processes

cannot

mean-

ingfully

modeled

without

considering

such

temporal

variation.

Speech

signals

are

one

example

such

pro-

cesses.

There

are several ways

address

the

problem of

modeling temporal variation

a signal.

mentioned

above,

within a

"short

time"

period,

some

physical signals,

such

speech,

can

effectively

modeled by a simple linear time-invariant system with

the

IEEE

appropriate excitation. The easiest way

then

address

the

time-varying nature of

the

process

to view

as a direct

concatenation

these

smaller "short time" segments,

each such

segment

being individually represented by a

linear system model.

other

words,

the

overall model

a synchronous

sequence

of symbols

where

each of

the

symbols

a linear system model representing a short seg-

ment

the process.

sense

this type of approach

models

the

observed signal using representative tokens of

the signal itself (or

some

suitably averaged

set

of such

signals

we have multiple observations).

Time-varying

processes

Modeling time-varying processes with

the

above ap-

proach assumes

that

every such short-time segment of

observation

a unit with a

prechosen

duration.

gen-

eral, however,

there

doesn't

exist a precise procedure

to decide what

the

unit duration should

that both

the time-invariant assumption holds, and

the

short-time

linear system models (as well as concatenation of

the

mod-

els) are meaningful.

most

physical systems,

the

duration

of a short-time

segment

determined

empirically.

many processes,

course,

one

would neither expect

the

properties of

the

process to change synchronously with

every unit analysis duration,

nor

observe drastic changes

from each unit

the

next except at certain instances.

Making no further assumptions

about

the

relationship be-

tween adjacent short-time models, and treating temporal

variations, small

large, as "typical"

phenomena

the

observed signal, are key features

the

above direct con-

catenation

technique.

This template approach to signal

modeling has proven

quite useful and has been the

basis

a wide variety

speech recognition systems.

There are

good

reasons to suspect, at this point, that

the

above approach, while useful, may not

the

most effi-

cient

(in

terms

computation, storage, parameters etc.)

technique

as far as representation

concerned. Many real

world processes

seem

manifest a rather sequentially

changing behavior;

the

properties of

the

process are usu-

ally held pretty steadily, except for minor fluctuations,

for a certain period of time (or a

number

the

above-

mentioned duration units), and

then,

at certain instances,

change (gradually

rapidly)

another

set

of properties.

The opportunity for

efficient modeling can

ex-

ploited

can first identify thes.e periods of rather

steadily behavior, and

then

are willing to assume that the

temporal variations within each of

these

steady periods

are,

a sense, statisti'cal. A more efficient representation

may

then

obtained by using a common short time

model for each of

the

steady,

well-behaved parts of the

signal, along with

some

characterization

how

one

such

period

evolves

the

next. This

how

hidden

Markov models

(HMM)

come

about. Clearly,

three

prob-

lems have

addressed:

how

these

steadily

dis-

tinctively behaving periods can

identified,

how the

"sequentially" evolving nature

these

periods can

characterized, and

what

typical

common short time

model should

chosen

for each

these

periods. Hid-

den

Markov models successfully treat

these

problems un-

der

a probabilistic

statistical framework.

thus

the

purpose

of this paper

explain what a

hidden Markov model

is,

why

appropriate for certain

types of problems, and how

can

used

practice.

the

next section, we illustrate hidden Markov models via

some simple coin toss examples and outline

the

three

fundamental problems associated with

the

modeling tech-

nique. We

then

discuss how

these

problems can

solved

Section

III.

will not direct our general discussion to

anyone

particular problem,

but

at the

end

of this

paperwe

illustrate how HMM's are used via a

couple

of examples

speech recognition.

DEFINITION

HIDDEN

MARKOV

MODEL

HMM

a doubly stochastic process with an unde'r-

lying stochastic process that

not

observable (it

hid-

den),

but

can only

observed

through

another

set of

stochastic processes that produce

the

sequence

of ob-

served symbols. We illustrate HMM's with

the

following

coin toss example.

Coin

toss

example

understand

the

concept

of the HMM, consider

the

following simplified example.

You

are

a room with a

barrier (e.g., a curtain) through which you

cannot

see

what

happening.

the

other

side of the barrier

another person

who

performing a coin (or multiple

coin) tossing experiment. The

other

person will not tell

you anything

about

what he

doing exactly; he will only

tell you

the

result

each coin flip. Thus a

sequence

hidden coin tossing experiments

performed, and you

only observe

the

results of

the

coin tosses, i.e.

•••••••••••

where

'M-

stands for heads and

stands for tails.

Given

the

above experiment, the problem

how

do we

build an

HMM

to explain

the

observed

sequence

heads

and tails.

One

possible model

shown

Fig.

1a. We call

this the "1-fair coin" model. There

are

two states

the

model,

but

each state

uniquely associated with either

heads (state

tails (state 2). Hence this model

not

hidden

because

the

observation

sequence

uniquely de-

fines

the

state. The model represents a "fair coin" because

the probability of generating a head (or a tail) following a

head (or a tail)

0.5; hence

there

bias

the

current

observation. This

degenerate

example and shows how

independent

trials, like tossing

a fair coin, can

inter-

preted as a set

sequential events.

course,

the

person

behind

the

barrier is,

fact, tossing a single fair

coin, this model should explain

the

outcomes

very well.

second

possible

HMM

for explaining

the

observed

sequence

of cofn toss

outcomes

giv-en

Fig.

1 b. We

call

this model

the

"2-fair coin" model. There are again 2 states

the model, but neither state

uniquely associated with

JANUARY

1986

IEEE ASSP MAGAZINE

0.5

1 2

0.5

P(H)= 1.0

P(H) =

0.0

P(T)=

0.0

PIT):

1.0

0.5

1 2

0.5

P(H)=0.5

P(H) =

0.5

P(T)=0.5

p(n=0.5

0.5

1 2

0.5

P(H) =

0.75

P(H) =

0.25

PIT)

0.25

=0.75

STATE

1 2 3

P(H)

0.6

0.25 0.45

PIT)

0.4

0.75

0.55

1 -

FAIR

COIN

MODEL

FAIR

COINS

MODEL

BIASED

COINS

MODEL

BIASED

COINS

MODEL

Figure

Models which

can

used

explain

the

results of

hidden

coin

tossing experiments.

The

sim-

plest

model.

shown

part

(a).

consists of a single fair

coin

with the outcome heads corresponding

one

state

and

tails

the other state.

The

model

part

(b)

corresponds

tossing two fair (unbiased) coins. with

the

first

coin

being

used

state

and

the second

coin

being

used

state

independent "fair"

coin

used.

decide which of the other two fair coins is

flipped

each trial.

The

model

part

(c)

corresponds

to tossing two biased coins. with the

first

coin

being

heavily biased towards heads.

and

the second

coin

heavily

biased towards tails. Again a "fair" coin is

used

decide which biased

coin.

is tossed

each trial.

Finally the

model

part

d corresponds

the case of

3 biased coins

being

used.

either heads

tails. The probabilities

heads (or tails) in

either state

0.5. Also the probability

leaving (or

re-

maining in) either state

0.5. Thus, in this

case,

can

associate each state

with

a fair (unbiased) coin. Although

the probabilities associated with remaining in,

leaving,

either

the tw€H.tates are all 0.5, a little

thought

should

convince the reader that the statistics

the observable

output

sequencres

the 2-fair coins model are indepen-

dent

the state transitions. The reason

for

this

that this

IEEE

ASSP MAGAZINE JANUARY

1986

model

hidden (Le. we cannot know exactly

which

fair

coin (state) led

the observed heads

tails at each ob-

servation),

but

essentially indistinguishable (in a statisti-

cal

sense)

from

the 1-fair coin model

Fig.

1a.

Figures 1c and 1d show two more possible

HMM's

which

can

account

for

the observed sequence

heads and tails.

The model

Fig.

1c, which we call the 2-biased coins

model,

has

two

states (corresponding

two different

coins). In state 1,

the

coin

biased strongly towards

heads. In state

the coin

biased strongly towards tails.

The state transition probabilities are all equal to 0.5. This

2-biased coins model

a hidden Markov model which

distinguishable

from

the

two

previously discussed

models. Interestingly, the reader should be able

con-

vince himself that the long time statistics (e.g. average

number

heads

tails)

the observation sequences

from the

HMM

Fig. 1c are the

same

those

from

the

models

Figs.

and 1b. This model

very appropriate

what

happening behind the barrier

follows. The

person

has

three coins, one fair and the other

two

biased

according

the description in

Fig.

1c. The two biased

coins are associated-

with

the

two

faces

the fair coin

respectively.

report the outcome

every mysterious

coin flip, the person behind the barrier first flips

the

fair

coin

decide which biased coin

use, and then flips the

chosen biased coin

obtain the result. With this model,

we thus are able

look

into and explain the above subtle

characteristic changes (Le. switching the biased coins).

The model

Fig.

1d, which we call the 3-biased coins

model,

has

three states (corresponding

three different

coins). In state 1

the

coin

biased slightly towards heads;

in state 2

the

coin

biased strongly toward tails; in state 3

the coin

biased slightly toward tails. We

have

not

speci-

fied values

the state transition probabilities in

Fig.

1d;

clearly the behavior

the observation sequences pro-

duced by such a model are strongly dependent on these

transition probabilities.

(To

convince himself

this, the

reader should consider

two

extreme

cases,

namely when

the probability

remaining in state 3

large (>0.95), or

small

«0.05).

Very

different

sequence statistics

will

result

from these

two

extremes because

the strong bias

the

coin associated

with

state

3).

with the 2-biased coin

model, some real scenario behind

the

barrier, corre-

sponding

such a model

can

be composed; the reader

should find

difficulty

doing this himself.

There are several important points

be learned from

this discussion

how

model the outputs

the coin

tossing experiment via HMM's. First we note that one of

the most

difficult

parts

the modeling procedure

decide on the size (the number

states)

the model.

Without

some a

priori

information, this decision often

difficult

make and could involve trial and error before

settling on

the

most appropriate model size. Although we

stopped at a 3-coin model

for

the above illustration, even

this might be

too

small.

How

we decide on

how

many

coins (states) are really needed in the model? The answer

this question

even larger question,

namely

how

we choose model parameters (state transi-

Urn

Pr(R)=

•

(8)

= •

(Y)

= •

•

Urn

(R)

= •

(8)=

•

Pr(Y)=

•

• • •

Urn

(R)

(8)

= •

Pr(Y)=·

•

Figure 2.

urn

and

ball

model

which illustrates the general

case

of a discrete

symbol

hidden

Markov

model.

Each

of N urns (the N states of the

model)

contains a large number of colored balls.

The

proportion of each colored

ball.

each urn. is different.

and

is governed

the probability density of colors

for

each

urn.

The

observations from

the urn

and

ball

model

consists of announcing the color of the

ball

drawn

random from a selected urn. replacing

the

ball.

and

then choosing a new urn from which

select a

ball

according

the

state

transition density associated

with the originally selected urn.

tion probabilities, probabilities

heads and tails in

each

state)

optimize

the

model

that

best explains the

observed outcome sequence. We

will

try

answer this

question in the section on Solutions

the Three

HMM

Problems

this

the key to the successful

use

HMM's

for

real

world

problems. A final

point

concerns the size

the observation sequence.

we are restricted

a small

finite observation sequence we may

not

be able

reliably

estimate the optimal model parameters. (Think

the

case

actually using

coins

but

be given a set

50-100 observations). Hence, in a sense, depending on the

amount

model

training

data we are given, certain

HMM's

may

not

be statistically, reliably different.

Elements of an

HMM

now

explain the elements and

the

mechanism

the

type

HMM's

that we discuss in this paper:

There are a finite number,

say

states in the

model; we shall

not

rigorously define what a state

but

simply

say

that

within

a state the signal possesses some

measurable, distinctive properties.

each clock time,

new

state

entered

based

upon a transition probability distribution which depends

the

previous state (the Markovian property). (Note that

the transition may be such that the process remains in the

previous state.)

After each transition

made,

observation

output

symbol

produced according

a probability distribution

which depends on

the

current state. This probability distri-

bution

held fixed

for

the state regardless

when and

how

the

state

entered. There are thus N such obser-

vation probability distributions

which,

course, repre-

sent randpm variables

stochastic processes.

fix ideas, let

consider the

"urn

and ball" model

Fig.

2. There

are

N urns, each filled

with

a large number

colored balls. There are M possible colors

for

each

ball.

The observation sequence

generated by initially choos-

ing one

the N urns (according

initial probability

distribution), selecting a ball from

the

initial urn, record-

ing its color, replacing the ball, and then choosing a

new

urn according

a transition probability distribution

asso-

ciated

with

the current urn. Thus a typical observation

sequence might be:

clock time

1234·

urn (hidden) state

q3q1q1q2'"

qN-2

color (observation) R B Y Y

...

now

formally define the following model notation

for

a discrete observation

HMM:

T = length

the observation sequence (total number

clock times)

N = number

states (urns) in the model

= number

observation symbols (colors)

Q =

{q1,

q2,

,qN}, states (urns)

{V1,

V2,

...

,VM}

discrete set of possible symbol obser-

vations (colors)

A =

{aij},

aij

= Pr(qj at t +

t),

state transition proba-

bility

distribution

B = {bj(k)}, bj(k) =

Pr(vk

t),

observation symbol

probability

distribution

in state i

{7Ti},

71)

= Pr(qi at t

=1),

initial state distribution

Using

the

model,

observation

sequence, 0 =

•••

,OT,

generated

follows:

JANUARY

1986

IEEE

ASSP MAGAZINE

1. Choose an initial state, i

according to

the

initial state

distribution,7T;

2. Set t = 1;

3. Choose Ot according to bii(k),

the

symbol probability

distribution

state it;

4. Choose

it+1

according to {aitit+l}'

it+1

1,2,

...

,N,

the

state transition probability distribution for state it;

Set t = t +

return to step 3

t <

otherwise

terminate

the

procedure.

use

the

compact notation A =

(A,

7T)

to represent

HMM.

Specification

HMM

involves choice

the

number

states, N, and

the

number of discrete symbols

(we

will

briefly discuss continuous density HMM's at

the

end

this paper),

and

specification of

the

three

probability densities A,

and

7T.

try to specify

the

relative importance of

the

three

densities, A,

and

7T,

then it should

clear

that

for most applications

the

least important (this represents initial conditions), and B

the

most important (since it

directly related to

the

ob-

served symbols). For

some

problems

the

distribution A

also quite important (recall

the

3-biased coins models dis-

cussed earlier), whereas for

other

problems (e.g. isolated

word recognition problems)

of less importance.

The

three problems

for

HMM's

Given

the

form of

the

HMM

discussed

the

section,

there

are

three

key problems of interest

that

must

solved for

the

model

useful

real world applica-

tions. These problems

are

the

following:

Problem

1 - Given

the

observation

sequence

0 =

•••

, OT,

and

the

model

A =

(A,B,7T), how we compute Pr(OIA),

the

probability

the

observation

sequence.

Problem 2 - Given

the

observation

sequence

0 =

•••

OT,

how

choose

a state

sequence

I = i

, i

, •

•••

which

opti-

mal

some

meaningful sense.

Problem 3 - How

adjust

the

model

parameters

A =

(A,

7T)

to maximize Pr(O I

A).

Problem 1

the

evaluation problem: given a model and

sequence

of observations, how

can

compute

the

probability

that

the

observed

sequence

was

produced

the

model. We can also view

the

problem as: given a

model and a

sequence

of observations, how we "score"

evaluate

the

model. The latter viewpoint

very useful.

we think of

the

case

which we have several competing

models (e.g.

the

four models of

Fig.

1 for

the

coin tossing

experiment),

the

solution

problem

1 allows us

choose

the

model which

best

matches

the

observations.

Problem 2

the

one

in which we attempt to uncover

the

hidden part of

the

model, i.e.

the

state sequence. This

a typical estimation problem. We usually use an opti-

mality criterion

solve this problem as best as possible.

Unfortunately, as

will

see,

there

are several possible

optimality criteria that can

imposed and

hence

the

choice of criterion

a strong function of

the

intended use

IEEE

ASSP MAGAZINE JANUARY

1986

for

the

uncovered state

sequence.

A typical use

the

recovered state

sequence

learn

about

the

structure of

the model, and to get average statistics, behavior, etc.

within individual states.

Problem 3

the

one

which we

attempt

to optimize

the model parameters

as to

best

describe how

the

ob-

served

sequence

comes about. We call this a training se-

quence

this case since it

used to train

the

model. The

training problem

the

crucial

one

for

most

applications

of HMM's since it allows us

optimally adapt model

parameters to observed training

data...,--

i.e.

create best

models for real

phenomena.

fix

ideas, consider

the

following

speech

recognition

scheme. We want to design an N-state

HMM

for each word

of a V-word vocabulary. Using vector quantization

(VQ)

techniques,

represent

the

speech

signal by a

sequence

of VQ

codebook

symbols derived from an M-word code-

book. Thus we start with a training

sequence,

for each

vocabulary word, consisting of a

number

repetitions of

the

spoken word (by

one

talkers).

use

the

solution to Problem 3

optimally get model parameters

for each word model.

develop an understanding

the

physical

meaning

the

model states, we

use

the

solution

Problem 2

segment

each

the

word

training se-

quences into states,

and

then

study

the

observations oc-

curring

each state. The result

this study may lead to

further improvements

the

model. We shall discuss this

later sections. Finally

recognition

an unknown

word,

use

the

solution to Problem 1

score each

word model based

upon

the

given test observation se-

quence, and select

the

word

whose

word model

score

the

highest.

We now

present

the

formal mathematical solutions to

each of

the

three

fundamental problems for HMM's. And,

shall

see,

these

three

problems may

linked to-

gether

under

our

probabilistic framework.

SOLUTIONS

THE

THREE

HMM

PROBLEMS

Problem 1

We wish

calculate

the

probability

the

observation

sequence

given

the

model

The most straightforward

way of doing this

through

enumerating every possible

state

sequence

of length T (the

number

observations).

For

every fixed state

sequence

I = i

••

iT,

the

proba-

bility of

the

observation

sequence

Pr(O

II,

A),

where

Pr(

II,

bil

01)b

( O

)

•••

bii

OT)

The probability of such a state

sequence

the

other

hand,

The joint probability of 0 and I, i.e.,

the

probability that

o and I

occur

simultaneously,

simply

the

product

the

above

two

terms,

Pr(O,

= Pr(O

II,

PrUI A). The

probability of 0

then

obtained by summing this joint

probability over

all

possible state

sequences:

HTML Viewer

An introduction to hidden Markov models

Summary (1 min read)

Summary

Figures (2)

Citations

Cites background from "An introduction to hidden Markov mo..."

References

Related Papers (5)

Frequently Asked Questions (1)

Q1. What are the contributions in this paper?