What is the function w([u, v]) for a graph?

For a strongly connected graph, there is an integer k ≥ 1 and a unique partition V = V0∪V1∪· · ·∪Vk−1 such thatfor all 0 ≤ r ≤ k − 1 each edge [u, v] ∈ E with u ∈

What is the inverse of the random walk?

Given an n × n invertible matrix A, the time required to compute the inverse A−1 is generally O(n3) and the representation of the inverse requires Ω(n2) space.

What is the simplest way to partition a graph?

Given a directed graph G = (V, E), it may be partitioned into two parts as follows:1. Define a random walk over G with a transition probability matrix P such that it has a unique stationary distribution.

What is the eigenvector of the graph?

Compute an eigenvector Φ of Θ corresponding to the second largest eigenvalue, and then partition the vertex set V of G into the two parts S = {v ∈ V |Φ(v) ≥ 0} and Sc = {v ∈ V |Φ(v) < 0}.

What is the next step of the random walk?

Given that the authors are currently at vertex u, the next step of this random walk proceeds as follows: first jump backward to a vertex h adjacent to u with probability p−(u, h) = w([h, u])/d−(u); then jump forward to a vertex v adjacent from u with probability p+(h, v) = w([h, v])/d+(h).

What is the function of the function h H(V)?

Define an indicator function h ∈ H(V ) by h(v) = 1 if v ∈ S, and −1 if v ∈ Sc. Denote by ν the volume of S. Clearly, the authors have 0 < ν < 1 due to S ⊂ G.

(Open Access) Learning from labeled and unlabeled data on a directed graph (2005) | Dengyong Zhou

Q: What are the contributions in "Learning from labeled and unlabeled data on a directed graph" ?

The authors propose a general framework for learning from labeled and unlabeled data on a directed graph in which the structure of the graph including the directionality of the edges is considered. The time complexity of the algorithm derived from this framework is nearly linear due to recently developed numerical techniques. In the absence of labeled instances, this framework can be utilized as a spectral clustering method for directed graphs, which generalizes the spectral clustering approach for undirected graphs.

Q: What is the function of the undirected graph?

For an undirected graph G = (V,E), it is well-known that the stationary distribution of the natural random walk has a closed-form expression π(v) = d(v)/ volV, where d(v) denotes the degree of the vertex v, and volV = ∑ u∈V d(u).

Q: What is the popular method for clustering directed graphs?

In the absence of labeled instances, their approach reduces to a spectral clustering method for directed graphs, which generalizes the work of Shi and Malik (2000) that may be the most popular spectral clustering scheme for undirected graphs.

Q: What is the probability of a random walk?

Giventhat the authors are currently at vertex u with d+(u) > 0, the next step of this random walk proceeds as follows: (1) with probability 1 − η jump to a vertex chosen uniformly at random over the whole vertex set except u; and (2) with probability ηw([u, v])/d+(u) jump to a vertex v adjacent from u.

Q: What is the simplest way to classify a random walk?

At the beginning of this section, the authors assume the graph to be strongly connected and aperiodic such that the natural random walk over the graph converges to a unique and positive stationary distribution.

Q: What is the next step of the random walk?

Given that the authors are currently at vertex u, the next step of this random walk proceeds as follows: first jump backward to a vertex h adjacent to u with probability p−(u, h) = w([h, u])/d−(u); then jump forward to a vertex v adjacent from u with probability p+(h, v) = w([h, v])/d+(h).

Learning from Labeled and Unlabeled Data on a Directed Graph

Dengyong Zhou dengyong.zhou@tuebingen.mpg.de

Max Planck Institute for Biological Cybernetics, Spemannstr. 38, 72076 T¨ubingen, Germany

Jiayuan Huang j9huang@cs.uwaterloo.ca

School of Computer Science, University of Waterloo, Waterloo ON, N2L 3G1, Canada

Bernhard Sch¨olkopf bernhard.schoelkopf@tuebingen.mpg.de

Max Planck Institute for Biological Cybernetics, Spemannstr. 38, 72076 T¨ubingen, Germany

Abstract

We propose a general framework for learning

from labeled and unlabeled data on a directed

graph in which the structure of the graph in-

cluding the directionality of the edges is con-

sidered. The time complexity of the algo-

rithm derived from this framework is nearly

linear due to recently developed numerical

techniques. In the absence of labeled in-

stances, this framework can be utilized as

a spectral clustering method for directed

graphs, which generalizes the spectral clus-

tering approach for undirected graphs. We

have applied our framework to real-world web

classiﬁcation problems and obtained encour-

aging results.

1. Introduction

Given a directed graph, the vertices in a subset of the

graph are labeled. Our problem is to classify the re-

maining unlabeled vertices. Typical examples of this

kind are web page categorization based on hyperlink

structure and document classiﬁcation based on cita-

tion graphs (Fig. 1). The main issue to be resolved

is to determine how to eﬀectively exploit the structure

of directed graphs.

One may assign a label to an unclassiﬁed vertex on the

basis of the most common label present among the

classiﬁed neighbors of the vertex. However we want

to exploit the structure of the graph globally rather

than locally such that the classiﬁcation or clustering

is consistent over the whole graph. Such a point of

Appearing in Proceedings of the 22

International Confer-

ence on Machine Learning, Bonn, Germany, 2005. Copy-

right 2005 by the author(s)/owner(s).

view has been considered previously in the method of

Zhou et al. (2005). It is motivated by the framework

of hubs and authorities (Kleinberg, 1999), which sep-

arates web pages into two categories and uses the fol-

lowing recursive notion: a hub is a web page with links

to many good authorities, while an authority is a web

page that receives links from many good hubs. In con-

trast, the approach that we will present is inspired by

the ranking algorithm PageRank used by the Google

search engine (Page et al., 1998). Diﬀerent from the

the framework of hubs and authorities, PageRank is

based on a direct recursion as follows: an authorita-

tive web page is one that receives many links from

other authoritative web pages. When the underlying

graph is undirected, the approach that we will present

reduces to the method of Zhou et al. (2004).

There has been a large amount of activity on how to

exploit the link structure of the web for ranking web

pages, detecting web communities, ﬁnding web pages

similar to a given web page or web pages of interest to a

given geographical region, and other applications. One

may refer to (Henzinger, 2001) for a comprehensive

survey. Unlike those work, the present work is on how

to classify the unclassiﬁed vertices of a directed graph

in which some vertices have been classiﬁed by globally

exploiting the structure of the graph. Classifying a ﬁ-

nite set of objects in which some are labeled is called

transductive inference (Vapnik, 1998). In the absence

of labeled instances, our approach reduces to a spectral

clustering method for directed graphs, which general-

izes the work of Shi and Malik (2000) that may be the

most popular spectral clustering scheme for undirected

graphs. We would like to mention that understanding

how eigenvectors partition a directed graph has been

proposed as one of six algorithmic challenges in web

search engines by Henzinger (2003). The framework

of probabilistic relational models may also be used to

Learning from Labeled and Unlabeled Data on a Directed Graph

Figure 1. The World Wide Web can be thought of as a

directed graph, in which the vertices represent web pages,

and the directed edges hyperlinks.

deal with structured data like the web (e.g. Getoor

et al. (2002)). In contrast to the spirit of the present

work however, it focuses on modeling the probabilistic

distribution over the attributes of the related entities

in the model.

The structure of this paper is as follows. We ﬁrst

introduce some basic notions from graph theory and

Markov chains in Section 2. The framework for learn-

ing from directed graphs is presented in Section 3. In

the absence of labeled instances, as shown in section 4,

this framework can be utilized as a spectral clustering

approach for directed graphs. In Section 5, we develop

discrete analysis for directed graphs, and characterize

this framework in terms of discrete analysis. Exp eri-

mental results on web classiﬁcation problems are de-

scribed in Section 6.

2. Preliminaries

A directed graph

= (

V, E

) consists of a ﬁnite set

together with a subset E ⊆ V ×V. The elements of V

are the vertices of the graph, and the elements of E are

the edges of the graph. An edge of a directed graph is

an ordered pair [u, v] where u and v are the vertices of

the graph. We say that the vertex v is adjacent from

the vertex u, and the the vertex u is adjacent to the

vertex v. Moreover, we say that the edge is incident

from the vertex u and incident to the vertex v. When

u = v the edge is called a loop. A graph is simple if it

has no loop.

A path in a directed graph is a tuple of vertices

, v

, . . . , v

) with the property that [v

, v

i+1

] ∈ E

for 1 ≤ i ≤ p − 1. We say that a directed graph is

strongly connected when for every pair of vertices u

and v there is a path in which v

= u and v

= v. For

a strongly connected graph, there is an integer k ≥ 1

and a unique partition V = V

∪V

∪···∪V

k−1

such that

for all 0 ≤ r ≤ k − 1 each edge [u, v] ∈ E with u ∈ V

has v ∈ V

r+1

, where V

= V

, and k is maximal, that

is, there is no other such partition V = V

∪···∪V

−1

with k

> k. When k = 1, we say that the graph is

aperiodic; otherwise we say that the graph is periodic.

A graph is weighted when there is a function w :

E → R

which associates a positive value w([u, v])

with each edge [u, v] ∈ E. The function w is called a

weight function. Typically, we can equip a graph with

a canonical weight function deﬁned by w([u, v]) := 1 at

each edge [u, v] ∈ E. Given a weighted directed graph

and a vertex v of this graph, the in-degree function

−

: V → R

and out-degree function d

: V → R

are respectively deﬁned by d

−

(v) :=

u→v

w([u, v]),

and d

(v) :=

u←v

w([v, u]), where u → v denotes

the set of vertices adjacent to the vertex v, and u ← v

the set of vertices adjacent from the vertex v.

Let H(V ) denote the space of functions, in which each

one f : V → R assigns a real value f(v) to each vertex

v. A function in H(V ) can be thought of as a col-

umn vector in R

|V |

, where |V | denotes the number of

the vertices in V . The function space H(V ) then can

be endowed with the standard inner product in R

|V |

as hf, gi

H(V )

v∈ V

f(v)g(v) for all f, g ∈ H(V ).

Similarly, deﬁne the function space H(E) consisting of

the real-valued functions on edges. When the function

space of the inner product is clear in its context, we

omit the subscript H(V ) or H(E).

For a given weighted directed graph, there is a nat-

ural random walk on the graph with the transition

probability function p : V × V → R

deﬁned by

p(u, v) = w([u, v])/d

(u) for all [u, v] ∈ E, and 0 oth-

erwise. The random walk on a strongly connected and

aperiodic directed graph has a unique stationary dis-

tribution π, i.e. a unique probability distribution satis-

fying the balance equations π(v) =

u→v

π(u)p(u, v),

for all v ∈ V. Moreover, π(v) > 0 for all v ∈ V.

3. Regularization Framework

Given a directed graph G = (V, E) and a label set Y =

{1, −1}, the vertices in a subset S ⊂ V is labeled. The

problem is to classify the vertices in the complement of

S. The graph G is assumed to be strongly connected

and aperiodic. Later we will discuss how to dispose

this assumption.

Assume a classiﬁcation function f ∈ H(V ), which as-

signs a label sign f(v) to each vertex v ∈ V. On the one

hand, the classiﬁcation function should be as smooth

as possible. Speciﬁcally, a pair of vertices linked by

an edge are likely to have the same label; moreover,

vertices lying on a densely linked subgraph are likely

Learning from Labeled and Unlabeled Data on a Directed Graph

to have the same label. Thus we deﬁne a functional

Ω(f) :=

[u,v ]∈E

π(u)p(u, v)

f(u)

π(u)

−

f(v)

π(v)

(1)

which sums the weighted variation of a function on

each edge of the directed graph. On the other hand,

the initial label assignment should be changed as little

as possible. Let y denote the function in H(V ) deﬁned

by y(v) = 1 or −1 if vertex v has been labeled as pos-

itive or negative respectively, and 0 if it is unlabeled.

Then we may consider the optimization problem

argmin

f∈H(V )

Ω(f) + µkf − yk

, (2)

where µ > 0 is the parameter specifying the tradeoﬀ

between the two competitive terms.

We will provide the motivations for the functional de-

ﬁned by (1). In the end of this section, this functional

will be compared with another choice which may seem

more natural. The comparison may make us gain an

insight into this functional. In Section 4, it will be

shown that this functional may be naturally derived

from a combinatorial optimization problem. In Sec-

tion 5, we will further characterize this functional in

terms of discrete analysis on directed graphs.

For an undirected graph G = (V, E), it is well-known

that the stationary distribution of the natural random

walk has a closed-form expression π(v) = d(v)/ vol V,

where d(v) denotes the degree of the vertex v, and

vol V =

u∈V

d(u). Substituting the closed-form ex-

pression into (1), we have

Ω(f) =

[u,v]∈E

w([u, v])

f(u)

d(u)

−

f(v)

d(v)

which is the regularizer of the transductive inference

algorithm of Zhou et al. (2004) operating on undi-

rected graphs.

For solving the optimization problem (2), we introduce

an operator Θ : H(V ) → H(V ) deﬁned by

(Θf)(v) =

u→v

π(u)p(u, v)f(u)

π(u)π(v)

u←v

π(v)p(v, u)f(u)

π(v)π(u)

. (3)

Let Π denote the diagonal matrix with Π(v, v) = π(v)

for all v ∈ V. Let P denote the transition probability

matrix and P

the transpose of P. Then

Θ =

1/2

P Π

−1/2

+ Π

−1/2

1/2

. (4)

Lemma 3.1. Let ∆ = I − Θ, where I denotes the

identity operator. Then Ω(f) = hf, ∆fi.

Proof. The idea is to use summation by parts, a dis-

crete analogue of the more common integration by

parts.

[u,v ]∈ E

π(u)p(u, v)

f(u)

π(u)

−

f(v)

π(v)

v∈V

u→v

π(u)p(u, v)

f(u)

π(u)

−

f(v)

π(v)

u←v

π(v)p(v, u)

f(v)

π(v)

−

f(u)

π(u)

v∈V

u→v

p(u, v)f

(u) +

u→v

π(u)p(u, v)

π(v)

(v)

−2

u→v

π(u)p(u, v)f(u)f(v)

π(u)π(v)

v∈ V

u←v

p(v, u)f

(v) +

u←v

π(v)p(v, u)

π(u)

(u)

−2

u←v

π(v)p(v, u)f(v)f(u)

π(v)π(u)

The ﬁrst term on the right-hand side may be written

[u,v ]∈ E

p(u, v)f

(u) =

u∈V

v←u

p(u, v)f

(u)

u∈V

v←u

p(u, v)

(u) =

u∈V

(u) =

v∈V

(v),

and the second term

v∈V

u→v

π(u)p(u, v)

π(v)

(v) =

v∈ V

(v).

Similarly, for the fourth and ﬁfth terms, we can show

that

v∈V

u←v

p(v, u)f

(v) =

v∈ V

(v),

and

v∈ V

u←v

π(v)p(v, u)

π(u)

(u) =

v∈V

(v).

respectively. Therefore,

Ω(f) =

v∈V

(v) −

u→v

π(u)p(u, v)f(u)f(v)

π(u)π(v)

u←v

π(v)p(v, u)f(v)f(u)

π(v)π(u)

¶¾

which completes the proof.

Learning from Labeled and Unlabeled Data on a Directed Graph

Lemma 3.2. The eigenvalues of the operator Θ are in

[−1, 1], and the eigenvector with the eigenvalue equal

to 1 is

√

π.

Proof. It is easy to see that Θ is similar to the operator

Ψ : H(V ) → H(V ) deﬁned by Ψ =

P + Π

−1

/2.

Hence Θ and Ψ have the same set of eigenvalues. As-

sume that f is the eigenvector of Ψ with eigenvalue λ.

Choose a vertex v such that |f (v)| = max

u∈V

|f(u)|.

Then we can show that |λ| ≤ 1 by

|λ||f(v)| =

u∈V

Ψ(v, u)f(u)

≤

u∈V

Ψ(v, u)|f(v)|

|f(v)|

u←v

p(v, u) +

u→v

π(u)p(u, v)

π(v)

= |f(v)|.

In addition, we can show that Θ

√

π =

√

π by

u→v

π(u)p(u, v)

π(u)

π(u)π(v)

u←v

π(v)p(v, u)

π(u)

π(v)π(u)

u→v

π(u)p(u, v)

π(v)

u←v

π(v)p(v, u)

π(v)

u→v

π(u)p(u, v) +

π(v)

u←v

p(v, u)

π(v).

Theorem 3.3. The solution of (2) is f

∗

= (1−α)(I −

αΘ)

−1

y, where α = 1/(1 + µ).

Proof. From Lemma 3.1, diﬀerentiating (2) with re-

spect to function f, we have (I −Θ)f

∗

+µ(f

∗

−y) = 0.

Deﬁne α = 1/(1 + µ). This system may b e written

(I − αΘ)f

∗

= (1 − α)y. From Lemma 3.2, we easily

know that (I − αΘ) is positive deﬁnite and thus in-

vertible. This completes the proof.

At the beginning of this section, we assume the graph

to be strongly connected and aperiodic such that the

natural random walk over the graph converges to a

unique and positive stationary distribution. Obviously

this assumption cannot be guaranteed for a general di-

rected graph. To remedy this problem, we may intro-

duce the so-called teleporting random walk (Page et al.,

1998) as the replacement of the natural one. Given

that we are currently at vertex u with d

(u) > 0, the

next step of this random walk proceeds as follows: (1)

with probability 1 − η jump to a vertex chosen uni-

formly at random over the whole vertex set except u;

and (2) with probability ηw([u, v])/d

(u) jump to a

vertex v adjacent from u. If we are at vertex u with

(u) = 0, just jump to a vertex chosen uniformly at

random over the whole vertex set except u.

Algorithm. Given a directed graph G = (V, E) and

a label set Y = {1, −1}, the vertices in a subset S ⊂ V

are labeled. Then the remaining unlabeled vertices

may be classiﬁed as follows:

1. Deﬁne a random walk over G with a transition

probability matrix P such that it has a unique sta-

tionary distribution, such as the teleporting ran-

dom walk.

2. Let Π denote the diagonal matrix with its di-

agonal elements being the stationary distribu-

tion of the random walk. Compute the matrix

Θ = (Π

1/2

P Π

−1/2

+ Π

−1/2

1/2

)/2.

3. Deﬁne a function y on V with y(v) = 1 or −1 if

vertex v is labeled as 1 or −1, and 0 if v is unla-

beled. Compute the function f = (I − αΘ)

−1

where α is a parameter in ]0, 1[, and classify each

unlabeled vertex v as sign f(v).

It is worth mentioning that the approach of Zhou

et al. (2005) can also be derived from this algo-

rithmic framework by deﬁning a two-step random

walk. Assume a directed graph G = (V, E) with

(v) > 0 and d

−

(v) > 0 for all v ∈ V. Given that

we are currently at vertex u, the next step of this

random walk proceeds as follows: ﬁrst jump back-

ward to a vertex h adjacent to u with probability

−

(u, h) = w([h, u])/d

−

(u); then jump forward to a

vertex v adjacent from u with probability p

(h, v) =

w ([h, v])/d

(h). Thus the transition probability from

u to v is p(u, v) =

h∈V

−

(u, h)p

(h, v). It is easy

to show that the stationary distribution of the random

walk is π(v) = d

−

(v)/

u∈V

−

(u) for all v ∈ V. Sub-

stituting the quantities of p(u, v) and π(v) into (1), we

then recover one of the two regularizers proposed by

Zhou et al. (2005). The other one can also be recov-

ered simply by reversing this two-step random walk.

Now we discuss implementation issues. The closed

form solution shown in Theorem 3.3 involves a ma-

trix inverse. Given an n × n invertible matrix A, the

time required to compute the inverse A

−1

is generally

O(n

) and the representation of the inverse requires

Ω(n

) space. Recent progress in numerical analy-

Learning from Labeled and Unlabeled Data on a Directed Graph

sis (Spielman & Teng, 2003), however, shows that,

for an n × n symmetric positive semi-deﬁnite, diag-

onally dominant matrix A with m non-zero entries

and a n-vector b, we can obtain a vector ˜x within rel-

ative distance ² of the solution to Ax = b in time

O(m

1.31

log(nκ

(A)/²)

O(1)

), where κ

(A) is the log of

the ratio of the largest to smallest non-zero eigenvalue

of A. It can be shown that our approach can beneﬁt

from this numerical technique. From Theorem 3.3,

I − α

1/2

P Π

−1/2

+ Π

−1/2

1/2

∗

= (1 − α)y,

which may be transformed into

Π − α

ΠP + P

(Π

−1/2

∗

) = (1 − α)Π

1/2

Let A = Π − α

ΠP + P

. It is easy to verify that A

is diagonally dominant.

For well understanding this regularization framework,

we may compare it with an alternative approach in

which the regularizer is deﬁned by

Ω(f) =

[u,v]∈E

w([u, v])

f(u)

(u)

−

f(v)

−

(v)

(5)

A similar closed form solution can be obtained from

the corresponding optimization problem. Clearly, for

undirected graphs, this functional also reduces to that

in (Zhou et al., 2004). At ﬁrst glance, this functional

may look natural, but in the later experiments we will

show that the algorithm based on this functional does

not work as well as the previous one. This is because

the directionality is only slightly taken into account by

this functional via the degree normalization such that

much valuable information for classiﬁcation conveyed

by the directionality is ignored by the corresponding

algorithm. Once we remove the degree normalization

from this functional, the resulted functional is totally

insensitive to the directionality.

4. Directed Spectral Clustering

In the absence of labeled instances, this framework can

be utilized in an unsupervised setting as a spectral

clustering method for directed graphs. We ﬁrst deﬁne

a combinational partition criterion, which generalizes

the normalized cut criterion for undirected graphs (Shi

& Malik, 2000). Then relaxing the combinational op-

timization problem into a real-valued one leads to the

functional deﬁned in Section 3.

Given a subset S of the vertices of a directed graph G,

deﬁne the volume of S by vol S :=

v∈ S

π(v). Clearly,

Figure 2. A subset S and its complement S

. Note that

there is only one edge in the out-boundary of S.

vol S is the probability with which the random walk

occupies some vertex in S and consequently vol V = 1.

Let S

denote the complement of S (Fig. 2). The out-

boundary ∂S of S is deﬁned by ∂S := {[u, v]|u ∈

S, v ∈ S

}. The value vol ∂S :=

[u,v ]∈∂S

π(u)p(u, v)

is called the volume of ∂S. Note that vol ∂S is the

probability with which one sees a jump from S to S

Generalizing the normalized cut criterion for undi-

rected graphs is based on a key observation stated by

Proposition 4.1. vol ∂S = vol ∂S

Proof. It immediately follows from that the probabil-

ity with which the random walk leaves a vertex equals

the probability with which the random walk arrives at

this vertex. Formally, for each vertex v in V, it is easy

to see that

u→v

π(u)p(u, v) −

u←v

π(v)p(v, u) = 0.

Summing the above equation over the vertices of S

(see also Fig. 2), then we have

v∈ S

u→v

π(u)p(u, v) −

u←v

π(v)p(v, u)

[u,v ]∈∂S

π(u)p(u, v) −

[u,v ]∈∂S

π(u)p(u, v) = 0,

which completes the proof.

From Proposition 4.1, we may partition the vertex set

of a directed graph into two nonempty parts S and S

by minimizing

Ncut(S) = vol ∂S

vol S

, (6)

which is a directed generalization of the normalized

cut criterion for undirected graphs. Clearly, the ra-

tio of vol ∂S to vol S is the probability with which the

Learning from labeled and unlabeled data on a directed graph

Figures

Citations

Semi-Supervised Learning Literature Survey

Semi-Supervised Learning

Introduction to Semi-Supervised Learning

Feature Selection: A Data Perspective

Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation

References

Statistical learning theory

The PageRank Citation Ranking : Bringing Order to the Web

Normalized cuts and image segmentation

Normalized cuts and image segmentation

Authoritative sources in a hyperlinked environment

Related Papers (5)

Normalized cuts and image segmentation

Spectral Graph Theory

The PageRank Citation Ranking : Bringing Order to the Web

A tutorial on spectral clustering

Authoritative sources in a hyperlinked environment

Frequently Asked Questions (15)

Q1. What are the contributions in "Learning from labeled and unlabeled data on a directed graph" ?

Q2. What is the function w([u, v]) for a graph?

Q3. What is the inverse of the random walk?

Q4. What is the function of the undirected graph?

Q5. What is the simplest way to partition a graph?

Q6. What is the eigenvector of the graph?

Q7. What is the popular method for clustering directed graphs?

Q8. What is the probability of a random walk?

Q9. What is the simplest way to classify a random walk?

Q10. What is the next step of the random walk?

Q11. What is the function used for directed graphs?

Q12. What is the function of the function h H(V)?

Q13. What is the function for undirected graphs?

Q14. What is the framework for learning from directed graphs?

Q15. What is the function that is used to classify vertices in a directed graph?