A fast distributed algorithm for mining association rules

doi:10.1109/PDIS.1996.568665

A Fast Distributed Algorithm for Mining Association

Rules

*

David

W.

Cheungt Jiawei Hans Vincent

T.

Ngtt

Ada

W.

Fuss

Yongjian

FuI

t

Department of Computer Science, The University of Hong Kong, Hong Kong. Email: dcheungOcs.hku.hk.

*

School of Computing Science, Simon Fraser University, Canada. Email: hanOcs.sfu.ca.

tt

Department of Computing, Hong Kong Polytechnic University, Hong Kong. Email:

cstyngOcomp.po1yu.edu.hk.

$4

Department of Computer Science and Engineering, The Chinese University of Hong Kong,Hong Kong. Email: adafuOcs.cuhk.hk.

Abstract

With the existence

of

many large transaction

databases, the huge amounts

of

data,

the high scal-

ability

of

distributed systems, and the easy partition

and distribution

of

a centralized database, it is im-

portant to inuestzgate eficient methods for distributed

mining

of

association rules. This study discloses some

interesting relationships between locally large and glob-

ally large itemsets and proposes an interesting dis-

tributed association rule mining algorithm, FDM

(Fast

Distributed Mining of association rules),

which gener-

ates

a

small number

of

candidate sets and substantially

reduces the number

of

messages to be passed at min-

ing association rules. Our performance study shows

that FDM has a superior performance over the direct

application

of

a

typical sequential algorithm. Further

performance enhancement leads to

a

few variations

of

the algorithm.

1

Introduction

An association rule is a rule which implies certain

association relationships among a set of objects (such

as “occur together”

or

“one implies the other”) in

a

database. Since finding interesting association rules

in databases may disclose some useful patterns for

decision support, selective marketing, financial fore-

cast, medical diagnosis, and many other applications,

it has attracted a lot of attention in recent data min-

ing research

[5].

Mining association rules may require

iterative scanning

of

large transaction

or

relational

databases which is quite costly in processing. There-

fore, efficient mining of association rules in transaction

and/or relational databases has been studied substan-

tially

[l,

2, 4,

8,

10, 11, 12, 14,

151.

*The research of the first author was supported in part

by RGC (the Hong Kong Research Grants Council) grant

338/065/0026. The research of the second author was supported

in part by the research grant NSERC-A3723 from the Natural

Sciences and Engineering Research Council of Canada, the re-

search grant NCE:IRIS/PRECARN-HMI5 from the Networks

of

Centres of Excellence of Canada, and a research grant

from

Hughes Research Laboratories.

Previous studies examined efficient mining of asso-

ciation rules from many different angles. An influen-

tial association rule mining algorithm, Apriori

[2],

has

been developed

for

rule mining in large transaction

databases. A DHP algorithm

[lo]

is an extension of

Apriori using

a

hashing technique. The scope of the

study has also been extended to efficient mining of se-

quential patterns

[3],

generalized association rules

[14],

multiple-level association rules

[8],

quantitative asso-

ciation rules

[15],

etc. Maintenance of discovered

asso-

ciation rules by incremental updating has been studied

in

[4].

Although these studies are on sequential data

mining techniques, algorithms for parallel mining of

association rules have been proposed recently

[ll,

11.

We feel that the development of distributed algo-

rithms for efficient mining of association rules has its

unique importance, based on the following reasoning.

(1)

Databases

or

data warehouses

[13]

may store a

huge amount of data. Mining association rules in such

databases may require substantial processing power,

and distributed system is

a

possible solution.

(2)

Many large databases are distributed in nature.

For

example, the huge number of transaction records of

hundreds of Sears department stores are likely to be

stored at different sites. This observation motivates

us to study efficient distributed algorithms for min-

ing association rules in databases. This study may

also shed new light on parallel data mining. Further-

more, a distributed mining algorithm can also be used

to mine association rules in a single large database

by partitioning the database among a set of sites and

processing the task in a distributed manner. The high

flexibility, scalability, low cost performance ratio, and

easy connectivity of a distributed system makes it an

ideal platform for mining association rules.

In this study, we assume that the database to be

studied is a transaction database although the method

can be easily extended to relational databases as well.

The

database

consists

of

a

huge number

of

transac-

tion records, each with a transaction identifier (TID)

and

a

set of data items. Further, we assume that the

0-8186-7475-X/96

$5.00

0

1996

IEEE

31

database is “horizontally” partitioned (i.e., grouped

by transactions) and allocated to the sites in a dis-

tributed system which communicate by message pass-

ing. Based on these assumptions, we examine dis-

tributed mining of association rules. It has been well

known that the major cost of mining association rules

is the computation of the set of

large itemsets

(i.e.,

fre-

quently occurring sets

of

items,

see Section 2.1) in the

database

[2].

Distributed computing of large itemsets

encounters some new problems. One may compute

lo-

cally large

itemsets easily, but a locally large itemset

may not bc

globally large.

Since it is very expensive

to broadcast the whole data set to other sites, one op-

tion is to broadcast all the counts of all the itemsets,

no matter locally large

or

small, to other sites. How-

ever

,

a database may contain enormous combinations

of itemsets, and it will involve passing a huge number

of messages.

Based on

our

observation, there exist some interest-

ing properties between locally large and globally large

itemsets. One should maximally take advantages of

such properties to reduce the number of messages to

be passed and confine the substantial amount of pro-

cessing to local sites. As mentioned before, two al-

gorithms for parallel mining

of

association rules have

been proposed. The two proposed algorithms PDM

and Count Distribution (CD) are designed for share-

nothing parallel systems

[ll,

13.

However, they can

also be adapted

to

distributed environment. We have

proposed an efficient distributed data mining algo-

rithm FDM

(Fast Distributed Mining

of

associatzon

rules),

which has the following distinct feature in com-

parison with these two proposed parallel mining algo-

rithms.

1.

The generation of candidate sets is in the same

spirit of Apriori. However, some interesting rela-

tionships between locally large sets and globally

large ones are explored to generate

a

smaller set of

candidate sets

at

each iteration and thus reduce

the number of messages to be passed.

2.

After the candidate sets have been generated, two

pruning techniques,

local pruning

and

global prun-

ing,

are developed to prune away some candidate

sets at each individual sites.

3.

In order

to

determine whether

a

candidate set is

large,

our

algorithm requires only

O(n)

messages

for support count exchange, where

n

is the num-

ber

of

sites in the network. This is much less than

a

straight adaptation of Apriori, which requires

O(n2)

messages.

Notice that several different combinations of the

local and global prunings can be adopted in FDM.

We studied three versions of FDM:

FDM-LP, FDM-

LUP,

and

FDM-LPP

(see Section

4),

with similar

framework but different combinations

of

pruning tech-

niques. FDM-LP only explores the

local prunzng;

FDM-LUP

does

both local pruning and the

upper-

bound-prunzng;

and FDM-LPP does both local prun-

ing and the

pollang-szte-prunang.

Extensive experiments have been conducted to

study the performance of FDM and compare it against

the Count Distribution algorithm. The study demon-

strates the efficiency of the distributed mining algo-

rithm.

The remaining of the paper is organized as follows.

The tasks of mining association rules in sequential as

well as distributed environments are defined in Sec-

tion 2. In Section

3,

techniques

for

distributed mining

of association rules and some important results are dis-

cussed. The algorithms for different versions of FDM

are presented in Section

4.

A performance study is re-

ported in Section

5.

Our

discussions and conclusions

are presented respectively in Sections

6

and

7.

2

Problem

Definition

2.1

Sequential Algorithm

for

Mining As-

sociation Rules

Let

I

=

{il,i2,.

.

.,im}

be a set of

atems.

Let

DB

be a database

of

transactions, where each transaction

T

consists of a set of items such that

T

C

I.

Given an

ztemset

X

C

I,

a transaction

T

contazns

X

if and only

if

X

T.

An

assocaatzon rule

is an implication of the

form

X

a

Y,

where

X

C_

I,

Y

2

I

and

X

n

Y

=

0.

The association rule

X

j

Y

holds in

DB

with

confi-

dence

c

if the probability of a transaction in

DB

which

contains

X

also contains

Y

is

e.

The association rule

X

Y

has

support

s

in

DB

if the probability of a

transaction in

DB

contains both

X

and

Y

is

s.

The

task of mining association rules is to find all the asso-

ciation rules whose support is larger than a

mznamum

support threshold

and whose confidence is larger than

a

mznzmum confidence threshold.

For

an itemset

X,

its

support

is the percentage

of

transactions in

DB

which contains

X,

and its

support

count,

denoted by

X.sup,

is the number of transactions

in

DB

containing

X.

An itemset

X

is

large

(or

more

precisely,

frequently occurrzng)

if its support is no less

than the minimum support threshold. An itemset of

size

k

is called a

k-ztemset.

It has been shown that the

problem of mining association rules can be reduced to

two subproblems [2]:

(1)

find

all large itemsets

for

Q

gaven mznzmum support threshold,

and (2)

generate the

association rules from the large atemsets found.

Since

(1)

dominates the overall cost of mining association

rules, the research has been focused on how to develop

efficient methods to solve the first subproblem

[a].

An interesting algorithm,

Aprzorz

[a],

has been pro-

posed for computing large itemsets at mining asso-

ciation rules in a transaction database. There have

been many studies on mining association rules using

sequential algorithms in centralized databases (e.g.,

32

[lo,

14,

8,

12, 4, 15]),

which can be viewed

as

vari-

ations

or

extensions to Apriori. For example, as an

extension to Apriori, the DHP algorithm

[lo]

uses a

direct hashing technique to eliminate some size-2 can-

didate sets in the Apriori algorithm.

2.2

Distributed Algorithm for Mining

As-

sociation Rules

We examine the mining of association rules in a

distributed environment. Let

DB

be

a

database

with

D

transactions. Assume that there are

n

sites

S1,S2,.

. .

,

Sn

in a distributed system and the

database

DB

is partitioned over the

n

sites into

(DB1, DB2,.

. .

,

DB,},

respectively.

Let the size of the partitions

DBi

be

Di,

for

i

=

1,.

.

,

n.

Let

X.sup

and

X.supi

be the support

counts of an itemset

X

in

DB

and

DBi,

respectively.

X.sup

is called the

global support count,

and

X.supi

the

local support count

of

X

at site

Si.

For

a

given

minimum support threshold

s,

X

is

globally large

if

X.sup

2

s

x

D;

correspondingly,

X

is

locally large

at

site

Si,

if

X.supi

2

s

x

Di.

In the following,

L

de-

notes the globally large itemsets in

DB,

and

L(k)

the

globally large k-itemsets in

L.

The essential task

of

a distributed association rule mining algorithm is to

find the globally large itemsets

L.

For comparison, we outline the Count Distribution

(CD)

algorithm as the follows

[l].

The algorithm is an

adaptation of the Apriori algorithm in the distributed

case. At each iteration, CD generates the candidate

sets at every site by applying the Apriorigen function

on the set of large itemsets found at the previous it-

eration. Every site then computes the local support

counts

of

all these candidate sets and broadcasts them

to all the other sites. Subsequently, all the sites can

find the globally large itemsets for that iteration, and

then proceed to the next iteration.

3

Techniques

for

Distributed Data

3.1

Generation of Candidate Sets

It is important to observe some interesting proper-

ties related to large itemsets in distributed environ-

ments since such properties may substantially reduce

the number of messages to be passed across network

at mining association rules.

There is an important relationship between large

itemsets and the sites in a distributed database:

every

globally large itemsets must be locally large at

some

site(s).

If an itemset

X

is

both globally large and locally

large

at a site

Si, X

is called

gl-large

at

site

Si.

The

set of gl-large itemsets at a site will form a basis for

the site to generate its own candidate sets.

Two monotonic properties can be easily observed

from the locally large and gl-large itemsets. First, if

an

itemset

X

is locally large

at

a

site

Si,

then all

of

its subsets are also locally large

at

site

Si.

Secondly,

if an itemset

X

is gl-large at a site

Si,

then all of

Mining

its subsets are also gl-large at site

Si.

Notice that a

similar relationship exists among the large itemsets in

the centralized case. Following is an important result

based

on

which an effective technique

for

candidate

sets generation in the distributed case is developed.

Lemma

1

If

an

itemset

X

is globally large, there ex-

ists a site

Si,

(1

<

i

<

n),

such that

X

and all its

subsets are gl-large at site

Si.

Proof.

If

X

is not locally large at any site, then

X.supi

<

s

x

Di

for all

i

=

1,.

.

. ,

n.

Therefore,

X.sup

<

s

x

D,

and

X

cannot be globally large. By

contradiction,

X

must be locally large at some site

Si,

and hence

X

is gl-large at

Si.

Consequently, all the

0

We use

GLi

to denote the set of gl-large itemsets

at site

Si,

and

GLi(k)

to denote the set of gl-large

k-

itemsets at site

Si.

It follows from Lemma

1

that if

X

E

L(k),

then there exists a site

si,

such that all its

size-(k

-

1)

subsets are gl-large at site

Si,

i.e., they

belong to

GLi(k-1).

In

a

straightforward adaptation of Apriori, the set

of candidate sets at the k-th iteration, denoted by

CA(k),

which stands for size-k candidate sets from

Apriori, would be generated by applying the Apri-

origen function on

L(k-1).

That is,

subsets of

X

must also be gl-large at

Si.

CA(k)

=

Apriori-gen(L(k-1)).

At each site

Si,

let

CGi(k)

be the set of candidates

sets generated by applying Apriorigen on

GLi(k-11,

i.e.,

CGi(k)

=

Apriori-gen(

GL,(k-

1

)),

where

CG

stands for candidate sets generated from

gl-large itemsets. Hence

CGi(k)

is generated from

GLi(k-l).

Since

GLi(k-1)

5

L(k-l),

CGqk)

is a sub-

set of

CA(k).

In the following, we use

CG(k)

to denote

the set

Uy="=,Gi(k).

Theorem

1

For

every

IC

>

1,

the set

of

all large

k-

itemsets

L(k)

is

Q

subset

of

CG(k)

=

CGi(k),

where

CGi(k)

=

Apriori-gen(

GL,(k-

1)).

Proof.

Let

X

E

L(k).

It follows from Lemma

1

that

there exists a site

Si,

(1

5

i

<

n),

such that all the

size-(k

-

1)

subsets of

X

are gl-large at site

Si.

Hence

X

E

CGi(k).

Therefore,

L(k)

G

CG(k)

=

U

CGi(k)

=

U

Apriori-gen(GL,(k-I)).

n

i=l

U

Theorem

1

indicates that

CG(k),

which is a subset

of

CA(k)

and could be much smaller than

CA(,),

can

be taken

as

the

set

of candidate sets for the size-k large

itemsets. The difference between the two sets,

CA(k)

and

CG(k),

depends on the distribution of the item-

sets. This theorem forms a basis for the generation

of the set of candidate sets in the algorithm FDM.

First the set of candidate sets

CG!i(k)

can be gener-

ated locally at each site

Si

at the k-th iteration. After

the exchange of support counts, the gl-large itemsets

GLqk)

in

CGi(k)

can be found at the end of that itera-

tion. Based on

GL;(k),

the candidate sets at

Si

for the

(k

+

1)-st

iteration can then be generated. According

to the performance study in Section 5, by using this

approach, the number of candidate sets generated can

be substantially reduced to about

10

-

25% of that

generated in CD.

Example

1

illustrates the effectiveness of the reduc-

tion of candidate

sets

using Theorem 1.

Example

1

Assuming there are

3

sites in a system

which partitions the

DB

into

DB1, DB2

and

DB3.

Suppose the set of large 1-itemsets (computed at

the first iteration)

L(1)

=

{A,B,C,D, E, F,G,H},

in which

A,

B,

and

C

are locally large

at

site

SI,

I?,

6,

and

D

are locally large at site

S2,

and

E,

F,C,

and

H

are locally large

att

site

S3.

There-

fore,

GIql)

=

(A,B,C},

GL2(1)

=

{B,C, D},

and

GL3(1)

=

{E,F,G,

H}.

Based on Theo-

rem

I,

the set of size-2 candidate sets

at

site

SI

is

CG1(2),

where

CGI(2)

=

Apriori.gen

(GLI(1))

=

(AB, BC,

AC}.

Similarly,

CG2(2)

=

{BC,

CD,

BD},

and

CG3(2)

=

{EF, EG, EH, FG, FH,GH}.

Hence,

the set of candidate sets for large 2-itemsets is

CG(2)

=

CGl(2)

U

CGZ(2)

U

CG3(2),

total

11

candi-

dates. However, if Apriori-gen is applied to

L(1),

the

set of candidate sets

CA(2)

=

Apriori-gen(L(l)) would

have

28

candidates. This shows that it is very effective

to apply Theorem

1

to reduce the candidate

sets.

0

3.2

Local

Pruning

of

Candidate Sets

The previous subsection shows that based on The-

orem

1,

one can usually generate in a distributed en-

vironment a much smaller set of candidate sets than

the direct application of the Apriori algorithm.

When the set of candidate set

C'G(k)

is generated,

to find the globally large itemsets, i,he support counts

of the candidate sets must be exchainged among all the

sites. Notice that some candidate sets in

CG(k)

can be

pruned by a

local

pruning

technique before count ex-

change starts. The general idea is that

at

each site

Si,

if a candidate set

X

E

CG,(k)

is not locally large

at

site

Si,

there is no need

for

S,

to find out its global support

count to determine whether it is gllobally large. This

is

because in this case, either

X

is small (not glob-

ally large),

or

it will be locally large at some other

site, and hence only the site(s) at which

X

is locally

large need to be responsible to find the global support

count of

X.

Therefore, in order to compute all the

large k-itemsets,

at

each site

Si,

the candidate

sets

can be confined

to

only the sets

X

E

CGi(k)

which are

locally large at site

Si.

For convenience, we use

LL;(k)

to denote those candidate sets in

CGi(k)

which are lo-

cally large at site

Si.

Based on the above discussion,

at every iteration (the k-th iteration), the gl-large

k-

itemsets can be computed at each site

Si

according to

the following procedure.

1.

Candidate sets generation:

Generate the candidate

sets

CGi(k)

based on the gl-large itemsets found

at site

Si

at the

(k

-

1)-st iteration using the

formula,

CG;(k)

=

Apriori-gen

(

GLz(k-l)).

2.

Local pruning:

For each

X

E

CGi(k),

scan the

partition

DBi

to compute the local support count

X.supi.

If

X

is not locally large at site

Si,

it is

excluded from the candidate sets

&(k).

(Note:

This pruning only removes

X

from the candidate

set at site

Si.

X

could still be

a

candidate set at

some other site.)

3.

Support count exchange:

Broadcast the candidate

sets in

LL;(k)

to other sites to collect support

counts. Compute their global support counts and

find all the gl-large k-itemsets in site

Si.

4.

Broadcast mining results:

Broadcast the computed

gl-large k-itemsets to all the other sites.

For

clarity, the notations used

so

far are listed in

Table

1.

number of transactions in

DB

support threshold

minsup

globally large k-itemsets

candidate sets generated from

L(k)

global support count of

X

number of transactions in

DBi

gl-large k-itemsets at

Si

candidate sets generated from

GLi(k-1)

locally large k-itemsets in

CGi(k)

local support count of

X

at

Si

Table

1:

Notation Table.

To illustrate the above procedure. we continue

working on Example

1

as follows.

Example

2

Assume the database in Example

1

con-

tains 150 transactions and each one of the

3

parti-

tions has

50

transactions. Also assume that the sup-

port threshold

s

=

10%.

Moreover, according to

Ex-

ample

1,

at the second iteration, the candidate sets

generated at site

SI

are

CG1(2)

=

{AB, BC,AC};

at site

S2, CGq2)

=

{BC, BD,

CD};

and at site

S3,

CG3(2)

=

(EF, EG, EH, FG,

FH,

GH}.

In order to

compute the large 2-itemsets, the local support counts

34

Table

2:

Locally Large Itemsets.

large request

candidates from

AB

s1

BC

Sl,

s2

CD

s2

EF

s3

GH

s3

at each site is computed first. The result is recorded

in Table 2.

From Table

2,

it can be seen that

AC.sup1

=

2

<

s

x

D1

=

5.

AC

is not locally large. Hence, the

candidate set

AC

is pruned away at site

SI.

On the

other hand, both

AB

and

BC

have enough local sup

port counts and they survive the local pruning. Hence

LLq2)

=

{AB, BC}.

Similarly,

LL2(2)

=

{BC, CD},

and

LL3(2)

=

{EF,

GH}.

After the local pruning, the

number of size-2 candidate sets has been reduced to

five which is less than half of the original size. Once

the local pruning is completed, each site broadcasts

messages containing all the remaining candidate sets

to the other sites to collect their support counts. The

result of this count support exchange is recorded in

Table

3.

1

5

4 4

10

2

4

8

4

3

8

4

6

Table

3:

Globally Large Itemsets.

The request for support count for

AB

is broad-

casted from

SI

to site

S2

and

5’3,

and the counts

sent back are recorded at site

S1

as in the second row

of Table

3.

The other rows record similar count ex-

change activities at the other sites. At the end of

the iteration, site

S1

finds out that only

BC

is gl-

large, because

BC.sup

=

22

>

s

x

D

=

15,

and

AB.sup

=

13

<

s

x

D

=

15.

Hence the gl-large

2-itemset at site

S1

is

GLl(2)

=

{BC}.

Similarly,

GL2(2)

=

{BC,CD}

and

GL3(2)

=

{EF}.

After the

broadcast of the gl-large itemsets, all sites return the

large 2-itemsets

42)

=

{BC,

CD,

EF}.

Notice that

some

candidate set,

such

as

BC

in

this

example, could be locally large at more than one site.

In this case, the messages are broadcasted from all the

sites

at

which

BC

is found to be locally large. This

is unnecessary because for each of candidate itemset,

only one broadcast is needed. In Section

3.4,

an opti-

mization technique to eliminate such redundancy will

be discussed.

0

There is a subtlety in the implementation of the

four steps outlined above for finding globally large

itemsets. In order to support both step

2,

“local prun-

ing”, and step

3,

“support count exchange”, each site

Si

must have two sets of support counts.

For

local

pruning,

Si

has to find the local support counts

of

its

candidate sets

CGi(k).

For support count exchange,

Si

has to find the local support counts

of

some possi-

bly different candidate sets from other sites in order

to answer the count requests from these sites. A sim-

ple approach would be to scan

DBi

twice, once for

collection of the counts for the local

CGqk),

and once

for responding to the count requests from other sites.

However, this would substantially degrade the perfor-

mance.

At

Si,

not only is

CG;(k)

available at the beginning of the

H-th iteration, but also are other sets, i.e.,

CGj(k)

(j

=

1,.

. .

,

n,

j

#

i),

because all the

GLi(k-l),

(i

=

1,.

.

,

n),

are broadcasted to every site

at

the

end of the

(H

-

1)-st iteration, and the sets

of

can-

didate sets

CGqk),

(i

=

1,

.

,

n),

are computed from

the corresponding

GLi(k-1).

That is, at the beginning

of each iteration, since all the gl-large itemsets found

at the previous iteration have been broadcasted to all

the sites, every site can compute the candidate sets

of

every other site. Therefore, the local support counts

of all these candidate sets can be found in one scan

and stored in a data structure like the hash-tree used

in Apriori

[2].

Using this technique, the data structure

can be built in one scan, and the two different sets of

support counts required in the local pruning and sup-

port count exchange can be retrieved from this data

structure.

3.3

Global

Pruning

of

Candidate Sets

The local pruning

at

a site

Si

uses only the local

support counts found in

DBi

to prune a candidate

set. In fact, the local support counts from other sites

can also be used for pruning. A

global

pruning

tech-

nique is developed to facilitate such pruning and is

outlined as follows. At the end of each iteration, all

the local support and global support counts of a can-

didate set

X

are available. These local support counts

can be broadcasted together with the global support

counts after a candidate set is found to be globally

large. Using this information, some global pruning

can be performed on the candidate sets at the subse-

quent iteration.

Assume

that the local support count

of

every can-

didate itemset is broadcasted to all the sites after it

is found to be globally large at the end of an itera-

In fact, there is no need of two scans.

35

A fast distributed algorithm for mining association rules

Citations

Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques (2nd edition)

Scalable algorithms for association mining

Frequent pattern mining: current status and future directions

Privacy-preserving distributed mining of association rules on horizontally partitioned data

References

Fast algorithms for mining association rules

Mining sequential patterns

PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing

Efficient and Effective Clustering Methods for Spatial Data Mining

An Efficient Algorithm for Mining Association Rules in Large Databases

Related Papers (5)

Fast algorithms for mining association rules

Mining association rules between sets of items in large databases

Fast Algorithms for Mining Association Rules in Large Databases

An effective hash-based algorithm for mining association rules

Mining frequent patterns without candidate generation