What have the authors contributed in "A framework for protecting worker location privacy in spatial crowdsourcing" ?

In this paper, the authors introduce a framework for protecting location privacy of workers participating in SC tasks. The authors argue that existing location privacy techniques are not sufficient for SC, and they propose a mechanism based on differential privacy and geocasting that achieves effective SC services while offering privacy guarantees to workers. The authors investigate analytical models and task assignment strategies that balance multiple crucial aspects of SC functionality, such as task completion rate, worker travel distance and system overhead.

What are the future works mentioned in the paper "A framework for protecting worker location privacy in spatial crowdsourcing" ?

As future work, the authors will extend their framework to also protect privacy of task locations.

What is the effect of a higher acceptance rate on the network?

As expected, a higher acceptance rate yields lower overhead and shorter travel distance, as workers are more willing to accept tasks.

What is the effect of increasing EU on the GR construction algorithm?

To obtain a higher probability of task acceptance, the GR construction algorithm will generate a larger geocast region, leading to increased overhead, as measured by ANW , HOP and WTD .

What is the way to find the smallest enclosing circle?

An efficient solution to find the smallest enclosing circle is a randomized algorithm [26] that runs in linear time to the number of data points in the region.

What is the widely accepted measure of compactness?

One widely accepted measure proposed in [17] is the Digital Compactness Measurement (DCM), which measures region compactness as the ratio between the area of the region and the area of its smallest circumscribing circle.

What is the way to evaluate the effectiveness of using compactness in the GR search strategy?

To evaluate the effectiveness of using compactness in the GR search strategy, the authors use as metric an estimation of the hop count required to disseminate the task request to all workers, given the communication range of the wireless network (e.g., 50-100 meters for WiFi).

What is the main reason why workers are not notified of tasks?

Protecting worker locations significantly complicates task assignment, and may reduce the effectiveness and efficiency of worker-task matching.

How do the authors create sanitized data releases at the CSP?

To create sanitized data releases at the CSP, the authors adopt the Private Spatial Decomposition (PSD) approach, first introduced in [3].

What is the difference between the two types of geocasting?

it is cheaper to geocast within a shape with less skew, such as a circle or a square, as opposed to skewed regions such as line-shaped areas, which have large network diameter.

(Open Access) A framework for protecting worker location privacy in spatial crowdsourcing (2014) | Hien To

A Framework for Protecting Worker Location Privacy in

Spatial Crowdsourcing

Hien To

Computer Science Dept.

Univ. of Southern California

hto@usc.edu

Gabriel Ghinita

Dept. of Computer Science

UMass Boston

Gabriel.Ghinita@umb.edu

Cyrus Shahabi

Computer Science Dept.

Univ. of Southern California

shahabi@usc.edu

ABSTRACT

Spatial Crowdsourcing (SC) is a transformative platform

that engages individuals, groups and communities in the act

of collecting, analyzing, and disseminating environmental,

social and other spatio-temporal information. The objective

of SC is to outsource a set of spatio-temporal tasks to a set

of workers, i.e., individuals with mobile devices that perform

the tasks by physically traveling to speciﬁed locations of in-

terest. However, current solutions require the workers, who

in many cases are simply volunteering for a cause, to dis-

close their locations to untrustworthy entities. In this paper,

we introduce a framework for protecting location privacy of

workers participating in SC tasks. We argue that existing

location privacy techniques are not suﬃcient for SC, and

we propose a mechanism based on diﬀerential privacy and

geocasting that achieves eﬀective SC services while oﬀering

privacy guarantees to workers. We investigate analytical

models and task assignment strategies that balance multiple

crucial aspects of SC functionality, such as task completion

rate, worker travel distance and system overhead. Exten-

sive experimental results on real-world datasets show that

the proposed technique protects workers’ location privacy

without incurring signiﬁcant performance metrics penalties.

1. INTRODUCTION

Recent years have witnessed a signiﬁcant growth in the

number of mobile smart phone users, as well as fast develop-

ment in phone hardware performance, software functional-

ity and communication features. Today’s mobile phones are

powerful devices that can act as multi-modal sensors collect-

ing and sharing various types of data, e.g., picture, video, lo-

cation, movement speed, direction and acceleration. In this

context, Spatial Crowdsourcing (SC) [14] is emerging as a

novel and transformative platform that engages individuals,

groups and communities in the act of collecting, analyzing,

and disseminating environmental, social and other informa-

tion for which spatio-temporal features are relevant. With

SC, task requesters outsource their spatio-temporal tasks to

This work is licensed under the Creative Commons Attribution-

NonCommercial-NoDerivs 3.0 Unported License. To view a copy of this li-

cense, visit http://creativecommons.org/licenses/by-nc-nd/3.0/. Obtain per-

mission prior to any use beyond those covered by the license. Contact

were invited to present their results at the 40th International Conference on

Very Large Data Bases, September 1st - 5th 2014, Hangzhou, China.

Proceedings of the VLDB Endowment, Vol. 7, No. 10

a set of workers, i.e., individuals with mobile devices that

perform the tasks by physically traveling to speciﬁed loca-

tions of interest. The nature of tasks may vary from en-

vironmental sensing to capturing images at social or enter-

tainment events. Typically, requesters and workers register

with a centralized spatial crowdsourcing server (SC-server)

that acts as a broker between parties, and often also plays

a role in how tasks are assigned to workers (i.e., scheduling

according to some performance criteria). SC has numer-

ous applications in domains such as environmental sensing,

journalism, crisis response and urban planning.

Consider an emergency response scenario where the Red

Cross (i.e., requester) is interested in collecting pictures and

videos of disaster areas from various locations in a country

(e.g., typhoon Haiyan in the Philippines in 2013). The re-

quester issues a query to an SC-server, and the request is

forwarded to workers situated in proximity to the zones of

interest. The workers record photos and videos using their

mobile phones, and send the results back to the requester.

Participatory sensing is another domain where SC is very

suitable. Mobile users can leverage their sensor-equipped

mobile devices to collect environmental or traﬃc data.

SC is feasible only if workers and tasks are matched ef-

fectively, i.e., tasks are completed in a timely fashion, and

workers do not need to travel across very long distances.

To that extent, matching at the SC-server must take into

account the locations of workers. However, the SC-server

may not be trusted, and disclosing individual locations has

serious privacy implications [9, 20, 7, 3]. Knowing worker lo-

cations, an adversary can stage a broad spectrum of attacks

such as physical surveillance and stalking, identity theft, and

breach of sensitive information (e.g., an individual’s health

status, alternative lifestyles, political and religious views).

Thus, ensuring location privacy is an essential aspect of SC,

because mobile users will not accept to engage in spatial

tasks if their privacy is violated.

Several solutions [9, 20, 7] have been proposed to protect

location-based queries, i.e., given an individual’s location,

ﬁnd points of interest in the proximity without disclosing

the actual coordinates. However, in SC, a worker’s location

is no longer part of the query, but rather the result of a

spatial query around the task. In addition, while some work

considers queries on private locations in the context of out-

sourced databases [28, 27], it is assumed that the data owner

entity and the querying entity trust each other, with protec-

tion being oﬀered only against intermediate service provider

entities. This scenario does not apply in SC, as there is no

inherent trust relationship between requesters and workers.

We propose a framework for protecting privacy of worker

locations, whereby the SC-server only has access to data san-

itized according to diﬀerential privacy (DP) [5]. In practice,

there may be many SC-servers run by diverse organizations

that do not have an established trust relationships with the

workers. On the other hand, every worker subscribes to a

cellular service provider (CSP) that already has access to

the worker locations (e.g., through cell tower triangulation).

The CSP signs a contract with its subscribers, which stipu-

lates the terms and conditions of location disclosure. Thus,

the CSP can release worker locations to third party SC-

servers in noisy form, according to DP. However, using DP

introduces two diﬃcult challenges, as discussed next.

First, the SC-server must match workers to tasks using

noisy data, which requires complex strategies to ensure ef-

fective task assignment. To create sanitized data releases

at the CSP, we adopt the Private Spatial Decomposition

(PSD) approach, ﬁrst introduced in [3]. A PSD is a san-

itized spatial index, where each index node contains a noisy

count of the workers rooted at that node. Speciﬁcally, we

devise a mechanism to create a Worker PSD by extending

the Adaptive Grid (AG) technique [23]. To ensure that task

assignment has a high success rate, we introduce an ana-

lytical model that determines with high probability a PSD

partition around the task location that includes suﬃcient

workers to complete the task.

Second, by the nature of the DP protection model, fake en-

tries may need to be created in the PSD. Thus, the SC-server

cannot directly contact workers, not even if pseudonyms are

used, as merely establishing a network connection to an en-

tity would allow the SC-server to learn whether an entry is

real or not, and breach privacy. To address this challenge,

we propose the use of geocasting [22] as means to deliver

task requests to workers. Once a PSD partition is identiﬁed

by the analytical model outlined above, the task request is

geocast to all the workers within the partition. Geocast in-

troduces overhead considerations that need to be carefully

considered in the framework design.

Our speciﬁc contributions are:

(i). We identify the speciﬁc challenges of location privacy

in the context of SC, and we propose a framework

that achieves diﬀerentially-private protection guaran-

tees. To the best of our knowledge, this is the ﬁrst

work to study location privacy for SC.

(ii). We propose an analytical model that measures the

probability of task completion with uncertain worker

locations, and we devise a search strategy that ﬁnds

appropriate PSD partitions to ensure high success rate

of task assignment.

(iii). We introduce a geocast mechanism for task request

dissemination that is necessary to overcome the re-

strictions imposed by DP, and we factor the geocast

system overhead in the PSD partition search strategy.

(iv). We conduct an extensive set of experiments on real-

world datasets which shows that the proposed frame-

work is able to protect workers’ location privacy with-

out signiﬁcantly aﬀecting the eﬀectiveness and eﬃ-

ciency of the SC system.

The remainder of this paper is organized as follows: Sec-

tion 2 presents necessary background. Section 3 introduces

the proposed privacy framework, whereas Sections 4 and 5

detail the proposed solution. Experimental results are pre-

sented in Section 6, followed by a survey of related work in

Section 7, and conclusions in Section 8.

2. BACKGROUND

2.1 Spatial Crowdsourcing

Spatial Crowdsourcing SC [14] is a type of online crowd-

sourcing where performing a task requires the worker to

travel to the location of the task (termed spatial task). Ac-

cording to the taxonomy in [14], there are two categories of

SC, based on how workers are matched to tasks. In Worker

Selected Tasks (WST) mode, the SC-server publishes the

spatial tasks online, and workers can autonomously choose

any tasks in their vicinity without the need to coordinate

with the SC-server. In Server Assigned Tasks (SAT) mode,

online workers send their location to the SC-server, and the

SC-server assigns tasks to nearby workers.

WST is the simpler protocol, and it does not require work-

ers to share their locations with the SC-server. However, the

assignment is often sub-optimal, as workers do not have a

global system view. Workers typically choose the closest

task to them, which may cause multiple workers to travel to

the same task, while many other tasks remain unassigned.

The SAT mode incurs the overhead of running complex

matching algorithms at the SC-server, but the best-suited

worker is selected for a task. This requires the SC-server to

know the workers’ locations, which poses a privacy threat.

In our work, we consider the SAT mode, but we also pro-

vide location privacy protection for the workers. Instead of

directly disclosing their coordinates to the SC-server, worker

locations are ﬁrst pooled together by a CSP and sanitized

according to diﬀerential privacy. This introduces signiﬁcant

challenges, as the SC-server has to employ far more complex

task assignment strategies that must take into account the

uncertain nature of the received location data.

2.2 Differential Privacy

Diﬀerential Privacy (DP) [5] has emerged as the de-facto

standard in data privacy, thanks to its strong protection

guarantees rooted in statistical analysis. DP is a seman-

tic model which provides protection against realistic adver-

saries with access to background information. DP ensures

that an adversary is not able to learn from the sanitized

data whether a particular individual is present or not in the

original data, regardless of the adversary’s prior knowledge.

DP allows interaction with a database only by means of

aggregate (e.g., count, sum) queries. Random noise is added

to each query result to preserve privacy, such that an adver-

sary that attempts to attack the privacy of some individual

worker w will not be able to distinguish from the set of query

results (called a transcript) whether a record representing w

is present or not in the database.

Definition 1 (-indistinguishability). Consider that

a database produces transcript U on the set of queries QS =

, Q

, . . . , Q

}, and let  > 0 be an arbitrarily-small real

constant. Then, transcript U satisﬁes -indistinguishability

if for every pair of sibling datasets D

, D

such that |D

| =

| and D

, D

diﬀer in only one record, it holds that

P r[QS

= U]

P r[QS

= U]

≤ 

In other words, an attacker cannot learn whether the tran-

script was obtained by answering the query set QS on dataset

or D

. Parameter  is called privacy budget, and speci-

ﬁes the amount of protection required, with smaller values

corresponding to stricter privacy protection. To achieve -

indistinguishability, DP injects noise into each query result,

and the amount of noise required is proportional to the sen-

sitivity of the query set QS, formally deﬁnes as:

Definition 2 (L

-Sensitivity). Given any arbitrary

sibling datasets D

and D

, the sensitivity of query set QS

is the maximum change in the query results of D

and D

σ(QS) = max

i=1

|QS

− QS

A suﬃcient condition to achieve diﬀerential privacy with pa-

rameter  is to add to each query result randomly distributed

Laplace noise with mean λ = σ(QS)/ [6].

Typically, the interaction with a dataset consists of a se-

ries of analyses (i.e., transcripts) A

, each required to satisfy



-diﬀerential privacy. Then, the privacy level of the result-

ing analysis can be computed as follows:

Theorem 1 (Sequential Composition [19]). Let A

be a set of analyses such that each provides ε

-DP. Then,

running in sequence all analyses A

provides (

)-DP.

Theorem 2 (Parallel Composition [19]). If D

are

disjoint subsets of the original database, and A

is a set of

analyses each providing ε

-DP, then applying each analysis

on partition D

provides max (

)-DP.

2.3 Private Spatial Decompositions (PSD)

The work in [3] introduced the concept of Private Spatial

Decompositions (PSD) to release spatial datasets in a DP-

compliant manner. A PSD is a spatial index transformed

according to DP, where each index node is obtained by re-

leasing a noisy count of the data points enclosed by that

node’s extent. Various index types such as grids, quad-trees

or k-d trees [24] can be used as a basis for PSD.

Accuracy of PSD is heavily inﬂuenced by the type of

PSD structure and its parameters (e.g., height, fan-out).

With space-based partitioning PSD, the split position for a

node does not depend on worker locations. This category

includes ﬂat structures such as grids, or hierarchical ones

such as BSP-trees (Binary Space Partitioning) and quad-

trees [24]. The privacy budget  needs to be consumed only

when counting the workers in each index node. Typically,

all nodes at same index level have non-overlapping extents,

which yields a constant and low sensitivity of 2 per level

(i.e., changing a single location in the data may aﬀect at

most two partitions in a level). The budget  is best dis-

tributed across levels according to the geometric allocation

[3], where leaf nodes receive more budget than higher levels.

The sequential composition theorem applies across nodes on

the same root-to-leaf path, whereas parallel composition ap-

plies to disjoint paths in the hierarchy. Space-based PSD are

simple to construct, but can become unbalanced.

Object-based structures such as k-d trees and R-trees [3]

perform splits of nodes based on the placement of data

points. To ensure privacy, split decisions must also be done

according to DP, and signiﬁcant budget may be used in the

process. Typically, the exponential mechanism [3] is used to

assign a merit score to each candidate split point according

to some cost function (e.g., distance from median in case of

k-d trees), and one value is randomly picked based on its

noisy score. The budget must be split between protecting

node counts and building the index structure. Object-based

PSD are more balanced in theory, but they are not very ro-

bust, in the sense that accuracy can decrease abruptly with

only slight changes of the PSD parameters, or for certain

input dataset distributions.

The recent work in [23] compares tree-based methods with

multi-level grids, and shows that two-level grids tend to per-

form better than recursive partitioning counterparts. The

paper also proposes an Adaptive Grid (AG) approach, where

the granularity of the second-level grid is chosen based on

the noisy counts obtained in the ﬁrst-level (sequential com-

position is applied). AG is a hybrid which inherits the sim-

plicity and robustness of space-based PSD, but still uses a

small amount of data-dependent information in choosing the

granularity for the second level. In our work, we adapt the

AG method to address SC-speciﬁc requirements.

3. PRIVACY FRAMEWORK

Section 3.1 presents the system model and the workﬂow

for privacy-preserving SC. Section 3.2 outlines the privacy

model and assumptions. Section 3.3 discusses design chal-

lenges and associated performance metrics.

3.1 System Model

We consider the problem of privacy-preserving SC task

assignment in the SAT mode. Figure 1 shows the proposed

system architecture. Workers send their locations (Step 0)

to a trusted cellular service provider (CSP) which collects

updates and releases a PSD according to privacy budget 

mutually agreed upon with the workers. The PSD is ac-

cessed by the SC-server (Step 1), which also receives tasks

from a number of requesters (Step 2). For simplicity, we fo-

cus on the single-SC-server case, but our system model can

support multiple SC-servers.

When the SC-server receives a task t, it queries the PSD

to determine a geocast region (GR) that encloses with high

probability workers in relative proximity to t. Due to the

uncertain nature of the PSD, this is a challenging process

which will be detailed later in Section 5. Next, the SC-server

initiates a geocast communication [22] process (Step 3) to

disseminate t to all workers within GR. According to DP,

sanitizing a dataset requires creation of fake locations in the

PSD. If the SC-server is allowed to directly contact work-

ers, then failure to establish a communication channel would

breach privacy, as the SC-server is able to distinguish fake

workers from real ones. Using geocast is a unique feature

of our framework which is necessary to achieve protection.

Geocast can be performed either with the help of the CSP

infrastructure, or through a mobile ad-hoc network where

the CSP contacts a single worker in the GR, and then the

message is disseminated on a hop-by-hop basis to the entire

GR. The latter approach keeps CSP overhead low, and can

reduce operation costs for workers.

Upon receiving request t, a worker w decides whether to

perform the task or not. If yes (Step 4), she sends a consent

message to the SC-server conﬁrming w’s availability (alter-

natively, the consent can be directly sent to the requester).

If w is not willing to participate in the task, then no consent

is sent, and no information about the worker is disclosed.

2. Task Request t

Requesters

Workers

SC-Server

Worker

Database

1. Sanitized Release

PSD

4. Consent

Cell Service

Provider

0. Report Locations

3. Geocast {t,GR}

Figure 1: Privacy framework for spatial crowdsourcing

3.2 Privacy Model and Assumptions

Our speciﬁc objective is to protect both the location and

the identity of workers during task assignment. Once a

worker consents to a task, the worker herself may directly

disclose information to the task requester (e.g., to enable

a communication channel between worker and requester).

However, such additional disclosure is outside our scope, as

each worker has the right to disclose his or her individual

information. Our focus is on what happens prior to consent,

when worker location and identity must be protected from

both task requesters and the SC server.

Focusing on the SC assignment step is important, given

the fact that SC workers have to travel to the task loca-

tion. Mere completion of a task discloses the fact that some

worker must have been at that location, and this sort of

disclosure is unavoidable in SC. To protect her location af-

ter consent, a worker can still enjoy some form of identity

protection (e.g., using pseudonyms and anonymous routing),

for which solutions are already available (e.g., TOR). On the

other hand, no solution exists to date for the more challeng-

ing problem of privacy-preserving task assignment, hence we

direct our eﬀorts in this direction.

Furthermore, focusing on task assignment also makes sense

from a disclosure volume standpoint. During assignment, all

workers are candidates for participation, therefore locations

of all workers would be exposed, absent a privacy-preserving

mechanism. On the other hand, after task request dissemi-

nation, only few workers will participate in task completion,

and only if they give their explicit consent.

Workers cannot trust the SC-server, especially as there

may be many such entities with diverse backgrounds, e.g.,

private companies, non-proﬁts, government organizations,

academic institutions. On the other hand, the CSP already

has a signed agreement with workers through the service

contract, so there is already a trust relationship established,

as well as mutually-agreed upon rules for data disclosure.

Furthermore, the CSP already knows where subscribers are,

e.g., using cell tower triangulation, so worker location re-

porting does not introduce additional disclosure.

However, the CSP has no expertise, and perhaps no ﬁnan-

cial interest, to host an SC service, which needs to deal with

a diverse set of issues such as interacting with various task

requester categories, managing proﬁles (e.g., some workers

may only volunteer for environmental tasks), etc. The role

of the CSP is to aggregate locations from subscribed work-

ers, transform them according to DP, and release the data

in sanitized form to one or more SC-servers for assignment.

As multiple SC-servers can use the same PSD, it is practical

for the CSP to provide PSDs for a small fee, e.g., a percent-

age of the workers’ payment, or a tax incentive in the case

of public-interest SC applications.

3.3 Design Goals and Performance Metrics

Protecting worker locations signiﬁcantly complicates task

assignment, and may reduce the eﬀectiveness and eﬃciency

of worker-task matching. Due to the nature of DP, it is

possible for a region to contain no workers, even if the PSD

shows a positive count. Therefore, no workers (or an insuf-

ﬁcient number thereof) may be notiﬁed of the task request.

The task may not be completed. Alternatively, a worker

may be notiﬁed of the task even though she is at a long dis-

tance away from the task location, whereas a nearer worker

does not receive the request. Finally, in the non-private SAT

case, only one selected worker, whose location and identity

are known, is notiﬁed of the task request. With location

protection, many redundant messages may need to be sent,

increasing system overhead.

Therefore, we focus on the following performance metrics:

• Assignment Success Rate (ASR). Due to PSD

data uncertainty, the SC-server may fail to assign work-

ers to tasks (e.g., no worker is reached, or task is too

far and workers do not accept it). ASR measures the

ratio of tasks accepted by a worker

to the total num-

ber of task requests. The challenge is to keep ASR

close to 100%.

• Worker Travel Distance (WTD). The SC-server

is no longer able to accurately evaluate worker-task

distance, hence workers may have to travel long dis-

tances to tasks. The challenge is to keep the worker

travel distance low, even when exact worker locations

are not known.

• System Overhead. Dealing with imprecise locations

increases the complexity of assignment algorithms, which

poses scalability problems. A signiﬁcant metric to

measure overhead is the a

verage number of notiﬁed

workers (ANW). This number aﬀects both the com-

munication overhead required to geocast task requests,

as well as the computational overhead of the matching

algorithm, which depends on how many workers need

to be notiﬁed of a task request.

4. BUILDING THE WORKER PSD

The ﬁrst step consists of building a PSD (at the CSP side)

to be later used for task assignment at the SC-server. Build-

ing the PSD is an essential step, because it determines how

accurate is the released data, which in turn aﬀects ASR,

WTD and ANW . In this section, we modify the state-of-

the-art Adaptive Grid (AG) method proposed in [23] to ad-

dress the speciﬁc requirements of the SC framework. Table 1

summarizes the notations used in our paper.

PSDs based on uniform grids treat all regions in the dataset

identically, despite large variances in location density. As a

result, they over-partition the space in sparse regions, and

ASR does not capture worker reliability, tasks may still fail

to complete after being accepted. Our focus is on assignment

success, reliability is outside our scope.

Symbol Deﬁnition

ε, ε

Total privacy budget and level-i budget

α AG budget split, α = 0.5 means ε

= ε

N Total number of workers

Noisy worker count of level-1 cells

× m

Level-i grid granularity

¯n Expected noisy worker count of a level-2 cell

t A task or its location, used interchangeably

A level-2 cell

Noisy worker count of c

Acceptance rate of workers within c

Sub-cell of cell c

Table 1: Summary of Notations

under-partition in dense regions. AG avoids these draw-

backs by using a two-level grid and variable cell granular-

ity. At the ﬁrst level, AG creates a coarse-grained, ﬁxed-

size m

× m

grid over the data domain. AG uses a data-

independent heuristic to choose level-1 granularity as

= max(10,

N × 

)

where N is the total number of locations and k

= 10 [23].

Next, AG issues m

count queries, one for each level-1

cell, using a fraction of the total privacy budget: 

=  ×α,

where 0 < α < 1. AG then partitions each level-1 cell into

× m

level-2 cells, where m

is adaptively chosen based

on the noisy count N

of the level-1 cell:

× 

(1)

where 

=  −

is the remaining budget, and the constant

is set empirically to k

= 5. Parameter α determines how

privacy budget is divided between the two levels.

Figure 2 shows a snapshot of an adaptive grid, with four

level-1 cells A,B,C,D. Constructing a diﬀerentially private

AG requires two steps. First, the noisy counts N

of A,B,C,D

are computed by adding random Laplace noise with mean

= 2/ε

to the actual counts of these cells. Second, based

on the noisy counts, level-1 cells are further split into level-2

cells. According to Eq. (1), cell D, which has noisy count

200 is partitioned according to a 3x3 grid, while the gran-

ularity for other cells is 2x2. Thereafter, AG adds to each

level-2 cell (c

, i = 1..21) random Laplace noise with mean

= 2/ε

. Finally, their corresponding noisy counts n

together with the structure of the AG are published. Ac-

cording to Theorem 2, the sanitized release of AG provides

ε-DP.

A B

C D

Level 1

Level 2

)100(

N )100(

)100(

)200(

Figure 2: A snapshot of adaptive grid (ε = 0.5, α = 0.5)

Although AG was shown to yield good results for general-

purpose spatial queries [23], it is not directly applicable to

SC, due to its rigidity in choosing its parameters. Specif-

ically, the granularity m

of the level-2 grid is too coarse,

leading to large geocast areas and high communication over-

head, as we show next. According to Eq. (1), the expected

number of workers (i.e., noisy count) in a level-2 cell is:

¯n = N

≈ k

/

Table 2a presents diﬀerent values of m

and ¯n when varying

total budget  with α = 0.5. Note that, the values of ¯n are

rather large, especially for more restrictive privacy settings

(i.e., lower ). For  = 0.1, ¯n is 100. In practice, a geocast

region is likely to include multiple PSD cells, hence 100 is a

lower bound on the ANW , while its typical values can grow

much higher, leading to prohibitive communication cost.

ε ε

¯n

1 0.5 3 11

0.5 0.25 2 25

0.1 0.05 1 100

(a) Original AG (k

= 5)

ε ε

¯n

1 0.5 6 2.8

0.5 0.25 5 5.6

0.1 0.05 2 28.2

(b) Modiﬁed AG (k

√

Table 2: Granularity m

and average count per cell ¯n (N

= 100)

We propose a more suitable heuristic for choosing k

. Re-

call that the primary requirement of SC task assignment is

to achieve high ASR. To that extent, we want to ensure

that the task request is geocast in a non-empty region, i.e.,

the real worker count is strictly positive. According to the

Laplace mechanism of DP, each PSD count is the sum of

noisy and real counts. Given the level-2 privacy budget 

we can also quantify the distribution of added noise, which

has standard deviation µ =

√

2/

. Therefore, if the PSD

count is larger than µ, then with high probability there will

be at least one worker in the level-2 cell.

We increase granularity m

in order to decrease overhead,

but only to the point where there is at least one worker in

a cell. Denote by count

P SD

the value reported by PSD for

a certain level-2 cell. Given a Lap(1/ε

) distribution, the

probability that the noisy count is larger than zero is:

= 1 −

exp(−

count

P SD

1/

)

Furthermore, we want to have the PSD count larger than

the noise, i.e., ¯n = k

/ε

≥

√

2/ε

, so at the limit we set

√

2. The resulting probability of having non-empty

cells is p

= 1 −

exp(−

√

2) = 0.88. According to Eq. (1),

the corresponding granularity is m

√

In summary, we modify AG by carefully reducing the

granularity threshold at level-2 such that ANW is reduced,

while the probability for each level-2 cell to contain a real

worker is at least 88%. Table 2b shows that this new set-

ting signiﬁcantly reduces ¯n, and as a result ANW . Next, we

present a search strategy which groups cells together such

that the achieved ASR is above a given threshold.

5. TASK ASSIGNMENT

When a request for a task t is posted, the SC-server

queries the PSD and determines a geocast region GR where

the task is disseminated. The goal of the SC-server is to

obtain a high success rate for task assignment, while at the

same time reducing the worker travel distance WTD and

request dissemination overhead ANW .

A framework for protecting worker location privacy in spatial crowdsourcing

Figures

Citations

A Private and Efficient Mechanism for Data Uploading in Smart Cyber-Physical Systems

CrowdBC: A Blockchain-Based Decentralized Framework for Crowdsourcing

Online mobile Micro-Task Allocation in spatial crowdsourcing

Differentially Private Data Publishing and Analysis: A Survey

Crowdsourced Data Management: A Survey

References

Calibrating noise to sensitivity in private data analysis

Differential privacy

L-diversity: Privacy beyond k-anonymity

Calibrating noise to sensitivity in private data analysis

The design and analysis of spatial data structures

Related Papers (5)

GeoCrowd: enabling query answering with spatial crowdsourcing

Differential privacy

Geo-indistinguishability: differential privacy for location-based systems

Differentially Private Spatial Decompositions

Online mobile Micro-Task Allocation in spatial crowdsourcing

Frequently Asked Questions (16)

Q1. What have the authors contributed in "A framework for protecting worker location privacy in spatial crowdsourcing" ?

Q2. What are the future works mentioned in the paper "A framework for protecting worker location privacy in spatial crowdsourcing" ?

Q3. Why does the SC-server fail to assign workers to tasks?

Q4. What is the effect of a higher acceptance rate on the network?

Q5. What is the effect of increasing EU on the GR construction algorithm?

Q6. What is the way to find the smallest enclosing circle?

Q7. What is the widely accepted measure of compactness?

Q8. What is the way to evaluate the effectiveness of using compactness in the GR search strategy?

Q9. What is the main reason why workers are not notified of tasks?

Q10. How do the authors create sanitized data releases at the CSP?

Q11. What is the importance of building a PSD?

Q12. What is the cost of providing a PSD?

Q13. What is the main idea behind ensuring location privacy?

Q14. What is the main reason why workers cannot trust the SC server?

Q15. What is the difference between object-based and space-based PSD?

Q16. What is the difference between the two types of geocasting?