What is the way to distinguish between different types of disclosure?

Distinguishing different types of disclosure behaviors per type of personal information can improve the accuracy of prior research results, in which disclosures were summed up into a single “disclosure score”.

How many times more likely were participants to disclose an item than they were to answer “neutral?

Participants answering “very likely” to the behavioral intention question of a certain item were on average 2.69 times more likely to disclose the item than participants answering “neutral”.

What did the authors do to test the claim of multidimensionality?

The “online retailer dataset” was gathered specifically for this paper, in order to broaden its empirical basis, to test the claim of multidimensionality as an ex ante hypothesis, and to alleviate concerns that the thematic grouping of items in the two previous studies had a major effect on the discovered dimensions.

(Open Access) Dimensionality of information disclosure behavior (2013) | Bart P. Knijnenburg

Q: What are the contributions in "Dimensionality of information disclosure behavior - revision april 2013-camready" ?

In this paper, the authors evaluate three information disclosure datasets using a six-step statistical analysis, and show that people ’ s disclosure behaviors are rather multidimensional: participants ’ disclosure of personal information breaks down into a number of distinct factors. The authors discuss how their results provide relevant insights for a more user-centric approach to privacy and, more generally, advance their understanding of online privacy behavior. The authors also found other significant differences between these groups, in terms of privacy attitudes, behaviors, and demographic characteristics.

Q: What is the dimensionality of information disclosure behavior in this study?

These results suggest that there may be two dimensions of information disclosure behavior in this study, but this dimensionality is derived from the dimensionality of attitudes, and not directly tested on behavior.

Dimensionality of information disclosure behavior

Bart P. Knijnenburg, Alfred Kobsa, Hongxia Jin

Bart P. Knijnenburg

a,b

– bart.k@uci.edu (corresponding author)

Alfred Kobsa

– kobsa@uci.edu

Hongxia Jin

– hongxia.jin@sisa.samsung.com

Donald Bren School of Information and Computer Sciences, University of California, Irvine

6210 Donald Bren Hall, Irvine, CA 92697, USA

Samsung Information Systems of America

75 West Plumiera Drive, San Jose, CA 95134, USA

Abstract

In studies of people’s privacy behavior, the extent of disclosure of personal information is

typically measured as a summed total or a ratio of disclosure. In this paper, we evaluate three

information disclosure datasets using a six-step statistical analysis, and show that people’s

disclosure behaviors are rather multidimensional: participants’ disclosure of personal information

breaks down into a number of distinct factors. Moreover, people can be classified along these

dimensions into groups with different “disclosure styles”. This difference is not merely in degree,

but rather also in kind: one group may for instance disclose location-related but not interest-

related items, whereas another group may behave exactly the other way around. We also found

other significant differences between these groups, in terms of privacy attitudes, behaviors, and

demographic characteristics. These might for instance allow an online system to classify its users

into their respective privacy group, and to adapt its privacy practices to the disclosure style of this

group. We discuss how our results provide relevant insights for a more user-centric approach to

privacy and, more generally, advance our understanding of online privacy behavior.

Keywords

Privacy behavior, privacy attitude, information disclosure, measurement, factor analysis, latent

class analysis, structural equation modeling.

1 Introduction

Privacy is a very active research topic: every year, over 1200 new books and journal articles have

been published with this word in the title (Patil and Kobsa, 2009). Even in the sub-realm of online

privacy, these publications cover a wide range of disciplines, such as human-computer interaction

(Iachello and Hong, 2007), information systems (Bélanger and Crossler, 2011), personalization

(Kobsa, 2007), behavioral economics (Acquisti and Grossklags, 2008), marketing (Caudill and

Murphy, 2000), and social psychology (Joinson and Paine, 2007; Joinson et al., 2010). Although

attempts to integrate the contributions of these different fields exist (Knijnenburg and Kobsa,

2013a; Smith et al., 2011), this has proven to be a difficult task, as each discipline has its own

conceptualization of the notion of “privacy” (Smith et al., 2011).

International Journal of Human-Computer Studies 71(12), 1144-1162 (Special Issue on Privacy Methodologies in HCI)

Two notions that do recur across these disciplines are the concepts of privacy attitudes and

privacy behaviors. Moreover, researchers seem to agree that attitudes have a significant impact

on behavior. In fact, the fundamental theory behind integrative models of privacy research

(Knijnenburg and Kobsa, 2013b; Li, 2011; Smith et al., 2011; Xu et al., 2008) is Ajzen and

Fishbein’s Theory of Reasoned Action, which describes the link between attitudes and behaviors

(Ajzen and Fishbein, 1977). However, Ajzen and Fishbein noted an “attitude-behavior gap” in

their seminal work: the link between attitudes and behavior is not very strong. There is strong

evidence in privacy research that such a gap exists for privacy as well (Acquisti and Grossklags,

2005; Acquisti, 2004; Metzger, 2006; Norberg et al., 2007; Spiekermann et al., 2001; van de

Garde-Perik et al., 2008). Norberg (2007) et al. use the term “privacy paradox” to refer to this

discrepancy between stated privacy attitudes and actual privacy behavior. Ajzen and Fishbein

remedy this problem by introducing the mediating concept of behavioral intentions, but they note

that even these intentions are not always perfectly correlated with actual behavior (Ajzen, 1991).

Privacy researchers suggest that the paradox can be overcome by studying people’s behavior in

realistic situations instead of lab experiments (Bennett, 1995; Smith et al., 2011).

For the measurement of attitudes, several scales have been developed, often through a reasonably

rigorous scale-development process (Dinev and Hart, 2004; Malhotra et al., 2004; Smith et al.,

1996; Stewart and Segars, 2002). In each of these efforts, privacy concerns turned out to be

multidimensional: privacy attitude is not a single construct, but an interplay of correlated but

conceptually distinct aspects, such as “control”, “collection”, “identification”, “improper access”,

“unauthorized use” and “awareness”.

In contrast to attitudes, comparatively few studies have been conducted on privacy-related

behavior, and specifically on information disclosure behavior

. The research in this area can be

broadly divided into two approaches (see section 2 for a detailed literature overview). The first

approach regards the disclosure of each item of personal information (e.g. location, gender,

income) as a separate decision, and makes no assumptions about correlations between these

decisions. In the absence of a theory of how different disclosure behaviors are related, this work

does not define an overall measure of a person’s rate of disclosure (or disclosure tendency). This

work typically also does not try to explain how disclosure behaviors come about, or how they can

be influenced.

The other approach treats the aggregate of individual disclosure behaviors as a single scale (see

section 2). By summing individual disclosures into an overall measurement of disclosure

tendency, these researchers make an implicit assumption of unidimensionality of the information

disclosures (i.e. they assume that all items belong to the same scale), and even exchangeability of

the disclosed items (i.e. they assume that each item contributes the same amount of “evidence” to

the scale). The construction of a disclosure tendency scale allows these researchers to, e.g., find

antecedents in terms of covariates and manipulations of disclosure behavior. In doing so, they

might however oversimplify the actual structure of the disclosure behavior (i.e. some behaviors

may be more strongly related than others), thereby violating one of the preconditions of

unidimensional measurement.

In this paper we consider both behavioral intentions and actual behaviors, and we provide

results in section 7.5 that suggests the two are sufficiently related. We therefore refer to both of

them as “behavior”, unless our argument calls for a distinction.

In this paper, we argue that information disclosure behaviors are in fact multidimensional, i.e. that

different people have different tendencies to disclose different types of information. Our

statistical results suggest that regardless of people’s overall tendency to disclose information,

different categories of information can be distinguished and people be classified into distinct

groups that behave differently with regard to the disclosure of information belonging to these

categories.

Classifying people according to their privacy concerns is not a new idea; in fact, one of the most

cited results in privacy research is that people can be divided into three broad categories: privacy

fundamentalists, pragmatists, and unconcerned (Harris et al., 2003a; Harris, 2000; Westin and

Maurici, 1998; Westin et al., 1981). Our classification is different though in two ways: First, we

classify on behavior rather than attitudes, which, given the mentioned attitude-behavior gap, is an

arguably more accurate classification. Second, we argue that privacy categorization should not

just consider a difference in degree, but also a difference in kind: for example, one group may be

less likely to disclose their location, while another group may be less likely to disclose their

opinions.

Although the notion of multidimensional information disclosure behaviors and the idea of

classifying people along these dimensions seems fairly straightforward, this approach has to date

hardly ever been considered in the privacy literature. The multidimensional analyses by, e.g.,

Phelps, Nowak and Ferrell (2000), Spiekermann et al. (2001), Olson et al. (2005), Koshimizu et

al. (2006) and Lusoli et al. (2012) are notable but limited exceptions.

The next section describes existing research that measures information disclosure behaviors, and

shows that (with the mentioned exceptions) these measurements either do not make any

assumptions about dimensionality, or are unidimensional in nature. Section 3 explains in more

detail what it means for disclosure behaviors to be multidimensional, and discusses why it is

important for researchers to conceptualize disclosure behaviors in this way. Section 4 details the

6-step analysis that we performed to describe the dimensionality of disclosure behavior, and

clarifies in what way this approach is a step beyond the aforementioned multidimensional

analyses. Sections 5 to 7 describe the results of our analyses of the dimensionality of disclosure

behavior in three datasets and also discuss how different groups of users behave differently along

the discovered dimensions. Two of the discussed datasets were previously collected (one of them

by other researchers), and we try to uncover the dimensionality in these datasets ex post. The third

dataset was specifically collected for this paper, and we formulate ex ante hypotheses about its

underlying dimensional structure. Finally, section 8 draws conclusions and makes suggestions for

future work.

2 Related work

Studies that focus on the disclosure of small amounts of personal information typically treat

users’ disclosure of each requested item as independent:

• Acquisti, John and Loewenstein (2011, study 1) investigate the effect of social

information on participants’ tendency to admit having engaged in six sensitive behaviors.

They treat these behaviors independently, and show an effect of social information on all

of them.

• Joinson et al. (2008, study 2) test the effect of priming participants with a privacy policy

on their subsequent disclosure behavior. They allow participants to opt out of disclosure

by either choosing “prefer not to say”, or by “blurring” their answer (providing a less

concrete value). They use three items (income, religion and ethnicity), and show that a

when treated separately, the items on which their manipulation has an effect are different

for men (income and religion) and women (ethnicity). It thus remains unclear whether the

separated or rather the summated results should be considered as the best representation

of participants’ behavior.

Treating the disclosure of different items as independent behaviors is certainly not wrong, but it

typically comes at the cost of reduced statistical power. One also has to consider the family-wise

error of performing a large number of statistical tests: when comparing users’ disclosure of 20

different items of personal information, one is likely to find one item that tests significant at the p

< .05 level by pure chance. Studies that consider the disclosure of a larger number of items

therefore typically use a summated composite score to represent disclosure behavior:

• In a series of studies, Metzger (2007, 2006, 2004) examines the effects of privacy

policies, trust and previous experience on information disclosure to an online retailer. She

creates two composite scores of disclosure: a simple sum, and a sum weighed by the

relative sensitivity of the items.

• Similarly, Joinson et al. (2010) measure the effect of privacy and trust on a summated

score of information disclosure. In study 1, they show that the effect of perceived privacy

on a composite score of disclosure is mediated by trust. In study 2, they manipulate trust

and privacy through interface cues, and measure the effect on another composite score of

disclosure (comprising the four most sensitive items from study 1). They find that

disclosure is substantially lower only when the system employs both a weak privacy

policy and cues designed to reduce trust.

• Similarly, John, Acquisti and Loewenstein (2011) investigate the effect of contextual

cues (a frivolous survey versus a serious survey) on admittance of sensitive behaviors.

They sum up participants’ answers into a single score (the “affirmative admission rate”),

and show that these rates are higher for frivolous surveys than for serious surveys.

It would be interesting to revisit the data from each of these experiments and analyze the

dimensionality of the disclosure behaviors. For example, a closer inspection of John et al.’s

(2011) results reveals that the effect of contextual cues differs per behavior. For instance, in study

1B, the effect seems to be strongest for financial behaviors, whereas in study 2A the effect is

most pronounced for legal sexual acts. A factor analysis of the specific behaviors could

categorize the behaviors in interesting ways, possibly leading to new insights. The same is true

for the studies of Metzger and Joinson et al., because the disclosure items requested in these

studies span a wide range of domains.

There are other studies that group items into a number of distinct scales, but most often these

groups are based on the sensitivity of the item instead of a tested underlying dimensionality:

• In study 1 of Joinson et al. (2008), they distinguish between sensitive and non-sensitive

items. Their results show that priming had an effect on both types of information.

• In study 2 of Acquisti, John and Loewenstein (2011), they study the effect of request

order on “tame”, “moderate” and “intrusive” items. Request order only seems to

influence the disclosure of intrusive items.

• Knapp and Kirk (2003) test the effect of different survey administration methods (touch-

tone phone, Internet and paper) on the disclosure of 60 items ranging from innocuous

(“Do you own pet?”) to very sensitive (“Have you ever been in jail?”). They test the

effect of administration method on each of these items separately, and do not find any

effect. Subsequently, they sum items for three different sensitivity levels, but still find no

effect.

Although these studies make an effort to test the effect of their manipulation on different groups

of items, we contend that a factor analysis of the reported behavior could categorize the

disclosure of the behaviors differently, potentially leading to interesting new results. For instance,

the Acquisti et al. items seem to fall along the dimensions of sexual, financial, larcenous, work-

related, and impression management behaviors. It would be interesting to see how their

manipulations have different effects on each of these different types of data. Although it is not

certain that a domain-related grouping is more insightful than a sensitivity-related grouping, a

data-driven dimensionality approach would arguably have resulted in more robust dimensions.

For example, Knapp and Kirk acknowledge that the Cronbach’s alphas of their composite scores

are low (between 0.37 and 0.61) because items within each group are from non-related domains.

Domain-specific composite scores would likely have resulted in higher Cronbach’s alphas.

Studying a music recommender, Van de Garde-Perik et al. (2008) make a distinction between

music preference items and personality trait items. Their use of two different information types is

in line with our approach, but the authors unfortunately do not request users to make a separate

decision per individual item (only per information type), and they do not report the correlation

between participants’ disclosures of the two information types.

Finally, Norberg, Horne and Horne (2007) test the effect of trust on the intentions to disclose, and

on the actual disclosure of 17 items of personal information. They show that participants’

intentions to disclose information are lower than their actual levels of disclosure, regardless of

whether the receiving party is a trustworthy bank or a less trustworthy pharmaceutical company.

Moreover, they show that perceptions of risk, but not of trust, are related to participants’ intention

to disclose. Interestingly, the authors use different items for the pharmaceutical company (more

health-related items) than for the bank (more finance-related items). This invalidates their

additional finding that both intention to disclose and actual disclosure are higher for the bank than

for the pharmaceutical company, because the metrics taken in both cases are incomparable. The

authors should have taken a multi-dimensional approach instead, and should have considered how

separate measures of general, financial and health-related disclosures differ between the two

scenarios.

3 Multidimensionality: why bother?

The previous section demonstrates that most existing research on disclosure behavior either does

not make any assumptions about its dimensionality, or regards behavior as unidimensional. We

argue instead that disclosure behaviors are in fact multidimensional. What does that mean? Let’s

take a hypothetical website that asks users to (optionally) disclose ten items of personal

information, I

1…10

. Most researchers would agree that people’s tendencies to disclose these

individual items are correlated (i.e. people have an overall disclosure tendency that holds for any

type of information). However, it may be the case that the correlations among the disclosure

tendencies for items I

1..5

are stronger than the correlations for these items with the other items,

and the same may hold true for I

6…10

. In that case, there are potentially two (correlated) factors of

disclosure behavior underlying the disclosure of these ten items. In other words: there are two

disclosure tendencies: the tendency to disclose I

1…5

and the tendency to disclose I

6…10

. This

essentially means that although there may be some people who have no problems disclosing all

ten items and some people who do not disclose any of them, there also exists a sizable group of

Dimensionality of information disclosure behavior

Figures

Citations

Risk, trust, and the interaction of perceived ease of use and behavioral control in predicting consumers use of social media for transactions

Explaining the privacy paradox: A systematic review of literature investigating privacy attitude and behavior

CHI '05 Extended Abstracts on Human Factors in Computing Systems

Face/Off: Preventing Privacy Leakage From Photos in Social Networks

Evaluating recommender systems with user experiments

References

Cutoff criteria for fit indexes in covariance structure analysis : Conventional criteria versus new alternatives

The theory of planned behavior

Significance tests and goodness of fit in the analysis of covariance structures

Attitude-behavior relations: A theoretical analysis and review of empirical research.

Internet Users' Information Privacy Concerns (IUIPC): The Construct, the Scale, and a Causal Model

Related Papers (5)

The Privacy Paradox: Personal Information Disclosure Intentions versus Behaviors

Location disclosure to social relations: why, when, & what people want to share

Information privacy: measuring individuals' concerns about organizational practices

Internet Users' Information Privacy Concerns (IUIPC): The Construct, the Scale, and a Causal Model

Imagined communities: awareness, information sharing, and privacy on the facebook

Frequently Asked Questions (6)

Q1. What are the contributions in "Dimensionality of information disclosure behavior - revision april 2013-camready" ?

Q2. How many books and journals have been published with this word in the title?

Q3. What is the way to distinguish between different types of disclosure?

Q4. What is the dimensionality of information disclosure behavior in this study?

Q5. How many times more likely were participants to disclose an item than they were to answer “neutral?

Q6. What did the authors do to test the claim of multidimensionality?