scispace - formally typeset
Open AccessJournal ArticleDOI

Dimensionality of information disclosure behavior

TLDR
This paper evaluates three information disclosure datasets using a six-step statistical analysis, and shows that people's disclosure behaviors are rather multidimensional: participants' disclosure of personal information breaks down into a number of distinct factors.
Abstract
In studies of people's privacy behavior, the extent of disclosure of personal information is typically measured as a summed total or a ratio of disclosure In this paper, we evaluate three information disclosure datasets using a six-step statistical analysis, and show that people's disclosure behaviors are rather multidimensional: participants' disclosure of personal information breaks down into a number of distinct factors Moreover, people can be classified along these dimensions into groups with different ''disclosure styles'' This difference is not merely in degree, but rather also in kind: one group may for instance disclose location-related but not interest-related items, whereas another group may behave exactly the other way around We also found other significant differences between these groups, in terms of privacy attitudes, behaviors, and demographic characteristics These might for instance allow an online system to classify its users into their respective privacy group, and to adapt its privacy practices to the disclosure style of this group We discuss how our results provide relevant insights for a more user-centric approach to privacy and, more generally, advance our understanding of online privacy behavior

read more

Content maybe subject to copyright    Report

1
Dimensionality of information disclosure behavior
Bart P. Knijnenburg, Alfred Kobsa, Hongxia Jin
Bart P. Knijnenburg
a,b
bart.k@uci.edu (corresponding author)
Alfred Kobsa
a
kobsa@uci.edu
Hongxia Jin
b
hongxia.jin@sisa.samsung.com
a
Donald Bren School of Information and Computer Sciences, University of California, Irvine
6210 Donald Bren Hall, Irvine, CA 92697, USA
b
Samsung Information Systems of America
75 West Plumiera Drive, San Jose, CA 95134, USA
Abstract
In studies of people’s privacy behavior, the extent of disclosure of personal information is
typically measured as a summed total or a ratio of disclosure. In this paper, we evaluate three
information disclosure datasets using a six-step statistical analysis, and show that people’s
disclosure behaviors are rather multidimensional: participants’ disclosure of personal information
breaks down into a number of distinct factors. Moreover, people can be classified along these
dimensions into groups with different “disclosure styles”. This difference is not merely in degree,
but rather also in kind: one group may for instance disclose location-related but not interest-
related items, whereas another group may behave exactly the other way around. We also found
other significant differences between these groups, in terms of privacy attitudes, behaviors, and
demographic characteristics. These might for instance allow an online system to classify its users
into their respective privacy group, and to adapt its privacy practices to the disclosure style of this
group. We discuss how our results provide relevant insights for a more user-centric approach to
privacy and, more generally, advance our understanding of online privacy behavior.
Keywords
Privacy behavior, privacy attitude, information disclosure, measurement, factor analysis, latent
class analysis, structural equation modeling.
1 Introduction
Privacy is a very active research topic: every year, over 1200 new books and journal articles have
been published with this word in the title (Patil and Kobsa, 2009). Even in the sub-realm of online
privacy, these publications cover a wide range of disciplines, such as human-computer interaction
(Iachello and Hong, 2007), information systems (Bélanger and Crossler, 2011), personalization
(Kobsa, 2007), behavioral economics (Acquisti and Grossklags, 2008), marketing (Caudill and
Murphy, 2000), and social psychology (Joinson and Paine, 2007; Joinson et al., 2010). Although
attempts to integrate the contributions of these different fields exist (Knijnenburg and Kobsa,
2013a; Smith et al., 2011), this has proven to be a difficult task, as each discipline has its own
conceptualization of the notion of “privacy(Smith et al., 2011).
International Journal of Human-Computer Studies 71(12), 1144-1162 (Special Issue on Privacy Methodologies in HCI)

2
Two notions that do recur across these disciplines are the concepts of privacy attitudes and
privacy behaviors. Moreover, researchers seem to agree that attitudes have a significant impact
on behavior. In fact, the fundamental theory behind integrative models of privacy research
(Knijnenburg and Kobsa, 2013b; Li, 2011; Smith et al., 2011; Xu et al., 2008) is Ajzen and
Fishbein’s Theory of Reasoned Action, which describes the link between attitudes and behaviors
(Ajzen and Fishbein, 1977). However, Ajzen and Fishbein noted an “attitude-behavior gap” in
their seminal work: the link between attitudes and behavior is not very strong. There is strong
evidence in privacy research that such a gap exists for privacy as well (Acquisti and Grossklags,
2005; Acquisti, 2004; Metzger, 2006; Norberg et al., 2007; Spiekermann et al., 2001; van de
Garde-Perik et al., 2008). Norberg (2007) et al. use the term “privacy paradox” to refer to this
discrepancy between stated privacy attitudes and actual privacy behavior. Ajzen and Fishbein
remedy this problem by introducing the mediating concept of behavioral intentions, but they note
that even these intentions are not always perfectly correlated with actual behavior (Ajzen, 1991).
Privacy researchers suggest that the paradox can be overcome by studying people’s behavior in
realistic situations instead of lab experiments (Bennett, 1995; Smith et al., 2011).
For the measurement of attitudes, several scales have been developed, often through a reasonably
rigorous scale-development process (Dinev and Hart, 2004; Malhotra et al., 2004; Smith et al.,
1996; Stewart and Segars, 2002). In each of these efforts, privacy concerns turned out to be
multidimensional: privacy attitude is not a single construct, but an interplay of correlated but
conceptually distinct aspects, such as “control”, “collection”, “identification”, “improper access”,
“unauthorized use” and “awareness”.
In contrast to attitudes, comparatively few studies have been conducted on privacy-related
behavior, and specifically on information disclosure behavior
1
. The research in this area can be
broadly divided into two approaches (see section 2 for a detailed literature overview). The first
approach regards the disclosure of each item of personal information (e.g. location, gender,
income) as a separate decision, and makes no assumptions about correlations between these
decisions. In the absence of a theory of how different disclosure behaviors are related, this work
does not define an overall measure of a person’s rate of disclosure (or disclosure tendency). This
work typically also does not try to explain how disclosure behaviors come about, or how they can
be influenced.
The other approach treats the aggregate of individual disclosure behaviors as a single scale (see
section 2). By summing individual disclosures into an overall measurement of disclosure
tendency, these researchers make an implicit assumption of unidimensionality of the information
disclosures (i.e. they assume that all items belong to the same scale), and even exchangeability of
the disclosed items (i.e. they assume that each item contributes the same amount of “evidence” to
the scale). The construction of a disclosure tendency scale allows these researchers to, e.g., find
antecedents in terms of covariates and manipulations of disclosure behavior. In doing so, they
might however oversimplify the actual structure of the disclosure behavior (i.e. some behaviors
may be more strongly related than others), thereby violating one of the preconditions of
unidimensional measurement.
1
In this paper we consider both behavioral intentions and actual behaviors, and we provide
results in section 7.5 that suggests the two are sufficiently related. We therefore refer to both of
them as “behavior”, unless our argument calls for a distinction.

3
In this paper, we argue that information disclosure behaviors are in fact multidimensional, i.e. that
different people have different tendencies to disclose different types of information. Our
statistical results suggest that regardless of people’s overall tendency to disclose information,
different categories of information can be distinguished and people be classified into distinct
groups that behave differently with regard to the disclosure of information belonging to these
categories.
Classifying people according to their privacy concerns is not a new idea; in fact, one of the most
cited results in privacy research is that people can be divided into three broad categories: privacy
fundamentalists, pragmatists, and unconcerned (Harris et al., 2003a; Harris, 2000; Westin and
Maurici, 1998; Westin et al., 1981). Our classification is different though in two ways: First, we
classify on behavior rather than attitudes, which, given the mentioned attitude-behavior gap, is an
arguably more accurate classification. Second, we argue that privacy categorization should not
just consider a difference in degree, but also a difference in kind: for example, one group may be
less likely to disclose their location, while another group may be less likely to disclose their
opinions.
Although the notion of multidimensional information disclosure behaviors and the idea of
classifying people along these dimensions seems fairly straightforward, this approach has to date
hardly ever been considered in the privacy literature. The multidimensional analyses by, e.g.,
Phelps, Nowak and Ferrell (2000), Spiekermann et al. (2001), Olson et al. (2005), Koshimizu et
al. (2006) and Lusoli et al. (2012) are notable but limited exceptions.
The next section describes existing research that measures information disclosure behaviors, and
shows that (with the mentioned exceptions) these measurements either do not make any
assumptions about dimensionality, or are unidimensional in nature. Section 3 explains in more
detail what it means for disclosure behaviors to be multidimensional, and discusses why it is
important for researchers to conceptualize disclosure behaviors in this way. Section 4 details the
6-step analysis that we performed to describe the dimensionality of disclosure behavior, and
clarifies in what way this approach is a step beyond the aforementioned multidimensional
analyses. Sections 5 to 7 describe the results of our analyses of the dimensionality of disclosure
behavior in three datasets and also discuss how different groups of users behave differently along
the discovered dimensions. Two of the discussed datasets were previously collected (one of them
by other researchers), and we try to uncover the dimensionality in these datasets ex post. The third
dataset was specifically collected for this paper, and we formulate ex ante hypotheses about its
underlying dimensional structure. Finally, section 8 draws conclusions and makes suggestions for
future work.
2 Related work
Studies that focus on the disclosure of small amounts of personal information typically treat
users’ disclosure of each requested item as independent:
Acquisti, John and Loewenstein (2011, study 1) investigate the effect of social
information on participants’ tendency to admit having engaged in six sensitive behaviors.
They treat these behaviors independently, and show an effect of social information on all
of them.
Joinson et al. (2008, study 2) test the effect of priming participants with a privacy policy
on their subsequent disclosure behavior. They allow participants to opt out of disclosure
by either choosing “prefer not to say”, or by “blurring” their answer (providing a less

4
concrete value). They use three items (income, religion and ethnicity), and show that a
privacy policy has an effect when these three behaviors are summed together. However,
when treated separately, the items on which their manipulation has an effect are different
for men (income and religion) and women (ethnicity). It thus remains unclear whether the
separated or rather the summated results should be considered as the best representation
of participants’ behavior.
Treating the disclosure of different items as independent behaviors is certainly not wrong, but it
typically comes at the cost of reduced statistical power. One also has to consider the family-wise
error of performing a large number of statistical tests: when comparing users’ disclosure of 20
different items of personal information, one is likely to find one item that tests significant at the p
< .05 level by pure chance. Studies that consider the disclosure of a larger number of items
therefore typically use a summated composite score to represent disclosure behavior:
In a series of studies, Metzger (2007, 2006, 2004) examines the effects of privacy
policies, trust and previous experience on information disclosure to an online retailer. She
creates two composite scores of disclosure: a simple sum, and a sum weighed by the
relative sensitivity of the items.
Similarly, Joinson et al. (2010) measure the effect of privacy and trust on a summated
score of information disclosure. In study 1, they show that the effect of perceived privacy
on a composite score of disclosure is mediated by trust. In study 2, they manipulate trust
and privacy through interface cues, and measure the effect on another composite score of
disclosure (comprising the four most sensitive items from study 1). They find that
disclosure is substantially lower only when the system employs both a weak privacy
policy and cues designed to reduce trust.
Similarly, John, Acquisti and Loewenstein (2011) investigate the effect of contextual
cues (a frivolous survey versus a serious survey) on admittance of sensitive behaviors.
They sum up participants’ answers into a single score (the “affirmative admission rate”),
and show that these rates are higher for frivolous surveys than for serious surveys.
It would be interesting to revisit the data from each of these experiments and analyze the
dimensionality of the disclosure behaviors. For example, a closer inspection of John et al.’s
(2011) results reveals that the effect of contextual cues differs per behavior. For instance, in study
1B, the effect seems to be strongest for financial behaviors, whereas in study 2A the effect is
most pronounced for legal sexual acts. A factor analysis of the specific behaviors could
categorize the behaviors in interesting ways, possibly leading to new insights. The same is true
for the studies of Metzger and Joinson et al., because the disclosure items requested in these
studies span a wide range of domains.
There are other studies that group items into a number of distinct scales, but most often these
groups are based on the sensitivity of the item instead of a tested underlying dimensionality:
In study 1 of Joinson et al. (2008), they distinguish between sensitive and non-sensitive
items. Their results show that priming had an effect on both types of information.
In study 2 of Acquisti, John and Loewenstein (2011), they study the effect of request
order on “tame”, “moderate” and “intrusive” items. Request order only seems to
influence the disclosure of intrusive items.
Knapp and Kirk (2003) test the effect of different survey administration methods (touch-
tone phone, Internet and paper) on the disclosure of 60 items ranging from innocuous
(“Do you own pet?”) to very sensitive (“Have you ever been in jail?”). They test the
effect of administration method on each of these items separately, and do not find any

5
effect. Subsequently, they sum items for three different sensitivity levels, but still find no
effect.
Although these studies make an effort to test the effect of their manipulation on different groups
of items, we contend that a factor analysis of the reported behavior could categorize the
disclosure of the behaviors differently, potentially leading to interesting new results. For instance,
the Acquisti et al. items seem to fall along the dimensions of sexual, financial, larcenous, work-
related, and impression management behaviors. It would be interesting to see how their
manipulations have different effects on each of these different types of data. Although it is not
certain that a domain-related grouping is more insightful than a sensitivity-related grouping, a
data-driven dimensionality approach would arguably have resulted in more robust dimensions.
For example, Knapp and Kirk acknowledge that the Cronbach’s alphas of their composite scores
are low (between 0.37 and 0.61) because items within each group are from non-related domains.
Domain-specific composite scores would likely have resulted in higher Cronbach’s alphas.
Studying a music recommender, Van de Garde-Perik et al. (2008) make a distinction between
music preference items and personality trait items. Their use of two different information types is
in line with our approach, but the authors unfortunately do not request users to make a separate
decision per individual item (only per information type), and they do not report the correlation
between participants’ disclosures of the two information types.
Finally, Norberg, Horne and Horne (2007) test the effect of trust on the intentions to disclose, and
on the actual disclosure of 17 items of personal information. They show that participants’
intentions to disclose information are lower than their actual levels of disclosure, regardless of
whether the receiving party is a trustworthy bank or a less trustworthy pharmaceutical company.
Moreover, they show that perceptions of risk, but not of trust, are related to participants’ intention
to disclose. Interestingly, the authors use different items for the pharmaceutical company (more
health-related items) than for the bank (more finance-related items). This invalidates their
additional finding that both intention to disclose and actual disclosure are higher for the bank than
for the pharmaceutical company, because the metrics taken in both cases are incomparable. The
authors should have taken a multi-dimensional approach instead, and should have considered how
separate measures of general, financial and health-related disclosures differ between the two
scenarios.
3 Multidimensionality: why bother?
The previous section demonstrates that most existing research on disclosure behavior either does
not make any assumptions about its dimensionality, or regards behavior as unidimensional. We
argue instead that disclosure behaviors are in fact multidimensional. What does that mean? Let’s
take a hypothetical website that asks users to (optionally) disclose ten items of personal
information, I
1…10
. Most researchers would agree that people’s tendencies to disclose these
individual items are correlated (i.e. people have an overall disclosure tendency that holds for any
type of information). However, it may be the case that the correlations among the disclosure
tendencies for items I
1..5
are stronger than the correlations for these items with the other items,
and the same may hold true for I
6…10
. In that case, there are potentially two (correlated) factors of
disclosure behavior underlying the disclosure of these ten items. In other words: there are two
disclosure tendencies: the tendency to disclose I
1…5
and the tendency to disclose I
6…10
. This
essentially means that although there may be some people who have no problems disclosing all
ten items and some people who do not disclose any of them, there also exists a sizable group of

Citations
More filters
Journal ArticleDOI

Risk, trust, and the interaction of perceived ease of use and behavioral control in predicting consumers use of social media for transactions

TL;DR: The empirical results support the hypothesis that perceived ease of use significantly amplifies (positively moderates) the effect of perceived behavioral control and indicate that perceived risk and trust play significant roles as antecedents in consumer decision making, and that risk-taking propensity has a direct effect on behavioral intention.
Journal ArticleDOI

Explaining the privacy paradox: A systematic review of literature investigating privacy attitude and behavior

TL;DR: The privacy research community is suggested to agree on a shared definition of the different privacy constructs to allow for conclusions beyond individual samples and study designs, and provide strong evidence for the theoretical explanation approach called ‘privacy calculus’.
Proceedings Article

CHI '05 Extended Abstracts on Human Factors in Computing Systems

TL;DR: The Extended Abstracts portion of this disc includes submissions from all conference venues except Papers, and a formal record of the discussions, demonstrations, and debates that will occur during the conference.
Proceedings ArticleDOI

Face/Off: Preventing Privacy Leakage From Photos in Social Networks

TL;DR: This paper proposes to rethink access control when applied to photos, in a way that allows us to effectively prevent unwanted individuals from recognizing users in a photo, and reveals the misconceptions about the privacy offered by existing mechanisms.
Book ChapterDOI

Evaluating recommender systems with user experiments

TL;DR: This chapter provides a detailed practical description of how to conduct user experiments, covering the following topics: formulating hypotheses, sampling participants, creating experimental manipulations, measuring subjective constructs with questionnaires, and statistically evaluating the results.
References
More filters
Journal ArticleDOI

Cutoff criteria for fit indexes in covariance structure analysis : Conventional criteria versus new alternatives

TL;DR: In this article, the adequacy of the conventional cutoff criteria and several new alternatives for various fit indexes used to evaluate model fit in practice were examined, and the results suggest that, for the ML method, a cutoff value close to.95 for TLI, BL89, CFI, RNI, and G...
Journal ArticleDOI

The theory of planned behavior

TL;DR: Ajzen, 1985, 1987, this article reviewed the theory of planned behavior and some unresolved issues and concluded that the theory is well supported by empirical evidence and that intention to perform behaviors of different kinds can be predicted with high accuracy from attitudes toward the behavior, subjective norms, and perceived behavioral control; and these intentions, together with perceptions of behavioral control, account for considerable variance in actual behavior.
Journal ArticleDOI

Significance tests and goodness of fit in the analysis of covariance structures

TL;DR: In this article, a general null model based on modified independence among variables is proposed to provide an additional reference point for the statistical and scientific evaluation of covariance structure models, and the importance of supplementing statistical evaluation with incremental fit indices associated with the comparison of hierarchical models.
Journal ArticleDOI

Attitude-behavior relations: A theoretical analysis and review of empirical research.

TL;DR: In this article, a review of available empirical research supports the contention that strong attitude-behavior relations can be obtained only under high correspondence between at least the target and action elements of the attitudinal and behavioral entities.
Journal ArticleDOI

Internet Users' Information Privacy Concerns (IUIPC): The Construct, the Scale, and a Causal Model

TL;DR: The results of this study indicate that the second-order IUIPC factor, which consists of three first-order dimensions--namely, collection, control, and awareness--exhibited desirable psychometric properties in the context of online privacy.
Related Papers (5)
Frequently Asked Questions (6)
Q1. What are the contributions in "Dimensionality of information disclosure behavior - revision april 2013-camready" ?

In this paper, the authors evaluate three information disclosure datasets using a six-step statistical analysis, and show that people ’ s disclosure behaviors are rather multidimensional: participants ’ disclosure of personal information breaks down into a number of distinct factors. The authors discuss how their results provide relevant insights for a more user-centric approach to privacy and, more generally, advance their understanding of online privacy behavior. The authors also found other significant differences between these groups, in terms of privacy attitudes, behaviors, and demographic characteristics. 

Privacy is a very active research topic: every year, over 1200 new books and journal articles have been published with this word in the title (Patil and Kobsa, 2009). 

Distinguishing different types of disclosure behaviors per type of personal information can improve the accuracy of prior research results, in which disclosures were summed up into a single “disclosure score”. 

These results suggest that there may be two dimensions of information disclosure behavior in this study, but this dimensionality is derived from the dimensionality of attitudes, and not directly tested on behavior. 

Participants answering “very likely” to the behavioral intention question of a certain item were on average 2.69 times more likely to disclose the item than participants answering “neutral”. 

The “online retailer dataset” was gathered specifically for this paper, in order to broaden its empirical basis, to test the claim of multidimensionality as an ex ante hypothesis, and to alleviate concerns that the thematic grouping of items in the two previous studies had a major effect on the discovered dimensions.