scispace - formally typeset
Search or ask a question

Showing papers by "Larry Korba published in 2008"


Book ChapterDOI
21 Sep 2008
TL;DR: This paper focuses on PII discovery, i.e. automatically identifying private data existant in semi-structured and unstructured (free text) documents, and develops technology that would automatically discover workflow across organizational collaborators that would include private data.
Abstract: With the growing use of computers and the Internet, it has become difficult for organizations to locate and effectively manage sensitive personally identifiable information (PII). This problem becomes even more evident in collaborative computing environments. PII may be hidden anywhere within the file system of a computer. As well, in the course of different activities, via collaboration or not, personally identifiable information may migrate from computer to computer. This makes meeting the organizational privacy requirements all the more complex. Our particular interest is to develop technology that would automatically discover workflow across organizational collaborators that would include private data. Since in this context, it is important to understand where and when the private data is discovered, in this paper, we focus on PII discovery, i.e. automatically identifying private data existant in semi-structured and unstructured (free text) documents. The first part of the process involves identifying PII via named entity recognition. The second part determines relationships between those entities based upon a supervised machine learning method. We present test results of our methods using publicly-available data generated from different collaborative activities to provide an assessment of scalability in cooperative computing environment.

59 citations


Journal ArticleDOI
TL;DR: The content of an Internet or Web service security policy is derived and a flexible security personalization approach is proposed that will allow an Internetor Web service provider and customer to negotiate to an agreed-upon personalized security policy.
Abstract: The growth of the Internet has been accompanied by the growth of Internet services (e.g., e-commerce, e-health). This proliferation of services and the increasing attacks on them by malicious individuals have highlighted the need for service security. The security requirements of an Internet or Web service may be specified in a security policy. The provider of the service is then responsible for implementing the security measures contained in the policy. However, a service customer or consumer may have security preferences that are not reflected in the provider’s security policy. In order for service providers to attract and retain customers, as well as reach a wider market, a way of personalizing a security policy to a particular customer is needed. We derive the content of an Internet or Web service security policy and propose a flexible security personalization approach that will allow an Internet or Web service provider and customer to negotiate to an agreed-upon personalized security policy. In addition, we present two application examples of security policy personalization, and overview the design of our security personalization prototype.

22 citations


Journal ArticleDOI
TL;DR: A new group key management protocol is proposed and it is demonstrated that it has better scalability when compared with other important centralized protocols.
Abstract: Group key management brings challenges on scalability for multicast security. In this paper, we propose a new group key management protocol and demonstrate that it has better scalability when compared with other important centralized protocols.

15 citations


Book ChapterDOI
08 Oct 2008
TL;DR: This paper addresses the problem of predicting the presence of private information in email using data mining and text mining methods and proposes two prediction models based on association rules and classification models that predict private information according to the content of the emails.
Abstract: Private information management and compliance are important issues nowadays for most of organizations. As a major communication tool for organizations, email is one of the many potential sources for privacy leaks. Information extraction methods have been applied to detect private information in text files. However, since email messages usually consist of low quality text, information extraction methods for private information detection may not achieve good performance. In this paper, we address the problem of predicting the presence of private information in email using data mining and text mining methods. Two prediction models are proposed. The first model is based on association rules that predict one type of private information based on other types of private information identified in emails. The second model is based on classification models that predict private information according to the content of the emails. Experiments on the Enron email dataset show promising results.

11 citations


Book ChapterDOI
28 May 2008
TL;DR: A method to identify topics in email messages using the fuzzy membership functions to rank concepts based on the features of the emails, such as the senders, recipients, time span, and frequency of emails in the concepts.
Abstract: In this paper, we present a method to identify topics in email messages. The formal concept analysis is adopted as a semantic analysis method to group emails containing the same keywords to concepts. The fuzzy membership functions are used to rank the concepts based on the features of the emails, such as the senders, recipients, time span, and frequency of emails in the concepts. The highly ranked concepts are then identified as email topics. Experimental results on the Enron email dataset illustrate the effectiveness of the method.

7 citations


Book ChapterDOI
01 Jan 2008

4 citations


Book ChapterDOI
02 Sep 2008
TL;DR: The Latent Dirichlet Allocation model is proposed to adapt for analyzing email corpus and it is shown that this method obtains better performance than the Author-Topic model.
Abstract: Analyzing the author and topic relations in email corpus is an important issue in both social network analysis and text mining. The Author-Topic model is a statistical model that identifies the author-topic relations. However, in its inference process, it ignores the information at the document level, i.e., the co-occurrence of words within documents are not taken into account in deriving topics. This may not be suitable for email analysis. We propose to adapt the Latent Dirichlet Allocation model for analyzing email corpus. This method takes into account both the author-document relations and the document-topic relations. We use the Author-Topic model as the baseline method and propose measures to compare our method against the Author-Topic model. We did empirical analysis based on experimental results on both simulated data sets and the real Enron email data set to show that our method obtains better performance than the Author-Topic model.

3 citations


Book ChapterDOI
21 Sep 2008
TL;DR: This paper proposes a cooperative visualization technique that can be employed by service providers to understand how private information flows within their organizations, as a way of identifying privacy risks or vulnerabilities that can lead to violations of privacy legislation.
Abstract: The growth of the Internet has been accompanied by the growth of e-services (eg e-commerce, e-health) This proliferation of e-services has put large quantities of customer private information in the hands of service providers, who in many cases have mishandled the information to the detriment of customer privacy As a result, government bodies have put in place privacy legislation that spells out the privacy rights of customers and how their private information is to be handled Service providers are required to comply with this privacy legislation This paper proposes a cooperative visualization technique that can be employed by service providers to understand how private information flows within their organizations, as a way of identifying privacy risks or vulnerabilities that can lead to violations of privacy legislation The description of the technique includes a model of how an e-service uses private information, a graphical notation for the visualization, and an application example

2 citations


01 Jan 2008
TL;DR: The method and apparatus disclosed are particularly applicable to supporting a telecommunications cable while cooling extruded jacketing material on the outer surface of the cable.
Abstract: Fluid is directed against an article of indefinite length to support the article as it is moved along a passline. The fluid may be used to modify the temperature of the article. The method and apparatus disclosed are particularly applicable to supporting a telecommunications cable while cooling extruded jacketing material on the outer surface of the cable.

1 citations


Book ChapterDOI
07 Sep 2008
TL;DR: An approach for quantitatively assessing the likelihood that an organization will comply with privacy policy is described.
Abstract: Individuals interact with organizations in many different capacities (e.g. as clients, as employees). Many of these interactions require the individual to submit her personal information to the organization, which may claim compliance with privacy policy. It is important to assess this compliance quantitatively. This paper describes an approach for quantitatively assessing the likelihood that an organization will comply with privacy policy.

1 citations