scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

A Highly-Reliable Virtual Primary Key Scheme for Relational Database Watermarking Techniques

TL;DR: This paper proposes a solution that solves the dependency of the primary key, avoiding the problem of non-unique values and showing more resilience against attribute deletion attacks than previous schemes.
Abstract: Watermarking techniques for relational data have been proposed to allow copyright protection and data authenticity, among other things. Almost all proposals depend on the primary key of the database relations for deciding where and how to place the marks. The primary key could be easily updated or deleted if the attacker does not require watermarked data to be placed back in the database. The few techniques trying to avoid this dependency create a virtual primary key through schemes that frequently compute non-unique values, which may cause watermark synchronization problems. Also, the deletion of attributes compromises obtaining same values for the virtual primary key used for the mark embedding. In this paper, we propose a solution that solves the dependency of the primary key, avoiding the problem of non-unique values and showing more resilience against attribute deletion attacks than previous schemes.
Citations
More filters
Journal ArticleDOI
TL;DR: The metrics are introduced to allow precise measuring of the quality of the VPKs generated by any scheme without requiring to perform the watermark embedding, so that time waste can be avoided in case of low-quality detection.
Abstract: Most of the watermarking techniques designed to protect relational data often use the Primary Key (PK) of relations to perform the watermark synchronization. Despite offering high confidence to the watermark detection, these approaches become useless if the PK can be erased or updated. A typical example is when an attacker wishes to use a stolen relation, unlinked to the rest of the database. In that case, the original values of the PK lose relevance, since they are not employed to check the referential integrity. Then, it is possible to erase or replace the PK, compromising the watermark detection with no need to perform the slightest modification on the rest of the data. To avoid the problems caused by the PK-dependency some schemes have been proposed to generate Virtual Primary Keys (VPK) used instead. Nevertheless, the quality of the watermark synchronized using VPKs is compromised due to the presence of duplicate values in the set of VPKs and the fragility of the VPK schemes against the elimination of attributes. In this paper, we introduce the metrics to allow precise measuring of the quality of the VPKs generated by any scheme without requiring to perform the watermark embedding. This way, time waste can be avoided in case of low-quality detection. We also analyze the main aspects to design the ideal VPK scheme, seeking the generation of high-quality VPK sets adding robustness to the process. Finally, a new scheme is presented along with the experiments carried out to validate and compare the results with the rest of the schemes proposed in the literature.

12 citations

Journal ArticleDOI
TL;DR: A semantic-driven watermarking approach of relational textual databases is proposed, which marks multi-word textual attributes, exploiting the synonym substitution technique for text water marking together with notions in semantic similarity analysis, and dealing with the semantic perturbations provoked by the watermark embedding.
Abstract: In relational database watermarking, the semantic consistency between the original database and the distorted one is a challenging issue which is disregarded by most watermarking proposals, due to the well-known assumption for which a small amount of errors in the watermarked database is tolerable. We propose a semantic-driven watermarking approach of relational textual databases, which marks multi-word textual attributes, exploiting the synonym substitution technique for text watermarking together with notions in semantic similarity analysis, and dealing with the semantic perturbations provoked by the watermark embedding. We show the effectiveness of our approach through an experimental evaluation, highlighting the resulting capacity, robustness and imperceptibility watermarking requirements. We also prove the resilience of our approach with respect to the random synonym substitution attack.

11 citations

Journal ArticleDOI
TL;DR: An exhaustive empirical study and thorough comparative analysis of various relational database watermarking techniques in the literature along with a rigorous experimental analysis demonstrating a detailed comparison on robustness, data usability, and computational cost with considerable empirical evidence is provided.
Abstract: Digital watermarking is considered one of the most promising techniques to verify the authenticity and integrity of digital data. It is used for a wide range of applications, e.g., copyright protection, tamper detection, traitor tracing, maintaining the integrity of data, etc. In the past two decades, a wide range of algorithms for relational database watermarking has been proposed. Even though a number of surveys exist in the literature, they are unable to provide insightful guidance to choose the right watermarking technique for a given application. In this paper, we provide an exhaustive empirical study and thorough comparative analysis of various relational database watermarking techniques in the literature. Our work is different from the existing survey papers as we consider both distortion-based and distortion-free techniques along with a rigorous experimental analysis demonstrating a detailed comparison on robustness, data usability, and computational cost with considerable empirical evidence.

8 citations

Journal ArticleDOI
TL;DR: In this paper , the authors provide an exhaustive empirical study and thorough comparative analysis of various relational database watermarking techniques in the literature, considering both distortion-based and distortion-free techniques along with a rigorous experimental analysis demonstrating a detailed comparison on robustness, data usability, and computational cost.
Abstract: Digital watermarking is considered one of the most promising techniques to verify the authenticity and integrity of digital data. It is used for a wide range of applications, e.g., copyright protection, tamper detection, traitor tracing, maintaining the integrity of data, etc. In the past two decades, a wide range of algorithms for relational database watermarking has been proposed. Even though a number of surveys exist in the literature, they are unable to provide insightful guidance to choose the right watermarking technique for a given application. In this paper, we provide an exhaustive empirical study and thorough comparative analysis of various relational database watermarking techniques in the literature. Our work is different from the existing survey papers as we consider both distortion-based and distortion-free techniques along with a rigorous experimental analysis demonstrating a detailed comparison on robustness, data usability, and computational cost with considerable empirical evidence.

8 citations

Journal ArticleDOI
TL;DR: This paper proposes double fragmentation of the watermark by using the existing redundancy in the set of virtual primary keys to guarantee the right identification of theWatermark despite the deletion of any of the attributes of the relation.
Abstract: Relational data watermarking techniques using virtual primary key schemes try to avoid compromising watermark detection due to the deletion or replacement of the relation’s primary key. Nevertheless, these techniques face the limitations that bring high redundancy of the generated set of virtual primary keys, which often compromises the quality of the embedded watermark. As a solution to this problem, this paper proposes double fragmentation of the watermark by using the existing redundancy in the set of virtual primary keys. This way, we guarantee the right identification of the watermark despite the deletion of any of the attributes of the relation. The experiments carried out to validate our proposal show an increment between 81.04% and 99.05% of detected marks with respect to previous solutions found in the literature. Furthermore, we found out that our approach takes advantage of the redundancy present in the set of virtual primary keys. Concerning the computational complexity of the solution, we performed a set of scalability tests that show the linear behavior of our approach with respect to the processes runtime and the number of tuples involved, making it feasible to use no matter the amount of data to be protected.

6 citations


Cites background or methods from "A Highly-Reliable Virtual Primary K..."

  • ...[24] with the goal of varying the elements involved in the VPK generation....

    [...]

  • ...Considering that choosing a process featuring a high entropy in the generation of the seeds is an important factor for our approach to succeed, we proposed a version of the ExtScheme [24] to perform this task despite having been initially conceived for VPK generation....

    [...]

  • ...WM embedded using each VPK scheme [10], [24]....

    [...]

References
More filters
Book ChapterDOI
Rakesh Agrawal1, Jerry Kiernan1
20 Aug 2002
TL;DR: The need for watermarking database relations to deter their piracy, identify the unique characteristics of relational data which pose new challenges for water marking, and provide desirable properties of a watermarked system for relational data are enunciated.
Abstract: We enunciate the need for watermarking database relations to deter their piracy, identify the unique characteristics of relational data which pose new challenges for watermarking, and provide desirable properties of a watermarking system for relational data. A watermark can be applied to any database relation having attributes which are such that changes in a few of their values do not affect the applications. We then present an effective watermarking technique geared for relational data. This technique ensures that some bit positions of some of the attributes of some of the tuples contain specific values. The tuples, attributes within a tuple, bit positions in an attribute, and specific bit values are all algorithmically determined under the control of a private key known only to the owner of the data. This bit pattern constitutes the watermark. Only if one has access to the private key can the watermark be detected with high probability. Detecting the watermark neither requires access to the original data nor the watermark. The watermark can be detected even in a small subset of a watermarked relation as long as the sample contains some of the marks. Our extensive analysis shows that the proposed technique is robust against various forms of malicious attacks and updates to the data. Using an implementation running on DB2, we also show that the performance of the algorithms allows for their use in real world applications.

382 citations


"A Highly-Reliable Virtual Primary K..." refers background in this paper

  • ...[1] oriented to mark numeric attributes....

    [...]

  • ...PK)) where the operator ° represents concatenation and H represents a one-way hash function [1] such as SHA-2 or SHA-3....

    [...]

Journal ArticleDOI
TL;DR: The current state-of-the- art watermarking techniques are surveyed and they are classified according to their intent, the way they express the watermark, the cover type, the granularity level, and their verifiability.
Abstract: Digital watermarking for relational databases emerged as a candidate so- lution to provide copyright protection, tamper detection, traitor tracing, maintaining integrity of relational data. Many watermarking techniques have been proposed in the literature to address these purposes. In this paper, we survey the current state-of-the- art and we classify them according to their intent, the way they express the watermark, the cover type, the granularity level, and their verifiability.

86 citations

Proceedings ArticleDOI
27 Oct 2003
TL;DR: A new fingerprinting scheme that does not depend on a primary key attribute is proposed that constructs virtual primary keys from the most significant bits of some of each tuple's attributes.
Abstract: Agrawal and Kiernan's watermarking technique for database relations [1] and Li et al's fingerprinting extension [6] both depend critically on primary key attributes. Hence, those techniques cannot embed marks in database relations without primary key attributes. Further, the techniques are vulnerable to simple attacks that alter or delete the primary key attribute.This paper proposes a new fingerprinting scheme that does not depend on a primary key attribute. The scheme constructs virtual primary keys from the most significant bits of some of each tuple's attributes. The actual attributes that are used to construct then virtual primary key differ from tuple to tuple. Attribute selection is based on a secret key that is known to the merchant only. Further, the selection does not depend on an apriori ordering over the attributes, or on knowledge of the original relation or fingerprint codeword.The virtual primary keys are then used in fingerprinting as in previous work [6]. Rigorous analysis shows that, with high probability, only embedded fingerprints can be detected and embedded fingerprints cannot be modified or erased by a variety of attacks. Attacks include adding, deleting, shuffling, or modifying tuples or attributes (including a primary key attribute if one exists), guessing secret keys, and colluding with other recipients of a relation.

65 citations


"A Highly-Reliable Virtual Primary K..." refers background or methods in this paper

  • ...Years later, the PK dependent scheme for tuple selection on AHK algorithms, was named T-Scheme [10]....

    [...]

  • ...[10]) this will result in many identical marks, seriously compromising the synchronization of WM extraction....

    [...]

  • ...independently; (ii) every mark is embedded multiple times, and (iii) a majority vote is used in WM detection [10]....

    [...]

  • ...This problem, defined as deletion problem [10], also compromises the WM detection....

    [...]

  • ...T-Scheme has proven to be robust against common set attacks (elimination or modification of tuples or attributes) due to: (i) every tuple is marked 978-1-5386-2652-8/17 $31.00 © 2017 IEEE DOI 10.1109/CSCI.2017.10 55 independently; (ii) every mark is embedded multiple times, and (iii) a majority vote is used in WM detection [10]....

    [...]

Journal ArticleDOI
TL;DR: The robust reversible watermarking modulation originally proposed by Vleeschouwer for images to the protection of relational databases is adapted and compared with two recent and efficient schemes so as to prove its benefits.
Abstract: In this paper, we adapt the robust reversible watermarking modulation originally proposed by Vleeschouwer for images to the protection of relational databases. The resulting scheme modulates the relative angular position of the circular histogram center of mass of one numerical attribute for message embedding. It can be used for verifying database authentication as well as for traceability when identifying database origin after it has been modified. Beyond the application framework, we theoretically evaluate the performance of our scheme in terms of capacity, distortion, and robustness against two common database modifications: 1) addition and 2) removal of tuples. To that end, we model the impact of the embedding process and of database modifications on the probability distribution of the center of mass position. We further verify experimentally these theoretical limits within the framework of a medical database of more than one million of inpatient hospital stay records. We show that under the assumptions imposed by the central limit theorem, experimental results fit the theory. We also compare our approach with two recent and efficient schemes so as to prove its benefits.

45 citations


"A Highly-Reliable Virtual Primary K..." refers background in this paper

  • ...That is why most of the following techniques, proposed since 2002, are also PK-dependent [4, 8, 11]....

    [...]

Journal ArticleDOI
TL;DR: Results demonstrate that the proposed technique is resilient against tuples insertion, tuples deletion, and attributes values modification attacks, and comparison with recent related effort shows that the scheme performs better in detecting multifaceted attacks.
Abstract: Nowadays, internet is becoming a suitable way of accessing the databases. Such data are exposed to various types of attack with the aim to confuse the ownership proofing or the content protection. In this paper, we propose a new approach based on fragile zero watermarking for the authentication of numeric relational data. Contrary to some previous databases watermarking techniques which cause some distortions in the original database and may not preserve the data usability constraints, our approach simply seeks to generate the watermark from the original database. First, the adopted method partitions the database relation into independent square matrix groups. Then, group-based watermarks are securely generated and registered in a trusted third party. The integrity verification is performed by computing the determinant and the diagonal’s minor for each group. As a result, tampering can be localized up to attribute group level. Theoretical and experimental results demonstrate that the proposed technique is resilient against tuples insertion, tuples deletion, and attributes values modification attacks. Furthermore, comparison with recent related effort shows that our scheme performs better in detecting multifaceted attacks.

33 citations


"A Highly-Reliable Virtual Primary K..." refers background in this paper

  • ...Also, WMs can be used for controlling the integrity of data and protect them against tampering and fraud [2, 14]....

    [...]