scispace - formally typeset
Search or ask a question

Showing papers in "International Journal of Digital Crime and Forensics in 2012"


Journal ArticleDOI
TL;DR: In this article, the key aspects of cloud computing and how established digital forensic procedures will be invalidated in this new environment are discussed and several new research challenges addressing this changing context are also identified and discussed.
Abstract: Cloud computing is a rapidly evolving information technology (IT) phenomenon. Rather than procure, deploy and manage a physical IT infrastructure to host their software applications, organizations are increasingly deploying their infrastructure into remote, virtualized environments, often hosted and managed by third parties. This development has significant implications for digital forensic investigators, equipment vendors, law enforcement, as well as corporate compliance and audit departments (among others). Much of digital forensic practice assumes careful control and management of IT assets (particularly data storage) during the conduct of an investigation. This paper summarises the key aspects of cloud computing and analyses how established digital forensic procedures will be invalidated in this new environment. Several new research challenges addressing this changing context are also identified and discussed.

100 citations


Journal ArticleDOI
TL;DR: A novel method is proposed based on hybrid features that is suitable for the identification of natural images and computer generated graphics and can achieve better identification accuracy than the existing methods with fewer dimensions of features.
Abstract: Examining the identification of natural images (NI) and computer generated graphics (CG), a novel method is proposed based on hybrid features. Since the image acquisition pipelines are different, some differences exist in statistical, visual, and noise characteristics between natural images and computer generated graphics. Firstly, the mean, variance, kurtosis, skew-ness, and median of the histograms of grayscale image in the spatial and wavelet domain are selected as statistical features. Secondly, the fractal dimensions of grayscale image and wavelet sub-bands are extracted as visual features. Thirdly, considering the shortage of the photo response non-uniformity noise (PRNU) acquired from wavelet based de-noising filter, a pre-processing of Gaussian high pass filter is applied to the image before the extraction of PRNU, and the physical features are calculated from the enhanced PRNU. In the identification, a support vector machine (SVM) classifier is used in experiments and an average classification accuracy of 94.29% is achieved, where the classification accuracy for computer generated graphics is 97.3% and for natural images is 91.28%. Analysis and discussion show that the method is suitable for the identification of natural images and computer generated graphics and can achieve better identification accuracy than the existing methods with fewer dimensions of features. DOI: 10.4018/jdcf.2012010101 2 International Journal of Digital Crime and Forensics, 4(1), 1-16, January-March 2012 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. by counterfeiters, serious implications to the authenticity of news, scientific research and the stability of country’s politics and society will be triggered. Hence, the study of digital image forensics technology is becoming a research hotspot. Generally, digital image forensics can be divided into two categories: active forensics and passive forensics. Active forensics mainly includes digital signature (Swaminathan, 2006) and digital watermarking (Chandra, 2010). In active forensics, additional information needs to be inserted into the host in advance, which requires that the acquisition device should have the corresponding functionality. However, most of the existing devices don’t have. At the same time, the additional information embedded will reduce the quality of images. Nowadays, all these problems have limited the practical application of active forensics. As a novel and an advanced technology, passive forensics technology occurred. It is a kind of blind forensics method and can identify the authentication or source of an image only based on the characteristics of the image itself without embedded additional information, which makes it more practical. According to its applications in different research fields, passive forensics can be classified into tampering detection (Popescu, 2005), steganalysis (Lyu & Farid, 2006) and source identification (Lukas & Fridrich, 2005a, 2005b, 2006a, 2006b; Ng, 2005; Chen & Li, 2009; Lyu & Farid, 2005; Dehnie, 2006). The identification of natural images and computer generated graphics belongs to the field of source identification. Natural images and computer generated graphics are acquired from two different pipelines. Natural images are obtained by cameras, which reflect the real world, while computer generated graphics are created by computer software, which are rendered from different geometric models. To our best knowledge, the research progress on the identification of natural images and computer generated graphics is relatively slow in recent years. Existing methods are mainly based on statistical properties of images (Lyu & Farid, 2005) or physical model of image processing (Lukas & Fridrich, 2005a, 2005b, 2006a, 2006b; Ng, 2005; Chen & Li, 2009; Dehnie, 2006). Although statistical properties can reflect the inherent differences of images to some extent, the identification accuracy is still limited even though the dimension of the features is more than several hundreds. The algorithms based on physical model generally discriminate the images by using the imperfection generated by the physical equipments such as lens, a color filter array (CFA) and charged coupled device (CCD) sensor. Compared with the methods based on statistical characteristics, it generally requires much fewer features. However, since the key features for the different image acquisition pipelines are still uncertain, the detection accuracy of the existing methods still needs to be improved. In this paper, the intrinsic properties and the essential differences in both image acquisition pipelines are studied, and consequently an identification algorithm is proposed.

21 citations


Journal ArticleDOI
TL;DR: A robust method for local image region feature description based on step sector statistics is proposed in this paper, and the proposed description method can work well for the detection of copy-rotate/flip-move forgery and block matching is proposed to reduce the computational time.
Abstract: A robust method for local image region feature description based on step sector statistics is proposed in this paper. The means and the standard deviations along the radial direction of the circle image region are extracted through the sector masks, and the rearrangement of these statistics makes this image region description method rotation-robust. The proposed description method is applied in the detection of copy-rotate-move forgery, and it can detect the exact rotation angle between the duplicate regions. With minor extension, the proposed description method can also be applied in the detection of copy-flip-move forgery. The experimental results show that the proposed description method can work well for the detection of copy-rotate/flip-move forgery. DOI: 10.4018/jdcf.2012010104 50 International Journal of Digital Crime and Forensics, 4(1), 49-62, January-March 2012 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. able Image File format (EXIF) Metadata-based image watermarking in the copyright protection of digital images. However, digital signatures and watermarking are restricted to only a few fields of application because they should be embedded in the images beforehand (Kundur & Hatzinakos, 1999; Lin, Podilchuk, & Delp, 2000; Fridrich, Goljan, & Du, 2001). In contrast, passive digital image forgery detection techniques can be used in a lot of fields without such limitations, which makes it much more popular in digital image forensics (Ng, Chang, & Sun, 2004). Among all digital image forgeries, copymove forgery is a common one. It is a kind of manipulation in which a part of the image is copied, and then pasted onto another part of the same image in order to insert or cover some objects in the image (Bayram, Sencar, & Memon, 2009). Figure 1 and Figure 2 show an example of copy-move forgery in a news photo (“Iran test-fired long-range missiles,” 2008). Figure 1 is the original image, while Figure 2 is a fake counterpart where the third missile from the left is a duplication of another missile. During the copy-move process, the duplicate regions may go through geometrical modifications such as rotation, scaling and/or illumination adjustment for a better visual effect. And the tampered images may also be blurred, noised, or compressed in order to hide the traces of forgery. Thus a good forgery detection algorithm should take these operations into account. The simplest approach to detect copy-move forgery is the exhaustive search (Fridrich, Soukal, & Lukas, 2003), in which the image is compared with all its cyclic-shifted versions to look for the closest matching regions. Although this method works for copy-move forgery detection, its high computational burden prevents it from practical applications. Block matching is proposed to reduce the computational time. In the matching procedure, an image is divided into overlapping blocks first, and then all duplicate block pairs are marked. The key lies in finding some robust representations for each image block, and then the duplicate blocks can be identified even they are not exactly the same. Fridrich et al. (2003) proposed to represent the image blocks with quantized Discrete Cosine Transform (DCT) coefficients, and lexicographical sorting is adopted to detect the copymove blocks. An improved DCT-based method was proposed by Huang et al. (2011) to detect Figure 1. Copy-move forgery example: the original image 12 more pages are available in the full version of this document, which may be purchased using the \"Add to Cart\" button on the product's webpage: www.igi-global.com/article/image-region-description-methodbased/65736?camid=4v1 This title is available in InfoSci-Journals, InfoSci-Journal Disciplines Computer Science, Security, and Information Technology, InfoSci-Select, InfoSci-Surveillance, Security, and Defense eJournal Collection, InfoSci-Computer Science and IT Knowledge Solutions – Journals. Recommend this product to your librarian: www.igi-global.com/e-resources/libraryrecommendation/?id=2

16 citations


Journal ArticleDOI
TL;DR: In this article, a model for investigating crime scenes with hybrid evidence is proposed, which unifies the procedures related to digital and physical evidence collection and examination, taking into consideration the unique characteristics of each form of evidence.
Abstract: With the advent of Information and Communication Technologies, the means of committing a crime and the crime itself are constantly evolved. In addition, the boundaries between traditional crime and cybercrime are vague: a crime may not have a defined traditional or digital form since digital and physical evidence may coexist in a crime scene. Furthermore, various items found in a crime scene may worth be examined as both physical and digital evidence, which the authors consider as hybrid evidence. In this paper, a model for investigating such crime scenes with hybrid evidence is proposed. Their model unifies the procedures related to digital and physical evidence collection and examination, taking into consideration the unique characteristics of each form of evidence. The authors' model can also be implemented in cases where only digital or physical evidence exist in a crime scene.

14 citations


Journal ArticleDOI
TL;DR: This study is one of the first likelihood ratio-based forensic text comparison studies in forensic authorship analysis, and a forensic comparison of SMS messages using the likelihood-ratio framework focusing on lexical features.
Abstract: This study is one of the first likelihood ratio-based forensic text comparison studies in forensic authorship analysis. The likelihood-ratio-based evaluation of scientific evidence has started being adopted in many disciplines of forensic evidence comparison sciences, such as DNA, handwriting, fingerprints, footwear, voice recording, etc., and it is largely accepted that this is the way to ensure the maximum accountability and transparency of the process. Due to its convenience and low cost, short message service (SMS) has been a very popular medium of communication for quite some time. Unfortunately, however, SMS messages are sometimes used for reprehensible purposes, e.g., communication between drug dealers and buyers, or in illicit acts such as extortion, fraud, scams, hoaxes, and false reports of terrorist threats. In this study, the author performs a likelihood-ratio-based forensic text comparison of SMS messages focusing on lexical features. The likelihood ratios (LRs) are calculated in Aitken and Lucy’s (2004) multivariate kernel density procedure, and are calibrated. The validity of the system is assessed based on the magnitude of the LRs using the loglikelihood-ratio cost (Cllr). The strength of the derived LRs is graphically presented in Tippett plots. The results of the current study are compared with those of previous studies. DOI: 10.4018/jdcf.2012070104 48 International Journal of Digital Crime and Forensics, 4(3), 47-57, July-September 2012 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. That being said, there is a large amount of research on forensic authorship analysis in other electronically-generated texts, such as emails (De Vel, Anderson, Corney, & Mohay, 2001; Iqbal, Hadjidj, Fung, & Debbabi, 2008), whereas forensic authorship analysis studies specifically focusing on SMS messages are conspicuously sparse (cf. Ishihara, 2011; Mohan, Baggili, & Rogers, 2010). The forensic sciences are experiencing a paradigm shift in the evaluation and presentation of evidence (Saks & Koehler, 2005). This paradigm shift has already happened in forensic DNA comparison. Saks and Koehler (2005) fervently suggest that other forensic comparison sciences should follow forensic DNA comparison, which adopts the likelihoodratio framework for the evaluation of evidence. The use of the likelihood-ratio framework has been advocated in the main textbooks on the evaluation of forensic evidence (e.g., Robertson & Vignaux, 1995) and by forensic statisticians (e.g., Aitken & Stoney, 1991; Aitken & Taroni, 2004). However, despite the fact that the likelihood-ratio framework has started making inroads in other fields of forensic comparison sciences, such as fingerprint (Choi, Nagar, & Jain, 2011; Neumann et al., 2007), handwriting (Bozza, Taroni, Marquis, & Schmittbuhl, 2008; Marquis, Bozza, Schmittbuhl, & Taroni, 2011) and voice (Morrison, 2009), we are somewhat behind in this trend in forensic authorship analysis. Thus, emulating forensic DNA comparison, the current study is a forensic comparison of SMS messages using the likelihood-ratio framework. Focusing on the lexical features of SMS messages, we test a forensic text comparison system. The validity of the system is assessed using the log-likelihood-ratio-cost function (Cllr) which was originally developed for use in automatic speaker recognition systems (Brümmer & du Preez, 2006), and subsequently adopted in forensic voice comparison (Morrison, 2011). The strength of likelihood ratios (= strength of evidence) obtained from SMS messages is graphically presented using Tippett plots. FORENSIC AUTHORSHIP ANALYSIS Profiling, Identification, and Verification Forensic authorship analysis can be broadly classified into the subfields of authorship profiling, authorship identification and authorship verification. In order to clarify where the current study stands, commonly-held descriptions of the tasks of these subfields are concisely given: 1. Authorship profiling summarises the sociolinguistic characteristics, such as gender, age, occupation, educational and cultural background, of the unknown author (offender) of the (illicit) document in question (Stamatatos, 2009, p. 539). 2. The task of (forensic) authorship identification is to identify the most likely author (suspect) of a given (incriminating) document from a group of candidate authors (suspects) (Iqbal, Binsalleeh, Fung, & Debbabi, in press, p. 3). 3. The task of (forensic) authorship verification is to determine or verify if a target author (suspect) did or did not write a specific (incriminating) document (Halteren, 2007, p. 3). Using the conventional terminology, the current study is one of forensic authorship verification. That is, when a suspect is prosecuted, a forensic expert can perform authorship verification for it “to be determined if a suspect did or did not write a specific, probably incriminating, text” (Halteren, 2007, p. 2). Role of Forensic Expert Commonly-held views about forensic authorship analysis have been summarised previously. However, it is important to explicitly state here that the forensic scientist as a witness is NOT in a position, either legally or logically, to identify, confirm, decide or even verify if two samples 9 more pages are available in the full version of this document, which may be purchased using the \"Add to Cart\" button on the product's webpage: www.igi-global.com/article/probabilistic-evaluation-smsmessages-forensic/72324?camid=4v1 This title is available in InfoSci-Journals, InfoSci-Journal Disciplines Computer Science, Security, and Information Technology, InfoSci-Select, InfoSci-Surveillance, Security, and Defense eJournal Collection, InfoSci-Computer Science and IT Knowledge Solutions – Journals. Recommend this product to your librarian: www.igi-global.com/e-resources/libraryrecommendation/?id=2

12 citations


Journal ArticleDOI
TL;DR: The authors propose and motivate the use of a novel kind of features exploiting characteristics noticed in the reproduction of fake fingers, that they named fake-based features, and propose a possibile implementation of this type of features based on the power spectrum of the fingerprint image.
Abstract: The vitality detection of fingerprints is currently acknowledged as a serious issue for personal identity verification systems. This problem, raised some years ago, is related to the fact that the 3d shape pattern of a fingerprint can be reproduced using artificial materials. An image quite similar to that of true, alive, fingerprint, is derived if such “fake fingers” are submitted to an electronic scanner. Since introducing hardware dedicated to liveness detection in scanners is expensive, software-based solutions, based on image processing algorithms, have been proposed as alternative. So far, proposed approaches are based on features exploiting characteristics of a live finger (e.g., finger perspiration). Such features can be named live-based, or vitality-based features. In this paper, the authors propose and motivate the use of a novel kind of features exploiting characteristics noticed in the reproduction of fake fingers, that they named fake-based features. Then the authors propose a possibile implementation of this kind of features based on the power spectrum of the fingerprint image. The proposal is compared and integrated with several live-based features at the state-of-the-art, and shows very good liveness detection performances. Experiments are carried out on a data set much larger than commonly adopted ones, containing images from three different optical sensors. DOI: 10.4018/jdcf.2012070101 2 International Journal of Digital Crime and Forensics, 4(3), 1-19, July-September 2012 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. cheaper than the hardware-based ones. Recently, two integrated systems (hardware and software methods) have been proposed (Russo, 2007; Diaz-Santana & Parziale, 2006). According to Coli et al. (2007a), the majority of approaches are aimed to characterize physiological features of the skin by image processing algorithms, e.g., physiological features such as the perspiration (Derakshani et al., 2003; Parthasaradhi et al., 2005; Marcialis et al., 2010), and the elastic deformation of fingerprints (Antonelli et al., 2008; Chen & Jain, 2005). Morphological algorithms and wavelet transformations are employed to this aim (Coli et al., 2008, Moon et al., 2005; Tan & Schuckers, 2006, 2008; Abhyankar & Schuckers, 2006; Drahansky & Lodrova, 2008; Yau et al., 2007; Choi et al., 2007). These features are related to characteristics of live fingers which can be detected by image processing. In the following, we will refer to them as vitality-based features (or live-based features). It is worth noting that no work so far paid attention to features based on fake finger characteristics (in the following, fake-based features). However, a visual analysis of two example images (Figure 1) may help in detecting some differences among images obtained from live and fake fingers. In particular, Figure 1 shows in the left side a live fingerprint image, whilst the correspondent fake fingerprint image is shown in the right side. At the center, we zoomed on the rectangle pointed out in both images. It is possible to observe three kind of peculiarities: absence or reduction of pores (recently exploited in Marcialis et al., 2010), alteration of pores width, general smoothing of several details, presence of artifacts (the scratch at the center is present only in the fake fingerprint). Such differences are mainly due to the stamp fabrication process of fake fingers which causes an alteration of the frequency details between ridges and valleys. But stateof-the-art vitality-based features have not been conceived to detect and exploit such differences for liveness detection purposes. On the basis of the considerations, we believe it is useful to investigate if fingerprint replica may have some peculiarities which allow discriminating them from live fingerprints. Moreover, we hypothesise that fake-based and vitality-based features may have a certain degree of complementarity for liveness detection purposes. With the term “complementarity,” we mean that certain samples can be better Figure 1. An example of live and fake fingerprint images (left and right side, respectively) corresponding to the same subject. At the center, we zoomed on the rectangle pointed out in both images, in order to point out differences mainly due to the fabrication process of the fake fingerprint. 17 more pages are available in the full version of this document, which may be purchased using the \"Add to Cart\" button on the product's webpage: www.igi-global.com/article/fingerprint-liveness-detectionbased-fake/72321?camid=4v1 This title is available in InfoSci-Journals, InfoSci-Journal Disciplines Computer Science, Security, and Information Technology, InfoSci-Select, InfoSci-Surveillance, Security, and Defense eJournal Collection. Recommend this product

11 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed algorithm overall outperforms three related duplication forgery detection algorithms in terms of computational efficiency, detection accuracy and robustness against common video operations like compression and brightness change.
Abstract: Frame duplication is a common way of digital video forgeries. State-of-the-art approaches of duplication detection usually suffer from heavy computational load. In this paper, the authors propose a new algorithm to detect duplicated frames based on video sub-sequence fingerprints. The fingerprints employed are extracted from the DCT coefficients of the temporally informative representative images (TIRIs) of the sub-sequences. Compared with other similar algorithms, this study focuses on improving fingerprints representing video sub-sequences and introducing a simple metric for the matching of video sub-sequences. Experimental results show that the proposed algorithm overall outperforms three related duplication forgery detection algorithms in terms of computational efficiency, detection accuracy and robustness against common video operations like compression and brightness change. DOI: 10.4018/978-1-4666-4006-1.ch005 Co py ri gh t @ 20 13 . IG I Gl ob al . Al l ri gh ts r es er ve d. M ay n ot b e re pr od uc ed i n an y fo rm w it ho ut p er mi ss io n fr om t he p ub li sh er , ex ce pt f ai r us es p er mi tt ed u nd er U .S . or a pp li ca bl e co py ri gh t la w.

9 citations


Journal ArticleDOI
TL;DR: A comprehensive survey of the state-of-the-art, detection-based, preventionbased and early-detection-based Spam 2.0 filtering methods is provided, as well as providing directions for future research.
Abstract: Spam 2.0 is defined as the propagation of unsolicited, anonymous, mass content to infiltrate legitimate Web 2.0 applications. A fake eye-catching profile in social networking websites, a promotional review, a response to a thread in online forums with unsolicited content, or a manipulated Wiki page are examples of Spam 2.0. In this paper, the authors provide a comprehensive survey of the state-of-the-art, detection-based, preventionbased and early-detection-based Spam 2.0 filtering methods. DOI: 10.4018/jdcf.2012010102 18 International Journal of Digital Crime and Forensics, 4(1), 17-36, January-March 2012 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. indicates that 75%+ of pings coming from blogs to search engines for the update of information are spam (Kolari, Java, & Finin, 2006) and the amount of comment spam in 2009 doubled that of 2008 (Akismet, 2011). Such an overwhelming amount of spam is seriously degrading the quality of information on the Internet (Chai, Potdar, D instead, they post spam on legitimate websites. If such information persists, the trust in such pages is diminished, spam is effectively promoted by trusted sources, and many users can be misled or involved in scams and computer malware. Because of the success and impact rates of Spam 2.0, it is far more popular amongst spammers and has far greater negative socio-economic impact. The main problems associated with Spam 2.0 are: • Spam 2.0 cannot be detected and treated using existing anti-spam techniques that are designed for email spam. • Web 2.0 offers an avenue whereby spammers can easily promote targeted spam content with a far greater impact factor and lower maintenance costs compared to email spam. • Spam 2.0 gets an undeserved higher ranking in search engine results (further expanding this spam operation). • Spam 2.0 diminishes the trustworthiness and quality of legitimate websites and places them in danger of being blacklisted. Figure 1. Evolution and adaptability of spam once new media emerge International Journal of Digital Crime and Forensics, 4(1), 17-36, January-March 2012 19 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. This also gives rise to significant socioeconomic issues such as direct and indirect costs associated with the management of Spam 2.0, reduction of Internet quality of service (Chai et al., 2009), heightened proliferation of Internet scams, viruses, trojans and malware. Given the massive impact that Spam 2.0 has on the current web environment, there have been significant efforts in assessing the problem and developing cutting edge solutions. In this paper we describe and critically analyse the relevant Spam 2.0 literature under three broad categories classified based on the spam filtering method, as follows: (1) detection-based filtering, (2) preventionbased filtering, and (3) early-detection-based filtering. The overall objective is to provide the reader with a comprehensive understanding of the current state-of-the-art of Spam 2.0, as well as providing directions for future research. The rest of the paper is organized as follows; we first describe detection based spam 2.0 filtering methods, followed by prevention based spam 2.0 filtering methods. Next, this paper describes early-detection-based Spam 2.0 filtering methods. We then provide an overall Figure 2. A comment spam Figure 3. An online community spam 20 International Journal of Digital Crime and Forensics, 4(1), 17-36, January-March 2012 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. evaluation and finally, we conclude the paper with future remarks. DETECTION-BASED SPAM 2.0 FILTERING A detection-based Spam 2.0 filtering method relies on filtering spam by analysing and evaluating content posted on Web 2.0 platforms. Methods in this category deal with the content of spam in order to discover spam patterns (Paul, Georgia, & Hector, 2007). These detection methods search for spam keywords, templates, attachments, etc., within the content. Table 1 lists examples of where this information can be found in the blog and forum posts. As shown in Figure 5, detection methods follow three steps. The first step is to retrieve content from different systems (e.g., email, SMS, or web 2.0). The second step is to mine features such as keywords from content and meta-content. Finally, the last step is to perform classification which results in spam or ham content. The literature on detection-based Spam 2.0 filtering can be classified based upon the platform for which they are designed. Accordingly, we classify Spam 2.0 detection methods for: • Blogs and forums • Twitter • Social networking websites (MySpace, Facebook, etc.) • Social videosharing systems (YouTube, etc.) • Opinion gathering websites (Amazon, etc.) • Social bookmarking websites (Delicious, BibSonomy, etc.) Figure 4. A forum spam Table 1. Examples of spam features in Web 2.0 platforms Platform Where to look (part of content) Blog Comment • # of links per comment content. • Similarity of comment content and blog post. Forum Post • Content of post • # of images per post • # of links per content International Journal of Digital Crime and Forensics, 4(1), 17-36, January-March 2012 21 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. Blogs and Forums In this section, we outline three different methods proposed for blogs and forums. Table 2 shows a comparative evaluation of these methods. The first method we describe is proposed by Narisawa, Bannai, Hatano, and Takeda (2007), which detects spam messages in forums. The authors believe that spam messages are designed to distribute advertisement messages in forums and do have discriminative characteristics which can be explored for classification. To do this, they created equivalence classes for the content of the message. An equivalence class consists of the same occurrences of substrings within a message. By employing three features – length (where length of representative of spam equivalence class is higher), size (where size of representative of spam equivalence class is higher) and Maximin (length of the representative and the length of the longest minimal element of the equivalence class), the authors could distinguish spam messages from the legitimate ones. They evaluated their proposed method based on four Japanese forums (24,591 messages) and used the F1-measure to evaluate the performance of their proposed method. The results of their work varied from 68% to 80% for all four datasets. Another method was proposed by Uemura, Ikeda, and Arimura (2008). Their assumption Figure 5. Detection-based spam filtering approach Table 2. Spam 2.0 detection algorithms for blogs and forums Method Platform Classifier Dataset Result

8 citations


Journal ArticleDOI
TL;DR: This paper presents an inter-disciplinary approach for the quantitative analysis of user engagement to identify relational and temporal dimensions of evidence relevant to an investigation and applies it to a case study of actors posting to a social media Web site.
Abstract: The increasing use of social media, applications or platforms that allow users to interact online, ensures that this environment will provide a useful source of evidence for the forensics examiner. Current tools for the examination of digital evidence find this data problematic as they are not designed for the collection and analysis of online data. Therefore, this paper presents a framework for the forensic analysis of user interaction with social media. In particular, it presents an inter-disciplinary approach for the quantitative analysis of user engagement to identify relational and temporal dimensions of evidence relevant to an investigation. This framework enables the analysis of large data sets from which a much smaller group of individuals of interest can be identified. In this way, it may be used to support the identification of individuals who might be 'instigators' of a criminal event orchestrated via social media, or a means of potentially identifying those who might be involved in the 'peaks' of activity. In order to demonstrate the applicability of the framework, this paper applies it to a case study of actors posting to a social media Web site.

7 citations


Journal ArticleDOI
TL;DR: This article addresses the need to create and maintain internationally accepted standards to control the use and application of digital forensic processes and touches on the motivation for such internationally recognised standards on digital evidence.
Abstract: Continuous developments in forensic processes and tools have aided in elevating the positioning of digital forensics within the legal system. The equally continuous developments in technology and electronic advances, however, are making it more difficult to match forensic processes and tools with the advanced technology. Therefore, it is necessary to create and maintain internationally accepted standards to control the use and application of digital forensic processes. This article addresses this need and touches on the motivation for such internationally recognised standards on digital evidence. It also looks at current work in and progress towards the establishment of digital evidence related documents addressing all phases of the digital forensic process. DOI: 10.4018/jdcf.2012040101 2 International Journal of Digital Crime and Forensics, 4(2), 1-12, April-June 2012 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. enforces the need for regulated and standardised digital forensics. Continuous developments in forensic processes and tools have assisted in the promotion and positioning of digital forensics within the legal system. However, at the same time continuous developments in technology and electronic advances are making it more difficult to match forensic processes and tools with the advanced technology. “In a time of growing scrutiny of the digital forensic profession, and of forensic sciences in general, practitioners need to unite to develop accepted practice and ethics standards across all sectors of the industry” (Kroll Ontrack Onpoint, 2011). This results in a general lack of widely accepted models and frameworks (Taylor et al., 2007). Although a lot of work has been done towards standardising this process, it has not yet been finalised. Therefore, it becomes very important that internationally developed and accepted standards are put in place to ensure the consistent application of digital forensics across the globe. THE SCOPE OF DIGITAL EVIDENCE According to Garfinkle (2010), the Golden Age of digital forensics was the period from 1999 to 2007. Digital forensics was deemed as a mystical mechanism that could enable specialists to recover lost and deleted files and emails, find hidden information and give law enforcers insight into criminals’ minds at a push of a button. It was during this period that the emergence of the so-called CSI (Crime Scene Investigation) effect became widespread, mystifying many people with fancy gadgets and technical abilities that allowed digital evidence to be extracted in the process of fighting digital crime. Traditionally, best practices for digital forensics was prevalent for doing investigations on machines running Microsoft Windows, searching for file formats such as Microsoft Office documents, JPEG, AVI, and WMV. Investigations were mostly restricted to a single, non-virtual computer system, and storage devices came with standard interfaces and were generally small enough to image during a single working day. Generally, best practices were accepted as the norm for digital forensics during this Golden Age. However, technological advances, changes in general business processes, and the modern tendency for over reliance on the Internet have changed the digital forensics playing field. Nowadays, the world is increasingly dependent on computers and technology for communications, transactions, commerce and entertainment. During a typical business day employees email documentation, access information on organisational servers and store data via the cloud. Many people own smart phones, tablets and iPads that enable the sending of emails, SMSes and instant messages from a single device. The playing field is now not a single computer system anymore, but a virtualised environment with non-conformist file types, a variety of customised storage devices and non-standard interfaces, as well as terabytes of storing space. Yarrow (2011) suggested that in 2010 alone, 1.9 billion emailers sent 107 trillion emails, averaging on 294 billion emails sent on a single day. 255 million websites were in existence at the end of 2010, with 25 billion tweets sent, and 36 billion photos uploaded to Facebook in 2010. In addition, it is suggested that by the end of 2010, the mobile phone penetration in South Africa was close to 98% (Rao, 2011). All these developments and the increased digital inclination of many people make the dispersion of digital evidence across the Internet common. As a result, it is inevitable that sensitive business information is more exposed and vulnerable to misuse by technology-adept individuals, both on a local and international scale. This necessitates the formalisation and international agreement of digital forensics and evidence management (Grobler & Dlamini, 2010). INTERNATIONAL INITIATIVES Already in 1986 were multi-jurisdictional investigations involving digital evidence a 10 more pages are available in the full version of this document, which may be purchased using the "Add to Cart" button on the product's webpage: www.igi-global.com/article/need-digital-evidencestandardisation/68406?camid=4v1 This title is available in InfoSci-Journals, InfoSci-Journal Disciplines Computer Science, Security, and Information Technology, InfoSci-Select, InfoSci-Surveillance, Security, and Defense eJournal Collection. Recommend this product

7 citations


Journal ArticleDOI
TL;DR: The authors present a novel framework for content-based audio retrieval based on the audio fingerprinting scheme that is robust against large linear speed changes and their scheme is robust to several signal processing attacks and manipulations except for linear speed change.
Abstract: Audio fingerprinting is the process to obtain a compact content-based signature that summarizes the essence of an audio clip. In general, existing audio fingerprinting schemes based on wavelet transforms are not robust against large linear speed changes. The authors present a novel framework for content-based audio retrieval based on the audio fingerprinting scheme that is robust against large linear speed changes. In the proposed scheme, 8 levels Daubechies wavelet decomposition is adopted for extracting time-frequency features and two fingerprint extraction algorithms are designed. The experimental results from this study are discussed further into the article. DOI: 10.4018/jdcf.2012040104 50 International Journal of Digital Crime and Forensics, 4(2), 49-69, April-June 2012 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. local energy centroid (LEC) was proposed to represent the energy conglomeration degree of the relative small region in the spectrum (Pan et al., 2011), while a robust audio fingerprinting algorithm in the MP3 compressed domain was proposed with high robustness to time scale modification (Zhou & Zhu, 2011). Among existing transform-based audio fingerprinting schemes, the schemes based on the wavelet transform are very popular, since the wavelet transform or more particularly the discrete wavelet transform is a relatively recent and computationally efficient technique for extracting information about non-stationary signals like audio. Wavelet transform is a local transformation on a signal in time and frequency domains, which can effectively extract information from the signal, and do multi-scale detailed analysis on a function or signal by functions such as scaling and translation, thereby can solve many difficult issues which cannot be solved by the Fourier transform. Therefore, our paper focuses on the wavelet transform based schemes. The existing works based on wavelet transforms can be classified into the following two categories. The first type of fingerprinting schemes performs the wavelet transform on each audio frame directly to extract time-frequency features for audio fingerprinting. In Lu (2002), the one dimensional continuous Morlet wavelet transform is adopted to extract two fingerprints for authentication and recognition purposes, respectively. In Ghouti and Bouridane (2006), a robust perceptual audio hashing scheme using balanced multiwavelets (BMW) is proposed. They first perform 5 levels wavelet decomposition on each audio frame and divide the 5 decomposition sub-bands’ coefficients into 32 different frequency bands. Then the estimation quantization (EQ) with a window of 5 audio samples is adopted. Finally, 32 bits subfingerprinting is extracted according to the relationship between the log variances of each sub-bands’ coefficients and the mean of all the log variances for each audio frame. They do several experiments to demonstrate that their scheme is robust to several signal processing attacks and manipulations except for linear speed change. The other type of fingerprinting schemes introduces the computer vision technique to convert the audio clip into a 2-D spectrogram and then apply the wavelet transform. In Ke et al. (2005), the spectrogram of each audio snippet is viewed as a 2-D image and the wavelet transform is used to extract 860 descriptors for a 10 seconds audio clip. Then apply the pairwise boosting scheme to learn compact, discriminative, local descriptors that are efficient in audio retrieval. This algorithm can finish retrieving quickly and accurately in practical systems with poor recording quality or significant ambient noises. In Baluja and Covell (2006, 2007), the so-called Waveprint, combining of computer vision and data stream processing, was proposed. The Harr wavelet is used for extracting the t top magnitude wavelets for each spectral image. And the selected features are modeled by the Min-Hash technique. In the retrieval step, the locality sensitive hashing (LSH) technique is introduced. This algorithm exhibits an excellent identification rate against content-preserving degradations except for linear speed changes. Furthermore, the tradeoffs between the performance, memory usage, and computation are analyzed through extensive experiments. As an extension, the parameters of the system are analyzed and verified in Baluja and Covell (2008). This system shows superiority in terms of memory usage and computation, while being more accurate when compared with Ke et al. (2005). From above, we can obviously know that the existing works based on wavelet transforms are not robust against large linear speed changes in common. Based on this, a novel framework of audio fingerprinting which is robust against large linear speed changes is proposed in this paper. We adopt the Daubechies wavelet transform Daub8 in our framework. Compared with the Haar wavelet transform, the scaling signals and wavelets for the Daubechies wavelet transforms have slightly longer supports, i.e., averages and differences are produced using just a few more values from the signal. This 19 more pages are available in the full version of this document, which may be purchased using the \"Add to Cart\" button on the product's webpage: www.igi-global.com/article/daubechies-wavelets-basedrobust-audio/68409?camid=4v1 This title is available in InfoSci-Journals, InfoSci-Journal Disciplines Computer Science, Security, and Information Technology, InfoSci-Select, InfoSci-Surveillance, Security, and Defense eJournal Collection, InfoSci-Computer Science and IT Knowledge Solutions – Journals. Recommend this product to your librarian: www.igi-global.com/e-resources/libraryrecommendation/?id=2

Journal ArticleDOI
TL;DR: Through mathematical analysis on embedding efficiency, it is proved that ternary matrix embedding outperforms r -ary (r ≠ 3 ) matrixembedding in terms of embedding efficiencies.
Abstract: In this paper, the authors examine embedding efficiency, which influences the most concerned performance of steganography, security, directly. Embedding efficiency is defined as the number of random message bits embedded per embedding change. Recently, matrix embedding has gained extensive attention because of its outstanding performance in boosting steganographic schemes’ embedding efficiency. Firstly, the authors evaluate embedding change not only on the number of changed coefficients but also on the varying magnitude of changed coefficients. Secondly, embedding efficiency of matrix embedding with different radixes is formularized and investigated. A conclusion is drawn that ternary matrix embedding can achieve the highest embedding efficiency. DOI: 10.4018/jdcf.2012010103 38 International Journal of Digital Crime and Forensics, 4(1), 37-48, January-March 2012 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. Westfeld, 2001; Kim et al., 2006; Goljan et al., 2006; Sachnev & Kim, 2010), while techniques of the other category embed data in such a way that it distorts the steganalyst’s estimate of the cover image statistics (Solanki et al., 2007; Sarkar, Solanki et al., 2008; Sarkar, Nataraj et al., 2008; Yu et al., 2010). Nowadays, the most popular way is to seek the lowest possible rate of modification to the cover signal or the highest possible embedding capacity at a given distortion level, and it belongs to the first category. Embedding efficiency (Westfeld, 2001) is a quantity that measures embedding capacity at given distortion level, which is defined as the number of random message bits embedded per one embedding change. Matrix embedding is an effective technique to improve embedding efficiency. The idea of importing matrix embedding to steganography was proposed by Crandall (Crandall, 1998). Westfeld (2001) firstly implemented binary matrix embedding into F5, which resorts to the Hamming codes to reduce modification on the quantized block discrete cosine transform (BDCT) coefficients of a cover JPEG image. Later, binary matrix embedding is also used in modified matrix encoding (MME) (Kim, 2006) with side information and has demonstrated distinguished performance. Ternary matrix embedding is first proposed by Goljan et al. (2006) in spatial domain and has shown obviously superior embedding efficiency than binary matrix encoding. Then it is used by Sachnev et al. (2010) in JPEG domain which outperforms binary matrix embedding based MME. Steganographic embedding efficiency of matrix embedding has been studied in (Fridrich et al., 2006), and ternary matrix embedding is shown to outperform binary matrix embedding by comparing upper bounds of their embedding efficiencies. But neither whether ternary matrix embedding outperforms binary matrix embedding in practice, nor whether ternary matrix embedding outperforms r -ary (r > 3 ) matrix embedding, is clear. In this paper, we evaluate embedding change not only on the number of changed coefficients but also on the varying magnitude of changed coefficients. Through mathematical analysis on embedding efficiency, we prove that ternary matrix embedding outperforms r -ary (r ≠ 3 ) matrix embedding in terms of embedding efficiency. This paper is organized as follows. In the next section, we define change and drive the formula of embedding efficiency based on this definition. Terms and symbols used in this paper are also introduced in this section. Following this, we mathematically analyze properties of embedding efficiency and embedding rate, respectively. Then embedding efficiency of matrix embedding with different radixes are compared and ternary matrix embedding is demonstrated to gain the highest embedding efficiency. Finally, we draw our conclusions.

Journal ArticleDOI
TL;DR: The purpose of this survey is to gain knowledge for the authors’ own event knowledge database which will consist of how unusual events work and how they are related to other events, and the algorithms mentioned in this paper have been used to build their future development.
Abstract: Analytics has emerged as an important area of study as it avoids further incidents or risks after the events have occurred; this is done by analysing computer events and making further statistics. The purpose of this survey is to gain knowledge for the authors’ own event knowledge database which will consist of how unusual events work and how they are related to other events. The algorithms mentioned in this paper have been used to build their future development, resulting in a knowledge database designed to be similar to an internet browser engine where it can search events and their relationships. The research and algorithms have helped the authors to decide on the technology they will be using for the knowledge database. DOI: 10.4018/jdcf.2012070103 34 International Journal of Digital Crime and Forensics, 4(3), 33-46, July-September 2012 Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. which is the highest level of development or technique at this time, a range or systems and techniques will be introduced here. Section 3 will be on existing systems and algorithms. Section 4 will then be the conclusion that will encapsulate our insights on what was useful and what we have learnt from this research. 2. THE STATE OF THE ARTS This section covers the highest level of development of computer analytics; it includes topics such as computer surveillance systems, computer forensic events, monitoring events and network events security related methodologies, which are currently being employed. This section contains up to date ideas and knowledge of computer analytics, which can help to make advancements in already existing methodologies. 2.1. Event Based Surveillance and Monitoring In surveillance, events retrieved from video, audio and image sensors (Bolderheij, Absil, & Genderen, 2005; Gonzalez, 2007; Guennoun, Khattak, Kapralos, & El-Khatib, 2008; Bouhats, Marebati, & Mokhatr, 2007). The purpose does not focus mainly on event detection; instead it focuses more on the event itself. In order to improve unusual event detection, all events must be analyzed individually and categorized based on type. The events are stored in a database to compare relationships for future use, so it can be retrieved when needed. If all events that are recorded and examined are logged into a database it can be used as an event library. By collecting these events we can examine the reasons of occurrence in which comparisons can be made so as to what events are normal and which ones are not. The application is very similar to youtube and is made to be used by anyone so using the application for the first time should be straight forward (Hameed & Abdullah, 2008; Hannemann, Donohue, & Dietz, 2007; Kieran & Yan, 2010). Multi-agent event monitoring system is a hybrid, artificial intelligence based event monitoring system. This system aims to assist the network administrators to keep track in computer intrusion detection. Event monitoring is one of the key parts of a systematic defense (Biblin, Muller, & Pfitzmann, 2006). When event monitoring is mentioned it would always relate to intrusion detection, these two are inseparable. There are many event monitoring approaches being used today and from our research some are ineffective (Zeng, Lei, & Chang, 2007; Woda Figure 1. Structure of event analytics 12 more pages are available in the full version of this document, which may be purchased using the \"Add to Cart\" button on the product's webpage: www.igi-global.com/article/comprehensive-survey-eventanalytics/72323?camid=4v1 This title is available in InfoSci-Journals, InfoSci-Journal Disciplines Computer Science, Security, and Information Technology, InfoSci-Select, InfoSci-Surveillance, Security, and Defense eJournal Collection. Recommend this product

Journal ArticleDOI
TL;DR: The authors present a new architecture which facilitates the move to automation of the investigative process; this new architecture draws together several important components of the literature on question and answer methodologies including the concept of 'pivot' word and sentence ranking.
Abstract: The need for an automated approach to forensic digital investigation has been recognized for some years, and several authors have developed frameworks in this direction. The aim of this paper is to assist the forensic investigator with the generation and testing of hypotheses in the analysis phase. In doing so, the authors present a new architecture which facilitates the move to automation of the investigative process; this new architecture draws together several important components of the literature on question and answer methodologies including the concept of 'pivot' word and sentence ranking. Their architecture is supported by a detailed case study demonstrating its practicality.

Journal ArticleDOI
TL;DR: Pypette is presented, a novel platform enabling the automated, repeatable analysis of live digital forensic acquisition techniques, and the effects of these approaches, and their improvements over other techniques, can be evaluated and quantified.
Abstract: Live digital forensics presents unique challenges with respect to maintaining forensic soundness, but also offers the ability to examine information that is unavailable to quiescent analysis. Any perturbation of a live operating system by a forensic examiner will have far-reaching effects on the state of the system being analysed. Numerous approaches to live digital forensic evidence acquisition have been proposed in the literature, but relatively little attention has been paid to the problem of identifying how the effects of these approaches, and their improvements over other techniques, can be evaluated and quantified. In this paper, the authors present Pypette, a novel platform enabling the automated, repeatable analysis of live digital forensic acquisition techniques.

Journal ArticleDOI
TL;DR: The authors devise a new distortion profile exploring both the block complexity and the distortion effect due to flipping and rounding errors, and incorporate it in the framework of syndrome trellis coding (STC) to propose a new JPEG steganographic scheme that greatly increases the secure embedding capacity against steganalysis.
Abstract: Minimizing the embedding impact is a practically feasible philosophy in designing steganographic systems. The development of steganographic systems can be formulated as the construction of distortion profile reflecting the embedding impact and the design of syndrome coding based on a certain code. The authors devise a new distortion profile exploring both the block complexity and the distortion effect due to flipping and rounding errors, and incorporate it in the framework of syndrome trellis coding (STC) to propose a new JPEG steganographic scheme. The STC provides multiple candidate solutions to embed messages to a block of coefficients while the constructed content-adaptive distortion profile guides the determination of the best solution with minimal distortion effect. The total embedding distortion or impact would be significantly reduced and lead to the less detectability of steganalysis. Extensive experimental results demonstrate that the proposed JPEG steganographic scheme greatly increases the secure embedding capacity against steganalysis and shows significant superiority over some existing JPEG steganographic approaches.