Steganalysis in high dimensions: fusing classifiers built on random subspaces
read more
Citations
Rich Models for Steganalysis of Digital Images
Ensemble Classifiers for Steganalysis of Digital Media
Steganalysis of JPEG Images Using Rich Models
Digital image splicing detection based on Markov features in DCT and DWT domain
Moving steganography and steganalysis from the laboratory into the real world
References
Random Forests
LIBSVM: A library for support vector machines
Bagging predictors
LIBLINEAR: A Library for Large Linear Classification
Related Papers (5)
Frequently Asked Questions (12)
Q2. What are the main factors that can negatively influence the performance of machine learning tools?
The authors conclude that there are three main factors that can negatively influence the performance of machine learning tools: (1) small number of training samples, (2) low class distinguishability, (3) high dimensionality.
Q3. What is the standard way to implement classifiers today?
A standard way to implement classifiers today is to train an SVM with a Gaussian kernel on a large database of cover and stego images.
Q4. What are the two pressing issues that have a strong effect on the classifier’s?
The two most pressing issues that have a strong effect on the classifier’s performance are the formation of prefeatures and the random selection of subspaces.
Q5. How long does it take to train a G-SVM?
Training a 50, 000-dimensional prefeature takes the ensemble classifier with L = 99 and dred = 2000 approximately 20 minutes, while training a G-SVM with the optimal hyperparameters already found takes about 7.5 hours.
Q6. What is the reason for embedding invariants?
embedding invariants may be very useful for steganalysis, provided they are correlated with some other cover statistic that is disturbed by embedding.
Q7. How long can it take to train a linear SVM?
Although very efficient implementations of linear SVMs exist (e.g., the LIBLINEAR package10), the training can still take a substantial amount of time.
Q8. What is the selection process for a CC-PEV?
In general, the optimal selection process will likely depend on mutual dependencies among prefeatures and their classification strength.
Q9. Why are weak steganographic methods easily detectable?
Weak steganographic methods are easily detectable because they disturb some elementary cover properties that can be captured by a low-dimensional feature vector with high distinguishability.
Q10. What is the accurate spatial domain steganalysis of 1 embedding?
The most accurate spatial domain steganalysis of ±1 embedding (LSB matching) uses the 686-dimensional SPAM features22 while a 1,234-dimensional Cross-Domain Feature (CDF) set was was employed in18 to attack YASS.25 Moreover, the recent results of the steganalysis competition BOSS11 indicate that there is little hope that a human-designed low-dimensional feature space effective against HUGO23 exists.
Q11. How long did it take to train a G-SVM?
To give an idea about the time savings, the ensemble classifier was trained with CC-PEV features with L = 31 and dred = 400 in about 70 seconds,‡ while training a G-SVM on the same features (with optimal values of the hyperparameters C and γ already found) took approximately 3.5 times longer.
Q12. How can the proposed method be used to improve the performance of a G-SVM?
So far, the authors have shown that the proposed method is capable of working with different prefeatures and that its performance can achieve results similar to a G-SVM.