GOTCHA! Network-Based Fraud Detection for Social Security Fraud
read more
Citations
Fraud Analytics Using Descriptive, Predictive And Social Network Techniques: A Guide To Data Science For Fraud Detection
The value of big data for credit scoring: Enhancing financial inclusion using mobile phone data and social network analytics
Generative adversarial network based telecom fraud detection at the receiving bank
Social network analytics for churn prediction in telco
Auto loan fraud detection using dominance-based rough set approach versus machine learning methods
References
Random Forests
SMOTE: synthetic minority over-sampling technique
The PageRank Citation Ranking : Bringing Order to the Web
SMOTE: Synthetic Minority Over-sampling Technique
Related Papers (5)
Frequently Asked Questions (9)
Q2. What future works have the authors mentioned in the paper "Gotcha! network-based fraud detection for social security fraud" ?
Their future work will elaborate more on active learning, by updating the model using both correctly and incorrectly classified instances. Another topic for future research is community detection which may find groups of suspicious companies. Although the authors applied their approach to social security fraud detection, they have promising results that their proposed framework can be employed for the detection of other fraud types where the network can be represented as a higher order graph ( n-partite graph ).
Q3. How does GOTCHA improve the intrinsic baseline?
improves the intrinsic baseline by detecting 31%, 33% and 33% more fraudulent and high-risk cases for the respective timestamps, resulting in a higher precision and recall.
Q4. What is the iterative propagation procedure for bipartite graphs?
The iterative propagation procedure for bipartite graphs can then be written as,(~ξ) = α ·Qnorm(~ξ) + (1−α) ·~v (5)Note that Qnorm is a dynamic matrix, representing both present and past relationships.
Q5. What is the adjacency matrix of a bipartite graph?
The adjacency matrix of an undirected bipartite graph is formally written as An×m = (ai,j), with ai,j = 1 if a link between node i ∈ V1 and node j ∈ V2 exists, and ai,j = 0 otherwise.
Q6. How will the future work elaborate on active learning?
Their future work will elaborate more on active learning, by updating the model using both correctly and incorrectly classified instances.
Q7. What is the corresponding matrix representation of size of a graph?
MS-14-00232 13corresponding matrix representation of size n× n of a graph, with n being the total number of vertices and ai,j = 1 if a link between node i and j exists, and ai,j = 0 otherwise.
Q8. How many iterations of the process are needed to make sure that the final exposure score is?
The authors repeat the process for 100 iterations in order to make sure that5 based on Page et al. (1998), the authors choose α= 0.85potential changes in the final exposure score are only marginal.
Q9. What are the types of variables that can be classified as direct and indirect?
Those variables include the degree, triangles and propagated exposure score (see Section 3.4 for details), and can be classified as direct and indirect network variables depending on whether they are derived from the direct neighborhood or take into account the full networkstructure.