Classification in a Skewed Online Trade Fraud Complaint Corpus

Kos, William; Schraagen, M.P.; Brinkhuis, M.J.S.; Bex, F.J.

Classification in a Skewed Online Trade Fraud Complaint Corpus

Files

Kos_bnaic_preproceedings.pdf (276.18 KB)

Publication date

2017-11

Authors

Kos, William

Schraagen, Marijn

Brinkhuis, Matthieu

Bex, Floris

Editors

Verheij, Bart

Wiering, Marco

Document Type

Part of book

Metadata

Show full item record

Collections

Utrecht University Repository

Abstract

This paper explores how machine learning techniques can be used to support handling of skewed online trade fraud complaints, by predicting whether a complaint will be withdrawn or not. To optimize the performance of each classifier, the influence of resampling, word weighting, and word normalization on the classification performance is assessed. It is found that machine learning can indeed be used for this purpose, by improving the baseline performance in comparison to the skewness ratio up to 13 pp using Logistic Regression. Furthermore, the results show that data alteration techniques can improve classifier performance on a skewed dataset up to 13.5 pp.

Keywords

Classification, Law Enforcement, Skewed Data

Citation

Kos, W, Schraagen, M P, Brinkhuis, M J S & Bex, F J 2017, Classification in a Skewed Online Trade Fraud Complaint Corpus. in B Verheij & M Wiering (eds), Preproceedings of the 29th Benelux Conference on Artificial Intelligence November 8–9, 2017 in Groningen, The Netherlands : BNAIC 2017. pp. 172-183, The 29th Benelux Conference on Artificial Intelligence, Groningen, Netherlands, 8/11/17., conference

URI

https://dspace.library.uu.nl/handle/1874/356746

Classification in a Skewed Online Trade Fraud Complaint Corpus

Files

Publication date

Authors

Editors

Advisors

Supervisors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI