Machine Studying for Fraud Detection in Streaming Providers | by Netflix Expertise Weblog

Determine 1. Schematic of a streaming service platform: (a) illustrates system varieties that can be utilized for streaming, (b) designates the set of authentication and authorization programs similar to license and manifest servers for offering encrypted contents in addition to decryption keys and manifests, and (c) exhibits the streaming service supplier, as a surrogate entity for digital content material suppliers, that interacts with the opposite two parts.

Knowledge Labeling

Desk 1. The listing of streaming associated options with the suffixes pct and cnt respectively referring to proportion and depend
Determine 2. Variety of anomalous samples as a operate of (a) fraud classes and (b) variety of tagged classes.
Determine 3. Correlation matrix of the options introduced in Desk 1 for (a) clear and (b) anomalous information samples.
Determine 4. Artificial Minority Over-sampling Approach
Determine 5. Mannequin-based anomaly detection approaches: (a) semi-supervised and (b) supervised.
Desk 2. The values of the analysis metrics for a set of semi-supervised anomaly detection fashions.
Determine 6. For the deep auto-encoder mannequin: (a) distribution of the Imply Squared Error (MSE) values for anomalous and benign samples on the inference stage — (b) confusion matrix throughout benign and anomalous samples- (c) Imply Squared Error (MSE) values averaged throughout the anomalous and benign samples for every of the 23 options.
Desk 3. The values of the analysis metrics for a set of supervised binary anomaly detection classifiers.
Desk 4. The values of the analysis metrics for a set of supervised multi-class multi-label anomaly detection approaches. The values in parenthesis seek advice from the efficiency of the fashions skilled on the unique (not upsampled) dataset.
Determine 7. The normalized function significance values (NFIV) for the multi-class multi-label anomaly detection process utilizing the XGBoost strategy in Desk 4 throughout the three anomaly courses, i.e., (a) content material fraud, (b) service fraud, and (c) account fraud.