Outlier Handling – Dynamic Yield Knowledge Base

Outliers are extreme observations that differ significantly from the rest of the data. For example, a single purchase of $1000 might stand out if the average order value is $50, or a user who made 30 purchases in one month might be considered anomalous if the average is just 1.5 purchases per customer.

While outliers can result from valid user behavior, and although they are rare, they can carry enough weight to distort the results of a test.

Note: Outlier handling applies only to the Personalization Impact and Experience reports.

Detecting outliers

Dynamic Yield detects and handles two types of outliers:

Extreme event values : Applied to every event or goal with value.
Users with an extreme number of events: Applied to every event or goal.

Extreme event values

For each event or goal with a value (such as Purchase), the outlier threshold is calculated based on the average and standard deviation of all event values collected in the past 30 days. The threshold is defined as 3 standard deviations above the average. Individual event values that exceed the threshold are replaced with the average value of the events below the threshold. A new outlier threshold is computed daily in a rolling 30-day fashion and applied to the events from the current day. The threshold is applied only if there are at least 100 events in the past 30 days.

Users with an extreme number of events

For each event or goal, we calculate an outlier threshold based on the count of events per DYID in the last 30 days. We consider only the DYIDs who have triggered each event multiple times and apply a variation of the DoubleMAD outlier detection technique:

Calculate the median number of events per DYID.
Calculate the absolute deviation from the median for the DYIDs above the median.
Set the threshold as median + 3 times the absolute deviation to the median.

All DYIDs that triggered more events than the threshold in the last 30 days are considered outliers for the current day, and their events fired on the current day are excluded from reports, as is their value. For example, assume that:

A new user starts visiting the site and makes a purchase every day.
The outlier threshold for the purchase event is 10.

The new user would be flagged as an outlier starting on the day of their 11th purchase, effectively capping their purchases considered for reporting purposes at 10.

A new outlier threshold is computed daily in a rolling 30-day fashion and applied to the events from the current day. The threshold is applied only if there are at least 100 events in the last 30 days.

Outliers in experience reports

By default, A/B test reports exclude outliers, but you can include them by going to More Options and switching off the Exclude Outliers toggle.

Both types of outliers – extreme event values and users with an extreme number of events – are excluded using the same selector.

All numbers in A/B test reports are affected by this selector, with the exception of predictive targeting, which is always computed with results excluding outliers.

Exporting outliers: Revenue events log

To export a log of all revenue events that were fired by users in the A/B test, click Export at the top of the experience report page, and then select Revenue event log. This includes all events with value, indicating whether these events were outliers or not based on either outlier handling method. The revenue events log is available for all A/B tests (campaigns using A/B test allocation that have at least 2 variations).