Outliers are extreme observations that differ significantly from the rest of the data. For example, a single purchase of $1000 might be unusual if the average order value is $50, or a user who made 30 purchases in a single month might be considered anomalous if the average number of purchases per customer in the same period is 1.5.
Outliers can be caused by legitimate user behavior, and however rare they are, they can carry enough weight to distort the results of a test.
Detecting outliers
Dynamic Yield detects and handles two types of outliers:
- Extreme event values : Applied to every event or goal with value.
- Users with an extreme number of events: Applied to every event or goal (as of July 1st, 2023).
Extreme event values
For each event or goal with a value (such as Purchase), the outlier threshold is calculated based on the average and standard deviation of all event values collected in the past 30 days. The threshold is defined as 3 standard deviations above the average. Individual event values that exceed the threshold are replaced with the average value of the events below the threshold. A new outlier threshold is computed daily in a rolling 30-day fashion and applied to the events from the current day. The threshold is applied only if there are at least 100 events in the past 30 days.
Users with an extreme number of events
For each event or goal, we calculate an outlier threshold based on the count of events per DYID in the last 30 days. We consider only the DYIDs who have triggered each event multiple times and apply a variation of the DoubleMAD outlier detection technique:
- Calculate the median number of events per DYID.
- Calculate the absolute deviation from the median for the DYIDs above the median.
- Set the threshold as median + 3 times the absolute deviation to the median.
All DYIDs that triggered more events than the threshold in the last 30 days are considered outliers for the current day, and their events fired on the current day are excluded from reports, as is their value. For example, assume that:
- A new user starts visiting the site and makes a purchase every day.
- The outlier threshold for the purchase event is 10.
The new user would be flagged as an outlier starting on the day of their 11th purchase, effectively capping their purchases considered for reporting purposes at 10.
A new outlier threshold is computed daily in a rolling 30-day fashion and applied to the events from the current day. The threshold is applied only if there are at least 100 events in the last 30 days.
Outliers in experience reports
By default, A/B test reports exclude outliers, but you can include them by going to More Options and switching off the Exclude Outliers toggle.
Both types of outliers – extreme event values and users with an extreme number of events – are excluded using the same selector.
All numbers in A/B test reports are affected by this selector, with the exception of predictive targeting, which is always computed with results excluding outliers.
Exporting outliers: Revenue events log
To export a log of all revenue events that were fired by users in the A/B test, click Export at the top of the experience report page, and then select Revenue event log. This includes all events with value, indicating whether these events were outliers or not based on either outlier handling method. The revenue events log is available for all A/B tests (campaigns using A/B test allocation that have at least 2 variations).