Cross-pollination of Multiple Experiments Running Concurrently
Hi Everyone - My past experience is with Optimizely, and I'm working to get up to speed on Dynamic Yield. One feature I really loved in Optimizely was test exclusion. For example, you typically wouldn't run two tests on the homepage concurrently as they could cross-pollinate with each other and skew the test results. Using test exclusions in Optimizely, the platform would ensure that a visitor to the homepage would only be let into one of the homepage tests and not both. This would allow you to run multiple tests on the same page or in different areas of the site and you could count on the results not being skewed.
What is the best practice in using DY? How do you ensure that running multiple tests can be done cleanly with no skewing of the test results. For example, if I have product recommendation tests running on the homepage, the product detail page and the cart, there are a lot of different permutations that a visitor could see in their session, especially if each test has multiple variations. In this example, would you run these tests at all the same time or would you sequentially run them one at a time for more granular learning?
-
Hey,
This is a super great question (:I think there are a few answers for that:
1. Multi touch campaigns.
The multi touch campaign can put users in different buckets, and show different campaigns to these users. in that case, group A will get campaign A, and group B can get campaign B
You can also create a few touch points, for example
- user A gets campaign 1, variation A, and Campaign B, variation A
- user A gets campaign 2, variation A, and Campaign B, variation B
- user A gets campaign 3, variation B, and Campaign B, variation A
- user A gets campaign 4, variation B, and Campaign B, variation B
This allows you to test interactions in campaigns that are interconnected.
2. Just don't test 2 campaigns that you think can effect each other.
Saying, that you can create test sprints of 2/4 weeks, dependent on your traffic volume.
3. Global measuring
We don't live in a perfect test environment, and campaigns can interact with each other (saying, I tested A, got results, choose a winner. Then added campaign B which actually negatively effects campaign A, but I can't know what is happening.
This is an unfair answer, as it is really hard to make this measurement. the goal is to look at global KPIs, make sure you are progressing, and optimize campaigns separately.
To conclude:
- What are you trying to achieve?
- Why are you making these tests?
- What do you want to achieve?
It's always great to have a plan for your future tests, and think how do they effect each other. Then you can start getting a better idea on what you want to learn and see.
Let me know if you have any follow up questions
-
I struggled with this concept at first too. We occasionally use multi-touch campaigns to solve for this if it's a larger test and we want to completely eliminate cross-pollination, or truly understand which specific touchpoint is impacting performance of a specific KPI. Beyond that, I am comfortable running multiple smaller-scale tests at the same time. I assume that traffic is being split evenly and randomly in all tests. In your example above, if you're running a product recs test on the homepage and the PDP, as long as users are split evenly and randomly between variations you can assume that, for example, if 300 users are exposed to homepage variation A and then they visit the PDP they will be equally split between all variations on the PDP.
-
Thanks for the replies!
I guess where I'm struggling is how the permutations quickly start building up and DY not keeping track of each permutation. As an example, If I have a product rec test with just 1 variation each on the homepage, the PDP and the cart, I have the following permutations.
Homepage-Control; PDP-Control; Cart-Control
Homepage-Variation; PDP-Control; Cart-Control
Homepage-Control; PDP-Variation; Cart-Control
Homepage-Control; PDP-Control; Cart-Variation
Homepage-Control; PDP-Variation; Cart-Variation
Homepage-Variation; PDP-Variation; Cart-Control
Homepage-Variation; PDP-Control; Cart-Variation
Homepage-Variation; PDP-Variation; Cart-Variation
If we add a second variation to each test above, the number of permutations grows exponentially. I'm concerned about downstream product recs changing visitor intent on upstream product recs, especially since all of these tests have the same primary goal.
I'm thinking the best way to run these three tests is not separately and concurrently, but to create a single multi-touch campaign whereby a visitor either sees the product recs in all three areas of the site or doesn't see them at all. At least with this test set-up you are truly measuring whether recommendations throughout the site have a lift without all the above permutations coming into play.
If you were running these tests, and they all had the same primary goal of purchases, would you:
1. Run them independently and all at the same time.
2. Run them sequentially.
3. Run them all at the same time but cookie them together such that each visitor will either see all the product recs throughout the site or not see them at all.
-
Hey Corte,
First I would say thanks for the really interesting questions.
In this case I would either just run the campaigns and make sure non of them is loosing + use direct revenue report
You can also set a global control group and just include the campaigns. I would do that in a later stage after you found your winners.
My main point of advice to you:
Try to avoid complicated measuring, unless it is super critical for you. I usually assume that measurement is not perfect and just give tests a longer time, make sure the confidence level is +95% and the impact in 5% or more
Yonatan
-
Hey Corte,
I would also challenge your thought of the primary metric being the same for each test. I usually try to set the primary metric as the closest CTA we are trying to impact. For example in the HP I would not set purchases for a rec widget. There are so many experience the user will go through before they purchase. I would probably set it as CTR to monitor engagement since recs in the HP are mainly to help the user move through the funnel and get to the PDPs faster and with no friction.
Of course then as a secondary metric I would look at purchases and understand. If the HP test is winning with CTR (high level of engagement) but purchases are losing I would then re-evaluate the experiences from PDP to purchase to see what in between throw the user of track.
We actually sent out a tips and tricks email recently that talks about the importance of events to analyze the impact of tests:https://support.dynamicyield.com/hc/en-us/community/posts/360012860498-Tips-Tricks-Employing-events-for-in-depth-reporting-and-analysis
-
Yeah, I agree with you 100% that having "Purchases" at the primary metric on the Homepage isn't ideal. As you say, we are testing at the very front of the purchase funnel and using an end-of-funnel metric as the primary metric - not ideal. Thanks for the article link - I will definitely read this.
Please sign in to leave a comment.
Comments
6 comments