Interesting to see this paper here! Many years ago when working on a problem of offering online campaigns of some form, I had stumbled onto this paper, and it entirely changed my perspective. Eventually, I built an internal library based on the paper with a d3-based tree visualizer.
If memory serves right, my primary takeaway was that it isn't a good idea to make "customer retention" kind of offers to visitors (to a website) based on probability of purchase, because a fraction of them would have purchased irrespective of the discount offer. Of course,loyal customers should be rewarded in other ways, e.g., loyalty points, and this discussion is strictly for customer retention campaigns. In terms of model building this translates to: features that predict probability of purchase don't necessarily predict the difference in the probability of purchase for the same person given an offer. Of course, the second quantity is what we want. Its challenging to get to this since, in your data, for a given person, you would have made an offer to them or not - so you can't directly model this difference for them. Or model it in a statistically significant way. This paper provides a way to do so.
Great read. A little verbose for my taste. But lots of good ideas.
If memory serves right, my primary takeaway was that it isn't a good idea to make "customer retention" kind of offers to visitors (to a website) based on probability of purchase, because a fraction of them would have purchased irrespective of the discount offer. Of course,loyal customers should be rewarded in other ways, e.g., loyalty points, and this discussion is strictly for customer retention campaigns. In terms of model building this translates to: features that predict probability of purchase don't necessarily predict the difference in the probability of purchase for the same person given an offer. Of course, the second quantity is what we want. Its challenging to get to this since, in your data, for a given person, you would have made an offer to them or not - so you can't directly model this difference for them. Or model it in a statistically significant way. This paper provides a way to do so.
Great read. A little verbose for my taste. But lots of good ideas.