We have a platform for selling cars. Sellers find more potential buyers by purchasing paid services (PS) to promote their ad. After using these paid services, ads are in the top and buyers see them more often.
The monetization team often makes changes to the mechanics of work and the control interface. Each such change goes through an AB test. The metric the platform most often track is ARPU = total revenue/total number of users.
We have the results of three unrelated AB tests.
experiment_num - experiment number (1, 2, 3)
experiment_group - the group the user is in (test, control)
user_id - user id
revenue - revenue generated by the user by purchasing a paid promotion service
Objective
Calculate ARPU and p-value for each experiment and identify which mechanics should be scaled.