Code
shhh <- function(expr) suppressPackageStartupMessages(suppressWarnings(suppressMessages(expr)))
shhh({
library(tidyverse);
library(lubridate);
library(scales);
library(magrittr);
library(dplyr);
})Task: T413020
The Reader Experience team developed reading list for logged-in readers on desktop and mobile web. (Hypothesis FY25-26 WE3.3.4). An A/B test is conducted on the XLab experiment platform to evaluate its impact on internal referrals and to measure feature usage.
The A/B test was run on logged-in web users on desktop and mobile web. It was enabled in tiers—starting with 5 pilot wikis on November 12, and then deployed on English Wikipedia on November 19.
In this test, a randomly selected half of logged-in users who met the criteria below were assigned to the treatment group, where they saw the Reading List feature: (1) active in the last 3 months, (2) 0 edits, (3) zero watchlist items (except for user page and user talk page), and (4) zero existing Reading List tables. The remaining half were assigned to the control group, where they experienced the current interface.
We tracked pageviews with referral information (internal or external) and reviewed the internal referral rate on a per-user basis for all tested users, as well as for the subgroup of users who engaged with the Reading List feature. We also tracked click events on the Reading List feature to understand overall feature usage.
We have published a report on Leading Indicators based on data collected on Desktop from November 12 through December 9, 2025. The following analysis will focus on data collected on both Desktop and Mobile Web from December 19, 2025, through January 18, 2026.
User engagement
Internal referral rate (primary)
Retention rate (guardrail)
shhh <- function(expr) suppressPackageStartupMessages(suppressWarnings(suppressMessages(expr)))
shhh({
library(tidyverse);
library(lubridate);
library(scales);
library(magrittr);
library(dplyr);
})library(relax) # for xlab stats report# For summary tables
library(gt)
library(gtsummary)
library(IRdisplay)library(htmltools)df_internal_ref_1 <-
read.csv(
file = 'Data_out/internal_referral_1st_tier.tsv',
header = TRUE,
sep = "\t",
stringsAsFactors = FALSE
) df_internal_ref_2 <-
read.csv(
file = 'Data_out/internal_referral_2nd_tier.tsv',
header = TRUE,
sep = "\t",
stringsAsFactors = FALSE
) df_internal_ref_all <- bind_rows(df_internal_ref_1 , df_internal_ref_2)df_stats_internal_ref_all <- df_internal_ref_all %>%
select(variation, internal_ref_rate) %>%
rename(outcome=internal_ref_rate) %>%
calculate_metric_stats(metric_type = "mean") df <- df_stats_internal_ref_all %>%
as.data.frame() %>%
rownames_to_column("group")display_html(
as_raw_html(
df %>%
gt()%>%
tab_header(
title = md("Internal Referral Rates of the Control and Treatment Groups<br>Statistical Summary")
) %>%
cols_label(
group = "Experiment group",
sample_size = "Sample Size",
sample_mean = "Sample Mean",
sample_variance = "Sample Variance"
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels(columns = c(group, sample_size, sample_mean,sample_variance ))
) %>%
fmt_percent(
columns = c("sample_mean", "sample_variance"),
decimals = 2
) %>%
fmt_number(
columns = "sample_size",
decimals = 0
) %>%
opt_stylize(6) %>%
cols_width(everything() ~ px(150)) %>%
tab_source_note(
source_note = md(
"Data source: schema mediawiki_product_metrics_reading_list <br>
Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web ")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Internal Referral Rates of the Control and Treatment Groups Statistical Summary |
|||
| Data source: schema mediawiki_product_metrics_reading_list Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
df_lift_internal_ref_all <- df_internal_ref_all %>%
select(variation, internal_ref_rate) %>%
rename(outcome=internal_ref_rate) %>%
analyze_relative_lift(metric_type = "mean")
display_html(
as_raw_html(
df_lift_internal_ref_all %>%
gt()%>%
tab_header(
title = md("Internal Referral Rates of the Control and Treatment Groups<br>Impact Estimation")
) %>%
tab_spanner(
label = "Bayes",
columns = c(estimate_bayes, chance_to_win, cred_lower, cred_upper)
) %>%
tab_spanner(
label = "Frequency",
columns = c(estimate_freq, p_value, conf_lower, conf_upper)
) %>%
cols_label(
estimate_bayes = "Change (Bayes)",
chance_to_win = "Chance To Win",
cred_lower = "95% CI Lower (Bayes)",
cred_upper = "95% CI Upper (Bayes)",
estimate_freq = "Change (Freq)",
# p_value =
conf_lower = "95% CI Lower (Freq)",
conf_upper = "95% CI Upper (Freq)",
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_spanners(c("Bayes", "Frequency"))
) %>%
fmt_percent(
columns = everything(),
decimals = 2
) %>%
# opt_stylize(6) %>%
cols_width(everything() ~ px(150)) %>%
tab_source_note(
source_note = md(
"Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)
| Internal Referral Rates of the Control and Treatment Groups Impact Estimation |
|||||||
| Change (Bayes) | Chance To Win | 95% CI Lower (Bayes) | 95% CI Upper (Bayes) | Change (Freq) | p_value | 95% CI Lower (Freq) | 95% CI Upper (Freq) |
|---|---|---|---|---|---|---|---|
| Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
|||||||
Internal referral rates were similar for both the control and treatment groups, around 29%. Overall, we did not observe a statistically significant increase in the treatment group. This may be due to the low engagement with the Reading List among treatment users.
df_internal_ref_feature_user_1 <-
read.csv(
file = 'Data_out/internal_referral_feature_user_1st_tier.tsv',
header = TRUE,
sep = "\t",
stringsAsFactors = FALSE
) df_internal_ref_feature_user_2 <-
read.csv(
file = 'Data_out/internal_referral_feature_user_2nd_tier.tsv',
header = TRUE,
sep = "\t",
stringsAsFactors = FALSE
) df_internal_ref_feature_user_all <- bind_rows(df_internal_ref_feature_user_1 , df_internal_ref_feature_user_2)df_stats_internal_ref_feature_user <- df_internal_ref_feature_user_all %>%
summarize(
sample_size = n(),
sample_mean = mean(internal_ref_rate),
sample_variance = var(internal_ref_rate)
) display_html(
as_raw_html(
df_stats_internal_ref_feature_user %>%
gt()%>%
tab_header(
title = md("Internal Referral Rates of Users Who Used the Feature<br>Statistical Summary")
) %>%
cols_label(
sample_size = "Sample Size",
sample_mean = "Sample Mean",
sample_variance = "Sample Variance"
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels(columns = c(sample_size, sample_mean,sample_variance ))
) %>%
fmt_percent(
columns = c("sample_mean", "sample_variance"),
decimals = 2
) %>%
fmt_number(
columns = "sample_size",
decimals = 0
) %>%
opt_stylize(6) %>%
cols_width(everything() ~ px(200)) %>%
tab_source_note(
source_note = md(
"Data source: schema mediawiki_product_metrics_reading_list <br>
Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web ")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Internal Referral Rates of Users Who Used the Feature Statistical Summary |
||
| Data source: schema mediawiki_product_metrics_reading_list Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
During the analysis timeframe, 148 users used the Reading List, representing 2.5% of users in the treatment group (148/5929).
df_clicks_per_user <-
read.csv(
file = 'Data_out/feature_clicks_per_user.tsv',
header = TRUE,
sep = "\t",
stringsAsFactors = FALSE
) display_html(
as_raw_html(
df_clicks_per_user %>%
filter(action_subtype=='save_article_to_reading_list')%>%
summarize(
total_saves=sum(clicks),
total_users=n_distinct(subject_id),,
actions_per_users=total_saves/total_users
)%>%
gt()%>%
tab_header(
title = md("Save-to-Reading-List Actions")
) %>%
cols_label(
total_saves = "Number of save actions",
total_users = "Number of users",
actions_per_users = "Average saves per user"
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels(columns = c(total_saves, total_users, actions_per_users ))
) %>%
fmt_number(
columns = "actions_per_users",
decimals = 2
) %>%
opt_stylize(6) %>%
cols_width(everything() ~ px(200)) %>%
tab_source_note(
source_note = md(
"Data source: schema mediawiki_product_metrics_reading_list <br>
Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web ")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Save-to-Reading-List Actions | ||
| Data source: schema mediawiki_product_metrics_reading_list Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
display_html(
as_raw_html(
df_clicks_per_user %>%
filter(action_subtype=='save_article_to_reading_list')%>%
group_by(action_source) %>%
summarize(
total_saves=sum(clicks),
) %>%
mutate(
pct_of_total_saves = total_saves / sum(total_saves)
) %>%
select( action_source, pct_of_total_saves) %>%
gt()%>%
tab_header(
title = md("Where Users Click to Save Articles to the Reading List")
) %>%
cols_label(
pct_of_total_saves = "Percentage of total save actions",
action_source="Action source"
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels(columns = c(action_source, pct_of_total_saves ))
) %>%
fmt_percent(
columns = c("pct_of_total_saves"),
decimals = 2
) %>%
opt_stylize(6) %>%
cols_width(everything() ~ px(200)) %>%
tab_source_note(
source_note = md(
"Data source: schema mediawiki_product_metrics_reading_list <br>
Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web ")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Where Users Click to Save Articles to the Reading List | |
| Data source: schema mediawiki_product_metrics_reading_list Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
The majority of save actions came from the toolbar. The remaining 10.83% of save actions came from the sticky header. We did not observe any save actions from the tool menu.
df_article_cnt_in_rl <-
read.csv(
file = 'Data_out/cnt_articles.tsv',
header = TRUE,
sep = "\t",
stringsAsFactors = FALSE
) percentiles <- data.frame(
percentile = c(0, 0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 1),
article_count = as.numeric(
quantile(
df_article_cnt_in_rl$article_count,
probs = c(0, 0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 1)
)
)
)percentiles <- data.frame(
metric = c("min", "P25", "P50", "P75", "P90", "P95", "P99", "max", "mean"),
article_count = c(
as.numeric(
quantile(
df_article_cnt_in_rl$article_count,
probs = c(0, 0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 1),
na.rm = TRUE
)
),
mean(df_article_cnt_in_rl$article_count, na.rm = TRUE)
)
)display_html(
as_raw_html(
percentiles %>%
gt()%>%
tab_header(
title = md("Number of Articles in the Reading List")
) %>%
cols_label(
metric = "Metric",
article_count = "# Articles"
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels(columns = c(metric,article_count ))
) %>%
fmt_number(
columns = "article_count",
decimals = 0
) %>%
opt_stylize(6) %>%
cols_width(everything() ~ px(180)) %>%
tab_source_note(
source_note = md(
"Data source: schema mediawiki_product_metrics_reading_list <br>
Timeframe: December 19, 2025 - January 18, 2026 <br>
User group: users who used the reading list feature <br>
Platform: desktop and mobile web ")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Number of Articles in the Reading List | |
| Data source: schema mediawiki_product_metrics_reading_list Timeframe: December 19, 2025 - January 18, 2026 User group: users who used the reading list feature Platform: desktop and mobile web |
The majority of users had very small reading lists: the median (P50) was 1 article, and 75% of users had 2 or fewer articles.
df_ctr <-
read.csv(
file = 'Data_out/feature_ctr.tsv',
header = TRUE,
sep = "\t",
stringsAsFactors = FALSE
) display_html(
as_raw_html(
df_ctr %>%
group_by(variation) %>%
summarize(
users = n(),
non_zero_users = sum(outcome != 0, na.rm = TRUE),
ctr_mean = mean(outcome, na.rm = TRUE)
) %>%
gt()%>%
tab_header(
title = md("Click-through Rate")
) %>%
cols_label(
variation = "Experiment group",
users = "Impressed users",
non_zero_users= "Clicked users" ,
ctr_mean = "CTR"
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels(columns = c(variation,users,non_zero_users,ctr_mean ))
) %>%
fmt_percent(
columns = c("ctr_mean"),
decimals = 2
) %>%
opt_stylize(6) %>%
cols_width(everything() ~ px(200)) %>%
tab_source_note(
source_note = md(
"Data source: schema mediawiki_product_metrics_reading_list <br>
Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web ")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Click-through Rate | |||
| Data source: schema mediawiki_product_metrics_reading_list Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
0.88% of users in the treatment group clicked the save_article icon during the analysis period.
df_ctr_events <-
read.csv(
file = 'Data_out/feature_ctr_events.tsv',
header = TRUE,
sep = "\t",
stringsAsFactors = FALSE
) display_html(
as_raw_html(
df_ctr_events %>%
summarize(
total_clicks=sum(clicks),
total_impressions=sum(impressions)
) %>%
mutate(
ctr=total_clicks/total_impressions
) %>%
gt()%>%
tab_header(
title = md("Click-through Rate")
) %>%
cols_label(
total_impressions = "Impressions",
total_clicks= "Clicks" ,
ctr = "CTR"
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels(columns = c(total_clicks,total_impressions,ctr ))
) %>%
fmt_percent(
columns = c("ctr"),
decimals = 3
) %>%
opt_stylize(6) %>%
cols_width(everything() ~ px(200)) %>%
tab_source_note(
source_note = md(
"Data source: schema mediawiki_product_metrics_reading_list <br>
Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web ")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Click-through Rate | ||
| Data source: schema mediawiki_product_metrics_reading_list Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
0.042% of all impressions from users in the treatment group led to saving an article to the reading list.
display_html(
as_raw_html(
df_clicks_per_user %>%
filter(action_subtype=='remove_article_from_reading_list')%>%
summarize(
total_saves=sum(clicks),
total_users=n_distinct(subject_id)
) %>%
#sanitizing per data publication guidelines
mutate(
total_saves = ifelse(total_saves < 100, "<100", total_saves),
total_users = ifelse(total_users < 25, "<25", total_users)
) %>%
gt()%>%
tab_header(
title = md("Reading List Article Removals")
) %>%
cols_label(
total_saves = "Total saves",
total_users = "Total users" ,
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels(columns = c(total_saves,total_users ))
) %>%
opt_stylize(6) %>%
cols_width(everything() ~ px(200)) %>%
tab_source_note(
source_note = md(
"Data source: schema mediawiki_product_metrics_reading_list <br>
Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web ")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Reading List Article Removals | |
| Data source: schema mediawiki_product_metrics_reading_list Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
display_html(
as_raw_html(
df_clicks_per_user %>%
filter(action_subtype=='view_reading_list')%>%
summarize(
total_clicks=sum(clicks),
total_users=n_distinct(subject_id)
) %>%
#sanitizing per data publication guidelines
mutate(
total_clicks = ifelse(total_clicks < 200, "~200", total_clicks),
total_users = ifelse(total_users < 25, "<25", total_users)
) %>%
gt()%>%
tab_header(
title = md("Reading List Views")
) %>%
cols_label(
total_clicks = "Total reading list views",
total_users = "Total users" ,
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels(columns = c(total_clicks,total_users ))
) %>%
opt_stylize(6) %>%
cols_width(everything() ~ px(200)) %>%
tab_source_note(
source_note = md(
"Data source: schema mediawiki_product_metrics_reading_list <br>
Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web ")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Reading List Views | |
| Data source: schema mediawiki_product_metrics_reading_list Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
display_html(
as_raw_html(
df_clicks_per_user %>%
filter(action_subtype=='view_reading_list')%>%
group_by(action_source) %>%
summarize(
total_clicks=sum(clicks)
) %>%
mutate(
pct_of_total_clicks = total_clicks / sum( total_clicks)
) %>%
select( action_source,pct_of_total_clicks) %>%
gt()%>%
tab_header(
title = md("Reading List Views")
) %>%
cols_label(
action_source ="Action source" ,
pct_of_total_clicks = "Percentage of total reading list views"
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels(columns = c(action_source,pct_of_total_clicks ))
) %>%
fmt_percent(
columns = c("pct_of_total_clicks"),
decimals = 2
) %>%
opt_stylize(6) %>%
cols_width(everything() ~ px(200)) %>%
tab_source_note(
source_note = md(
"Data source: schema mediawiki_product_metrics_reading_list <br>
Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web ")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Reading List Views | |
| Data source: schema mediawiki_product_metrics_reading_list Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
Most Reading List views were initiated from the top-right entry point, accounting for 77.65% of total views.
display_html(
as_raw_html(
df_clicks_per_user %>%
filter(action_subtype=='view_article')%>%
summarize(
total_views=sum(clicks),
total_users=n_distinct(subject_id)
) %>%
#sanitizing per data publication guidelines
mutate(
total_views = ifelse(total_views < 100, "<100", total_views),
total_users = ifelse(total_users < 25, "<25", total_users)
) %>%
gt()%>%
tab_header(
title = md("Reading List Article Views")
) %>%
cols_label(
total_views = "Total article views",
total_users = "Total users" ,
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels(columns = c(total_views,total_users ))
) %>%
opt_stylize(6) %>%
cols_width(everything() ~ px(200)) %>%
tab_source_note(
source_note = md(
"Data source: schema mediawiki_product_metrics_reading_list <br>
Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web ")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Reading List Article Views | |
| Data source: schema mediawiki_product_metrics_reading_list Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
df_retention_r1r1_d_m <-
read.csv(
file = 'Data_out/retention_r1r1_d_m.tsv',
header = TRUE,
sep = "\t",
stringsAsFactors = FALSE
) df_stats_retention_r1r1_d_m <- df_retention_r1r1_d_m %>%
select(variation, outcome) %>%
calculate_metric_stats(metric_type = "proportion") display_html(
as_raw_html(
df_stats_retention_r1r1_d_m %>%
as.data.frame() %>%
rownames_to_column("group") %>%
gt()%>%
tab_header(
title = md("Second-Day Retention Rate<br>Statistical Summary")
) %>%
cols_label(
group = "Experiment group",
sample_size = "Sample Size",
sample_mean = "Sample Mean",
sample_variance = "Sample Variance"
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels(columns = c(group, sample_size, sample_mean,sample_variance ))
) %>%
fmt_percent(
columns = c("sample_mean", "sample_variance"),
decimals = 2
) %>%
fmt_number(
columns = "sample_size",
decimals = 0
) %>%
opt_stylize(6) %>%
cols_width(everything() ~ px(150)) %>%
tab_source_note(
source_note = md(
"Data source: schema mediawiki_product_metrics_reading_list <br>
Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web ")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Second-Day Retention Rate Statistical Summary |
|||
| Data source: schema mediawiki_product_metrics_reading_list Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
display_html(
as_raw_html(
df_retention_r1r1_d_m %>%
select(variation, outcome) %>%
analyze_relative_lift(metric_type = "proportion") %>%
gt()%>%
tab_header(
title = md("Second-Day Retention Rate<br>Impact Estimation")
) %>%
tab_spanner(
label = "Bayes",
columns = c(estimate_bayes, chance_to_win, cred_lower, cred_upper)
) %>%
tab_spanner(
label = "Frequency",
columns = c(estimate_freq, p_value, conf_lower, conf_upper)
) %>%
cols_label(
estimate_bayes = "Change (Bayes)",
chance_to_win = "Chance To Win",
cred_lower = "95% CI Lower (Bayes)",
cred_upper = "95% CI Upper (Bayes)",
estimate_freq = "Change (Freq)",
# p_value =
conf_lower = "95% CI Lower (Freq)",
conf_upper = "95% CI Upper (Freq)",
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_spanners(c("Bayes", "Frequency"))
) %>%
fmt_percent(
columns = everything(),
decimals = 2
) %>%
# opt_stylize(6) %>%
cols_width(everything() ~ px(150)) %>%
tab_source_note(
source_note = md(
"Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Second-Day Retention Rate Impact Estimation |
|||||||
| Change (Bayes) | Chance To Win | 95% CI Lower (Bayes) | 95% CI Upper (Bayes) | Change (Freq) | p_value | 95% CI Lower (Freq) | 95% CI Upper (Freq) |
|---|---|---|---|---|---|---|---|
| Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
|||||||
df_retention_r7r7_d_m <-
read.csv(
file = 'Data_out/retention_r7r7_d_m.tsv',
header = TRUE,
sep = "\t",
stringsAsFactors = FALSE
) df_stats_retention_r7r7_d_m <- df_retention_r7r7_d_m %>%
select(variation, outcome) %>%
calculate_metric_stats(metric_type = "proportion") display_html(
as_raw_html(
df_stats_retention_r7r7_d_m %>%
as.data.frame() %>%
rownames_to_column("group") %>%
gt()%>%
tab_header(
title = md("Second-Week Retention Rate<br>Statistical Summary")
) %>%
cols_label(
group = "Experiment group",
sample_size = "Sample Size",
sample_mean = "Sample Mean",
sample_variance = "Sample Variance"
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_labels(columns = c(group, sample_size, sample_mean,sample_variance ))
) %>%
fmt_percent(
columns = c("sample_mean", "sample_variance"),
decimals = 2
) %>%
fmt_number(
columns = "sample_size",
decimals = 0
) %>%
opt_stylize(6) %>%
cols_width(everything() ~ px(150)) %>%
tab_source_note(
source_note = md(
"Data source: schema mediawiki_product_metrics_reading_list <br>
Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web ")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Second-Week Retention Rate Statistical Summary |
|||
| Data source: schema mediawiki_product_metrics_reading_list Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
display_html(
as_raw_html(
df_retention_r7r7_d_m %>%
select(variation, outcome) %>%
analyze_relative_lift(metric_type = "proportion") %>%
gt()%>%
tab_header(
title = md("Second-Week Retention Rate<br>Impact Estimation")
) %>%
tab_spanner(
label = "Bayes",
columns = c(estimate_bayes, chance_to_win, cred_lower, cred_upper)
) %>%
tab_spanner(
label = "Frequency",
columns = c(estimate_freq, p_value, conf_lower, conf_upper)
) %>%
cols_label(
estimate_bayes = "Change (Bayes)",
chance_to_win = "Chance To Win",
cred_lower = "95% CI Lower (Bayes)",
cred_upper = "95% CI Upper (Bayes)",
estimate_freq = "Change (Freq)",
# p_value =
conf_lower = "95% CI Lower (Freq)",
conf_upper = "95% CI Upper (Freq)",
) %>%
tab_style(
style = cell_text(weight = "bold"),
locations = cells_column_spanners(c("Bayes", "Frequency"))
) %>%
fmt_percent(
columns = everything(),
decimals = 2
) %>%
# opt_stylize(6) %>%
cols_width(everything() ~ px(150)) %>%
tab_source_note(
source_note = md(
"Timeframe: December 19, 2025 - January 18, 2026 <br>
Platform: desktop and mobile web")
)%>%
tab_style(
style = cell_text(align = "left"),
locations = cells_source_notes()
)
)
)| Second-Week Retention Rate Impact Estimation |
|||||||
| Change (Bayes) | Chance To Win | 95% CI Lower (Bayes) | 95% CI Upper (Bayes) | Change (Freq) | p_value | 95% CI Lower (Freq) | 95% CI Upper (Freq) |
|---|---|---|---|---|---|---|---|
| Timeframe: December 19, 2025 - January 18, 2026 Platform: desktop and mobile web |
|||||||
Retention rates across the platform were similar for both the control and treatment groups: approximately 22% for second-day retention and 53% for second-week retention. Overall, we did not observe a statistically significant increase in the treatment group. This may be due to the low engagement with the Reading List among treatment users.