Mediation analysis with duration dataΒΆ
This notebook demonstrates mediation analysis when the mediator and outcome are duration variables, modeled using proportional hazards regression. These examples are based on simulated data.
[1]:
import pandas as pd
import numpy as np
import statsmodels.api as sm
from statsmodels.stats.mediation import Mediation
Make the notebook reproducible.
[2]:
np.random.seed(3424)
Specify a sample size.
[3]:
n = 1000
Generate an exposure variable.
[4]:
exp = np.random.normal(size=n)
Generate a mediator variable.
[5]:
def gen_mediator():
mn = np.exp(exp)
mtime0 = -mn * np.log(np.random.uniform(size=n))
ctime = -2 * mn * np.log(np.random.uniform(size=n))
mstatus = (ctime >= mtime0).astype(np.int)
mtime = np.where(mtime0 <= ctime, mtime0, ctime)
return mtime0, mtime, mstatus
Generate an outcome variable.
[6]:
def gen_outcome(otype, mtime0):
if otype == "full":
lp = 0.5*mtime0
elif otype == "no":
lp = exp
else:
lp = exp + mtime0
mn = np.exp(-lp)
ytime0 = -mn * np.log(np.random.uniform(size=n))
ctime = -2 * mn * np.log(np.random.uniform(size=n))
ystatus = (ctime >= ytime0).astype(np.int)
ytime = np.where(ytime0 <= ctime, ytime0, ctime)
return ytime, ystatus
Build a dataframe containing all the relevant variables.
[7]:
def build_df(ytime, ystatus, mtime0, mtime, mstatus):
df = pd.DataFrame({"ytime": ytime, "ystatus": ystatus,
"mtime": mtime, "mstatus": mstatus,
"exp": exp})
return df
Run the full simulation and analysis, under a particular population structure of mediation.
[8]:
def run(otype):
mtime0, mtime, mstatus = gen_mediator()
ytime, ystatus = gen_outcome(otype, mtime0)
df = build_df(ytime, ystatus, mtime0, mtime, mstatus)
outcome_model = sm.PHReg.from_formula("ytime ~ exp + mtime", status="ystatus", data=df)
mediator_model = sm.PHReg.from_formula("mtime ~ exp", status="mstatus", data=df)
med = Mediation(outcome_model, mediator_model, "exp", "mtime",
outcome_predict_kwargs={"pred_only": True})
med_result = med.fit(n_rep=20)
print(med_result.summary())
Run the example with full mediation
[9]:
run("full")
<ipython-input-5-b2c8b114323e>:5: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
mstatus = (ctime >= mtime0).astype(np.int)
<ipython-input-6-214ee2853d67>:11: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ystatus = (ctime >= ytime0).astype(np.int)
Estimate Lower CI bound Upper CI bound P-value
ACME (control) 0.742427 0.643339 0.862745 0.0
ACME (treated) 0.742427 0.643339 0.862745 0.0
ADE (control) 0.073017 -0.016189 0.155321 0.1
ADE (treated) 0.073017 -0.016189 0.155321 0.1
Total effect 0.815444 0.675214 0.919580 0.0
Prop. mediated (control) 0.912695 0.814965 1.025747 0.0
Prop. mediated (treated) 0.912695 0.814965 1.025747 0.0
ACME (average) 0.742427 0.643339 0.862745 0.0
ADE (average) 0.073017 -0.016189 0.155321 0.1
Prop. mediated (average) 0.912695 0.814965 1.025747 0.0
Run the example with partial mediation
[10]:
run("partial")
<ipython-input-5-b2c8b114323e>:5: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
mstatus = (ctime >= mtime0).astype(np.int)
<ipython-input-6-214ee2853d67>:11: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ystatus = (ctime >= ytime0).astype(np.int)
Estimate Lower CI bound Upper CI bound P-value
ACME (control) 0.987067 0.801560 1.192019 0.0
ACME (treated) 0.987067 0.801560 1.192019 0.0
ADE (control) 1.071734 0.964214 1.150352 0.0
ADE (treated) 1.071734 0.964214 1.150352 0.0
Total effect 2.058801 1.862231 2.288170 0.0
Prop. mediated (control) 0.481807 0.417501 0.533773 0.0
Prop. mediated (treated) 0.481807 0.417501 0.533773 0.0
ACME (average) 0.987067 0.801560 1.192019 0.0
ADE (average) 1.071734 0.964214 1.150352 0.0
Prop. mediated (average) 0.481807 0.417501 0.533773 0.0
Run the example with no mediation
[11]:
run("no")
<ipython-input-5-b2c8b114323e>:5: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
mstatus = (ctime >= mtime0).astype(np.int)
<ipython-input-6-214ee2853d67>:11: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ystatus = (ctime >= ytime0).astype(np.int)
Estimate Lower CI bound Upper CI bound P-value
ACME (control) 0.010200 -0.039434 0.065176 1.0
ACME (treated) 0.010200 -0.039434 0.065176 1.0
ADE (control) 0.902295 0.824526 0.984934 0.0
ADE (treated) 0.902295 0.824526 0.984934 0.0
Total effect 0.912495 0.834728 1.009958 0.0
Prop. mediated (control) 0.003763 -0.044186 0.065520 1.0
Prop. mediated (treated) 0.003763 -0.044186 0.065520 1.0
ACME (average) 0.010200 -0.039434 0.065176 1.0
ADE (average) 0.902295 0.824526 0.984934 0.0
Prop. mediated (average) 0.003763 -0.044186 0.065520 1.0