"

7 Chapter 7: Cohort Studies

Chapter 7: Cohort Studies

OBJECTIVES

After completing this module, you should be able to:

Describe two main types of cohort studies.

Describe how to select cohort study participants using population-based samples and exposure-based samples.

Discuss potential sources of bias in cohort studies.

Discuss methods to reduce potential biases in cohort studies.

 

7.1 Introduction

“Cohort” refers to a group of subjects or persons who share a defining characteristic. In epidemiology, cohort studies are longitudinal studies that compare the risk of the outcome of interest between the exposed and non-exposed groups. The “exposure” in this regard can come from outside the body, such as contact with chemicals or bacteria, or from inside the body, such as atherosclerosis, high blood pressure, certain genes, certain mental illnesses, etc. Cohort studies are observational studies (non-interventional studies), and the exposed vs. non-exposed status occurs “naturally” (i.e., not as the result of random allocation, as found in randomized controlled trials) (Gordis, 2014; Rothman, 2002).

 

7.2 Designs and Types of Cohort Studies

Cohort studies can be categorized by the study design as well as by the study population. The most basic type of a cohort study is a study where the investigators select participants from a “healthy” defined population who are vulnerable to (i.e., at risk of) the outcome (disease) of interest, measure the exposure status of the participants, follow up on the participants in the future, and count the number of participants with the outcome in the exposed vs. the non-exposed groups. The investigators then compare the risk (also known as “incidence”) of the outcome between the exposed vs. the non-exposed group and calculate the measure of association known as the risk ratio. For example, an investigator may define a group of young coal miners who have just been recruited to work in the mine and measure the exposures of interest for lung disease (such as exposure to PM2.5 dust particles) at the beginning of employment, then return to the mine at some time in the future (for example, a year from the beginning of employment) and measure the presence (or absence) of lung diseases among the same group of miners. Such studies are called “prospective cohort studies” because the study begins at an initial point and then follows up on the participants in the future (i.e., prospectively).

Alternatively, a cohort study can begin by following up on study subjects for their outcome status, checking their past records for the history of exposure, and then calculating the risk ratio in a similar manner. For example, an investigator may ascertain the status of lung disease among coal miners today, then look back at their employment and medical examination records from the time they began employment as a miner and measure their level of exposure of interest (such as job in the mine, educational background, level of exposure to PM2.5 dust, weight, height, smoking status, and other aspects of their medical history). Such studies are called “retrospective cohort studies” because the study begins at the end-point and then assesses the exposure status of the participants by reviewing their history of exposure from past records (i.e., retrospectively). The difference between retrospective cohort studies and case-control studies is that in retrospective cohort studies, measurement of exposure happened before the outcome (with data on record). On the other hand, in case-control studies, investigators did not have records of exposure measured in the past, and must rely on the reported history or other measures at the time of study (after the cases had developed the disease).

We can also classify cohort studies by how the exposure vs. non-exposed groups are defined. The examples of cohort studies among miners have only one ascertainment of exposure status: at the beginning of employment (start of the follow-up) with no change in status afterward. These studies can be referred to as “fixed cohorts” because the exposure status is fixed. If every participant can be followed up (i.e., no loss to follow-up occurs), then this fixed cohort can be further defined as a closed population cohort (also known as a “closed cohort”).

However, a person’s exposure status may change over time. For example, a coal mine may recruit five new miners in 1972 and 10 new miners in 1973. Miners who work in a section or task with a high level of exposure to PM2.5 may become supervisors or coordinators and work mainly in an office with a low level of exposure. Some miners may work for just 2–3 years before the examination of their lung disease status, while others may work for more than 10 years. Cohorts that enroll miners with shifting exposure status and varying lengths of follow-up time are called “open population” or “dynamic population” cohorts (also called “open cohorts” or “dynamic cohorts”).

7.2.1 Cohort Identification / Selection of Study Populations

Cohort studies can include participants who are samples of the general population who reside in a defined geographic area. In these studies, special resources for cohort identification and/or follow-up may be available. In addition, cohort studies are also appropriate for studying the effect of rare or distinctive exposures and may be the only appropriate study design to assess the extent to which a rare/distinctive exposure is associated with an outcome of interest.

7.2.2 Measurement of Exposure in a Cohort Study

Issues pertaining to the measurement of exposure in cohort studies are similar to those in cross-sectional studies. When a cohort study participant is enrolled, investigators make various measurements of outcomes and potential determinants. For a given outcome of interest, participants with exposure status at baseline should only include those who are free from the outcome of interest, but are susceptible to developing the outcome. Exposure measurement methods include interviews, completing self-administered questionnaires, physical measurements, biospecimen collection, etc. The more details obtained by the investigator regarding the frequency, intensity, and duration of the exposure, the completer and more insightful the study findings will be. Disease status at baseline should also be considered as an exposure, as individuals with diseases can develop complications, which are outcomes of interest in studies focusing on secondary prevention.

7.2.3 Follow-Up of Cohort Study Members

In an ideal world, we would like to be able to measure the exposures and outcomes of all cohort study participants from the baseline to the last follow-up visit and that in our study no one had died early from other causes or became unreachable (lost to follow-up). In the real world, however, most cohort studies are conducted for years and are subject to incomplete follow-up or ascertainment of the outcome status among the exposed and the non-exposed. Thus, there is the possibility of a type of selection bias called differential loss to follow-up. If the proportion of participants with incomplete follow-up is relatively small and is similar between the exposed group and non-exposed group, then bias is likely to be absent or small (and in the direction toward the null), making the results conservative. However, if the proportion of participants with incomplete follow-up is moderate or large and is different between the exposed group and the non-exposed group, then bias is more likely to be present, affecting the validity of the study findings.

There are several ways to manage these issues. In many studies, it is necessary to monitor each participant’s location or whereabouts regularly. For example, investigators may send questionnaires to participants’ homes every 1 to 2 years and ask the participants to complete and return the questionnaires. The investigators will then have additional information on the outcome status (or lack thereof), whether the participants are still reachable to report the outcome(s) if it were to occur and also receive updated information on the exposure status.

In addition, some cohort studies may require that “everyone meeting the cohort definition must have survived for a specified period” (Rothman, 2008). The required specified period of survival is known as “immortal person-time”. For example, in our hypothetical cohort study on the exposure to PM2.5 and lung disease, if we are interested in the long-term effect of the exposure and we measure PM2.5 exposure on the first day on the job for everyone, the incidence rate of workers who are assigned to more highly exposed jobs, more susceptible to the effect of PM2.5, and quit early because of those effects may be mixed with the incidence rate of workers who work for a longer period, such as over 5 years, potentially introducing bias to our estimates. If we apply immortal person-time to our analyses and exclude anyone who has worked in the mine for less than five years from the baseline measurement and, instead, measure cumulative exposure to PM2.5 on the first day of the study, then we could exclude the effect of short-term workers and differential loss to follow-up from our study.

7.2.4 Outcome Measurement in a Cohort Study

Cohort studies are suitable for the study of rare exposures (such as the radiation emitted by the atomic bomb) but may not be suitable for studying rare outcomes. The cohort study may yield no case or very few cases of disease, so the statistical significance is low, and we are not able to determine whether chance is the best explanation for the observed findings.

For example, the study of the association between in utero exposure to the estrogen medication diethylstilbestrol (DES) and the occurrence of clear cell adenocarcinoma of the vagina was conducted in 804 girls and women aged 11 to 30 whose mothers used DES during pregnancy. The study found no case of vaginal adenocarcinoma because of the rarity of the cancer. Subsequent studies found that the cumulative incidence of the disease was actually ~1 per 1,000 among women and girls exposed in utero to DES.

Similarly, in a study on occupational exposure to the chemical trichloroethylene (TCE) on the occurrence of esophageal cancer among approximately 10,700 workers (7,000 workers exposed to TCE and 3,700 workers with no known chemical exposure) (Blair et al., 1998). At the end of the follow-up period, there were 10 cases in the exposed group, while there was only 1 case in the non-exposed group. The rate ratio of the association was RR = 5.6 (95% CI = 0.7,44.5). The association was strong, but the 95% confidence interval was very wide because the number of cases in each group was very small. Thus, statistically, we could not rule out chance as the best explanation for the observed findings.

Sometimes, we may not be able to measure our outcome of interest and must rely on “surrogate” outcomes; thus, we must adjust our results and interpretation accordingly. For example, in a cohort study in Kenya on the association between male circumcision and new HIV infection, the number of circumcised men was relatively small in the study setting and the follow-up time was relatively short (approximately a year) (Agot et al., 2007). As such, it was difficult to document new HIV infections. In addition, HIV testing during the study period required approximately three months of waiting time after potential exposure to HIV (i.e., the “window period”) before valid test results could be obtained. Otherwise, the test may yield false negative results. The investigators solved the issue by measuring sexual behaviors as the surrogate for new HIV infections. Investigators interviewed participants regarding high-risk sexual behaviors (history of unprotected sexual intercourse, number of sexual partners, number of high-risk sex acts per week). Investigators found that the risky sexual behaviors among circumcised men were lower during the first month after circumcision, then increased to the same level as that in uncircumcised men for the remainder of the year. They stated that the result suggested, “any protective effect of male circumcision on HIV acquisition is unlikely to be offset by an adverse behavioral impact”.

7.2.5 Potential Biases in Cohort Studies

Selection bias: The most common form of selection bias in cohort studies is differential loss to follow-up. If loss to follow-up occurs at different levels between the exposed and non-exposed group, and the occurrence of disease differs between those who remained in the study and those who are lost to follow-up in each group, then the losses could bias the RR either away from the null or towards the null. Good reports on epidemiological study findings should always strive to discuss the potential biases and the extent to which they distort the observed findings from the truth. The epidemiologist can assess the extent to which selection bias due to differential loss to follow-up existed in the study by conducting additional data analyzes to answer the following questions:

What is the extent to which loss to follow-up occurred in the exposed and the non-exposed groups?

Did the loss to follow-up differ between those who did and did not remain in the study with regard to the predictors of the outcome?

If so, had the loss to follow-up not occurred, would the true incidence in each group be higher or lower than the observed incidence?

How did the loss to follow-up affect the observed risk ratio (RR) or rate ratio (RR)?

Details regarding methods in imputation to correct the bias can be found in more advanced textbooks in epidemiology and biostatistics.

Information bias: The most common form of information bias in cohort studies is observer bias. Observer bias occurs when any member of the study staff (the interview nurses, the examining physician, the epidemiologist, or the statistician who analyzes the data) is aware of the exposure status of the participants and has strong preconceptions about the nature of the association between the exposure and the outcome. In such a scenario, the study staff’s care and thoroughness in their examination of the disease status may differ between the exposed and the non-exposed groups, possibly biasing the study results. Ways to reduce potential observer bias include: 1) blinding (i.e., not revealing) to the study staff the participant’s exposure status; and 2) creating a detailed written protocol for data collection and analyses to which all staff are required to adhere.

7.2.6 Comparison between cohort, cross-sectional, and case-control studies

Cohort studies differ from cross-sectional and case-control studies in purpose, study design, advantages, and disadvantages. A summary of the comparison can be found in Table 7.2.6.1 below.

Table 7.2.6.1 Comparison of cohort, cross-sectional, case-control studies

Component

Cross-sectional studies

Case-control studies

Cohort studies

Purpose

* Estimate prevalence of exposures and outcomes of interest

* To assess the extent to which an outcome is associated with an exposure

* To compare the risk of the outcome between exposed and non-exposed participants

Design

* Sample participants from members of the population of interest and measure their exposure and outcome status

* Select participants with the outcome of interest (preferably those who just developed the outcome, also known as “incident cases”)

* Then, select members of the base population that gave rise to the cases who have not become the cases as the “controls”

* Define a group of participants without the outcome of interest who can develop the outcome (i.e., susceptible population), measure their exposure (“baseline” measurement), and measure the incidence (risk) of the outcome of interest in the future after a period of time has passed (“follow-up” measurement)

Measuring the exposure prevalence

* Can measure the prevalence of multiple exposures simultaneously

* Generally, not appropriate for measuring the prevalence of exposures of interest in the general population

* Can measure the prevalence of the exposure at baseline

* Can measure the incidence (risk) of the outcome among the susceptible participants

Measuring outcome prevalence

* Can measure the prevalence of multiple outcomes simultaneously

* Cannot measure the prevalence of the outcome

* Can measure the outcome prevalence at follow-up, as well as the incidence (risk) of the outcome

Use in the study of rare exposures

* Not appropriate if conducted in the general population

* Not appropriate

* Appropriate

Use in the study of rare outcomes

* Not appropriate

* Appropriate

* Appropriate but may incur high costs

Common measure(s) of association

* Odds ratio (the ratio of the odds of the outcome among the exposed vs. non-exposed, or vice versa)

* Odds ratio (the ratio of the odds of the exposure among the cases vs. the controls)

* Risk ratio (the ratio of the incidence of the outcome among participants with and without the exposure of interest)

Common source(s) of selection bias

* Non-response (refusal to participate)

* Control selection, resulting in the exposure odds among the control not reflecting that of the base population that gave rise to the cases

* Non-response

* Loss to follow-up (biased toward the null value if differential, biased either toward or away from the null value if non-differential)

Common source(s) of information bias

* Social desirability bias

* Self-serving bias

* Response acquiescence bias

* Observer bias

* Recall bias

* Social desirability bias

* Self-serving bias

* Response acquiescence bias

* Observer bias

* Social desirability bias

* Self-serving bias

* Response acquiescence bias

* Observer bias

Use of the study findings

* Monitoring of trends in a population

* Generate hypotheses for further studies

* Identify potential determinants of a rare disease

* Generate hypotheses for further studies

* Measure the risk of disease

* As evidence in support of policymaking

7.3. Examples of Cohort Studies

7.3.1 Framingham Heart Study

An example of a cohort study that includes participants who are samples of the general population in a defined geographic area is the Framingham Heart Study, a long-term, ongoing cohort study of residents of Framingham, Massachusetts. In 1948, the U.S. Congress commissioned the U.S. National Heart, Lung and Blood Institute, National Institute of Health, to conduct the study. The study was scheduled to be conducted for 20 years, but in 1968, Congress voted to continue the study. The study population has been divided into five different cohorts (Tsao & Vasan, 2015):

The original cohort included 5,209 persons (including 1,644 husband-wife pairs) who were 30 to 62 years old in 1948, and who were deemed to be healthy (no history of heart attack or stroke).

The offspring cohort was recruited in 1971 and included 5,124 persons who were children of the husband-wife pairs in the original cohort, children of members of the original cohort with coronary disease, and the spouses of these groups of children.

The third generation cohort was recruited in 2002 and included 4,095 adults aged 20 years or older with at least one parent in the offspring cohort.

The OMNI cohort was recruited in 1994 and included 506 participants who were of African American, Hispanic, Asian, Indian, Pacific Islander, and Native American descent.

The second OMNI cohort was recruited in 2003 and included 410 “ethnically and racially diverse adults, some of whom were family members of the original OMNI Cohort” (Tsao & Vasan, 2015).

Over the years, the Framingham Heart Study has expanded its measurements from medical history, questionnaires, physical examination, and serum samples to include genetic, metabolomics, and proteomics data. It is estimated that over 3,000 peer-reviewed scientific papers have been published using data from the Framingham Heart Study. The data from the study are publicly available from the National Heart, Lung, and Blood Institute’s Biologic Specimen and Data Repository Information Coordinating Center (BioLINCC).

7.3.2 Atomic Bomb Survivors Study

An example of a cohort study with rare or distinctive exposure is the cohort study of atomic bomb survivors in Hiroshima and Nagasaki (Kodama et al., 1996; Ozasa et al., 2018). After the end of World War II, U.S. President Harry S. Truman issued a presidential directive to the U.S. National Academy of Sciences-National Research Council in 1946 to conduct investigations into the long-term effects of radiation among the atomic bomb survivors. The directive led to the establishment of the Atomic Bomb Casualty Commission (ABCC), which subsequently arrived in Japan to begin a cohort study of atomic bomb survivors.

The initial ABCC cohort study yielded information on the causes of injury in the population of the bombed cities, as well as the variations in the survival time and long-term health problems according to their distance from the detonation’s hypocenter. The ABCC cohort study continued, albeit with declined funding, for 30 years until its dissolution in 1975, after which the Radiation Effects Research Foundation (RERF) was established, and the work on the cohort continues today.

The RERF conducted the Life Span Study with a cohort of 120,000 subjects that have been followed since 1950, including 3,600 survivors who were exposed in utero and 77,000 children of the atomic bomb survivors. The Life Span Study estimated atomic bomb radiation exposure based on the location and shielding conditions at the time of the bombing, and outcomes included vital status, cause of death, cancer incidence, and other health effects. Outcome measurements also included the collection of biological specimens. The Life Span Study found that increased radiation exposure was associated with malignant diseases and some non-cancer diseases among those exposed in utero, but there was no increased risk among the children of survivors.

For prospective cohort studies, it is possible to obtain information on current and past exposures at the beginning of the follow-up (“baseline”) period according to our hypothesis. Indeed, this is the prospective study’s main advantage. In prospective cohort studies, exposures can be measured using direct physical measurements (such as the collection of blood samples and other specimens, physical examination, radiography, etc.), interviews, and self-administered questionnaires, among other methods. However, it must be noted that, regarding the measurement of biological specimens, some characteristics reflect very short-term exposure. For example, serum cholesterol changes within hours and urinary drug tests can only measure drug use within the past 2–4 days. Such measurements will not necessarily reflect the study participant’s long-term level of exposure, and repeated measurements may be required.

For retrospective cohort studies, it is not possible to go back in time to the past and personally interview or perform physical examinations on the study participants, but it is still possible to check the participants’ records (such as employment records from employers or medical and pharmacy records for the general population residing in a particular geographic area) and ascertain each participant’s exposure level. Vital records (such as birth certificates) can also be used to ascertain exposure. For example, many countries record a child’s birthweight on the birth certificate; thus, a cohort of low birthweight children vs. normal birthweight children can be identified from their birth certificates (Weiss & Koepsell, 2014).

7.4 Practical Exercise

Note: the data set and supporting file(s) are available at this webpage: https://www.kaggle.com/wichaiditwit/datasets

Assignment: Write codes to fill in the results table according to the outlined plans (please use row percentage).

Source: Gabrysch, C., Fritsch, R., Priebe, S., & Mundt, A. P. (2019). Mental disorders and mental health symptoms during imprisonment: A three-year follow-up study. PLOS ONE, 14(3), e0213711. https://doi.org/10.1371/journal.pone.0213711

 

Analysis Plan: Association between depression and risk of suicidality over a three-year follow-up period among inmates in Chilean prisons

Introduction

To describe the course of mental disorders and symptoms in three Chilean prisons over a three-year follow-up period, a group of investigators conducted a cohort study among prisoners who remained or were consecutively incarcerated at baseline and follow-up periods (2013 and 2016, respectively) (Gabrysch et al., 2019). The study found that the prisoners had a lower prevalence of depression, psychosis, and suicide risk at follow-up compared to baseline periods, as well as better overall mental illness. However, as suicide is known to be a complex phenomenon (Favril et al., 2023; Orsolini et al., 2020), it is also possible that prisoners who had depression at baseline were more likely to become a suicide risk at follow-up than those who did not. Thus, this practical exercise aims to describe the association between depression and suicide risk among prisoners in three Chilean prisons.

Objective: To describe the extent to which depression at baseline period is associated with suicide risk at follow-up.

Methods

Please refer to the cited literature (Gabrysch et al., 2019).

Study design, study participants, study instrument, data collection, and data management

Please refer to the cited literature (Gabrysch et al., 2019).

Study variables

Exposure (Depression at baseline): Represented with the variable ‘depress_bl’ (1 for having depression, 0 for not having depression)

Outcome (Suicide risk at follow-up): Represented with the variable ‘suiciderisk_fl’ (1 for having a suicide risk, 0 for otherwise)

Confounders: In multivariable regression analyses, we will adjust for sex (variable ‘male’), age (variable ‘age’), suicidality at baseline (variable ‘suicidality_bin’), and depression at follow-up (variable ‘depress_fu’), as these are shown to be predictors of suicide according to the literature (Favril et al., 2023; Orsolini et al., 2020).

Data analysis

We decided to use logistic regression to calculate the OR (crude and adjusted) with 95% CI instead of using the log-binomial regression to calculate the risk ratio (RR) (crude and adjusted) with 95% CI as the latter technique did not yield outputs when adjusted for multiple covariables.

Draft Results Table

Depression at baseline

Not suicide risk

Suicide risk

Unadjusted OR (95% CI)

Adjusted OR (95% CI)

Did not have depression

Reference

Reference

Have depression

* Adjusted for sex, age, suicidality at baseline, and depression at follow-up.

Solutions to the practical exercise can be found at the end of the chapter after the references.

7.5 Conclusion

The term “cohort” refers to a group of subjects or persons who share a defining characteristic. In epidemiology, cohort studies are longitudinal studies that compare the risk of the outcome of interest between the exposed and non-exposed groups. They can be divided into two main types: prospective cohort studies and retrospective cohort studies.

Cohort study participants may be selected from samples of a general population who reside in a defined geographic area or from individuals with a rare or distinctive exposure, such as survivors of a natural or man-made disaster. Like other observational studies, cohort studies can experience bias. A major source of selection bias in cohort studies is differential loss to follow-up, which can be corrected by imputation of data. A major source of information bias in cohort studies is observer bias, which can be reduced by blinding the observer to the participants’ exposure status and creating a detailed protocol to which research staff strictly adhere.

References

Agot, K. E., Kiarie, J. N., Nguyen, H. Q., Odhiambo, J. O., Onyango, T. M., & Weiss, N. S. (2007). Male circumcision in Siaya and Bondo Districts, Kenya: Prospective cohort study to assess behavioral disinhibition following circumcision. Journal of Acquired Immune Deficiency Syndromes (1999), 44(1), 66–70. https://doi.org/10.1097/01.qai.0000242455.05274.20

Blair, A., Hartge, P., Stewart, P. A., McAdams, M., & Lubin, J. (1998). Mortality and cancer incidence of aircraft maintenance workers exposed to trichloroethylene and other organic solvents and chemicals: Extended follow up. Occupational and Environmental Medicine, 55(3), 161–171. https://doi.org/10.1136/oem.55.3.161

Favril, L., Yu, R., Geddes, J. R., & Fazel, S. (2023). Individual-level risk factors for suicide mortality in the general population: An umbrella review. The Lancet. Public Health, 8(11), e868–e877. https://doi.org/10.1016/S2468-2667(23)00207-4

Gabrysch, C., Fritsch, R., Priebe, S., & Mundt, A. P. (2019). Mental disorders and mental health symptoms during imprisonment: A three-year follow-up study. PLOS ONE, 14(3), e0213711. https://doi.org/10.1371/journal.pone.0213711

Gordis, L. (2014). Epidemiology (5th edition). Elsevier Saunders.

Kodama, K., Mabuchi, K., & Shigematsu, I. (1996). A long-term cohort study of the atomic-bomb survivors. Journal of Epidemiology, 6(3 Suppl), 95–105. https://doi.org/10.2188/jea.6.3sup_95

Orsolini, L., Latini, R., Pompili, M., Serafini, G., Volpe, U., Vellante, F., Fornaro, M., Valchera, A., Tomasetti, C., Fraticelli, S., Alessandrini, M., La Rovere, R., Trotta, S., Martinotti, G., Di Giannantonio, M., & De Berardis, D. (2020). Understanding the Complex of Suicide in Depression: From Research to Clinics. Psychiatry Investigation, 17(3), 207–221. https://doi.org/10.30773/pi.2019.0171

Ozasa, K., Grant, E. J., & Kodama, K. (2018). Japanese Legacy Cohorts: The Life Span Study Atomic Bomb Survivor Cohort and Survivors’ Offspring. Journal of Epidemiology, 28(4), 162–169. https://doi.org/10.2188/jea.JE20170321

Rothman, K. J. (2002). Epidemiology: An Introduction (1st edition). Oxford University Press.

Tsao, C. W., & Vasan, R. S. (2015). Cohort Profile: The Framingham Heart Study (FHS): Overview of milestones in cardiovascular epidemiology. International Journal of Epidemiology, 44(6), 1800–1813. https://doi.org/10.1093/ije/dyv337

Weiss, N. S., & Koepsell, T. D. (2014). Epidemiologic Methods: Studying the Occurrence of Illness (2nd edition). Oxford University Press.

 

Solutions: Practical Exercise for Chapter 7 (Cohort Studies)

library(epicalc)

setwd(“redacted”)

use(“Ch7_exercise.csv”)

Cross-tabulate the association between the exposure (depression at baseline, or ‘depress_bl’) and the outcome (suicide risk at follow-up, or ‘suiciderisk_fl’):

tabpct(depress_bl, suiciderisk_fl)

Calculation of the crude OR:

model1a <- glm(suiciderisk_fl ~ depress_bl, family=binomial)

logistic.display(model1a)

Calculation of the adjusted OR with adjustments for the mentioned covariables in the analysis plan:

model2a <- glm(suiciderisk_fl ~ depress_bl + male + age + suicidality_bin + depress_fu, family=binomial)

logistic.display(model2a)

Notes regarding risk ratio calculation

The crude risk ratio (RR) can be calculated using the following command:

model1b <- glm(suiciderisk_fl ~ depress_bl, family=binomial(link=“log”))

logistic.display(model1b)

Please note that the RR is somewhat smaller than the OR and the 95% CI showed that the association was only borderline significant (unadjusted RR = 3.74; 95% CI = 0.99,14.11).

However, the adjusted risk ratio (RR) could not be calculated with the following codes, as errors occurred:

model2b <- glm(suiciderisk_fl ~ depress_bl + male + age + suicidality_bin + depress_fu, family=binomial(link=“log”), data=.data)

logistic.display(model2b)