5 Chapter 5: Cross-sectional Studies and Health Surveillance
Chapter 5: Cross-sectional Studies and Health Surveillance
Objectives
After completing this module, you should be able to:
1. Describe the basic principles of cross-sectional studies.
2. Describe the sampling methodology for cross-sectional studies.
3. Describe the concepts of health surveillance and monitoring surveys.
5.1 Introduction
Cross-sectional studies, commonly referred to as “health surveys” and “prevalence studies,” are studies that measure the distribution of exposure(s), i.e., the prevalence of risk factor(s) of interest, and the distribution of outcome(s), i.e., the prevalence of the disease(s) or health behavior(s) of interest, at the same time. Cross-sectional studies provide a “snapshot” of disease and risk factors (LaMorte, 2020) and can be used to describe the distribution of a health outcome by person, place, and time. Such findings can provide basic information for stakeholders to “keep track” of ongoing situations and identify issues or areas where additional attention and resource allocation may be needed. In other words, cross-sectional studies assess the extent to which a population is affected by a disease and the extent to which a population engages in health-related behaviors (e.g., smoking, exercise, handwashing, seat belt use) at a point in time (LaMorte, 2020).
5.2 Overview of Cross-sectional Studies
Unlike ecological studies that compare the aggregated occurrence of disease in entire cities or countries, the unit of analysis in cross-sectional studies is individual persons. Cross-sectional studies generally involve either the collection of data from all eligible individuals in a given population (i.e., a census) or a sample of members of a given population (i.e., a survey). It should be noted that in cross-sectional studies, the measurement is not repeated in the same individuals. In other words, there is no follow-up (LaMorte, 2020). If a follow-up visit is conducted among participants in a cross-sectional study, then the study is no longer cross-sectional; it becomes a prospective cohort study. In other words, a cross-sectional study can also become the starting point of a cohort study (see Chapter 7 for more details). In addition to describing prevalence, cross-sectional studies can also measure the extent to which an exposure is associated with an outcome and generate hypotheses for more in-depth investigations (Vetter & Chou, 2014). However, a key limitation of cross-sectional studies is the “snapshot” approach itself. The study lacks temporality. We do not know the temporal ordering of cause and effect (Bovbjerg, 2020). In other words, we do not know whether the exposure actually occurred before the outcome. We simply know the extent to which the prevalence of a given outcome varies between those with an exposure of interest and those without the exposure.
5.2.1 Health Surveillance and Monitoring
In public health and epidemiology, the term “surveillance” refers to “the continuous, systematic collection, analysis, and interpretation of health-related data” for the planning, implementation, and evaluation of public health actions (Thacker & Berkelman, 1988; Wolitski et al., 2004). Each surveillance study is a cross-sectional study or “survey” that helps to inform public health professionals and scientists on changes in the occurrence of diseases as well as risk factors of interest. Although the most common use is for infectious disease prevention and control, surveillance programs have expanded to chronic and non-communicable diseases, as well as injuries and illnesses after disasters (Gordis, 2014).
Surveillance can be divided into two types: passive surveillance and active surveillance. Passive surveillance refers to surveillance conducted using available data on reportable diseases. In passive surveillance, healthcare providers or local public health officers are generally the ones to make the report. Under-reporting and incomplete information are known issues in passive surveillance (Gordis, 2014). However, the system is also relatively inexpensive and easier to establish. Furthermore, standardized, simple instruments implemented worldwide can allow for the international comparison of disease occurrence that can confirm new cases and identify areas in need of urgent assistance (Gordis, 2014).
Active surveillance refers to surveillance conducted by specially recruited project staff. In active surveillance, project staff make field visits to hospitals to interview healthcare providers and review medical records or visit rural villages and towns to detect cases. Active surveillance systems are generally more sensitive to the detection of local outbreaks but are generally more expensive to implement and maintain.
One point of consideration regarding surveillance is that in order for public health actions to be coordinated across regions and countries, the structure of the data and the definition of diseases or exposures of interest need to be consistent. The data structure of surveillance systems also needs to be standardized in order to ease the data pooling and integration processes (Gordis, 2014), which would, in turn, ease coordinated public health actions.
Similar to surveillance, another type of periodic health action called “monitoring” can be used to describe the state of health within a particular group at various points in time. Monitoring can be defined as the “routine observations on health” and health-related factors (Christensen, 2001) and generally has a broader focus than surveillance. Although the activities are similar, the main difference appears to be with regard to the scope of the activity: surveillance focuses on a particular disease outcome or exposure within a group, whereas monitoring looks at the overall health experience of the group (Guidotti, 1985). Unlike surveillance, monitoring surveys can be more extensive and include measurements of emerging issues in behavioral health. For example, a monitoring survey on behavioral health among school-going Thai adolescents may include as many as 30 pages, self-administered by students in Years 7, 9, and 11 of a 12-year education system. Data from such a study can provide basic information on health disparities among sexual and gender minorities (Wichaidit, Mattawanon, et al., 2023) as well as the characterization of electronic cigarette use among adolescents (Wichaidit, Chotipanvithayakul, et al., 2023). Monitoring surveys in nationally representative populations conducted with adequate care can yield findings of interest to stakeholders for each respective issue. For example, a phone-based monitoring survey during the COVID-19 pandemic on alcohol consumption may include questions on experience of economic distress (Wichaidit et al., 2022), and the findings could be of interest to stakeholders in behavioral health as well as economics. Monitoring surveys, when repeatedly conducted over an adequately long time, can yield information regarding trends for a particular exposure or outcome of interest, such as changes in the prevalence of alcohol consumption among adolescents (Assanangkornchai et al., 2020).
5.2.2 Sampling Methods in Cross-Sectional Studies
In cross-sectional studies, sampling can be divided into two main types: 1) probability sampling, where the probability of selection for each participant is known, and; 2) non-probability sampling, where the probability of selection is unknown. Probability sampling methods include but are not limited to: 1) simple random sampling; 2) systematic sampling; 3) stratified sampling. The most common type of non-probability sampling in cross-sectional studies is convenience sampling.
In simple random sampling, investigators need to create a sampling frame, i.e., a list of the members of the population of interest, and then select the required number of samples from the list at random. Investigators may choose to use a mobile phone application, a web-based application, or even a random number function on a spreadsheet software to help with that process. The advantage of this method is the ability to ensure random selection of the participants, whereas the disadvantages are that each and every members of the target population will need to be placed on the sampling frame before random selection could be made, which can be time-consuming.
In systematic sampling, the investigator calculates a sampling interval based on the size of the population and the required sample size using the following formula:
Sampling interval (k) = Population (N) / sample size (n)
k = N/n
The investigator then randomly chooses a number between 1 and k, and then chooses the person at every k interval on the list.
The advantage is the relative ease of the process, whereas the disadvantage is the possibility that the samples might not be representative of the population if the members of the study population were already systematically organized. For example, if male participants were placed on the list in groups, and groups of female participants were interspersed with the male participants. If the interval made the investigator skipped most of the male participants, then male participants would be under-represented among the study samples.
In stratified sampling, the sampling frame would be divided (stratified) by the characteristic of interest (e.g., sex), and then simple random sampling or systematic sampling would take place within each stratum. The advantage of this method is that it ensures that the distribution of the characteristic of interest (e.g., sex) would reflect that of the source population. The disadvantage is the complicated sampling procedures.
Non-probability sampling is used when a sampling frame cannot be made. For example, if an investigator were to study the behaviors of customers at a bar on a Friday night, and there were people walking in and out of the bar at all times, then a sampling frame would be nearly impossible to create. The investigator would then resort to selecting the participants that the investigator could readily access in lieu of random sampling from the population. The investigator would not know the extent to which the study participants, chosen based on convenience, represented the entire population. The advantages of this method are the convenience and speed at which the data could be collected. The main disadvantage is the potential that the participants would not be representative of the population of interest.
5.3 Examples of Cross-sectional Studies
In addition to the surveillance and monitoring of emerging and existing situations, cross-sectional studies can also be used to generate hypotheses for more in-depth investigations (Vetter & Chou, 2014). For example, one of the most common sources of anxiety is financial concerns (Richardson 2017; Frankham 2020; Santabarbara 2021). However, financial worries likely do not occur randomly; they come from a known source (e.g., loss or reduction of employment, not having money to pay for rent or utilities or food, having to borrow money from family or friends). These sources are termed experience of economic distress, and these are likely associated with anxiety. A group of investigators then tested this hypothesis using data from a nationally representative monitoring survey among the general population of adults in Thailand (Wichaidit et al., 2022). The findings are shown in Table 5.3.1.
Table 5.3.1 Association between the experience of economic distress and anxiety disorder (GAD-7 test score 10 points or higher) among the general population of adults in Thailand (weighted percentage with standard errors)
Experience of Economic Distress |
No anxiety |
Anxiety |
Unadjusted OR (95% CI) |
Did not experience economic distress |
96.8% ± 0.7% |
3.2% ± 0.7% |
1 (Reference) |
Experienced economic distress |
92.4% ± 0.9% |
7.6% ± 0.9% |
2.54 (1.53, 4.22) |
Adapted from Wichaidit et al., 2022
Those who experienced economic distress had approximately 2.5 higher odds of anxiety compared to those who did not experience economic distress, and this difference was statistically significant. Detailed information on how to repeat this analysis (albeit without population-level sampling weight adjustment to help simplify the analysis process) can be found in the practical exercise section of this chapter.
5.4 Practical Exercise (Coding Required)
Note: the data set and supporting file(s) are available at this webpage: https://www.kaggle.com/wichaiditwit/datasets
Please note that this exercise is recommended but optional. I wish to invite the reader to read this section carefully, as it contains instructions on how to analyze the exercise data set and check the accuracy of the findings in Table 5.3.1. A repeat of the analyses without sampling weight adjustment should yield similar results.
Title: Association between the experience of economic distress and anxiety disorders: findings from a nationally representative survey in Thailand
Objective: To assess the extent to which experience of economic distress is associated with having an anxiety disorder.
Methodology:
The methodology of this study (Wichaidit et al., 2022) can be found at this link: https://peerj.com/articles/13307/
Abridged details are provided below.
Study Design and Setting:
We commissioned a survey research firm (Research Centre for Social and Business Development Co. Ltd. (SAB), Bangkok, Thailand) to conduct a phone-based cross-sectional study in late April 2021 as the third wave of the COVID-19 pandemic began in Thailand, an upper-middle-income country in Southeast Asia.
Study Population and Participants:
The study population included Thai people aged 18 and over in 15 provinces who had a cell phone number. SAB investigators performed sample size calculation at a 95% confidence interval, 3% margin of error, design effect of 1.2, and response rate of 80%, and they obtained a final sample size of 1,537 respondents.
Participant Recruitment and Data Collection:
SAB conducted stratified two-stage sampling by dividing Thailand into five regions: the Bangkok Metropolitan Area, the Central Region, the North, the Northeast, and the South. For each region, SAB researchers selected the study provinces using systematic sampling. SAB possessed a list of over 100,000 mobile phone numbers of users from all of Thailand’s provinces registered with the three major telecommunication operators (AIS, True-Move, DTAC), and they sampled mobile phone numbers from the list of users in the selected provinces using cumulative systematic sampling.
Exposure Measurement: Experience of Economic Distress
To quote: “We drafted economic distress measurement questions based on those used in a study on the association between financial stress and smoking (Siahpush & Carlin, 2006) and a study on financial stress and mental health conducted prior to the COVID-19 pandemic (Richardson et al., 2017) and during the COVID-19 pandemic (Hertz-Palmor et al., 2020). We modified the questions to suit the context of the COVID-19 pandemic in Thailand, and created two groups of measurement (economic distress experienced since the declaration of the COVID-19 pandemic state of emergency, and economic distress experienced within the past 30 days prior to the survey) and translated the questions to Thai.” (Wichaidit et al., 2022)
For this analysis, we will consider participants who had experienced at least one type of economic distress within the past 30 days to be those exposed to economic distress; otherwise, they would be considered as having no experience of economic distress within the past 30 days.
Outcome Measurement: Anxiety Disorder
We used Thai versions of the GAD-7 to assess the prevalence of symptoms of anxiety and depression, respectively. We used the cut-off points of ≥ 10 points (range: 0–21 points) for the GAD-7. These standardized screening tools were designed to measure anxiety and depressive symptoms at the time of the study based only on self-reported personal experience during the two weeks prior to the survey.
Study Variables
Variable |
Definition |
Coding |
Exposure: Experience of economic distress within the past 30 days |
Variable Q10 contains the measurement questions, divided into Q10a (long-term experience of economic distress) and Q10b (experience of economic distress within the past 30 days). We will use the latter. For this analysis, we will consider participants who had experienced at least one type of economic distress within the past 30 days to be those exposed to economic distress; otherwise, they would be considered as having no experience of economic distress within the past 30 days.
|
#Economic distress in past 30 days #Economic distress – pandemic distress_b1 <- ifelse(q10_1b==9, NA, q10_1b); distress_b1 <- ifelse(distress_b1==1,1,0) distress_b2 <- ifelse(q10_2b==9, NA, q10_2b); distress_b2 <- ifelse(distress_b2==1,1,0) distress_b3 <- ifelse(q10_3b==9, NA, q10_3b); distress_b3 <- ifelse(distress_b3==1,1,0) distress_b4 <- ifelse(q10_4b==9, NA, q10_4b); distress_b4 <- ifelse(distress_b4==1,1,0) distress_b5 <- ifelse(q10_5b==9, NA, q10_5b); distress_b5 <- ifelse(distress_b5==1,1,0) distress_b6 <- ifelse(q10_6b==9, NA, q10_6b); distress_b6 <- ifelse(distress_b6==1,1,0) distress_b7 <- ifelse(q10_7b==9, NA, q10_7b); distress_b7 <- ifelse(distress_b7==1,1,0) distress_b8 <- ifelse(q10_8b==9, NA, q10_8b); distress_b8 <- ifelse(distress_b8==1,1,0) #Distress scores and experiencing at least 1 distress since pandemic distress_score <- distress_b1 + distress_b2 + distress_b3 + distress_b4 + distress_b5 + distress_b6 + distress_b7 + distress_b8 econ_distress_30d <- ifelse(distress_score==0,0,1)
|
|
||
Outcome: Prevalence of anxiety disorder |
Variables q13gad_1 through q13gad_7 contain the measurement questions GAD-7 score from the sum of questions 1 through 7 in section 13 (the GAD-7 section). Each score ranged from 0 to 3, so the total range = 0 to 21. |
gad7_score <- q13gad_1 + q13gad_2 + q13gad_3 + q13gad_4 + q13gad_5 + q13gad_6 + q13gad_7
|
|
If GAD-7 score >= 10, then the respondent would be considered as having an anxiety disorder, otherwise, no anxiety disorder. |
anxiety <- ifelse(gad7_score>=10, 1, 0) |
Study Instruments
The structured interview questionnaire can be found via this link in the Supplementary Information section: https://peerj.com/articles/13307/
Data Analysis
Bivariate analysis with cross-tabulation and unadjusted logistic regression to calculate the unadjusted OR and 95% confidence interval with those who did not experience economic distress as the reference group.
Ethical Considerations
The investigators have received approval for the analysis of anonymized secondary data from this survey (Faculty of Medicine Human Research Ethics Committee, Prince of Songkla University; REC. 62-054-18-1).
Table 5.4.1 (Please fill in with the outputs of your analyses)
Experience of Economic Distress |
No anxiety |
Anxiety |
Unadjusted OR (95% CI) |
Did not experience economic distress (n = …) |
|
|
1 (Reference) |
Experienced economic distress (n = …) |
|
|
|
5.5 Conclusion
Cross-sectional studies, commonly referred to as “health surveys” and “prevalence studies,” provide a “snapshot” of participants’ health status, i.e., the prevalence of health outcomes and exposures. The findings from cross-sectional studies can provide basic information for stakeholders to follow ongoing situations. In addition to describing prevalence, cross-sectional study data can also be used to describe the association between an exposure of interest and an outcome of interest at the same point in time and generate hypotheses — for instance, that experience of economic distress is associated with anxiety.
References
Assanangkornchai, S., Saingam, D., Jitpiboon, W., & Geater, A. F. (2020). Comparison of drinking prevalence among Thai youth before and after implementation of the Alcoholic Beverage Control Act. The American Journal of Drug and Alcohol Abuse, 46(3), 325–332. https://doi.org/10.1080/00952990.2019.1692213
Bovbjerg, M. L. (2020). Foundations of Epidemiology. Oregon State University. https://open.oregonstate.education/epidemiology/
Christensen, J. (2001). Epidemiological concepts regarding disease monitoring and surveillance. Acta Veterinaria Scandinavica. Supplementum, 94(Suppl 1), 11–16. https://doi.org/10.1186/1751-0147-42-s1-s11
Gordis, L. (2014). Epidemiology (5th Edition). Elsevier Saunders.
Guidotti, T. L. (1985). Occupational health monitoring and surveillance. American Family Physician, 31(2), 161–169.
LaMorte, W. W. (2020). Cross-Sectional Surveys. Boston University School of Public Health. https://sphweb.bumc.bu.edu/otlt/MPH-Modules/PH717-QuantCore/PH717-Module1B-DescriptiveStudies_and_Statistics/PH717-Module1B-DescriptiveStudies_and_Statistics5.html
Thacker, S. B., & Berkelman, R. L. (1988). Public health surveillance in the United States. Epidemiologic Reviews, 10, 164–190. https://doi.org/10.1093/oxfordjournals.epirev.a036021
Vetter, T. R., & Chou, R. (2014). 80—Clinical Trial Design Methodology for Pain Outcome Studies. In H. T. Benzon, J. P. Rathmell, C. L. Wu, D. C. Turk, C. E. Argoff, & R. W. Hurley (Eds.), Practical Management of Pain (Fifth Edition) (pp. 1057-1065.e3). Mosby. https://doi.org/10.1016/B978-0-323-08340-9.00080-3
Wichaidit, W., Chotipanvithayakul, R., & Assanangkornchai, S. (2023). Use of Electronic Cigarettes among Secondary School Students and their Association with Depressive Symptoms: Findings from a National Secondary School Survey in Thailand. Journal of Health Science and Medical Research, 1–10. https://doi.org/10.31584/jhsmr.2023984
Wichaidit, W., Mattawanon, N., Somboonmark, W., Prodtongsom, N., Chongsuvivatwong, V., & Assanangkornchai, S. (2023). Behavioral health and experience of violence among cisgender heterosexual and lesbian, gay, bisexual, transgender, queer and questioning, and asexual (LGBTQA+) adolescents in Thailand. PLOS ONE, 18(6), e0287130. https://doi.org/10.1371/journal.pone.0287130
Wichaidit, W., Prommanee, C., Choocham, S., Chotipanvithayakul, R., & Assanangkornchai, S. (2022). Modification of the association between experience of economic distress during the COVID-19 pandemic and behavioral health outcomes by availability of emergency cash reserves: Findings from a nationally-representative survey in Thailand. PeerJ, 10, e13307. https://doi.org/10.7717/peerj.13307
Wolitski, R. J., Janssen, R. S., Holtgrave, D. R., & Peterson, J. L. (2004). Chapter 40—The Public Health Response to the HIV Epidemic in the U.S. In G. P. Wormser (Ed.), AIDS and Other Manifestations of HIV Infection (Fourth Edition) (pp. 997–1012). Academic Press. https://doi.org/10.1016/B978-012764051-8/50042-1
Solutions: Practical Exercise for Chapter 5 (Cross-Sectional Studies)
Open the epicalc package and the data file:
library(epicalc)
setwd(“[redacted]”)
use(“Ch5_exercise.csv”)
Exposure: Experience of economic distress within the past 30 days
distress_b1 <- ifelse(q10_1b==9, NA, q10_1b); distress_b1 <- ifelse(distress_b1==1,1,0)
distress_b2 <- ifelse(q10_2b==9, NA, q10_2b); distress_b2 <- ifelse(distress_b2==1,1,0)
distress_b3 <- ifelse(q10_3b==9, NA, q10_3b); distress_b3 <- ifelse(distress_b3==1,1,0)
distress_b4 <- ifelse(q10_4b==9, NA, q10_4b); distress_b4 <- ifelse(distress_b4==1,1,0)
distress_b5 <- ifelse(q10_5b==9, NA, q10_5b); distress_b5 <- ifelse(distress_b5==1,1,0)
distress_b6 <- ifelse(q10_6b==9, NA, q10_6b); distress_b6 <- ifelse(distress_b6==1,1,0)
distress_b7 <- ifelse(q10_7b==9, NA, q10_7b); distress_b7 <- ifelse(distress_b7==1,1,0)
distress_b8 <- ifelse(q10_8b==9, NA, q10_8b); distress_b8 <- ifelse(distress_b8==1,1,0)
Distress scores and having at least one experience of distress since the COVID-19 pandemic:
distress_score <- distress_b1 + distress_b2 + distress_b3 + distress_b4 + distress_b5 + distress_b6 + distress_b7 + distress_b8
econ_distress_30d <- ifelse(distress_score==0,0,1)
Outcome: Prevalence of anxiety disorder
Variables q13gad_1 through q13gad_7 contain the measurement questions. GAD-7 score from the sum of questions 1 through 7 in section 13 (the GAD-7 section). Each score ranged from 0 to 3, so the total range = 0 to 21.
gad7_score <- q13gad_1 + q13gad_2 + q13gad_3 + q13gad_4 + q13gad_5 + q13gad_6 + q13gad_7
If the GAD-7 score >= 10, then the respondent would be considered as having an anxiety disorder, otherwise, no anxiety disorder.
anxiety <- ifelse(gad7_score>=10, 1, 0)
Cross-tabulation and filling the table:
tabpct(econ_distress_30d, anxiety)
Logistic regression for the calculation of unadjusted OR (95% CI)
model1 <- glm(anxiety ~ econ_distress_30d, family=binomial)
logistic.display(model1)