Create Your First Project
Start adding your projects to your portfolio. Click on "Manage Projects" to get started
Public Health Data Analysis on Foodborne Disease Risk Factors Using R
This project analyzes self-reported data on foodborne illness among individuals, with a focus on identifying demographic and environmental risk factors. The analysis was conducted using R programming with an emphasis on data cleaning, recoding categorical variables, statistical summaries, and structured reporting aligned with public health research standards.
Dataset Source:
A structured dataset titled ENVIRONMENTALANDSOCI_DATA_2025-06-05_1030.csv, containing responses from a health behavior survey. Variables included gender, religion, education level, water sources, sanitation, and experience with foodborne illness.
Tools & Libraries Used:
tidyverse for data manipulation and visualization
gtsummary for statistical summaries and formatted tables
here for consistent file referencing
apaTable for APA-style reporting
Analysis Objectives:
Clean and recode demographic and environmental data for interpretability
Describe the population distribution by key factors such as gender, education, religion, and marital status
Evaluate associations between water/sanitation factors and reported foodborne illness
Present the findings in well-structured Word-ready tables using gtsummary and APA formats
Key Steps & Components:
Data Cleaning & Recoding
Used mutate() and dplyr::recode() to transform coded variables (e.g., gender: 1 → Male, 2 → Female)
Handled categorical variables like religion, marital status, education level for clarity
Descriptive Statistics
Summarized demographic distributions (e.g., majority Christian, mostly educated up to secondary/tertiary level)
Generated frequency tables and cross-tabulations for environmental exposure variables
Foodborne Illness Assessment
Analyzed the proportion of individuals reporting illness
Compared illness rates across groups (e.g., by water source, handwashing practices, and food storage)
Reporting
Produced publication-ready tables using gtsummary::tbl_summary()
Report was rendered as a clean Word document with interpretable tables for policymakers or public health officials
Outcomes & Insights:
Demonstrated correlation between poor sanitation indicators (e.g., unsafe water source) and self-reported foodborne disease
Identified key demographic segments for targeted interventions
Delivered a professional-quality report using reproducible R Markdown workflows
This project showcases your ability to:
Conduct public health data cleaning and statistical exploration using R
Translate coded survey data into readable, policy-relevant summaries
Apply structured R workflows for reproducible reporting

