top of page

Create Your First Project

Start adding your projects to your portfolio. Click on "Manage Projects" to get started

Public Health Data Analysis on Foodborne Disease Risk Factors Using R

Project type

Public Health Data Analysis

Date

2025

Role

Biostatistician

This project analyzes self-reported data on foodborne illness among individuals, with a focus on identifying demographic and environmental risk factors. The analysis was conducted using R programming with an emphasis on data cleaning, recoding categorical variables, statistical summaries, and structured reporting aligned with public health research standards.

Dataset Source:
A structured dataset titled ENVIRONMENTALANDSOCI_DATA_2025-06-05_1030.csv, containing responses from a health behavior survey. Variables included gender, religion, education level, water sources, sanitation, and experience with foodborne illness.

Tools & Libraries Used:

tidyverse for data manipulation and visualization

gtsummary for statistical summaries and formatted tables

here for consistent file referencing

apaTable for APA-style reporting

Analysis Objectives:
Clean and recode demographic and environmental data for interpretability

Describe the population distribution by key factors such as gender, education, religion, and marital status

Evaluate associations between water/sanitation factors and reported foodborne illness

Present the findings in well-structured Word-ready tables using gtsummary and APA formats

Key Steps & Components:
Data Cleaning & Recoding

Used mutate() and dplyr::recode() to transform coded variables (e.g., gender: 1 → Male, 2 → Female)

Handled categorical variables like religion, marital status, education level for clarity

Descriptive Statistics

Summarized demographic distributions (e.g., majority Christian, mostly educated up to secondary/tertiary level)

Generated frequency tables and cross-tabulations for environmental exposure variables

Foodborne Illness Assessment

Analyzed the proportion of individuals reporting illness

Compared illness rates across groups (e.g., by water source, handwashing practices, and food storage)

Reporting

Produced publication-ready tables using gtsummary::tbl_summary()

Report was rendered as a clean Word document with interpretable tables for policymakers or public health officials

Outcomes & Insights:
Demonstrated correlation between poor sanitation indicators (e.g., unsafe water source) and self-reported foodborne disease

Identified key demographic segments for targeted interventions

Delivered a professional-quality report using reproducible R Markdown workflows

This project showcases your ability to:

Conduct public health data cleaning and statistical exploration using R

Translate coded survey data into readable, policy-relevant summaries

Apply structured R workflows for reproducible reporting

bottom of page