$
#>

Projects

// selected work in data science & machine learning

Utilizing Pandas to Analyze "Friends" for a Reboot

featured

Exploratory and statistical analysis of the sitcom Friends to identify patterns in viewership, dialogue distribution, and emotional range to inform a potential reboot strategy.

// key outcomes
  • Analyzed episode-level data including directors, writers, and viewership trends
  • Applied hypothesis testing and regression analysis to examine IMDB ratings and audience patterns
  • Used NLP techniques to analyze dialogue distribution and generate episode titles in the style of the original show
PythonPandasNLPStatistical AnalysisRegression

Diabetes Risk Prediction Using Medical Records

featured

Built a logistic regression model to predict diabetes onset using medical record data, with a focus on data cleaning, visualization, and interpretability.

// key outcomes
  • Cleaned and preprocessed medical data by handling missing and invalid values
  • Visualized feature distributions and correlations using histograms and scatter plots
  • Achieved 75% accuracy on a held-out test dataset using logistic regression
PythonPandasScikit-learnMatplotlibLogistic Regression

Image Classification with CNNs on CIFAR-10

Collaborated on developing a convolutional neural network to classify images from the CIFAR-10 dataset, focusing on model optimization and performance evaluation.

// key outcomes
  • Implemented and trained a CNN using TensorFlow
  • Experimented with hyperparameters to optimize validation performance
  • Achieved 79.74% test accuracy on CIFAR-10 image classification
PythonTensorFlowCNNsMachine Learning