01
💬
Sentiment Analysis

NLP - OpenAI - NLTK - Twitter Data

OpenAI GPT vs NLTK sentiment comparison.

Compared GPT-3.5-turbo and NLTK on 500k tweets from the first 65 days of the Russia-Ukraine war (1.6GB dataset). Correlation analysis found no significant relationship between tweet sentiment and engagement metrics. A daily heatmap showed that while only 51% of results overlap between tools, both capture the same trend - a sharp rise in negative sentiment beginning February 24, 2022.

02
📩
Spam Predictor

Classification - KNN - SVM - Random Forest

Multi-model spam classification comparison.

Tested KNN, Logistic Regression, SVM, Decision Trees, and Random Forest on a spam dataset split 75/25 for training and testing. Used stratified k-fold cross-validation to handle class imbalance and prevent overfitting. Decision Trees emerged as the best performer based on cross-validation score, with feature engineering opportunities identified to further improve accuracy.

03
🌺
Neural Network

Deep Learning - Keras - Flower Classification

Predictive neural network with Keras.

Built and compared two neural network architectures for flower type classification using Keras. Model 1 - with 2 hidden layers and more neurons - outperformed Model 2 on both train and test accuracy, showing better generalization with no signs of overfitting. Model 2 exhibited a wider gap between train and test scores, indicating it struggled to generalize to unseen data.