Portfolio
Machine learning projects spanning NLP, classification, and neural networks - built with Python, Keras, and NLTK. More advanced models are in development.
NLP - OpenAI - NLTK - Twitter Data
Compared GPT-3.5-turbo and NLTK on 500k tweets from the first 65 days of the Russia-Ukraine war (1.6GB dataset). Correlation analysis found no significant relationship between tweet sentiment and engagement metrics. A daily heatmap showed that while only 51% of results overlap between tools, both capture the same trend - a sharp rise in negative sentiment beginning February 24, 2022.
Classification - KNN - SVM - Random Forest
Tested KNN, Logistic Regression, SVM, Decision Trees, and Random Forest on a spam dataset split 75/25 for training and testing. Used stratified k-fold cross-validation to handle class imbalance and prevent overfitting. Decision Trees emerged as the best performer based on cross-validation score, with feature engineering opportunities identified to further improve accuracy.
Deep Learning - Keras - Flower Classification
Built and compared two neural network architectures for flower type classification using Keras. Model 1 - with 2 hidden layers and more neurons - outperformed Model 2 on both train and test accuracy, showing better generalization with no signs of overfitting. Model 2 exhibited a wider gap between train and test scores, indicating it struggled to generalize to unseen data.