Projects that reflect my curiosity, technical growth, and drive to make data actionable. I enjoy tackling challenges in predictive modeling, NLP, computer vision, and data engineering.
Trojan Horse Hunt in Time Series Forecasting
- Top 13% -
Secure Your AI competition (European Space Agency): reconstructed trojan triggers injected into satellite telemetry forecasting models. Explored adversarial poisoning in multivariate time series using anomaly detection and robust model analysis.
- Reverse-engineered triggers from poisoned models using spectral and statistical methods.
- Benchmarked reconstruction accuracy against reference models and sample solutions.
- Documented findings and contributed to open-source reproducibility.
Time SeriesAI SecurityEuropean Space AgencySatellite Telemetry
FlightRank 2025: Aeroclub RecSys Cup
- Top 18% -
Kaggle competition to build personalized flight recommendations for business travelers. Developed a group-wise ranking model to predict which flight option a user would select from thousands of alternatives, balancing price, schedule, airline, and policy constraints.
- Engineered features from pricing, timing, route, and user attributes.
- Implemented ranking algorithms to surface relevant options for each search session.
- Optimized for ranking quality to place the chosen flight at the top of each session.
Recommendation SystemsClassificationBusiness Analysis
CMI - Detect Behavior with Sensor Data
- Top 22% -
Child Mind Institute competition: predicted body-focused repetitive behaviors (BFRBs) from wrist-worn device sensor data. Built models to distinguish BFRB-like gestures from everyday movements using IMU, temperature, and proximity sensors.
- Engineered features from movement, temperature, and proximity sensor streams.
- Trained models to differentiate BFRB-like and non-BFRB-like activity across multiple body positions.
- Ranked in the top 22% on the ongoing leaderboard.
Sensor DataTime SeriesClassification
RSNA Intracranial Aneurysm Detection
Kaggle competition to identify and localize intracranial aneurysms using CTA, MRA, and MRI data. Built deep learning pipelines for 3D medical image analysis, leveraging spatial localization and vessel segmentation labels. Currently ranked in the [your percentile here] percentile on the ongoing leaderboard.
- Developed preprocessing and augmentation for multimodal series (CTA, MRA, T1/T2 MRI).
- Trained models for detection and localization across 13 anatomical sites.
- Integrated vessel segmentation for improved spatial accuracy.
Computer VisionMedical ImagingDeep LearningKaggle
Brain Tumor MRI Classification
Deep learning model for automatic classification of brain tumors from MRI scans. Fine-tuned ResNet-50 to distinguish glioma, meningioma, pituitary tumor, and no tumor. Achieved 99.2% test accuracy using transfer learning, data augmentation, and robust evaluation.
- Transfer learning with ResNet-50; custom head for 4-class classification.
- Data augmentation: flips, rotations, color jitter for generalization.
- Comprehensive metrics: precision, recall, F1-score, confusion matrix.
Medical Imaging
Deep Learning
ResNet-50
PyTorch
Movie Data Analysis and Prediction
Analyzed a dataset of 16,000 movies and built predictive models for movie ratings and genres using Python (pandas, numpy, scikit-learn, matplotlib, seaborn, nltk). The project covers data cleaning, preprocessing, EDA, visualization, outlier removal, and machine learning for rating and genre prediction based on movie titles and descriptions.
- Explored trends: movies released per year, rating distributions, top genres, duration vs. rating.
- Visualized key insights with line plots, histograms, bar charts, and regression plots.
- Developed Ridge regression and Logistic Regression models to predict ratings and genres from titles/descriptions.
- Custom inference function for predicting ratings/genres for new movies.
Data Analysis
Visualization
Machine Learning
Python
Reddit Sentiment Analysis for Tech Stock Prediction
Analyzed Reddit comments about major tech companies and correlated sentiment with stock prices (Apple, Google, Microsoft, Amazon). Used Hugging Face transformers for sentiment analysis and Yahoo Finance for stock data. Explored how online sentiment relates to stock performance with visualizations and predictive modeling.
- Performed sentiment analysis on Reddit comments using transformer models.
- Fetched and analyzed stock data (returns, moving averages) from Yahoo Finance.
- Studied correlations between Reddit sentiment and stock returns.
- Visualized stock trends, sentiment indicators, heatmaps, and scatter plots.
- Developed linear regression models to quantify sentiment impact on stock returns.
NLP
Sentiment Analysis
Finance
Visualization
Python