Abdelwahab Amr: Aspiring Data Scientist & AI Engineer
I'm a 4th-year Software Engineering student passionate about Artificial Intelligence, with strong hands-on experience in Python, Machine Learning, and data analysis. I'm actively seeking opportunities to apply my skills and contribute to innovative projects.
CV
My Journey in Software Engineering & AI
1
Delta Technology University
Currently pursuing a Bachelor’s in Software Engineering (Year 4), with an expected graduation in May 2026. My studies provide a strong foundation in software development principles, complemented by specialized courses in AI and data science.
2
Rigorous Curriculum
Engaging with a comprehensive curriculum that covers advanced algorithms, data structures, and various programming paradigms. This academic rigor prepares me for complex problem-solving in the tech industry.
3
Hands-On Learning
Actively participating in practical labs and project-based learning, applying theoretical knowledge to real-world scenarios in AI and Machine Learning. This fosters a deep understanding of software development lifecycle.
Professional Experience & Impact
Genius Technology Center
Machine Learning Intern (Apr 2024 - May 2024)
Completed intensive ML training covering preprocessing, EDA, and deep learning. Built California House Price model (R²=0.846, MAE=29K) and Diabetes Prediction System (94% accuracy, 85% recall). Developed Hybrid Recommendation System.
Data Scientist Intern – Digital Egypt Pioneers Initiative
Sep 2024 – May 2025
  • Completed intensive training in Artificial Intelligence, Data Science, and Machine Learning, with a focus on building, evaluating, and deploying predictive models. Developed end‑to‑end projects, including a Sales Forecasting system (R²: 0.992) and a Diabetes Diagnosis Support System (94% accuracy, 85% recall). Gained hands‑on experience in data preprocessing, feature engineering, statistical analysis, SQL querying, and web scraping for real‑world datasets. Utilized Python, Scikit‑Learn, XGBoost, FastAPI, MLflow, and Power BI to track experiments, deploy solutions, and deliver actionable insights. Strengthened problem‑solving, analytical thinking, and collaboration through teamwork with data professionals.
Data Scientist Intern – Black Horse
Apr 2024 – Jun 2024
  • Completed practical training in data analysis and machine learning, focusing on regression models, feature engineering, and model evaluation. Developed Breast Cancer Prediction (96% accuracy) and House Price Prediction (91% accuracy) projects using Python, Scikit‑Learn, and Pandas. Processed and explored 100k+ records, applying data preprocessing, visualization, and web scraping to extract actionable insights from real‑world datasets. Strengthened problem-solving and hands-on coding through end‑to‑end predictive modeling and performance optimization.
Technical Skills & Expertise
Programming
C++, Java, Python, SQL with strong OOP principles and clean code practices
AI & ML
Pandas, NumPy, Scikit-Learn, TensorFlow, PyTorch, and advanced algorithms
Data Systems
SQLite, MySQL, PostgreSQL with complex queries and optimization
Analytics
Matplotlib, Seaborn, Power BI for data visualization and storytelling
Web Development
HTML, CSS, FastAPI
Tools & Platforms
GitHub, Jupyter Notebook, Kaggle, DataCamp, Google Collab
Featured Data Science & AI Projects
1
Sales Forecasting System (DEPI Final Project)
Build ML models (Prophet, XGBoost) for sales prediction, achieving MAE of 7.19 and R² of 0.992. Deployed with FastAPI and MLflow, integrating an interactive dashboard.
2
Hybrid Recommendation System (GTC Final Project)
Combined content-based and collaborative filtering for accurate movie recommendations. Used TF-IDF vectorization and cosine similarity to deliver personalized suggestions with higher accuracy than individual models.
3
Diabetes Diagnosis Support System
ML system leveraging demographic and lab data (HbA1c, glucose) to assist diagnosis. Achieved 94% accuracy and 85% recall using XGBoost tuning, SMOTE balancing, and Streamlit deployment.
4
Customer Churn Prediction
The goal of this project is to help telecom companies identify customers who are likely to leave (churn) and take proactive measures to retain them, I build Logistic Regression → Accuracy: 73
XGBRFClassifier → Accuracy: 77%
XGBoost Classifier (Tuned)→ Accuracy: 79.6% / Recall: 78%
5
Movie Ratings Classification
Preprocessed movie metadata and engineered features for a Random Forest model, achieving ~70% accuracy in classifying movie ratings (High/Medium/Low).
6
Titanic Survival Prediction
Logistic regression model predicting passenger survival with engineered features (family size, age groups). Achieved 84% accuracy using strategic feature engineering and model optimization.
7
House Price Prediction (Black Horse)
Developed a XGBRegressor model with 84% prediction accuracy, focusing on robust feature engineering and Scikit-learn based evaluation (MSE/R²).
Certifications & Continuous Learning
I am committed to continuous learning and professional development, constantly expanding my knowledge base through online courses and specialized certifications.
Python & Web Development
Udemy: Full-stack development using Python, Django, and HTML5 with multiple web-based applications (May 2024)
Data Science Fundamentals
Black Horse: Data analysis workflows and ML models with real-world classification and prediction projects (Oct 2024)
Data Visualization Mastery
Kaggle: Applied visualization techniques using Matplotlib and Seaborn to extract actionable insights (Sep 2024)
Mathematics for Data Science
365 Data Science: Probability, statistics, linear equations, and matrices essential for ML (Nov 2024)
SQL & Database Management
DataCamp: Querying, filtering, joins, grouping, and subqueries through interactive exercises (Jan 2025)
Digital Egypt Pioneers Program
Trained by AMIT on IBM Data Science Professional Certificate ,include Python, SQL, ML algorithms, model evaluation, and deployment.
Stanford ML Specialization
Coursera: Three-course specialization by Andrew Ng covering supervised, unsupervised learning, and neural networks (Oct 2025)
Freelance Project
This project showcases my ability to develop robust data solutions and interactive dashboards that provide actionable insights for strategic decision-making.
Interactive Dashboard for Academic/Committee Invitation Data Analysis
Developed a comprehensive dashboard to analyze invitation data for various academic and committee positions, including Chair, Member, Alternative roles, and Vice Chair.
  • Response Patterns: Captured detailed insights into response rates and engagement levels.
  • Demographic Analysis: Visualized gender distribution and geographic representation of invitees and acceptors.
  • Trend Monitoring: Tracked monthly acceptance trends to identify peak periods and inform future planning.
  • Goal: Empowered stakeholders to monitor engagement, identify critical patterns, and optimize resource allocation for recruitment and outreach efforts.
Deep Learning Research freelance Project
Paper name:
Project story
This freelance research project focused on enhancing an existing pneumonia diagnosis research paper by improving the deep learning architecture and adding scientific contributions to boost performance and interpretability. The primary goal was to improve classification accuracy while maintaining high recall, which is critical for medical screening applications.
Dataset & Methodology
  • Dataset: CoronaHack-Chest X-Ray Dataset
  • Normal samples: 1,576 | Pneumonia samples: 4,334
  • Image Size: 224×224 pixels
  • Optimizer: Adam (lr=1e-4)
  • Loss Function: BCEWithLogitsLoss
  • Training Epochs: 25
Models Compared
  • Baseline: ResNet18 (pretrained on ImageNet)
  • Proposed: SE-ResNet18 (ResNet18 + SE Blocks + Grad-CAM)
Performance Metrics Comparison
  • Accuracy: 84.94% → 88.78% (+3.8%)
  • Precision: 0.810 → 0.852
  • Recall: 0.992 → 0.992
  • F1 Score: 0.892 → 0.917
  • AUC: 0.943 → 0.973
Key Insights
  • SE blocks improved feature selection and attention mechanisms
  • Grad-CAM heatmaps confirmed focus on infected lung regions
  • Very high recall (0.992) crucial for medical screening
  • Transforms black-box CNN into interpretable diagnostic tool
Grad-CAM Visualization Example
Heatmap showing model focus on infected lung regions

Model Performance Comparison
Visual comparison of metrics between baseline and proposed model
Let's Connect
I'm passionate about discussing innovative AI solutions, exploring collaboration opportunities, and connecting with fellow data science enthusiasts. Whether you're interested in projects, opportunities, or simply want to exchange ideas—I'd love to hear from you.
📱 Phone
+20 01272710802
Professional Networks
GitHub
github.com/abdelwahabamr – Explore my code repositories and projects
LinkedIn
linkedin.com/in/abdelwahab798 – Connect professionally and follow updates
Made with