Skip to content

Data Science Engineering Student

Anas Meftah

Machine Learning & AIML pipelines, LLM/RAG systems, and ML monitoring — with research interests in robust AI and neural network verification.

I'm a Data Science engineering student at the Faculty of Sciences of Tunis (expected 2028). I build end-to-end machine learning systems — from data pipelines and drift detection to RAG-based LLM applications — with a focus on rigor, reliability, and clean engineering.

About

Background

I'm currently pursuing a National Engineering Diploma in Data Science at the Faculty of Sciences of Tunis (Université de Tunis El Manar), after completing an integrated preparatory cycle in engineering and a Mathematics Baccalaureate with highest honors.

My work so far centers on practical machine learning engineering: I've built a feature drift detection system using Kolmogorov–Smirnov tests over production data streams, an MCP server that connects PDFs to LLMs through a RAG pipeline, and a from-scratch reproduction of the Vision Transformer paper in PyTorch. As an R&D member of the Securinets cybersecurity club, I've applied ML to network intrusion detection on the NSL-KDD dataset.

My engineering philosophy is that machine learning systems should be as trustworthy as they are capable. That conviction drives my research interests: robustness, formal verification of neural networks, and building AI systems whose behavior can be analyzed and guaranteed rather than merely observed.

Arabic
Native
English
Professional
French
Advanced

Research

Research Interests

Directions I'm actively studying and building toward — from robust ML systems to formal guarantees for neural networks.

Explore
Robust & Trustworthy AINeural Network VerificationFormal VerificationComputer VisionLanguage Models & RetrievalOptimization & Learning Theory

Projects

Selected Work

End-to-end ML systems: monitoring pipelines, retrieval-augmented LLM applications, and paper reproductions.

All projects
End-to-end ML monitoring pipeline that detects distribution drift between reference and production data streams using Kolmogorov–Smirnov tests.
PythonNumPyPandasSciPyStreamlit
2024
Details
Model Context Protocol server that connects technical PDFs to LLMs through an automated RAG pipeline with vector-database retrieval.
PythonLangChainVector DatabasesLLMsMCP
2025
Details
From-scratch PyTorch reproduction of “An Image is Worth 16×16 Words” with modular OOP design and custom training pipelines.
PythonPyTorch
2024
Details

Skills

Technical Toolkit

The languages, frameworks, and tooling I work with day to day.

Languages
Python (Advanced)SQLBash/ShellJava (Basic)
Machine Learning & AI
PyTorchScikit-learnLangChainRAG & LLMsVision TransformersDrift DetectionAnomaly DetectionSignal Processing
MLOps & Statistics
MLflowStreamlitSciPy (KS test)skopt (Bayesian tuning)Model Evaluation (F1, Confusion Matrix)Vector Databases
DevOps & Tools
Git / GitLabDockerLinux CLIPandasNumPyMatplotlib / Seaborn

Open Source

On GitHub

A selection of public repositories — learning journeys, experiments, and coursework.

@anesmeftah
My personal learning journey in Deep Learning with projects and notes.
Jupyter Notebook00
Kaggle Competitions
Jupyter Notebook00
Jupyter Notebook00
Jupyter Notebook00

Let's work together

Open to internships, research collaborations, and interesting ML engineering problems. Based in Tunis, Tunisia.