Back to Projects
Machine Learning · Industrial Advanced

✈️ AirTransport Predictor – Industrial MLOps

Predicting aviation KPIs with over 93% accuracy using real BTS datasets, end-to-end MLOps pipelines, and multi-cloud deployment.

View on GitHub
93%+ Predictive Accuracy
3 KPIs Forecasted
AWS Cloud Deploy

Project Overview

The AirTransport Predictor is a production-grade Machine Learning system designed to forecast critical aviation Key Performance Indicators — including passenger volumes, freight tonnage, and mail loads — using real-world datasets from the Bureau of Transportation Statistics (BTS) of the United States.

The system achieves over 93% predictive accuracy through rigorous feature engineering, model selection, and hyperparameter optimization. The entire pipeline is automated using MLOps best practices: DVC manages data versioning, MLflow tracks all experiments and model artifacts, and GitHub Actions orchestrates CI/CD deployment to AWS.

A Streamlit dashboard provides real-time KPI visualization and forecasting, while a REST API allows integration with external operational systems. The project demonstrates how industrial-scale ML can be built, validated, monitored, and updated entirely in an automated fashion.

What You'll Learn

  • Design end-to-end ML pipelines for real-world transportation data
  • Implement DVC for reproducible data versioning and pipeline management
  • Track experiments and compare models using MLflow
  • Build and expose ML predictions through a production REST API
  • Automate the full build → test → deploy cycle with GitHub Actions on AWS
  • Create interactive KPI dashboards using Streamlit

System Architecture

BTS Data
Source
Ingestion
Pipeline
Feature Eng.
Processing
Scikit-learn
Model
MLflow
Tracking
REST API
Serving
AWS + CI/CD
Deploy

Project Breakdown

01
Data Collection

Downloading and validating BTS aviation datasets. Setting up DVC remote storage and pipeline stages.

02
Feature Engineering

Time-series feature extraction, lag variables, rolling statistics, and categorical encoding for aviation data.

03
Model Development

Training and comparing Random Forest, Gradient Boosting, and XGBoost models. MLflow experiment logging.

04
API Development

Building a Flask/FastAPI REST endpoint for real-time KPI prediction with input validation.

05
Dashboard

Interactive Streamlit app for exploring forecasts, confidence intervals, and historical trends.

06
MLOps Deployment

Dockerizing the app, configuring GitHub Actions workflow, and deploying to AWS EC2/ECS with automated rollback.