🛒 Retail Sales Analytics Pipeline

A comprehensive end-to-end data analytics pipeline processing 475,000+ retail transactions using Python, SQL, and interactive web technologies.
🌐 Live Demo
👉 View Interactive Dashboard
✨ Features
- 📊 Data Generation: Creates 500K+ realistic retail transactions
- 🔄 ETL Pipeline: Automated data cleaning and transformation
- 💾 Database: SQLite with optimized indexes
- 🧮 SQL Analytics: CTEs, window functions, complex joins
- 📈 Visualizations: 8 professional charts (Matplotlib/Seaborn)
- 🌐 Interactive Dashboard: Web-based dashboard with Plotly.js
- 💼 Power BI Ready: Pre-processed export files
🚀 Quick Start
# Clone the repository
git clone https://github.com/Jin02420/retail-sales-analytics.git
cd retail-sales-analytics
# Install dependencies
pip install -r requirements.txt
# Run the pipeline
run_pipeline.bat
# Open dashboard.html in your browser
📊 Tech Stack
- Python 3.12 - Core language
- Pandas & NumPy - Data manipulation
- SQLite - Database management
- Matplotlib & Seaborn - Static visualizations
- Plotly.js - Interactive charts
- Faker - Synthetic data generation
📈 Project Highlights
- 💰 Total Revenue: $322.9M analyzed
- 📊 Total Profit: $112.8M calculated
- 📉 Profit Margin: 34.9% average
- 🛒 Transactions: 475,352 processed
- 👥 Customers: 50,000 unique
- 📦 Products: 1,000 items
🎯 Skills Demonstrated
- Python programming & scripting
- SQL database design & optimization
- ETL pipeline development
- Data analysis & visualization
- Business intelligence
- Web development (HTML/CSS/JavaScript)
- Version control with Git
📁 Project Structure
src/
├── 01_data_generation.py # Generate synthetic data
├── 02_etl_pipeline.py # ETL processes
├── 03_sql_analytics.py # SQL analysis
├── 04_visualization.py # Create charts
└── 05_power_bi_export.py # Export for Power BI
👤 Author
Jin02420 - GitHub Profile
📄 License
MIT License
⭐ If you found this project helpful, please give it a star!