Job Market Analytics Pipeline
A complete end-to-end data engineering pipeline: extract, process, and visualize job market data with Airflow, APIs, and Streamlit.
End-to-End Pipeline Overview
This project demonstrates how to build a fully automated data pipeline for job market analytics, from data extraction to interactive visualization. The pipeline is orchestrated with Airflow, retrieves data from the Adzuna API, stores it in a SQLite database, processes and transforms the data, and finally presents insights through a Streamlit dashboard.
The architecture is modular and production-ready: each step (API extraction, storage, transformation, analytics) is decoupled and can be extended or replaced. Airflow ensures reliability and automation, while Streamlit provides a modern, interactive UI for business users.
- Automated, scheduled data collection (Airflow DAGs)
- Robust data storage and transformation (SQLite, Python, SQL)
- Real-time analytics and visualization (Streamlit, Plotly)
- Easy to extend for new data sources or analytics
Tech Stack
Key Features
Real-time Data Extraction
Automated retrieval of job data from the Adzuna API using Airflow.
Database Storage
Store raw and processed data in a local SQLite database for reliability and easy querying.
Data Transformation
Clean, enrich, and aggregate job data using Python and SQL workflows.
Interactive Analytics
Visualize trends and insights with Streamlit and Plotly dashboards.
Orchestration with Airflow
Schedule and monitor the entire pipeline for full automation.
Key Insights Delivered
- Salary trends by role and location
- Top hiring companies analysis
- Skills demand tracking
- Geographic job distribution
Quick Start Tutorial: Build Your Own Automated Pipeline
- Clone the repository:
git clone https://github.com/HumbledDS/job-market-pipeline
- Install dependencies:
pip install -r requirements.txt
- Configure your API keys and settings (see
config/
folder). - Run the complete pipeline with Airflow:
python scripts/run_complete_pipeline.py
This script triggers the Airflow DAG to extract, load, and transform job data automatically. - Launch the Streamlit dashboard:
streamlit run dashboard/job_market_dashboard.py
Explore salary trends, top companies, skills demand, and more in real time.