The Weather Automation ELT Pipeline is designed to create a real-time, comprehensive view of weather conditions across the United States. By continuously gathering, processing, and storing weather data, this pipeline ensures accurate and up-to-date insights, functioning like a network of automated weather observers stationed in every major city.
The extraction process fetches real-time weather data using API calls to WeatherAPI, acting as a digital weather balloon that continuously collects raw atmospheric data. This process is fully automated, ensuring a constant stream of fresh weather information from various locations across the country.
Once extracted, the raw weather data is securely stored in Amazon S3, a scalable storage solution. The data is archived in a designated raw/ folder within the S3 bucket, preserving the original, unprocessed observations for future processing.
The transformation phase processes the raw weather data into a structured and analyzable format. This step involves:
The transformed data is then stored in the transformed/ folder in S3, ensuring it is ready for downstream applications such as forecasting and analytics.
To maintain efficiency, the entire pipeline is fully automated using Docker containers and Docker Compose. Each container performs a specific task—data extraction, storage, transformation—working in sync to keep the system running 24/7. This setup ensures:
To enhance usability, a Streamlit-based dashboard is integrated to provide real-time weather data visualization.
The Streamlit app visualizes the latest transformed weather data from the S3 bucket in real-time, offering users interactive and dynamic insights into current weather conditions.
Overview Metrics: Displays key weather statistics, including:
This Weather Automation ELT Pipeline transforms raw weather observations into a real-time, structured, and ready-to-use dataset, making it a powerful tool for weather analytics, forecasting, and decision-making.