Task 1: Data Collection
Task: Collect historical data from PJM Interconnection for the Hoboken area
Goal: There would be a preliminary dataset that can be preprocessed and cleaned to train the model.
Specification: The data shall include relevant information such as load data, date and time, and external factors (e.g. weather conditions, if it was a special day, etc.)
Task 2: Data Preprocessing
Task: The dataset is clean and preprocessed by addressing missing values, and any outliers.
Goal: This dataset will be used to train the model, so it must be cleaned for the most accurate and effective analysis
Specification: Normalize the data, ensure that the timestamps are in a suitable format for effective analysis, and remove any erroneous data values.
Task 3: Exploratory Data Analysis
Task: Using exploratory data analysis to appropriately select a model.
Goal: To determine any trends or patterns in the dataset that would be useful when selecting an appropriate model.
Specification: Implementation of data visualization and statistics, noting any correlations between load and external factors (e.g. weather conditions).
Task 4: Feature Engineering
Task: Creating new features using the existing dataset.
Goal: To develop features that would improve the performance of the model.
Specification: This takes into account the external variables such as weather conditions, and other categories such as the days of the week, holidays, and special events.
Task 5: Model Selection
Task: Based on the data analysis and determined trends and patterns, a suitable machine learning algorithm is chosen.
Goal: To generate the most accurate load prediction using an appropriate model.
Specification: Different models can be compared and evaluated for their robustness and stability.
Task 6: Research Existing Models
Task: Do some research on existing models, and features that would have been used to develop the algorithm. This includes referring to relevant case studies, academic papers, and industry reports.
Goal: To have an outline of previous used strategies to determine what aspects were successful and what could be improved.
Specification: Based on the information found, a document shall be created that contain all the models used, and their respective architectures, strengths and weaknesses. Any unique techniques or approaches shall also be noted.
Task 7: Develop Model Architecture
Task: Developing the different components of the model, that is, the individual modules and how they will interact with one another in the entire model.
Goal: To have a framework or outline of the model and how each module contributes to the overall functioning of the algorithm.
Specification: Determining the type of architecture of the model, the individual modules, and using visual representations (e.g. flowcharts or block diagrams) to illustrate how the model works.
Task 8: Model Training and Testing
Task: Training the model on existing data values using strategies that implement hyperparameter tuning and model generalization.
Goal: Hyperparameter tuning optimizes the model's settings for better performance, and assessing model generalization ensures that the model performs well on new data.
Specification: The existing dataset can be used for both training and testing. 80% can be used for training and the remaining 20% can be used for testing.
Task 9: Implementation of Real-Time Predictions
Task: Develop a system for generating real-time electricity load predictions by integrating the trained model into an accessible interface or API.
Goal: Facilitate real-time load forecasting based on incoming data.
Specification: The model will need to be periodically updated with new data.
Task 10: Evaluation and Refinement
Task: Continual monitoring and assessment, making any necessary adjustments.
Goal: To ensure that the model is performing as efficiently and accurately as possible.
Specification: Document performance and room for improvement.
If the model is not able to adapt to new data, a new model or architecture may need to be developed.