Add GPR functions and unit tests#4
Add GPR functions and unit tests#4vtommasini wants to merge 2 commits intosxs-collaboration:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds a Gaussian Process Regression (GPR) machine learning library for predicting orbital parameters, consisting of a self-contained model class with normalization capabilities and associated training/prediction functions. The implementation includes comprehensive unit tests to validate functionality.
- GPR model implementation with mixed RBF and Matern kernels, linear mean function, and input/output normalization
- Training and prediction functions with stored normalization parameters
- Complete pipeline function for training, prediction, visualization, and metrics reporting
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 17 comments.
| File | Description |
|---|---|
| GPR_library.py | Implements GPRegressionModel class, training function with Adam optimizer and learning rate scheduling, prediction function with denormalization, and full pipeline with plotting capabilities |
| TestGPRLibrary.py | Provides comprehensive unit tests covering training, prediction, normalization/denormalization, and pipeline functionality using synthetic sine wave data |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
| likelihood.train() | ||
|
|
||
| # Use Adam optimizer with learning rate | ||
| optimizer = torch.optim.Adam(model.parameters(), lr=0.05) |
There was a problem hiding this comment.
The learning rate value 0.05 is a magic number that should be extracted as a named constant or parameter for better maintainability and easier tuning.
TestGPRLibrary.py
Outdated
| rng = np.random.default_rng(0) | ||
|
|
||
| # Inputs with shape (n, 1) | ||
| X = np.linspace(0, 2 * np.pi, n).reshape(-1,1) |
There was a problem hiding this comment.
Missing space after comma in reshape(-1,1). Should be reshape(-1, 1) per PEP 8 style guidelines.
| X = np.linspace(0, 2 * np.pi, n).reshape(-1,1) | |
| X = np.linspace(0, 2 * np.pi, n).reshape(-1, 1) |
TestGPRLibrary.py
Outdated
| X, Y = self.X, self.Y # fake dataset | ||
|
|
||
| # Train the GPR model | ||
| model, likelihood = train_gpr_model(X,Y) |
There was a problem hiding this comment.
Missing space after comma in train_gpr_model(X,Y). Should be train_gpr_model(X, Y) per PEP 8 style guidelines.
| model, likelihood = train_gpr_model(X,Y) | |
| model, likelihood = train_gpr_model(X, Y) |
| # contains all functions necessary to run the GPR Model | ||
| # used to predict better low-eccentricity orbital parameter initial guesses | ||
| # functions: | ||
| # 1. normalize_data | ||
| # 2. denormalize_predictions | ||
| # 3. omega_and_adot | ||
| # 4. omegaAndAdot | ||
| # 5. polynomial_fit_with_confidence | ||
| # 6. GPRegressionModel (class) | ||
| # 7. train_gpr_model | ||
| # 8. predict_with_gpr_model | ||
| # 9. run_gpr_pipeline | ||
| # 10. train_model_and_eigenvalue_analysis | ||
| # 11. loo_predictions | ||
| # 12. parse_test_runs | ||
| # 13. apply_gpr_corrections | ||
| # 14. save_gpr_corrected | ||
| # 15. loo_crossval | ||
| # 16. plot_loo_residuals |
There was a problem hiding this comment.
The function list in the header comment (lines 7-24) lists 16 functions, but only 4 are actually present in this file (GPRegressionModel class, train_gpr_model, predict_with_gpr_model, run_gpr_pipeline). This documentation is misleading and should be updated to reflect only the functions that are actually implemented.
| # contains all functions necessary to run the GPR Model | |
| # used to predict better low-eccentricity orbital parameter initial guesses | |
| # functions: | |
| # 1. normalize_data | |
| # 2. denormalize_predictions | |
| # 3. omega_and_adot | |
| # 4. omegaAndAdot | |
| # 5. polynomial_fit_with_confidence | |
| # 6. GPRegressionModel (class) | |
| # 7. train_gpr_model | |
| # 8. predict_with_gpr_model | |
| # 9. run_gpr_pipeline | |
| # 10. train_model_and_eigenvalue_analysis | |
| # 11. loo_predictions | |
| # 12. parse_test_runs | |
| # 13. apply_gpr_corrections | |
| # 14. save_gpr_corrected | |
| # 15. loo_crossval | |
| # 16. plot_loo_residuals | |
| # Contains functions and classes necessary to run the GPR Model | |
| # used to predict better low-eccentricity orbital parameter initial guesses | |
| # Implemented functions/classes: | |
| # - GPRegressionModel (class) | |
| # - train_gpr_model | |
| # - predict_with_gpr_model | |
| # - run_gpr_pipeline |
| plot: whether to produce correlation plots. | ||
| silent: whether to suppress print statements entirely. |
There was a problem hiding this comment.
The plot parameter in the docstring is missing a type annotation. Should be plot (bool): Whether to produce correlation plots. for consistency with other parameters.
| plot: whether to produce correlation plots. | |
| silent: whether to suppress print statements entirely. | |
| plot (bool): Whether to produce correlation plots. | |
| silent (bool): Whether to suppress print statements entirely. |
TestGPRLibrary.py
Outdated
| rng = np.random.default_rng(0) | ||
|
|
||
| # Inputs with shape (n, 1) | ||
| X = np.linspace(0, 2 * np.pi, n).reshape(-1,1) |
There was a problem hiding this comment.
The magic number 0 for the random seed should be extracted as a named constant (e.g., RANDOM_SEED = 0) at the class or module level for better maintainability and clarity.
TestGPRLibrary.py
Outdated
| denormalize_output, ie that it is correctly undoing the normalization. | ||
| """ | ||
| X, Y = self.X, self.Y | ||
| model, likelihood = train_gpr_model(X,Y) |
There was a problem hiding this comment.
Missing space after comma in train_gpr_model(X,Y). Should be train_gpr_model(X, Y) per PEP 8 style guidelines.
GPR_library.py
Outdated
| input_dim = train_x.shape[1] if train_x.dim() > 1 else 1 | ||
|
|
||
| # Define base kernels - use a mixture of the RBF and Matern kernels | ||
| self.rbf_kernel = gpytorch.kernels.RBFKernel(ard_num_dims = input_dim) |
There was a problem hiding this comment.
Inconsistent spacing around = in keyword arguments. Should be ard_num_dims=input_dim (no spaces) per PEP 8 style guidelines.
| self.rbf_kernel = gpytorch.kernels.RBFKernel(ard_num_dims = input_dim) | |
| self.rbf_kernel = gpytorch.kernels.RBFKernel(ard_num_dims=input_dim) |
| import torch | ||
| import gpytorch | ||
| import numpy as np | ||
| import pandas as pd |
There was a problem hiding this comment.
Import of 'pd' is not used.
| import pandas as pd |
GPR_library.py
Outdated
| import numpy as np | ||
| import pandas as pd | ||
| import matplotlib.pyplot as plt | ||
| import argparse |
There was a problem hiding this comment.
Import of 'argparse' is not used.
| import argparse |
Partial merge of GPR Library, adding only the self contained ML class and functions for now (no eccentricity control related functions).
Contains the following:
Unit test is included for sanity checks