Learning Image Compression On board a Satellite constellation
This is the code repository for the paper Tackling the Satellite Downlink Bottleneck with Federated Onboard Learning of Image Compression. It provides the means to train and test an image compression model based on the Python module CompressAI combined with the PASEOS module to simulate physical constraints on the training. The paper demonstrates the onboard training of image compression for an Earth observation constellation on raw, unprocessed Sentinel-2 data but also works for postprocessed data such as the AID dataset. To test on raw data, it uses the THRawS dataset consisting of more than 100 samples (originally focusing on wildfires and volcanoes) from all over the world. We use mpi4py to distribute the training cross multiple devices / GPUs.
In the following we describe setting up the repo and launching a training run. For more details on the training process, please refer to the paper.
We recommend using conda to set-up your environment. This will also automatically set up CUDA and the cudatoolkit for you, enabling the use of GPUs for training, which is recommended.
First of all, clone the GitHub repository as follows (Git required):
git clone https://github.com/gomezzz/LICOS.git
To install LICOS you can use conda as follows:
cd LICOS
conda env create -f environment.yml
This will create a new conda environment called licos and install the required software packages.
To activate the new environment, you can use:
conda activate licos
To launch the training on, e.g.,AID, it is necessary to download the corresponding datasets. Please, place the data in two folders your_dataset/train and your_dataset/test and create a config file in the cfg folder for your run. You need to specify the path to your dataset in the config file with the dataset parameter. Please, have a look at the cfg/default_cfg.toml file for an example and additional explanations on different parameters. By default, we rely on the dataloaders used by CompressAI. See here for more information.
If you want to use a custom dataset with non-RGB data, you may need to create a custom dataset class similar to the one in raw_image_folder.py.
This work is based on Sentinel-2 Raw data. included in the dataset THRawS. To prepare your data, proceed as follows.
-
Navigate to the
data_preparationdirectory and clone PyRawS in it withgit clone https://github.com/ESA-PhiLab/PyRawS.git. PyRawS provide APIs to process Sentinel-2 raw data. -
To install PyRaws, from the PyRawS directory run:
conda env create -f environment.yml. -
Download THRawS. Please, notice the entire dataset size if of 147.6 GB.
-
Place all the downaloded ZIP files into
data_preparation\data\THRAWS\raw. There is an empty file calledput_THRAWS_here.txtto give you indication of the right location. -
Decompress all the zip files in
data_preparation\data\THRAWS\raw. -
Update the variables
PYRAWS_HOME_PATHandDATA_PATHvariables indata_preparation\sys_cfg.pywith the absolute path toPyRawSanddatadirectories. -
Move
data_preparation\sys_cfg.pytodata_preparation\PyRawS\pyraws\sys_cfg.py. -
Activate the
pyrawsenvironment through:
conda activate pyraws
- From
data_preparationlaunch thecreate_tif.py. To this aim, you can use:
python create_tif.py --input_dir PATH_TO_RAW
where PATH_TO_RAW is the path to the data\THRAWS\raw directory.
PyRawSis not needed anymore and can be now removed.
To launch a training run, you can use the main.py script as follows:
mpiexec -n 4 python main.py --cfg cfg/default_cfg.toml
The parameter --cfg specifies the config file to use. The parameter -n specifies the number of processes to use. The number of processes should be equal to the number of satellites you want to simulate with PASEOS and matches the number of GPUs used, when GPUs are used. Details on the constellation and satellite parameters can be found in the file licos/init_paseos.py.
You can evaluate a trained model by using the eval_script.py script. It may require adjustments as it is hard-coded to be used with the rawdata Sentinel-2 data. By default, it compares the compression to using JPEG.
If you have any questions, feel to reach out to @gomezzz (email: pablo.gomez at esa.int) or @GabrieleMeoni (email: G.Meoni@tudelft.nl). You can also open an issue on the GitHub repository.