Skip to content

georgemilosh/PyNets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyNets: A Python Neural Network Training and Evaluation Framework

A PyTorch-based framework for training and evaluating neural networks with focus on scientific machine learning applications. PyNets provides standardized interfaces for dataset handling, model training, hyperparameter optimization, and experiment management.

Features

  • Modular Training Pipeline: Unified Trainer class supporting multiple model architectures
  • Flexible Dataset Handling: Custom DataFrameDataset for image-based regression/classification
  • Built-in Model Architectures: Convolutional networks (CNet), MLPs, and hybrid models
  • Experiment Management: Automatic configuration saving, logging, and checkpoint management
  • Hyperparameter Optimization: Integrated Optuna support for automated tuning
  • Scientific Applications: Specialized for time-series forecasting and image-based regression

Quick Start

Environment Setup

conda env create -f environment.yml
conda activate haydntorch

Basic Usage

import sys
sys.path.append('src/')
import trainers as tr

# Initialize trainer with configuration
trainer = tr.Trainer(
    work_dir='./models/experiment1/',
    dataset_kwargs={
        'data_folder': './data/images',
        'train_sample': './data/train.csv',
        'val_sample': './data/val.csv',
        'test_sample': './data/test.csv',
        'scaler_features': True,
        'scaler_targets': True
    },
    model_kwargs={
        'model': 'CNet',
        'optimizer_kwargs': {'optimizer': 'Adam', 'lr': 0.001, 'criterion': 'MSELoss'},
        'scheduler_kwargs': {'scheduler': 'StepLR', 'epochs': 50, 'early_stopping': 10}
    }
)

# Train the model
trainer.fit()

Repository Structure

PyNets/
├── src/                    # Core framework modules
│   ├── trainers.py        # Main Trainer class and experiment management
│   ├── models.py          # Neural network architectures (PyNet, CNet, MLP)
│   ├── datasets.py        # Dataset classes and data loading utilities
│   └── utilities.py       # Helper functions and utilities
├── examples/              # Jupyter notebooks with usage examples
│   └── dev33a.ipynb      # Solar flare forecasting case study
├── data/                  # Dataset storage (user-created)
├── dev/                   # Development and experimental code
├── models/                # Trained model checkpoints and artifacts
├── gallery/               # Example outputs and visualizations
├── runs/                  # Training run logs and results
└── environment.yml        # Conda environment specification

Core Components

Trainer Class

The Trainer class provides a unified interface for:

  • Dataset loading and preprocessing with automatic train/val/test splits
  • Model instantiation with configurable architectures
  • Training loop with early stopping, learning rate scheduling
  • Automatic experiment logging and checkpoint saving
  • Configuration management with JSON persistence

Supported Models

  • PyNet: Base class for all neural network models with common training utilities
  • CNet: Convolutional neural network for image-based tasks
  • MLP: Multi-layer perceptron for tabular data
  • CNetPlusScalar: Hybrid model combining CNN features with scalar inputs

Dataset Integration

  • DataFrameDataset: CSV-driven dataset class supporting:
    • Image loading from file paths
    • Automatic feature/target scaling and normalization
    • Custom prescaling functions (e.g., log10 transforms)
    • Train/validation/test split management

Examples and Use Cases

1. Basic Regression (dev2.ipynb)

California housing price prediction using MLPs with hyperparameter optimization via Optuna.

2. Computer Vision (dev7.ipynb)

Rubik's cube orientation regression using convolutional neural networks.

3. Scientific Forecasting (dev33a.ipynb)

Solar flare prediction using SDO satellite imagery and auxiliary time-series features. This example demonstrates:

  • Chronological data splitting to avoid look-ahead bias
  • Feature engineering with background/lag features
  • Hybrid CNN+scalar models for multimodal inputs
  • Custom loss functions for imbalanced regression
  • Time-series evaluation with proper renormalization
# Example from solar flare forecasting
trainer = tr.Trainer(
    work_dir='/models/solar_flare/',
    dataset_kwargs={
        'data_folder': '/data/sdo_images/',
        'target_columns': ['target'],
        'extra_features': ['background'],  # Lag-1 feature
        'prescaler_targets': ['log10'],    # Log transform
        'scaler_features': True,
        'scaler_targets': True
    },
    model_kwargs={
        'model': 'CNetPlusScalar',
        'input_channels': 1,
        'optimizer_kwargs': {
            'optimizer': 'Adam',
            'lr': 0.001,
            'criterion': {
                'name': 'MSEBinary',
                'kwargs': {'w1': 0.2, 'alpha': 1, 'mu': -5.84}
            }
        }
    }
)

Key Features

Configuration Management

# Configurations are automatically saved as JSON
trainer = tr.Trainer(work_dir='./experiment1/', ...)
# Creates: ./experiment1/config.json

# Load existing experiments
trainer.load_run('run_001')  # Loads model weights and configuration

Multiple Experiment Runs

import copy
config = copy.deepcopy(trainer.config)
config['run'] = 'sensitivity_test/'
config['model_kwargs']['optimizer_kwargs']['lr'] = 0.0001
trainer.fit(config=config)

Hyperparameter Optimization

import optuna

def objective(trial):
    config = copy.deepcopy(trainer.config)
    config['model_kwargs']['optimizer_kwargs']['lr'] = trial.suggest_float('lr', 1e-5, 1e-1, log=True)
    config['run'] = str(trial.number)
    return trainer.fit(config=config)

study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=50)

Advanced Data Preprocessing

# Support for multiple scaling and preprocessing strategies
dataset_kwargs = {
    'scaler_features': True,        # StandardScaler for scalar features
    'scaler_targets': True,         # StandardScaler for targets
    'scaler_vectors': True,         # StandardScaler for vector features
    'prescaler_targets': ['log10'], # Log transform before scaling
    'prescaler_vectors': ['log10'], # Log transform for vectors
    'extra_features': ['background', 'temporal_lag']  # Additional engineered features
}

Environment and Dependencies

The framework requires:

  • Python 3.11+ with scientific computing stack
  • PyTorch 2.5+ with CUDA 12.1 support for GPU acceleration
  • Core ML Libraries: NumPy, Pandas, Scikit-learn, Matplotlib
  • Hyperparameter Optimization: Optuna
  • Notebook Support: Jupyter, IPython, Widgets
  • Visualization: Matplotlib, BQPlot for interactive plots

Key environment specifications:

  • CUDA-enabled PyTorch for GPU training
  • MKL-optimized BLAS for numerical operations
  • Complete scientific Python ecosystem via conda-forge

See environment.yml for complete dependency list and version specifications.

Installation

  1. Clone the repository:
git clone https://github.com/georgemilosh/PyNets.git
cd PyNets
  1. Create conda environment:
conda env create -f environment.yml
conda activate haydntorch
  1. Verify installation:
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

License

MIT License - see LICENSE for details.

Contributing

This framework is designed for scientific machine learning research. Contributions are welcome, particularly:

  • Additional model architectures in src/models.py
  • Dataset connectors in src/datasets.py
  • Example applications in examples/
  • Documentation improvements

Citation

If you use PyNets in your research, please cite:

@Misc{pynets,
  author       = {George Miloshevich and Panagiotis Gonidakis and Francesco Carella},
  title        = {PyNets: Neural network utilities (GitHub repository)},
  howpublished = {\url{https://github.com/georgemilosh/PyNets}},
  year         = {2025},
  note         = {commit 92572f329ef5410b392aac4fb05646f9152f0ac9}
}

Support

For questions and issues:

  • Check the example notebooks in examples/
  • Review the source code documentation in src/
  • Open an issue on GitHub for bugs or feature requests

About

All-purpose NN project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published