Feature/element data classes #588

FBumann · 2026-01-23T17:16:43Z

Description

Major refactoring of the model building pipeline to use batched/vectorized operations instead of per-element loops. This brings significant performance improvements, especially for large models.

Key Changes

Batched Type-Level Models: New FlowsModel, StoragesModel, BusesModel classes that handle ALL elements of a type in single batched operations instead of individual FlowModel, StorageModel instances.
FlowsData/StoragesData Classes: Pre-compute and cache element data as xarray DataArrays with element dimensions, enabling vectorized constraint creation.
Mask-based Variable Creation: Variables use linopy's mask= parameter to handle heterogeneous elements (e.g., only some flows have status variables) while keeping consistent coordinates.
Fast NumPy Helpers: Replace slow xarray methods with numpy equivalents:
- fast_notnull() / fast_isnull() - ~55x faster than xarray's .notnull() / .isnull()
Unified Coordinate Handling: All variables use consistent coordinate order via .reindex() to prevent alignment errors.

Performance Results

Note: These benchmarks were run without the _populate_names call, which is still present in the current code for backwards compatibility. It will be removed once all tests are migrated to the new solutions API, which should yield additional speedup.

XL System (2000h, 300 converters, 50 storages)

Commit	Description	Build (ms)	Build speedup	Write LP (ms)	Write speedup
`42f593e7`	main branch (base)	113,360	1.00x	44,815	1.00x
`302413c4`	Summary of changes	7,718	14.69x	15,369	2.92x
`7dd56dde`	Summary of changes	9,572	11.84x	15,780	2.84x
`f38f828f`	sparse groupby in conversion	3,649	31.07x	10,370	4.32x
`2a94130f`	sparse groupby in piecewise_conversion	2,323	48.80x	9,584	4.68x
`805bcc56`	xr.concat → numpy pre-alloc	2,075	54.63x	10,825	4.14x
`82e69989`	fix build_effects_array signature	2,333	48.59x	10,331	4.34x
`9c2d3d3b`	Add sparse_weighted_sum	1,638	69.21x	9,427	4.75x
`8277d5d3`	Add sparse_weighted_sum (2)	2,785	40.70x	9,129	4.91x
`c67a6a7e`	Clean up, revert piecewise	2,616	43.33x	9,574	4.68x
`52a581fe`	Improve piecewise	1,743	65.04x	9,763	4.59x
`8c8eb5c9`	Pre-combine xarray coeffs in storage	1,676	67.64x	8,868	5.05x

Complex System (72h, piecewise)

Commit	Description	Build (ms)	Build speedup	Write LP (ms)	Write speedup
`42f593e7`	main branch (base)	1,003	1.00x	417	1.00x
`302413c4`	Summary of changes	533	1.88x	129	3.23x
`7dd56dde`	Summary of changes	430	2.33x	103	4.05x
`f38f828f`	sparse groupby in conversion	452	2.22x	136	3.07x
`2a94130f`	sparse groupby in piecewise_conversion	440	2.28x	112	3.72x
`805bcc56`	xr.concat → numpy pre-alloc	475	2.11x	132	3.16x
`82e69989`	fix build_effects_array signature	391	2.57x	99	4.21x
`9c2d3d3b`	Add sparse_weighted_sum	404	2.48x	96	4.34x
`8277d5d3`	Add sparse_weighted_sum (2)	416	2.41x	98	4.26x
`c67a6a7e`	Clean up, revert piecewise	453	2.21x	108	3.86x
`52a581fe`	Improve piecewise	426	2.35x	105	3.97x
`8c8eb5c9`	Pre-combine xarray coeffs in storage	383	2.62x	100	4.17x

LP file size: 528.28 MB (XL, branch) vs 503.88 MB (XL, main), 0.21 MB (Complex) — unchanged.

Key Takeaways

XL system: 67.6x build speedup — from 113.4s down to 1.7s. LP write improved 5.1x (44.8s → 8.9s). The bulk of the gain came from the initial refactoring (302413c4, 14.7x), with sparse groupby and weighted sum optimizations adding further large improvements.
Complex system: 2.62x build speedup — from 1,003ms down to 383ms. LP write improved 4.2x (417ms → 100ms). Gains are more modest since this system is small (72 timesteps, 14 flows) and dominated by per-operation linopy/xarray overhead.

Model Size Reduction

The batched approach creates fewer, larger variables instead of many small ones:

System	Old Vars	New Vars	Old Cons	New Cons
Medium (720h, all features)	370	21	428	30
Large (720h, 50 conv)	859	21	997	30
Full Year (8760h)	148	16	168	24
XL (2000h, 300 conv)	4,917	21	5,715	30

How to Run Benchmarks

# Single system
python benchmarks/benchmark_model_build.py --system complex
python benchmarks/benchmark_model_build.py --system synthetic --converters 300 --timesteps 2000

# All systems
python benchmarks/benchmark_model_build.py --all

# Across commits
for SHA in 302413c4 7dd56dde f38f828f 2a94130f 805bcc56 82e69989 9c2d3d3b 8277d5d3 c67a6a7e 52a581fe 8c8eb5c9; do
    echo "=== $SHA ==="
    git checkout "$SHA" --force 2>/dev/null
    python benchmarks/benchmark_model_build.py --system complex --iterations 3
done
git checkout feature/element-data-classes --force

Type of Change

Code refactoring
Performance improvement

Testing

All existing tests pass
Benchmarked with multiple system configurations

* Remove unnessesary log * The bug has been fixed. When expanding segmented clustered FlowSystems, the effect totals now match correctly. Root Cause Segment values are per-segment TOTALS that were repeated N times when expanded to hourly resolution (where N = segment duration in timesteps). Summing these repeated values inflated totals by ~4x. Fix Applied 1. Added build_expansion_divisor() to Clustering class (flixopt/clustering/base.py:920-1027) - For each original timestep, returns the segment duration (number of timesteps in that segment) - Handles multi-dimensional cases (periods/scenarios) by accessing each clustering result's segment info 2. Modified expand() method (flixopt/transform_accessor.py:1850-1875) - Added _is_segment_total_var() helper to identify which variables should be divided - For segmented systems, divides segment total variables by the expansion divisor to get correct hourly rates - Correctly excludes: - Share factors (stored as EffectA|(temporal)->EffectB(temporal)) - these are rates, not totals - Flow rates, on/off states, charge states - these are already rates Test Results - All 83 cluster/expand tests pass - All 27 effect tests pass - Debug script shows all ratios are 1.0000x for all effects (EffectA, EffectB, EffectC, EffectD) across all periods and scenarios * The fix is now more robust with clear separation between data and solution: Key Changes 1. build_expansion_divisor() in Clustering (base.py:920-1027) - Returns the segment duration for each original timestep - Handles per-period/scenario clustering differences 2. _is_segment_total_solution_var() in expand() (transform_accessor.py:1855-1880) - Only matches solution variables that represent segment totals: - {contributor}->{effect}(temporal) - effect contributions - *|per_timestep - per-timestep totals - Explicitly does NOT match rates/states: |flow_rate, |on, |charge_state 3. expand_da() with is_solution parameter (transform_accessor.py:1882-1915) - is_solution=False (default): Never applies segment correction (for FlowSystem data) - is_solution=True: Applies segment correction if pattern matches (for solution) Why This is Robust ┌───────────────────────────────────────┬─────────────────┬────────────────────┬───────────────────────────┐ │ Variable │ Location │ Pattern │ Divided? │ ├───────────────────────────────────────┼─────────────────┼────────────────────┼───────────────────────────┤ │ EffectA|(temporal)->EffectB(temporal) │ FlowSystem DATA │ share factor │ ❌ No (is_solution=False) │ ├───────────────────────────────────────┼─────────────────┼────────────────────┼───────────────────────────┤ │ Boiler(Q)->EffectA(temporal) │ SOLUTION │ contribution │ ✅ Yes │ ├───────────────────────────────────────┼─────────────────┼────────────────────┼───────────────────────────┤ │ EffectA(temporal)->EffectB(temporal) │ SOLUTION │ contribution │ ✅ Yes │ ├───────────────────────────────────────┼─────────────────┼────────────────────┼───────────────────────────┤ │ EffectA(temporal)|per_timestep │ SOLUTION │ per-timestep total │ ✅ Yes │ ├───────────────────────────────────────┼─────────────────┼────────────────────┼───────────────────────────┤ │ Boiler(Q)|flow_rate │ SOLUTION │ rate │ ❌ No (no pattern match) │ ├───────────────────────────────────────┼─────────────────┼────────────────────┼───────────────────────────┤ │ Storage|charge_state │ SOLUTION │ state │ ❌ No (no pattern match) │ └───────────────────────────────────────┴─────────────────┴────────────────────┴───────────────────────────┘ * The fix is now robust with variable names derived directly from FlowSystem structure: Key Implementation _build_segment_total_varnames() (transform_accessor.py:1776-1819) - Derives exact variable names from FlowSystem structure - No pattern matching on arbitrary strings - Covers all contributor types: a. {effect}(temporal)|per_timestep - from fs.effects b. {flow}->{effect}(temporal) - from fs.flows c. {component}->{effect}(temporal) - from fs.components d. {source}(temporal)->{target}(temporal) - from effect.share_from_temporal Why This is Robust 1. Derived from structure, not patterns: Variable names come from actual FlowSystem attributes 2. Clear separation: FlowSystem data is NEVER divided (only solution variables) 3. Explicit set lookup: var_name in segment_total_vars instead of pattern matching 4. Extensible: New contributor types just need to be added to _build_segment_total_varnames() 5. All tests pass: 83 cluster/expand tests + comprehensive debug script * Add interpolation of charge states to expand and add documentation * Summary: Variable Registry Implementation Changes Made 1. Added VariableCategory enum (structure.py:64-77) - STATE - For state variables like charge_state (interpolated within segments) - SEGMENT_TOTAL - For segment totals like effect contributions (divided by expansion divisor) - RATE - For rate variables like flow_rate (expanded as-is) - BINARY - For binary variables like status (expanded as-is) - OTHER - For uncategorized variables 2. Added variable_categories registry to FlowSystemModel (structure.py:214) - Dictionary mapping variable names to their categories 3. Modified add_variables() method (structure.py:388-396) - Added optional category parameter - Automatically registers variables with their category 4. Updated variable creation calls: - components.py: Storage variables (charge_state as STATE, netto_discharge as RATE) - elements.py: Flow variables (flow_rate as RATE, status as BINARY) - features.py: Effect contributions (per_timestep as SEGMENT_TOTAL, temporal shares as SEGMENT_TOTAL, startup/shutdown as BINARY) 5. Updated expand() method (transform_accessor.py:2074-2090) - Uses variable_categories registry to identify segment totals and state variables - Falls back to pattern matching for backwards compatibility with older FlowSystems Benefits - More robust categorization: Variables are categorized at creation time, not by pattern matching - Extensible: New variable types can easily be added with proper category - Backwards compatible: Old FlowSystems without categories still work via pattern matching fallback * Summary: Fine-Grained Variable Categories New Categories (structure.py:45-103) class VariableCategory(Enum): # State variables CHARGE_STATE, SOC_BOUNDARY # Rate/Power variables FLOW_RATE, NETTO_DISCHARGE, VIRTUAL_FLOW # Binary state STATUS, INACTIVE # Binary events STARTUP, SHUTDOWN # Effect variables PER_TIMESTEP, SHARE, TOTAL, TOTAL_OVER_PERIODS # Investment SIZE, INVESTED # Counting/Duration STARTUP_COUNT, DURATION # Piecewise linearization INSIDE_PIECE, LAMBDA0, LAMBDA1, ZERO_POINT # Other OTHER Logical Groupings for Expansion EXPAND_INTERPOLATE = {CHARGE_STATE} # Interpolate between boundaries EXPAND_DIVIDE = {PER_TIMESTEP, SHARE} # Divide by expansion factor # Default: repeat within segment Files Modified ┌───────────────────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ File │ Variables Updated │ ├───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ components.py │ charge_state, netto_discharge, SOC_boundary │ ├───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ elements.py │ flow_rate, status, virtual_supply, virtual_demand │ ├───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ features.py │ size, invested, inactive, startup, shutdown, startup_count, inside_piece, lambda0, lambda1, zero_point, total, per_timestep, shares │ ├───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ effects.py │ total, total_over_periods │ ├───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ modeling.py │ duration │ ├───────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ transform_accessor.py │ Updated to use EXPAND_INTERPOLATE and EXPAND_DIVIDE groupings │ └───────────────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ Test Results - All 83 cluster/expand tests pass - Variable categories correctly populated and grouped * Add IO for variable categories * The refactoring is complete. Here's what was accomplished: Changes Made 1. Added combine_slices() utility to flixopt/clustering/base.py (lines 52-107) - Simple function that stacks dict of {(dim_values): np.ndarray} into a DataArray - Much cleaner than the previous reverse-concat pattern 2. Refactored 3 methods to use the new utility: - Clustering.expand_data() - reduced from ~25 to ~12 lines - Clustering.build_expansion_divisor() - reduced from ~35 to ~20 lines - TransformAccessor._interpolate_charge_state_segmented() - reduced from ~43 to ~27 lines 3. Added 4 unit tests for combine_slices() in tests/test_cluster_reduce_expand.py Results ┌───────────────────────────────────┬──────────┬────────────────────────┐ │ Metric │ Before │ After │ ├───────────────────────────────────┼──────────┼────────────────────────┤ │ Complex reverse-concat blocks │ 3 │ 0 │ ├───────────────────────────────────┼──────────┼────────────────────────┤ │ Lines of dimension iteration code │ ~100 │ ~60 │ ├───────────────────────────────────┼──────────┼────────────────────────┤ │ Test coverage │ 83 tests │ 87 tests (all passing) │ └───────────────────────────────────┴──────────┴────────────────────────┘ The Pattern Change Before (complex reverse-concat): result_arrays = slices for dim in reversed(extra_dims): grouped = {} for key, arr in result_arrays.items(): rest_key = key[:-1] if len(key) > 1 else () grouped.setdefault(rest_key, []).append(arr) result_arrays = {k: xr.concat(v, dim=...) for k, v in grouped.items()} result = list(result_arrays.values())[0].transpose('time', ...) After (simple combine): return combine_slices(slices, extra_dims, dim_coords, 'time', output_coord, attrs) * Here's what we accomplished: 1. Fully Vectorized expand_data() Before (~65 lines with loops): for combo in np.ndindex(*[len(v) for v in dim_coords.values()]): selector = {...} mapping = _select_dims(timestep_mapping, **selector).values data_slice = _select_dims(aggregated, **selector) slices[key] = _expand_slice(mapping, data_slice) return combine_slices(slices, ...) After (~25 lines, fully vectorized): timestep_mapping = self.timestep_mapping # Already multi-dimensional! cluster_indices = timestep_mapping // time_dim_size time_indices = timestep_mapping % time_dim_size expanded = aggregated.isel(cluster=cluster_indices, time=time_indices) # xarray handles broadcasting across period/scenario automatically 2. build_expansion_divisor() and _interpolate_charge_state_segmented() These still use combine_slices() because they need per-result segment data (segment_assignments, segment_durations) which isn't available as concatenated Clustering properties yet. Current State ┌───────────────────────────────────────┬─────────────────┬─────────────────────────────────┐ │ Method │ Vectorized? │ Uses Clustering Properties │ ├───────────────────────────────────────┼─────────────────┼─────────────────────────────────┤ │ expand_data() │ Yes │ timestep_mapping (fully) │ ├───────────────────────────────────────┼─────────────────┼─────────────────────────────────┤ │ build_expansion_divisor() │ No (small loop) │ cluster_assignments (partially) │ ├───────────────────────────────────────┼─────────────────┼─────────────────────────────────┤ │ _interpolate_charge_state_segmented() │ No (small loop) │ cluster_assignments (partially) │ └───────────────────────────────────────┴─────────────────┴─────────────────────────────────┘ * Completed: 1. _interpolate_charge_state_segmented() - Fully vectorized from ~110 lines to ~55 lines - Uses clustering.timestep_mapping for indexing - Uses clustering.results.segment_assignments, segment_durations, and position_within_segment - Single xarray expression instead of triple-nested loops Previously completed (from before context limit): - Added segment_assignments multi-dimensional property to ClusteringResults - Added segment_durations multi-dimensional property to ClusteringResults - Added position_within_segment property to ClusteringResults - Vectorized expand_data() - Vectorized build_expansion_divisor() Test results: All 130 tests pass (87 cluster/expand + 43 IO tests) The combine_slices utility function is still available in clustering/base.py if needed in the future, but all the main dimension-handling methods now use xarray's vectorized advanced indexing instead of the loop-based slice-and-combine pattern. * All simplifications complete! Here's a summary of what we cleaned up: Summary of Simplifications 1. expand_da() in transform_accessor.py - Extracted duplicate "append extra timestep" logic into _append_final_state() helper - Reduced from ~50 lines to ~25 lines - Eliminated code duplication 2. _build_multi_dim_array() → _build_property_array() in clustering/base.py - Replaced 6 conditional branches with unified np.ndindex() pattern - Now handles both simple and multi-dimensional cases in one method - Reduced from ~50 lines to ~25 lines - Preserves dtype (fixed integer indexing bug) 3. Property boilerplate in ClusteringResults - 5 properties (cluster_assignments, cluster_occurrences, cluster_centers, segment_assignments, segment_durations) now use the unified _build_property_array() - Each property reduced from ~25 lines to ~8 lines - Total: ~165 lines → ~85 lines 4. _build_timestep_mapping() in Clustering - Simplified to single call using _build_property_array() - Reduced from ~16 lines to ~9 lines Total lines removed: ~150+ lines of duplicated/complex code * Removed the unnecessary lookup and use segment_indices directl * The IO roundtrip fix is working correctly. Here's a summary of what was fixed: Summary The IO roundtrip bug was caused by representative_weights (a variable with only ('cluster',) dimension) being copied as-is during expansion, which caused the cluster dimension to incorrectly persist in the expanded dataset. Fix applied in transform_accessor.py:2063-2065: # Skip cluster-only vars (no time dim) - they don't make sense after expansion if da.dims == ('cluster',): continue This skips variables that have only a cluster dimension (and no time dimension) during expansion, as these variables don't make sense after the clustering structure is removed. Test results: - All 87 tests in test_cluster_reduce_expand.py pass ✓ - All 43 tests in test_clustering_io.py pass ✓ - Manual IO roundtrip test passes ✓ - Tests with different segment counts (3, 6) pass ✓ - Tests with 2-hour timesteps pass ✓ * Updated condition in transform_accessor.py:2063-2066: # Skip vars with cluster dim but no time dim - they don't make sense after expansion # (e.g., representative_weights with dims ('cluster',) or ('cluster', 'period')) if 'cluster' in da.dims and 'time' not in da.dims: continue This correctly handles: - ('cluster',) - simple cluster-only variables like cluster_weight - ('cluster', 'period') - cluster variables with period dimension - ('cluster', 'scenario') - cluster variables with scenario dimension - ('cluster', 'period', 'scenario') - cluster variables with both Variables with both cluster and time dimensions (like timestep_duration with dims ('cluster', 'time')) are correctly expanded since they contain time-series data that needs to be mapped back to original timesteps. * Summary of Fixes 1. clustering/base.py - combine_slices() hardening (lines 52-118) - Added validation for empty input: if not slices: raise ValueError("slices cannot be empty") - Capture first array and preserve dtype: first = next(iter(slices.values())) → np.empty(shape, dtype=first.dtype) - Clearer error on missing keys with try/except: raise KeyError(f"Missing slice for key {key} (extra_dims={extra_dims})") 2. flow_system.py - Variable categories cleanup and safe enum restoration - Added self._variable_categories.clear() in _invalidate_model() (line 1692) to prevent stale categories from being reused - Hardened VariableCategory restoration (lines 922-930) with try/except to handle unknown/renamed enum values gracefully with a warning instead of crashing 3. transform_accessor.py - Correct timestep_mapping decode for segmented systems (lines 1850-1857) - For segmented systems, now uses clustering.n_segments instead of clustering.timesteps_per_cluster as the divisor - This matches the encoding logic in expand_data() and build_expansion_divisor() * Added test_segmented_total_effects_match_solution to TestSegmentation class * Added all remaining tsam.aggregate() paramaters and missing type hint * Added all remaining tsam.aggregate() paramaters and missing type hint * Updated expression_tracking_variable modeling.py:200-242 - Added category: VariableCategory = None parameter and passed it to both add_variables calls. Updated Callers ┌─────────────┬──────┬─────────────────────────┬────────────────────┐ │ File │ Line │ Variable │ Category │ ├─────────────┼──────┼─────────────────────────┼────────────────────┤ │ features.py │ 208 │ active_hours │ TOTAL │ ├─────────────┼──────┼─────────────────────────┼────────────────────┤ │ elements.py │ 682 │ total_flow_hours │ TOTAL │ ├─────────────┼──────┼─────────────────────────┼────────────────────┤ │ elements.py │ 709 │ flow_hours_over_periods │ TOTAL_OVER_PERIODS │ └─────────────┴──────┴─────────────────────────┴────────────────────┘ All expression tracking variables now properly register their categories for segment expansion handling. The pattern is consistent: callers specify the appropriate category based on what the tracked expression represents. * Added to flow_system.py variable_categories property (line 1672): @Property def variable_categories(self) -> dict[str, VariableCategory]: """Variable categories for filtering and segment expansion.""" return self._variable_categories get_variables_by_category() method (line 1681): def get_variables_by_category( self, *categories: VariableCategory, from_solution: bool = True ) -> list[str]: """Get variable names matching any of the specified categories.""" Updated in statistics_accessor.py ┌───────────────┬──────────────────────────────────────────┬──────────────────────────────────────────────────┐ │ Property │ Before │ After │ ├───────────────┼──────────────────────────────────────────┼──────────────────────────────────────────────────┤ │ flow_rates │ endswith('|flow_rate') │ get_variables_by_category(FLOW_RATE) │ ├───────────────┼──────────────────────────────────────────┼──────────────────────────────────────────────────┤ │ flow_sizes │ endswith('|size') + flow_labels check │ get_variables_by_category(SIZE) + flow_labels │ ├───────────────┼──────────────────────────────────────────┼──────────────────────────────────────────────────┤ │ storage_sizes │ endswith('|size') + storage_labels check │ get_variables_by_category(SIZE) + storage_labels │ ├───────────────┼──────────────────────────────────────────┼──────────────────────────────────────────────────┤ │ charge_states │ endswith('|charge_state') │ get_variables_by_category(CHARGE_STATE) │ └───────────────┴──────────────────────────────────────────┴──────────────────────────────────────────────────┘ Benefits 1. Single source of truth - Categories defined once in VariableCategory enum 2. Easier maintenance - Adding new variable types only requires updating one place 3. Type safety - Using enum values instead of magic strings 4. Flexible filtering - Can filter by multiple categories: get_variables_by_category(SIZE, INVESTED) 5. Consistent naming - Uses rsplit('|', 1)[0] instead of replace('|suffix', '') for label extraction * Ensure backwards compatability * Summary of Changes 1. New SIZE Sub-Categories (structure.py) - Added FLOW_SIZE and STORAGE_SIZE to differentiate flow vs storage investments - Kept SIZE for backward compatibility 2. InvestmentModel Updated (features.py) - Added size_category parameter to InvestmentModel.__init__() - Callers now specify the appropriate category 3. Variable Registrations Updated - elements.py: FlowModel uses FLOW_SIZE - components.py: StorageModel uses STORAGE_SIZE (2 locations) 4. Statistics Accessor Simplified (statistics_accessor.py) - flow_sizes: Now uses get_variables_by_category(FLOW_SIZE) directly - storage_sizes: Now uses get_variables_by_category(STORAGE_SIZE) directly - No more filtering by element labels after getting SIZE variables 5. Backward-Compatible Fallback (flow_system.py) - get_variables_by_category() handles old files: - FLOW_SIZE → matches |size suffix + flow labels - STORAGE_SIZE → matches |size suffix + storage labels 6. SOC Boundary Pattern Matching Replaced (transform_accessor.py) - Changed from endswith('|SOC_boundary') to get_variables_by_category(SOC_BOUNDARY) 7. Effect Variables Verified - PER_TIMESTEP ✓ (features.py:659) - SHARE ✓ (features.py:700 for temporal shares) - TOTAL / TOTAL_OVER_PERIODS ✓ (multiple locations) 8. Documentation Updated - _build_segment_total_varnames() marked as backwards-compatibility fallback Benefits - Cleaner code: No more string manipulation to filter by element type - Type safety: Using enum values instead of magic strings - Single source of truth: Categories defined once, used everywhere - Backward compatible: Old files still work via fallback logic --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

…ase.py: - Before: 852 calls × 1.2ms = 1.01s - After: 1 call × 1.2ms = 0.001s (cached)

* from_dataset() - Fast null check (structure.py) ┌───────────────────┬──────────────────────┬────────────────────┐ │ Metric │ Before │ After │ ├───────────────────┼──────────────────────┼────────────────────┤ │ Time │ 61ms │ 38ms │ ├───────────────────┼──────────────────────┼────────────────────┤ │ Null check method │ array.isnull().any() │ np.any(np.isnan()) │ ├───────────────────┼──────────────────────┼────────────────────┤ │ Speedup │ - │ 38% │ └───────────────────┴──────────────────────┴────────────────────┘ # xarray isnull().any() was 200x slower than numpy has_nulls = ( np.issubdtype(array.dtype, np.floating) and np.any(np.isnan(array.values)) ) or ( array.dtype == object and pd.isna(array.values).any() ) * Summary of Performance Optimizations The following optimizations were implemented: 1. timestep_mapping caching (clustering/base.py) - Changed @Property to @functools.cached_property - 2.3x speedup for expand() 2. Numpy null check (structure.py:902-904) - Replaced xarray's slow isnull().any() with numpy np.isnan(array.values) - 26x faster null checking 3. Simplified from_dataset() (flow_system.py) - Removed _LazyArrayDict class as you suggested - all arrays are accessed anyway - Single iteration over dataset variables, reused for clustering restoration - Cleaner, more maintainable code Final Results for Large FlowSystem (2190 timesteps, 12 periods, 125 components with solution) ┌────────────────┬────────┬────────┬───────────────────┐ │ Operation │ Before │ After │ Speedup │ ├────────────────┼────────┼────────┼───────────────────┤ │ from_dataset() │ ~400ms │ ~120ms │ 3.3x │ ├────────────────┼────────┼────────┼───────────────────┤ │ expand() │ ~1.92s │ ~0.84s │ 2.3x │ ├────────────────┼────────┼────────┼───────────────────┤ │ to_dataset() │ ~119ms │ ~119ms │ (already optimal) │ └────────────────┴────────┴────────┴───────────────────┘ * Add IO performance benchmark script Benchmark for measuring to_dataset() and from_dataset() performance with large FlowSystems (2190 timesteps, 12 periods, 125 components). Usage: python benchmarks/benchmark_io_performance.py Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * perf: Fast DataArray construction in from_dataset() Use ds._variables directly instead of ds[name] to bypass the slow _construct_dataarray method. For large datasets (5771 vars): - Before: ~10s - After: ~1.5s - Speedup: 6.5x Also use dataset subsetting for solution restoration instead of building DataArrays one by one. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * perf: Cache coordinates for 40x total speedup Pre-cache coordinate DataArrays to avoid repeated _construct_dataarray calls when building config arrays. Real-world benchmark (5771 vars, 209 MB): - Before all optimizations: ~10s - After: ~250ms - Total speedup: 40x Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * refactoring is complete. Here's a summary of the changes: Changes Made flixopt/io.py (additions) - Added DatasetParser dataclass (lines 1439-1520) with: - Fields for holding parsed dataset state (ds, reference_structure, arrays_dict, etc.) - from_dataset() classmethod for parsing with fast DataArray construction - _fast_get_dataarray() static method for performance optimization - Added restoration helper functions: - restore_flow_system_from_dataset() - Main entry point (lines 1523-1553) - _create_flow_system() - Creates FlowSystem instance (lines 1556-1623) - _restore_elements() - Restores components, buses, effects (lines 1626-1664) - _restore_solution() - Restores solution dataset (lines 1667-1690) - _restore_clustering() - Restores clustering object (lines 1693-1742) - _restore_metadata() - Restores carriers and variable categories (lines 1745-1778) flixopt/flow_system.py (reduction) - Replaced ~192-line from_dataset() method with a 1-line delegation to fx_io.restore_flow_system_from_dataset(ds) (line 799) Verification - All 64 dataset/netcdf related tests passed - Benchmark shows excellent performance: from_dataset() at 26.4ms with 0.1ms standard deviation - Imports work correctly with no circular dependency issues * perf: Fast solution serialization in to_dataset() Use _variables directly instead of data_vars.items() to avoid slow _construct_dataarray calls when adding solution variables. Real-world benchmark (5772 vars, 209 MB): - Before: ~1374ms - After: ~186ms - Speedup: 7.4x Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * refactor: Move to_dataset serialization logic to io.py Extract FlowSystem-specific serialization into io.py module: - flow_system_to_dataset(): Main orchestration - _add_solution_to_dataset(): Fast solution serialization - _add_carriers_to_dataset(): Carrier definitions - _add_clustering_to_dataset(): Clustering arrays - _add_variable_categories_to_dataset(): Variable categories - _add_model_coords(): Model coordinates FlowSystem.to_dataset() now delegates to io.py, matching the pattern used by from_dataset(). Performance unchanged (~183ms for 5772 vars). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * I've refactored the IO code into a unified FlowSystemDatasetIO class. Here's the summary: Changes made to flixopt/io.py: 1. Created FlowSystemDatasetIO class (lines 1439-1854) that consolidates: - Shared constants: SOLUTION_PREFIX = 'solution|' and CLUSTERING_PREFIX = 'clustering|' - Deserialization methods (Dataset → FlowSystem): - from_dataset() - main entry point - _separate_variables(), _fast_get_dataarray(), _create_flow_system(), _restore_elements(), _restore_solution(), _restore_clustering(), _restore_metadata() - Serialization methods (FlowSystem → Dataset): - to_dataset() - main entry point - _add_solution_to_dataset(), _add_carriers_to_dataset(), _add_clustering_to_dataset(), _add_variable_categories_to_dataset(), _add_model_coords() 2. Simplified public API functions (lines 1857-1903) that delegate to the class: - restore_flow_system_from_dataset() → FlowSystemDatasetIO.from_dataset() - flow_system_to_dataset() → FlowSystemDatasetIO.to_dataset() Benefits: - Shared prefixes defined once as class constants - Clear organization: deserialization methods grouped together, serialization methods grouped together - Same public API preserved (no changes needed to flow_system.py) - Performance maintained: ~264ms from_dataset(), ~203ms to_dataset() * Updated to use the public ds.variables API instead of ds._variables * NetCDF I/O Performance Improvements ┌──────────────────────────┬───────────┬────────┬─────────┐ │ Operation │ Before │ After │ Speedup │ ├──────────────────────────┼───────────┼────────┼─────────┤ │ to_netcdf(compression=5) │ ~10,250ms │ ~896ms │ 11.4x │ ├──────────────────────────┼───────────┼────────┼─────────┤ │ from_netcdf() │ ~895ms │ ~532ms │ 1.7x │ └──────────────────────────┴───────────┴────────┴─────────┘ Key Optimizations _stack_equal_vars() (for to_netcdf): - Used ds.variables instead of ds[name] to avoid _construct_dataarray - Used np.stack() instead of xr.concat() for much faster array stacking - Created xr.Variable objects directly instead of DataArrays _unstack_vars() (for from_netcdf): - Used ds.variables for direct variable access - Used np.take() instead of var.sel() for fast numpy indexing - Created xr.Variable objects directly --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

…579) * perf: Use ds.variables to avoid _construct_dataarray overhead Optimize several functions by using ds.variables instead of iterating over data_vars.items() or accessing ds[name], which triggers slow _construct_dataarray calls. Changes: - io.py: save_dataset_to_netcdf, load_dataset_from_netcdf, _reduce_constant_arrays - structure.py: from_dataset (use coord_cache pattern) - core.py: drop_constant_arrays (use numpy operations) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * perf: Optimize clustering serialization with ds.variables Use ds.variables for faster access in clustering/base.py: - _create_reference_structure: original_data and metrics iteration - compare plot: duration_curve generation with direct numpy indexing Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * perf: Use batch assignment for clustering arrays (24x speedup) _add_clustering_to_dataset was slow due to 210 individual ds[name] = arr assignments. Each triggers xarray's expensive dataset_update_method. Changed to batch assignment with ds.assign(dict): - Before: ~2600ms for to_dataset with clustering - After: ~109ms for to_dataset with clustering Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * perf: Use ds.variables in _build_reduced_dataset (12% faster) Avoided _construct_dataarray overhead by: - Using ds.variables instead of ds.data_vars.items() - Using numpy slicing instead of .isel() - Passing attrs dict directly instead of DataArray cluster() benchmark: - Before: ~10.1s - After: ~8.9s Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * perf: Use numpy reshape in _build_typical_das (4.4x faster) Eliminated 451,856 slow pandas .loc calls by using numpy reshape for segmented clustering data instead of iterating per-cluster. cluster() with segments benchmark (50 clusters, 4 segments): - Before: ~93.7s - After: ~21.1s - Speedup: 4.4x Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: Multiple clustering and IO bug fixes - benchmark_io_performance.py: Add Gurobi → HiGHS solver fallback - components.py: Fix storage decay to use sum (not mean) for hours per cluster - flow_system.py: Add RangeIndex validation requiring explicit timestep_duration - io.py: Include auxiliary coordinates in _fast_get_dataarray - transform_accessor.py: Add empty dataset guard after drop_constant_arrays - transform_accessor.py: Fix timestep_mapping indexing for segmented clustering Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * perf: Use ds.variables pattern in expand() (2.2x faster) Replace data_vars.items() iteration with ds.variables pattern to avoid slow _construct_dataarray calls (5502 calls × ~1.5ms each). Before: 3.73s After: 1.72s Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

…tant_arrays() in clustering_data() that raises a clear ValueError if all variables are constant, preventing cryptic to_dataframe() indexing errors. 3. Lines 1978-1984 (fixed indexing): Simplified the interpolation logic to consistently use timesteps_per_cluster for both cluster index division and time index modulo. The segment_assignments and position_within_segment arrays are keyed by (cluster, timesteps_per_cluster), so the time index must be derived from timestep_mapping % timesteps_per_cluster, not n_segments.

…tion) pattern. Here's a summary: Created Files ┌───────────────────────────────┬────────────────────────────────────────────┐ │ File │ Description │ ├───────────────────────────────┼────────────────────────────────────────────┤ │ flixopt/flixopt/vectorized.py │ Core DCE infrastructure (production-ready) │ ├───────────────────────────────┼────────────────────────────────────────────┤ │ test_dce_pattern.py │ Standalone test demonstrating the pattern │ ├───────────────────────────────┼────────────────────────────────────────────┤ │ DESIGN_PROPOSAL.md │ Detailed design documentation │ └───────────────────────────────┴────────────────────────────────────────────┘ Benchmark Results Elements Timesteps Old (ms) DCE (ms) Speedup -------------------------------------------------------- 10 24 116.72 21.15 5.5x 50 168 600.97 22.55 26.6x 100 168 1212.95 22.72 53.4x 200 168 2420.73 23.58 102.6x 500 168 6108.10 24.75 246.8x The DCE pattern shows near-constant time regardless of element count, while the old pattern scales linearly. Key Components 1. VariableSpec - Immutable declaration of what an element needs: VariableSpec( category='flow_rate', # Groups similar vars for batching element_id='Boiler(Q_th)', # Becomes coordinate in batched var lower=0, upper=100, dims=('time', 'scenario'), ) 2. VariableRegistry - Collects specs and batch-creates: registry.register(spec) # Collect (no linopy calls) registry.create_all() # One linopy call per category handle = registry.get_handle('flow_rate', 'Boiler') 3. ConstraintSpec - Deferred constraint building: ConstraintSpec( category='flow_bounds', element_id='Boiler', build_fn=lambda model, handles: ConstraintResult( lhs=handles['flow_rate'].variable, rhs=100, sense='<=', ), ) Next Steps for Integration 1. Add declare_variables() / declare_constraints() to ElementModel - default returns empty list (backward compatible) 2. Modify FlowSystemModel.do_modeling() - add DCE phases alongside existing code 3. Migrate one element type (e.g., Flow) to test the pattern 4. Gradually migrate others - can be done incrementally The Interface classes remain unchanged - this only affects the internal modeling layer.

What Was Implemented 1. FlowModel DCE Interface (elements.py:672-875) declare_variables() returns specs for: - flow_rate (always) - main optimization variable - status (if with_status) - binary on/off variable - total_flow_hours (always) - aggregated per period - flow_hours_over_periods (if constrained) - weighted sum across periods declare_constraints() returns specs for: - flow_rate_ub/lb (if status without investment) - status-controlled bounds - total_flow_hours_eq - tracking constraint - flow_hours_over_periods_eq (if constrained) - load_factor_max/min (if set) 2. Base ElementModel DCE Interface (structure.py:2009-2039) - declare_variables() - returns empty list by default - declare_constraints() - returns empty list by default - on_variables_created() - no-op by default 3. FlowSystemModel DCE Support (structure.py:243-317) - do_modeling_dce() - alternative to do_modeling() using DCE pattern What's NOT Yet Implemented For a complete Flow DCE implementation, these are still needed: 1. Investment case (with_investment=True) - size variable declaration 2. StatusModel integration - startup/shutdown constraints 3. Effects system - needs to work with batched variables 4. Bus balance - system constraint using all flow_rates Usage Example # Traditional approach (still works) model = flow_system.create_model() model.do_modeling() # DCE approach (experimental) model = flow_system.create_model() model.do_modeling_dce() # Uses batch creation where supported Next Steps 1. Add Investment DCE - extend FlowModel for with_investment case 2. Migrate StatusModel - add DCE interface to StatusModel 3. Migrate other elements - Bus, Storage, LinearConverter, etc. 4. Add benchmarking - compare do_modeling() vs do_modeling_dce() Would you like me to continue with Investment support, or focus on a different element type?

…ounds issue in the old pattern simulation (not our code). The important thing is that the DCE demo runs successfully, showing that the pattern works. Let me give you a summary of what's been accomplished: Summary The DCE (Declaration-Collection-Execution) pattern is now working with the real FlowSystem. Here's what was implemented: Files Modified 1. flixopt/vectorized.py - Core DCE infrastructure: - VariableSpec, ConstraintSpec, ConstraintResult dataclasses - VariableRegistry, ConstraintRegistry for batch operations - VariableHandle for element access to batched variables 2. flixopt/structure.py: - Added _dce_mode flag to FlowSystemModel - Added do_modeling_dce() method for DCE workflow - Added base DCE methods to ElementModel 3. flixopt/elements.py: - Added DCE interface to FlowModel (declare_variables(), declare_constraints(), on_variables_created()) - Added _dce_mode check to FlowModel._do_modeling() - Added _dce_mode check to ComponentModel._do_modeling() - Added _dce_mode check to BusModel._do_modeling() 4. flixopt/components.py: - Added _dce_mode check to LinearConverterModel._do_modeling() - Added _dce_mode check to TransmissionModel._do_modeling() - Added _dce_mode check to StorageModel._do_modeling() - Added _dce_mode check to InterclusterStorageModel._do_modeling() Performance Results The benchmark shows significant speedups: - 10 elements: 5.6x faster - 50 elements: 27.2x faster - 100 elements: 55.7x faster - 200 elements: 103.8x faster - 500 elements: 251.4x faster Remaining Tasks The current implementation only batches flow variables. To complete the DCE pattern, the following still need to be done: 1. Add component constraints to DCE - LinearConverter conversion equations, Storage balance constraints 2. Add Bus balance constraints to DCE 3. Add Investment support to FlowModel DCE 4. Add StatusModel DCE support

* ⏺ Done. I've applied broadcasts to all four BoundingPatterns methods that take bound tuples: 1. basic_bounds - Added xr.broadcast(lower_bound, upper_bound) 2. bounds_with_state - Added xr.broadcast(lower_bound, upper_bound) 3. scaled_bounds - Added xr.broadcast(rel_lower, rel_upper) 4. scaled_bounds_with_state - Added broadcasts for both relative_bounds and scaling_bounds tuples The state_transition_bounds and continuous_transition_bounds methods don't take bound tuples, so they don't need this fix. Summary of changes: - flixopt/modeling.py: Added xr.broadcast() calls in all four bounding methods to ensure bound pairs always have compatible dimensions - flixopt/components.py: Added xr.broadcast() at the end of _relative_charge_state_bounds (kept as defensive measure) This should handle all cases where a scalar bound (e.g., relative_minimum=0) is paired with a time-varying bound that may have additional dimensions like cluster. * Changes made: 1. Added _xr_allclose() helper in modeling.py:79-95 - uses xarray operations that handle broadcasting natively: def _xr_allclose(a: xr.DataArray, b: xr.DataArray, atol: float = 1e-10) -> bool: diff = a - b # xarray broadcasts automatically is_close = (abs(diff) <= atol) | (a.isnull() & b.isnull()) return bool(is_close.all()) 2. Removed all xr.broadcast() calls from: - BoundingPatterns.basic_bounds - BoundingPatterns.bounds_with_state - BoundingPatterns.scaled_bounds - BoundingPatterns.scaled_bounds_with_state - StorageModel._relative_charge_state_bounds 3. Replaced np.allclose() with _xr_allclose() in bounds_with_state and scaled_bounds The key insight: xarray arithmetic (a - b) handles broadcasting automatically, while np.allclose() does not. By using xarray operations for the comparison, we avoid the shape mismatch entirely without needing explicit broadcasts everywhere. * ⏺ The inheritance chain handles it: - _relative_charge_state_bounds → broadcasts → used by _absolute_charge_state_bounds - relative_flow_rate_bounds → broadcasts → used by absolute_flow_rate_bounds So the downstream properties automatically get aligned data. Final architecture: 1. Interface layer (the *_bounds properties) broadcasts once when returning tuples 2. BoundingPatterns uses _xr_allclose which handles xarray operations gracefully (as safety net) 3. No redundant broadcasting in constraint creation The _xr_allclose helper is still valuable as it's cleaner than np.allclose for xarray data and handles NaN correctly. It just won't need to do any broadcasting work now since inputs are pre-aligned. * ⏺ With @functools.cached_property: - 230 → 60 calls (one per element instead of 3-4 per element) - 74% reduction in broadcast overhead - ~12ms instead of ~45ms for a typical model * Speedup _xr_allclose

We achieved 8.3x speedup (up from 1.9x) by implementing true constraint batching. Key Change In vectorized.py, added _batch_total_flow_hours_eq() that creates one constraint for all 203 flows instead of 203 individual calls: # Before: 203 calls × ~5ms each = 1059ms for spec in specs: model.add_constraints(...) # After: 1 call = 10ms flow_rate = var_registry.get_full_variable('flow_rate') # (203, 168) total_flow_hours = var_registry.get_full_variable('total_flow_hours') # (203,) model.add_constraints(total_flow_hours == sum_temporal(flow_rate))

Problem: When flows have effects_per_flow_hour, the speedup dropped from 8.3x to 1.5x because effect shares were being created one-at-a-time. Root Causes Fixed: 1. Factors are converted to DataArrays during transformation, even for constant values like 30. Fixed by detecting constant DataArrays and extracting the scalar. 2. Coordinate access was using .coords[dim] on an xr.Coordinates object, which should be just [dim]. Results with Effects: ┌────────────┬───────────┬─────────────┬───────┬─────────┐ │ Converters │ Timesteps │ Traditional │ DCE │ Speedup │ ├────────────┼───────────┼─────────────┼───────┼─────────┤ │ 20 │ 168 │ 1242ms │ 152ms │ 8.2x │ ├────────────┼───────────┼─────────────┼───────┼─────────┤ │ 50 │ 168 │ 2934ms │ 216ms │ 13.6x │ ├────────────┼───────────┼─────────────┼───────┼─────────┤ │ 100 │ 168 │ 5772ms │ 329ms │ 17.5x │ └────────────┴───────────┴─────────────┴───────┴─────────┘ The effect_shares phase now takes ~45ms for 304 effect shares (previously ~3900ms).

Before (40+ lines): - Built numpy array of scalars - Checked each factor type (int/float/DataArray) - Detected constant DataArrays by comparing all values - Had fallback path for time-varying factors After (10 lines): spec_map = {spec.element_id: spec.factor for spec in specs} factors_list = [spec_map.get(eid, 0) for eid in element_ids] factors_da = xr.concat( [xr.DataArray(f) if not isinstance(f, xr.DataArray) else f for f in factors_list], dim='element', ).assign_coords(element=element_ids) xarray handles all the broadcasting automatically - whether factors are scalars, constant DataArrays, or truly time-varying DataArrays.

Constraint Batching Progress ┌────────────────────────────┬────────────┬───────────────────────────────┐ │ Constraint Type │ Status │ Notes │ ├────────────────────────────┼────────────┼───────────────────────────────┤ │ total_flow_hours_eq │ ✅ Batched │ All flows │ ├────────────────────────────┼────────────┼───────────────────────────────┤ │ flow_hours_over_periods_eq │ ✅ Batched │ Flows with period constraints │ ├────────────────────────────┼────────────┼───────────────────────────────┤ │ flow_rate_ub │ ✅ Batched │ Flows with status │ ├────────────────────────────┼────────────┼───────────────────────────────┤ │ flow_rate_lb │ ✅ Batched │ Flows with status │ └────────────────────────────┴────────────┴───────────────────────────────┘ Benchmark Results (Status Flows) ┌────────────┬─────────────┬───────┬─────────┐ │ Converters │ Traditional │ DCE │ Speedup │ ├────────────┼─────────────┼───────┼─────────┤ │ 20 │ 916ms │ 146ms │ 6.3x │ ├────────────┼─────────────┼───────┼─────────┤ │ 50 │ 2207ms │ 220ms │ 10.0x │ ├────────────┼─────────────┼───────┼─────────┤ │ 100 │ 4377ms │ 340ms │ 12.9x │ └────────────┴─────────────┴───────┴─────────┘ Benchmark Results (Effects) ┌────────────┬─────────────┬───────┬─────────┐ │ Converters │ Traditional │ DCE │ Speedup │ ├────────────┼─────────────┼───────┼─────────┤ │ 20 │ 1261ms │ 157ms │ 8.0x │ ├────────────┼─────────────┼───────┼─────────┤ │ 50 │ 2965ms │ 223ms │ 13.3x │ ├────────────┼─────────────┼───────┼─────────┤ │ 100 │ 5808ms │ 341ms │ 17.0x │ └────────────┴─────────────┴───────┴─────────┘ Remaining Tasks 1. Add Investment support to FlowModel DCE - Investment variables/constraints aren't batched yet 2. Add StatusModel DCE support - StatusModel (active_hours, startup_count, etc.) isn't using DCE

…reation

What was implemented: 1. Added finalize_dce() method to FlowModel (elements.py:904-927) - Called after all DCE variables and constraints are created - Creates StatusModel submodel using the already-created status variable from DCE handles 2. Updated do_modeling_dce() in structure.py (lines 354-359) - Added finalization step that calls finalize_dce() on each element model - Added timing measurement for the finalization phase Performance Results: ┌───────────────────────────────────────┬─────────────┬────────┬─────────┐ │ Configuration │ Traditional │ DCE │ Speedup │ ├───────────────────────────────────────┼─────────────┼────────┼─────────┤ │ Investment only (100 converters) │ 4417ms │ 284ms │ 15.6x │ ├───────────────────────────────────────┼─────────────┼────────┼─────────┤ │ With StatusParameters (50 converters) │ 4161ms │ 2761ms │ 1.5x │ └───────────────────────────────────────┴─────────────┴────────┴─────────┘ Why StatusModel is slower: The finalize_dce phase takes 94.5% of DCE time when StatusParameters are used because: - StatusModel uses complex patterns (consecutive_duration_tracking, state_transition_bounds) - Each pattern creates multiple constraints individually via linopy - Full optimization would require batching these patterns across all StatusModels Verification: - Both traditional and DCE models solve to identical objectives - StatusModel is correctly created with all variables (active_hours, uptime, etc.) and constraints - All flow configurations work: simple, investment, status, and investment+status

Phase 1 Summary: Foundation Changes to flixopt/structure.py 1. Added new categorization enums (lines 150-231): - ElementType: Categorizes element types (FLOW, BUS, STORAGE, CONVERTER, EFFECT) - VariableType: Semantic variable types (FLOW_RATE, STATUS, CHARGE_STATE, etc.) - ConstraintType: Constraint categories (TRACKING, BOUNDS, BALANCE, LINKING, etc.) 2. Added ExpansionCategory alias (line 147): - ExpansionCategory = VariableCategory for backward compatibility - Clarifies that VariableCategory is specifically for segment expansion behavior 3. Added VARIABLE_TYPE_TO_EXPANSION mapping (lines 239-255): - Maps VariableType to ExpansionCategory for segment expansion logic - Connects the new enum system to existing expansion handling 4. Created TypeModel base class (lines 264-508): - Abstract base class for type-level models (one per element TYPE, not instance) - Key methods: - add_variables(): Creates batched variables with element dimension - add_constraints(): Creates batched constraints - _build_coords(): Builds coordinate dict with element + model dimensions - _stack_bounds(): Stacks per-element bounds into DataArrays - get_variable(): Gets variable, optionally sliced to specific element - Abstract methods: create_variables(), create_constraints() Verification - All imports work correctly - 172 tests pass (test_functional, test_component, test_flow, test_effect)

1. Created FlowsModel(TypeModel) class (elements.py:1404-1850): - Handles ALL flows in a single instance with batched variables - Categorizes flows by features: flows_with_status, flows_with_investment, etc. - Creates batched variables: flow_rate, total_flow_hours, status, size, invested, flow_hours_over_periods - Creates batched constraints: tracking, bounds (status, investment, both), investment linkage - Includes create_effect_shares() for batched effect contribution 2. Added do_modeling_type_level() method (structure.py:761-848): - Alternative to do_modeling() and do_modeling_dce() - Uses FlowsModel for all flows instead of individual FlowModel instances - Includes timing breakdown for performance analysis 3. Added element access pattern to Flow class (elements.py:648-685): - set_flows_model(): Sets reference to FlowsModel - flow_rate_from_type_model: Access slice of batched variable - total_flow_hours_from_type_model: Access slice - status_from_type_model: Access slice (if applicable) Verification - All 154 tests pass (test_functional, test_flow, test_component) - Element access pattern tested and working - Timing breakdown shows type-level modeling working correctly

- Created BusesModel(TypeModel) class that handles ALL buses in one instance - Creates batched virtual_supply and virtual_demand variables for buses with imbalance penalty - Creates bus balance constraints: sum(inputs) == sum(outputs) (with virtual supply/demand adjustment for imbalance) - Created BusModelProxy for lightweight proxy in type-level mode Effect Shares Refactoring The effect shares pattern was refactored for cleaner architecture: Before: TypeModels directly modified effect constraints After: TypeModels declare specs → Effects system applies them 1. FlowsModel now has: - collect_effect_share_specs() - returns dict of effect specs - create_effect_shares() - delegates to EffectCollectionModel 2. BusesModel now has: - collect_penalty_share_specs() - returns list of penalty expressions - create_effect_shares() - delegates to EffectCollectionModel 3. EffectCollectionModel now has: - apply_batched_flow_effect_shares() - applies flow effect specs in bulk - apply_batched_penalty_shares() - applies penalty specs in bulk

Architecture TypeModels declare specs → Effects applies them in bulk 1. FlowsModel.collect_effect_share_specs() - Returns dict of effect specs 2. BusesModel.collect_penalty_share_specs() - Returns list of penalty specs 3. EffectCollectionModel.apply_batched_flow_effect_shares() - Creates batched share variables 4. EffectCollectionModel.apply_batched_penalty_shares() - Creates penalty share variables Per-Element Contribution Visibility The share variables now preserve per-element information: flow_effects->costs(temporal) dims: ('element', 'time') element coords: ['Grid(elec)', 'HP(elec_in)'] You can query individual contributions: # Get Grid's contribution to costs grid_costs = results['flow_effects->costs(temporal)'].sel(element='Grid(elec)') # Get HP's contribution hp_costs = results['flow_effects->costs(temporal)'].sel(element='HP(elec_in)') Performance Still maintains 8.8-14.2x speedup because: - ONE batched variable per effect (not one per element) - ONE vectorized constraint per effect - Element dimension enables per-element queries without N separate variables

Architecture - StoragesModel - handles ALL basic (non-intercluster) storages in one instance - StorageModelProxy - lightweight proxy for individual storages in type-level mode - InterclusterStorageModel - still uses traditional approach (too complex to batch) Variables (batched with element dimension) - storage|charge_state: (element, time+1, ...) - with extra timestep for energy balance - storage|netto_discharge: (element, time, ...) Constraints (per-element due to varying parameters) - netto_discharge: discharge - charge - charge_state: Energy balance constraint - initial_charge_state: Initial SOC constraint - final_charge_max/min: Final SOC bounds - cluster_cyclic: For cyclic cluster mode Performance Type-level approach now has: - 8.9-12.3x speedup for 50-200 converters with 100 timesteps - 4.2x speedup for 100 converters with 500 timesteps (constraint creation becomes bottleneck) Implemented Type-Level Models 1. FlowsModel - all flows 2. BusesModel - all buses 3. StoragesModel - basic (non-intercluster) storages

I've added investment categorization to StoragesModel batched constraints: Changes Made 1. components.py - create_investment_constraints() method (lines 1946-1998) - Added a new method that creates scaled bounds constraints for storages with investment - Must be called AFTER component models are created (since it needs investment.size variables) - Uses per-element constraint creation because each storage has its own investment size variable - Handles both variable bounds (lb and ub) and fixed bounds (when rel_lower == rel_upper) 2. components.py - StorageModelProxy._do_modeling() (lines 2088-2104) - Removed the inline BoundingPatterns.scaled_bounds() call - Added comment explaining that scaled bounds are now created by StoragesModel.create_investment_constraints() 3. structure.py - do_modeling_type_level() (lines 873-877) - Added call to _storages_model.create_investment_constraints() after component models are created - Added timing tracking for storages_investment step Architecture Note The investment constraints are created per-element (not batched) because each storage has its own investment.size variable. True batching would require a InvestmentsModel with a shared size variable having an element dimension. This is documented in the method docstring and is a pragmatic choice that: - Works correctly - Maintains the benefit of batched variables (charge_state, netto_discharge) - Keeps the architecture simple

A type-level model that handles ALL elements with investment at once with batched variables: Variables created: - investment|size - Batched size variable with element dimension - investment|invested - Batched binary variable with element dimension (non-mandatory only) Constraints created: - investment|size|lb / investment|size|ub - State-controlled bounds for non-mandatory - Per-element linked_periods constraints when applicable Effect shares: - Fixed effects (effects_of_investment) - Per-size effects (effects_of_investment_per_size) - Retirement effects (effects_of_retirement) Updated: StoragesModel (components.py) - Added _investments_model attribute - New method create_investment_model() - Creates batched InvestmentsModel - Updated create_investment_constraints() - Uses batched size variable for truly vectorized scaled bounds Updated: StorageModelProxy (components.py) - Removed per-element InvestmentModel creation - investment property now returns _InvestmentProxy that accesses batched variables New Class: _InvestmentProxy (components.py:31-50) Proxy class providing access to batched investment variables for a specific element: storage.submodel.investment.size # Returns slice: investment|size[element_id] storage.submodel.investment.invested # Returns slice: investment|invested[element_id] Updated: do_modeling_type_level() (structure.py) Order of operations: 1. StoragesModel.create_variables() - charge_state, netto_discharge 2. StoragesModel.create_constraints() - energy balance 3. StoragesModel.create_investment_model() - batched size/invested 4. StoragesModel.create_investment_constraints() - batched scaled bounds 5. Component models (StorageModelProxy skips InvestmentModel) Benefits - Single investment|size variable with element dimension vs N per-element variables - Vectorized constraint creation for scaled bounds - Consistent architecture with FlowsModel/BusesModel

… a summary of the changes: Changes Made: 1. features.py - Added InvestmentProxy class (lines 157-176) - Provides same interface as InvestmentModel (.size, .invested) - Returns slices from batched InvestmentsModel variables - Shared between FlowModelProxy and StorageModelProxy 2. elements.py - Updated FlowModelProxy - Added import for InvestmentProxy (line 18) - Updated investment property (lines 788-800) to return InvestmentProxy instead of None 3. structure.py - Added call to FlowsModel.create_investment_model() (lines 825-828) - Creates batched investment variables, constraints, and effect shares for flows 4. components.py - Cleaned up - Removed local _InvestmentProxy class (moved to features.py) - Import InvestmentProxy from features.py Test Results: - All 88 flow tests pass (including all investment-related tests) - All 48 storage tests pass - All 26 functional tests pass The batched InvestmentsModel now handles both Storage and Flow investments with: - Batched size and invested variables with element dimension - Vectorized constraint creation - Batched effect shares for investment costs

New Classes Added (features.py): 1. StatusProxy (lines 529-563) - Provides per-element access to batched StatusesModel variables: - active_hours, startup, shutdown, inactive, startup_count properties 2. StatusesModel (lines 566-964) - Type-level model for batched status features: - Categorization by feature flags: - All status elements get active_hours - Elements with use_startup_tracking get startup, shutdown - Elements with use_downtime_tracking get inactive - Elements with startup_limit get startup_count - Batched variables with element dimension - Batched constraints: - active_hours tracking - inactive complementary (status + inactive == 1) - State transitions (startup/shutdown) - Startup count limits - Uptime/downtime tracking (consecutive duration) - Cluster cyclic constraints - Effect shares for effects_per_active_hour and effects_per_startup Updated Files: 1. elements.py: - Added _statuses_model = None to FlowsModel - Added create_status_model() method to FlowsModel - Updated FlowModelProxy to use StatusProxy instead of per-element StatusModel 2. structure.py: - Added call to self._flows_model.create_status_model() in type-level modeling The architecture now has one StatusesModel handling ALL flows with status, instead of creating individual StatusModel instances per element.

StatusesModel Implementation Created a batched StatusesModel class in features.py that handles ALL elements with status in a single instance: New Classes: - StatusProxy - Per-element access to batched StatusesModel variables (active_hours, startup, shutdown, inactive, startup_count) - StatusesModel - Type-level model with: - Categorization by feature flags (startup tracking, downtime tracking, uptime tracking, startup_limit) - Batched variables with element dimension - Batched constraints (active_hours tracking, state transitions, consecutive duration, etc.) - Batched effect shares Updates: - FlowsModel - Added _statuses_model attribute and create_status_model() method - FlowModelProxy - Updated status property to return StatusProxy - structure.py - Added call to create_status_model() in type-level modeling path Bug Fixes 1. _ensure_coords - Fixed to handle None values (bounds not specified) 2. FlowSystemModel.add_variables - Fixed to properly handle binary variables (cannot have bounds in linopy) 3. Removed unused stacked_status variable in StatusesModel Test Results - All 114 tests pass (88 flow tests + 26 functional tests) - Type-level modeling path working correctly

- features.py: Replaced concat_with_coords with stack_along_dim(values, dim, coords) — handles mixed scalar/DataArray inputs. Removed InvestmentBuilder.stack_bounds (now redundant). - structure.py: Removed TypeModel._stack_bounds (was only referenced in docstring). - elements.py: TransmissionsModel._stack_data now delegates to stack_along_dim. - components.py: StoragesModel._stack_parameter now delegates to stack_along_dim. Removed 5 dead InvestmentBuilder imports. All InvestmentBuilder.stack_bounds and concat_with_coords calls replaced. - batched.py: EffectsData._stack_bounds now uses stack_along_dim internally. All InvestmentBuilder.stack_bounds and concat_with_coords calls replaced. Removed unused InvestmentBuilder import.

…and build_effects_array Merge stack_and_broadcast into stack_along_dim (new target_coords param), rewrite build_effects_array to fill numpy arrays directly instead of nested xr.concat calls.

…, dt / eta) — saves 2-4 linopy operations per storage

Take main's version for statistics_accessor.py, transform_accessor.py, and comparison.py. These need adaptation to the batched model's variable naming in follow-up commits. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Removed CONSUME = 'consume' from ExpansionMode enum - Removed InterclusterStorageVarName.SOC_BOUNDARY: ExpansionMode.CONSUME from NAME_TO_EXPANSION flixopt/transform_accessor.py: - Added import functools - Added 4 cached properties (_original_period_indices, _positions_in_period, _original_period_da, _cluster_indices_per_timestep) to deduplicate period-to-cluster mapping computed in 3 methods - Added _get_mode() static method for suffix-based NAME_TO_EXPANSION lookup - Replaced __init__'s pre-built variable sets (_state_vars, _first_timestep_vars, _segment_total_vars + mode_to_set loop) with direct _consume_vars construction from InterclusterStorageVarName.SOC_BOUNDARY - Removed _is_state_variable() and _is_first_timestep_variable() methods - Rewrote expand_dataarray() using match/case dispatch on ExpansionMode - Replaced duplicated index computation in _interpolate_charge_state_segmented and _expand_first_timestep_only with cached property references

…agesModel.soc_boundary: extract_capacity_bounds was receiving boundary_dims that already included the storage dimension, then stack_along_dim added it again → ('intercluster_storage', 'cluster_boundary', 'intercluster_storage'). Fixed by passing the original dims (without storage dim) to extract_capacity_bounds. 2. tests/test_cluster_reduce_expand.py — Updated stale variable name references: 'storage|SOC_boundary' → 'intercluster_storage|SOC_boundary', 'storage|charge' → 'intercluster_storage|charge_state', and .sel(storage=...) → .sel(intercluster_storage=...) throughout the intercluster test classes.

…n that crashed when minimum_or_fixed_size/maximum_or_fixed_size returned multi-dimensional DataArrays (e.g., with period dimension). No other similar bugs found.

…ffect share constraints) (#595) * fix: memory issues due to dense large coeficients 1. flixopt/features.py — Added sparse_multiply_sum() function that takes a sparse dict of (group_id, sum_id) -> coefficient instead of a dense DataArray. This avoids ever allocating the massive dense array. 2. flixopt/elements.py — Replaced _coefficients (dense DataArray) and _flow_sign (dense DataArray) with a single _signed_coefficients cached property that returns dict[tuple[str, str], float | xr.DataArray] containing only non-zero signed coefficients. Updated create_linear_constraints to use sparse_multiply_sum instead of sparse_weighted_sum. The dense allocation at line 2385 (np.zeros(n_conv, max_eq, n_flows, *time) ~14.5 GB) is completely eliminated. Memory usage is now proportional to the number of non-zero entries (typically 2-3 flows per converter) rather than the full cartesian product. * fix(effects): avoid massive memory allocation in share variable creation Replace linopy.align(join='outer') with per-contributor accumulation and linopy.merge(dim='contributor'). The old approach reindexed ALL dimensions via xr.where(), allocating ~12.7 GB of dense arrays. Now contributions are split by contributor at registration time and accumulated via linopy addition (cheap for same-shape expressions), then merged along the disjoint contributor dimension. * Switch to per contributor constraints to solve memmory issues * fix(effects): avoid massive memory allocation in share variable creation Replace linopy.align(join='outer') with per-contributor accumulation and individual constraints. The old approach reindexed ALL dimensions via xr.where(), allocating ~12.7 GB of dense arrays. Now contributions are split by contributor at registration time and accumulated via linopy addition (cheap for same-shape expressions). Each contributor gets its own constraint, avoiding any cross-contributor alignment. Reduces effects expression memory from 1.2 GB to 5 MB. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Switch to per contributor constraints to solve memmory issues * perf: improve bus balance to be more memmory efficient * Switch to per effect shares * Firs succesfull drop to 10 GB * Make more readable * Go back to one variable for all shares * ⏺ Instead of adding zero-constraints for uncovered combos, we should just set lower=0, upper=0 on those entries (fix the bounds), or better yet — use a mask on the per-effect constraints and set the variable bounds to 0 for uncovered combos. The simplest fix: create the variable with lower=0, upper=0 by default, then only the covered entries need constraints. * Only create variables needed * _create_share_var went from 1,674ms → 116ms — a 14x speedup! The reindex + + approach is much faster than per-contributor sel + merge * Revert * Revert * 1. effects.py: add_temporal_contribution and add_periodic_contribution now raise ValueError if a DataArray has no effect dimension and no effect= argument is provided. 2. statistics_accessor.py: Early return with empty xr.Dataset() when no contributors are detected, preventing xr.concat from failing on an empty list. --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

This reverts commit 9e3c164.

FBumann and others added 30 commits January 16, 2026 13:33

Merge branch 'dev' into feature/tsam-v3+rework

a5f0147

Added @functools.cached_property to timestep_mapping in clustering/b…

ebf2aab

…ase.py: - Before: 852 calls × 1.2ms = 1.01s - After: 1 call × 1.2ms = 0.001s (cached)

Ruff checks

1c03e02

Improve bounds stacking

b69fa5b

Merge branch 'feature/tsam-v3+rework' into feature/vectorized-model-c…

298895e

…reation

Add infrastructure to use the old OR the new modeling mode

8b21a02

Finish BusModel implementation

630ce1e

FBumann and others added 23 commits January 31, 2026 14:03

perf: use sparse groupby in conversion

f38f828

perf: use sparse groupby in piecewise_conversion

2a94130

perf: replace xr.concat with numpy pre-allocation in stack_along_dim …

805bcc5

…and build_effects_array Merge stack_and_broadcast into stack_along_dim (new target_coords param), rewrite build_effects_array to fill numpy arrays directly instead of nested xr.concat calls.

fix: improve method signature of build_effects_array

82e6998

Add sparse_weighted_sum method for faster constraint building

9c2d3d3

Add sparse_weighted_sum method for faster constraint building

8277d5d

Add benchmark results

c67a6a7

Improve piecewise

52a581f

Pre-combined xarray coefficients in storage energy balance (eta * dt…

8c8eb5c

…, dt / eta) — saves 2-4 linopy operations per storage

Fix test

3561a91

Update benchmark_results.md

502fbb7

Fix cross effect shares

b05f63e

perf: use sparse_weighted_sum in bus balance

0d47027

Ensure all share defs have canonical effect order before alignment

03eb4ad

Update tests

d2579b4

Merge main into feature/element-data-classes

3a16bc2

Take main's version for statistics_accessor.py, transform_accessor.py, and comparison.py. These need adaptation to the batched model's variable naming in follow-up commits. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Finalize merge

a69c129

Finalize merge

7c0e12e

Use ModelCoordinates._update_*

17095ab

Fixed: batched.py:656 and batched.py:669 — removed float() conversio…

701f7af

…n that crashed when minimum_or_fixed_size/maximum_or_fixed_size returned multi-dimensional DataArrays (e.g., with period dimension). No other similar bugs found.

FBumann mentioned this pull request Feb 1, 2026

perf: reduce memory usage in build_model (sparse coefficients + per-effect share constraints) #595

Merged

3 tasks

FBumann and others added 6 commits February 1, 2026 16:05

use linopy.merge()

9e3c164

Revert "use linopy.merge()"

5d231e6

This reverts commit 9e3c164.

fix statistics_accessor.py

fc42b97

fix: FlowsContainer

a425ec2

fix: FlowsContainer

1223a30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/element data classes #588

Feature/element data classes #588

FBumann commented Jan 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feature/element data classes #588

Are you sure you want to change the base?

Feature/element data classes #588

Conversation

FBumann commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Key Changes

Performance Results

XL System (2000h, 300 converters, 50 storages)

Complex System (72h, piecewise)

Key Takeaways

Model Size Reduction

How to Run Benchmarks

Type of Change

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FBumann commented Jan 23, 2026 •

edited

Loading