Skip to content

Conversation

@FBumann
Copy link
Member

@FBumann FBumann commented Feb 1, 2026

Summary

Fixes out-of-memory issues on large real-world models during build_model(). Peak RSS dropped from ~27.5 GB to ~10.6 GB on a model with 125 components, 352 flows, 22 effects, and 2190 timesteps.

Two main changes:

1. Sparse converter coefficients (elements.py, features.py)

Replaced the dense np.zeros(n_converters, max_equations, n_flows, *time) coefficient array (~14.5 GB, 97-99% zeros) with a sparse dict[tuple[str, str], Numeric] mapping and a new sparse_multiply_sum() helper that builds the linear expression directly from non-zero entries only.

2. Per-effect share constraints (effects.py, components.py)

The share variable (contributor, effect, time) uses linopy's mask parameter to skip invalid (contributor, effect) combos (~60% of entries). However, Variable.sum('contributor') doesn't respect the mask — it creates _term = n_contributors for every position, with ~60% dead terms (vars=-1). On the full variable this caused OOM at 44+ GB.

Workaround: create one constraint per effect, summing only active contributors per effect. Each per-effect .sum('contributor') has no dead terms. See docs/linopy_memory_issues.md for a detailed write-up and reproducible example.

3. Benchmark memory profiling (benchmarks/benchmark_model_build.py)

Added --memory / -m flag that uses tracemalloc to measure peak Python heap allocation during build_model().

Benchmark (XL synthetic: 2000h, 300 converters, 50 storages)

Back-to-back A/B comparison on the same machine:

Metric PR #588 (base) This PR Change
Build (ms) 1,648 1,691 ~same
Write LP (ms) 10,003 7,527 -25%
LP Size (MB) 469 384 -18%
Build Peak Mem (MB) 968 724 -25%

Build time is essentially unchanged. The wins are in LP writing speed (-25%), LP file size (-18%), and build memory (-25%) — all from eliminating dead/masked variable entries.

On the real-world model (277 contributors, 22 effects, 2190 timesteps), the base OOM'd at 44+ GB while this PR completes at ~10.6 GB peak RSS.

Test plan

  • pytest tests/ — all pre-existing tests pass (1024 passed, 9 pre-existing failures)
  • Real-world model builds successfully without OOM
  • Benchmark comparison against base branch

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Refactor
    • Per-effect tracking overhauled so contributions and coefficients are handled per effect, improving accuracy and clarity of effect-level results.
  • New Features
    • Added a sparse-weighted-sum utility to efficiently compute grouped weighted sums.
  • Bug Fixes / Robustness
    • Improved solution reporting and data extraction to handle partial or single-mode results without errors.
  • Tests
    • Updated tests to accept multiple per-effect share constraints and reordered selectors for stability.

✏️ Tip: You can customize this high-level summary in your review settings.

FBumann and others added 13 commits January 31, 2026 23:26
1. flixopt/features.py — Added sparse_multiply_sum() function that takes a sparse dict of (group_id, sum_id) -> coefficient instead of a dense DataArray. This avoids ever
  allocating the massive dense array.
  2. flixopt/elements.py — Replaced _coefficients (dense DataArray) and _flow_sign (dense DataArray) with a single _signed_coefficients cached property that returns
  dict[tuple[str, str], float | xr.DataArray] containing only non-zero signed coefficients. Updated create_linear_constraints to use sparse_multiply_sum instead of
  sparse_weighted_sum.

  The dense allocation at line 2385 (np.zeros(n_conv, max_eq, n_flows, *time) ~14.5 GB) is completely eliminated. Memory usage is now proportional to the number of non-zero
  entries (typically 2-3 flows per converter) rather than the full cartesian product.
Replace linopy.align(join='outer') with per-contributor accumulation
and linopy.merge(dim='contributor'). The old approach reindexed ALL
dimensions via xr.where(), allocating ~12.7 GB of dense arrays.
Now contributions are split by contributor at registration time and
accumulated via linopy addition (cheap for same-shape expressions),
then merged along the disjoint contributor dimension.
Replace linopy.align(join='outer') with per-contributor accumulation
and individual constraints. The old approach reindexed ALL dimensions
via xr.where(), allocating ~12.7 GB of dense arrays.

Now contributions are split by contributor at registration time and
accumulated via linopy addition (cheap for same-shape expressions).
Each contributor gets its own constraint, avoiding any cross-contributor
alignment. Reduces effects expression memory from 1.2 GB to 5 MB.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…just set lower=0, upper=0 on those entries (fix the bounds), or better yet — use a mask on the per-effect

  constraints and set the variable bounds to 0 for uncovered combos. The simplest fix: create the variable with lower=0, upper=0 by default, then only the covered entries need
  constraints.
…dex + + approach is much faster than per-contributor sel + merge
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 1, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review
📝 Walkthrough

Walkthrough

Refactors per-effect contribution and share handling across models: introduces per-effect accumulation structures, sparse coefficient mappings, and emits explicit per-effect contributions; updates downstream variable creation, statistics extraction, and tests to align with per-effect semantics.

Changes

Cohort / File(s) Summary
Per-effect contributions
flixopt/components.py, flixopt/effects.py
Replace flat share lists with per-effect accumulators (effect_id -> contributions). add_temporal_contribution / add_periodic_contribution gain optional effect param; accumulation, masking, and per-effect emission logic added. Components now push separate periodic/investment/retirement/mandatory contributions per effect when applicable.
Sparse coefficient model
flixopt/elements.py, flixopt/features.py
ConvertersModel switches from dense _coefficients to sparse _signed_coefficients mapping (converter, flow) -> coeff. Added sparse_multiply_sum(var, coefficients, sum_dim, group_dim) and updated constraint construction to use sparse, per-(flow,effect) contributions.
Statistics & results handling
flixopt/statistics_accessor.py, flixopt/results.py
Dynamic per-mode lookup for `share
Transform accessor & tests
flixopt/transform_accessor.py, tests/test_flow.py, tests/test_functional.py
Changed expansion-mode lookup to exact-key behavior in _Expander._get_mode. Tests adjusted: temporal share constraint name check relaxed to prefix match; .sel(...) argument order updated for effect/contributor selection.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Converters as ConvertersModel
participant Effects as EffectsModel
participant Shares as ShareVariables
participant Stats as StatisticsAccessor
Converters->>Effects: emit per-(flow, effect) signed coefficients & contributions
Effects->>Effects: accumulate into per-effect dicts (effect_id -> contributions)
Effects->>Shares: create masked share variables per (contributor, effect, time/period)
Shares->>Stats: solution contains share|<mode> per-effect variables
Stats->>Stats: select per-effect coordinates -> build effects dataset

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

v3.0.0

Poem

🐰 Per-effect carrots in tidy rows,
Each contribution proudly shows,
Sparse hops, masked tails in flight,
Variables sprout through day and night,
A little rabbit cheers the code's new prose.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main changes: performance improvement through sparse coefficients and per-effect share constraints, directly matching the PR's core objectives.
Description check ✅ Passed The description covers all template sections with substantive content: clear summary of changes, type of change (refactoring), related improvements, testing results, and checklist completion.
Docstring Coverage ✅ Passed Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/element-data-classes+memmory-issues

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@FBumann FBumann changed the title Feature/element data classes+memmory issues perf: reduce memory usage in build_model (sparse coefficients + per-effect share constraints) Feb 1, 2026
@FBumann
Copy link
Member Author

FBumann commented Feb 1, 2026

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 1, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@FBumann
Copy link
Member Author

FBumann commented Feb 1, 2026

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 1, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
flixopt/statistics_accessor.py (1)

1533-1534: ⚠️ Potential issue | 🟠 Major

Fix balance() for list‑based inputs/outputs (pipeline failure).
The failing test indicates element.inputs can be a list; .values() then raises. Normalize to handle both mappings and lists.

🐛 Suggested fix
-        input_labels = [f.label_full for f in element.inputs.values()]
-        output_labels = [f.label_full for f in element.outputs.values()]
+        from collections.abc import Mapping
+        inputs = element.inputs.values() if isinstance(element.inputs, Mapping) else element.inputs
+        outputs = element.outputs.values() if isinstance(element.outputs, Mapping) else element.outputs
+        input_labels = [f.label_full for f in inputs]
+        output_labels = [f.label_full for f in outputs]
🤖 Fix all issues with AI agents
In `@flixopt/effects.py`:
- Around line 364-406: The DataArray branches in add_temporal_contribution and
add_periodic_contribution must ensure an 'effect' dimension is present; if a
DataArray lacks the 'effect' dim and no effect argument is provided, raise a
clear ValueError rather than appending a dimensionless array (which breaks
finalize_shares' reindexing). Update add_temporal_contribution and
add_periodic_contribution to check for ('effect' in defining_expr.dims) when
isinstance(defining_expr, xr.DataArray) and if missing and effect is None raise
an error instructing the caller to either pass effect="..." or supply an
'effect' dimension; if effect is provided, continue to
expand_dims(effect=[effect]) and append to
_temporal_constant_defs/_periodic_constant_defs as before.

In `@flixopt/statistics_accessor.py`:
- Around line 808-820: The loop builds detected_contributors and then uses
contributors in downstream xr.concat which will fail when empty; guard by
checking after building detected_contributors (symbols: detected_contributors,
contributors, modes_to_process, share_name, solution) and if it's empty either
return/produce an empty xarray Dataset/DataArray with a zero-length
'contributor' dimension or set up a zero-length contributors list and a matching
empty structure before calling xr.concat so downstream code handles the empty
case safely; implement the short‑circuit immediately after contributors =
sorted(detected_contributors).
🧹 Nitpick comments (1)
flixopt/elements.py (1)

2366-2404: Keep the sparse dict truly sparse by skipping zero scalar coefficients.
This avoids storing explicit zeros for the common scalar-coefficient case and better matches the sparse contract.

♻️ Suggested tweak (scalar-only zero skip)
                 for flow_label, coeff in conv_factors.items():
                     flow_id = flow_map.get(flow_label)
                     sign = flow_signs.get(flow_id, 0.0) if flow_id else 0.0
                     if sign != 0.0:
-                        intermediate[(conv.label, flow_id)].append((eq_idx, coeff * sign))
+                        signed = coeff * sign
+                        if isinstance(signed, (int, float)) and signed == 0:
+                            continue
+                        intermediate[(conv.label, flow_id)].append((eq_idx, signed))

Comment on lines 364 to 406
def add_temporal_contribution(
self,
defining_expr,
contributor_dim: str = 'contributor',
effect: str | None = None,
) -> None:
"""Register contributors for the share|temporal variable.

Args:
defining_expr: Expression with a contributor dimension.
Accepts linopy LinearExpression/Variable or plain xr.DataArray (constants).
defining_expr: Expression with a contributor dimension (no effect dim if effect is given).
contributor_dim: Name of the element dimension to rename to 'contributor'.
effect: If provided, the expression is for this specific effect (no effect dim needed).
"""
if contributor_dim != 'contributor':
defining_expr = defining_expr.rename({contributor_dim: 'contributor'})
if isinstance(defining_expr, xr.DataArray):
if effect is not None:
defining_expr = defining_expr.expand_dims(effect=[effect])
self._temporal_constant_defs.append(defining_expr)
else:
self._temporal_share_defs.append(defining_expr)
self._accumulate_shares(self._temporal_shares, self._as_expression(defining_expr), effect)

def add_periodic_contribution(self, defining_expr, contributor_dim: str = 'contributor') -> None:
def add_periodic_contribution(
self,
defining_expr,
contributor_dim: str = 'contributor',
effect: str | None = None,
) -> None:
"""Register contributors for the share|periodic variable.

Args:
defining_expr: Expression with a contributor dimension.
Accepts linopy LinearExpression/Variable or plain xr.DataArray (constants).
defining_expr: Expression with a contributor dimension (no effect dim if effect is given).
contributor_dim: Name of the element dimension to rename to 'contributor'.
effect: If provided, the expression is for this specific effect (no effect dim needed).
"""
if contributor_dim != 'contributor':
defining_expr = defining_expr.rename({contributor_dim: 'contributor'})
if isinstance(defining_expr, xr.DataArray):
if effect is not None:
defining_expr = defining_expr.expand_dims(effect=[effect])
self._periodic_constant_defs.append(defining_expr)
else:
self._periodic_share_defs.append(defining_expr)
self._accumulate_shares(self._periodic_shares, self._as_expression(defining_expr), effect)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

DataArray contributions must carry an effect dimension.
If a DataArray comes in without effect and no effect=... is provided, finalize_shares() later calls .reindex({'effect': ...}), which will error. This can happen for single‑effect constants (e.g., mandatory/retirement contributions). Enforce the requirement or expand with the intended effect to avoid runtime failures.

🛠️ Enforce effect presence
-        if isinstance(defining_expr, xr.DataArray):
-            if effect is not None:
-                defining_expr = defining_expr.expand_dims(effect=[effect])
-            self._temporal_constant_defs.append(defining_expr)
+        if isinstance(defining_expr, xr.DataArray):
+            if effect is not None:
+                defining_expr = defining_expr.expand_dims(effect=[effect])
+            elif 'effect' not in defining_expr.dims:
+                raise ValueError("DataArray contributions must include an 'effect' dimension or pass effect=")
+            self._temporal_constant_defs.append(defining_expr)
-        if isinstance(defining_expr, xr.DataArray):
-            if effect is not None:
-                defining_expr = defining_expr.expand_dims(effect=[effect])
-            self._periodic_constant_defs.append(defining_expr)
+        if isinstance(defining_expr, xr.DataArray):
+            if effect is not None:
+                defining_expr = defining_expr.expand_dims(effect=[effect])
+            elif 'effect' not in defining_expr.dims:
+                raise ValueError("DataArray contributions must include an 'effect' dimension or pass effect=")
+            self._periodic_constant_defs.append(defining_expr)
🤖 Prompt for AI Agents
In `@flixopt/effects.py` around lines 364 - 406, The DataArray branches in
add_temporal_contribution and add_periodic_contribution must ensure an 'effect'
dimension is present; if a DataArray lacks the 'effect' dim and no effect
argument is provided, raise a clear ValueError rather than appending a
dimensionless array (which breaks finalize_shares' reindexing). Update
add_temporal_contribution and add_periodic_contribution to check for ('effect'
in defining_expr.dims) when isinstance(defining_expr, xr.DataArray) and if
missing and effect is None raise an error instructing the caller to either pass
effect="..." or supply an 'effect' dimension; if effect is provided, continue to
expand_dims(effect=[effect]) and append to
_temporal_constant_defs/_periodic_constant_defs as before.

Comment on lines +808 to 820
# Detect contributors from combined share variables (share|temporal, share|periodic)
detected_contributors: set[str] = set()
for current_mode in modes_to_process:
share_name = share_var_map[current_mode]
if share_name in solution:
share_da = solution[share_name]
for c in share_da.coords['contributor'].values:
# Exclude effect-to-effect shares
base_name = str(c).split('(')[0] if '(' in str(c) else str(c)
if base_name not in effect_labels:
detected_contributors.add(str(c))
share_name = f'share|{current_mode}'
if share_name not in solution:
continue
share_da = solution[share_name]
for c in share_da.coords['contributor'].values:
base_name = str(c).split('(')[0] if '(' in str(c) else str(c)
if base_name not in effect_labels:
detected_contributors.add(str(c))

contributors = sorted(detected_contributors)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard against empty contributor sets when no share vars exist.
If neither share|temporal nor share|periodic is present, contributors stays empty and xr.concat(...) will raise. Consider short‑circuiting with an empty dataset (or a zero‑length contributor dim) to keep stats robust.

💡 Possible guard
         contributors = sorted(detected_contributors)
+        if not contributors:
+            logger.warning('No share variables found; returning empty effects dataset.')
+            return xr.Dataset()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Detect contributors from combined share variables (share|temporal, share|periodic)
detected_contributors: set[str] = set()
for current_mode in modes_to_process:
share_name = share_var_map[current_mode]
if share_name in solution:
share_da = solution[share_name]
for c in share_da.coords['contributor'].values:
# Exclude effect-to-effect shares
base_name = str(c).split('(')[0] if '(' in str(c) else str(c)
if base_name not in effect_labels:
detected_contributors.add(str(c))
share_name = f'share|{current_mode}'
if share_name not in solution:
continue
share_da = solution[share_name]
for c in share_da.coords['contributor'].values:
base_name = str(c).split('(')[0] if '(' in str(c) else str(c)
if base_name not in effect_labels:
detected_contributors.add(str(c))
contributors = sorted(detected_contributors)
# Detect contributors from combined share variables (share|temporal, share|periodic)
detected_contributors: set[str] = set()
for current_mode in modes_to_process:
share_name = f'share|{current_mode}'
if share_name not in solution:
continue
share_da = solution[share_name]
for c in share_da.coords['contributor'].values:
base_name = str(c).split('(')[0] if '(' in str(c) else str(c)
if base_name not in effect_labels:
detected_contributors.add(str(c))
contributors = sorted(detected_contributors)
if not contributors:
logger.warning('No share variables found; returning empty effects dataset.')
return xr.Dataset()
🤖 Prompt for AI Agents
In `@flixopt/statistics_accessor.py` around lines 808 - 820, The loop builds
detected_contributors and then uses contributors in downstream xr.concat which
will fail when empty; guard by checking after building detected_contributors
(symbols: detected_contributors, contributors, modes_to_process, share_name,
solution) and if it's empty either return/produce an empty xarray
Dataset/DataArray with a zero-length 'contributor' dimension or set up a
zero-length contributors list and a matching empty structure before calling
xr.concat so downstream code handles the empty case safely; implement the
short‑circuit immediately after contributors = sorted(detected_contributors).

…on now raise ValueError if a DataArray has no effect dimension and no effect= argument is provided.

  2. statistics_accessor.py: Early return with empty xr.Dataset() when no contributors are detected, preventing xr.concat from failing on an empty list.
@FBumann FBumann marked this pull request as ready for review February 1, 2026 14:28
@FBumann FBumann merged commit a6b7103 into feature/element-data-classes Feb 1, 2026
4 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants