Skip to content

Conversation

@PlethoraChutney
Copy link

Hello! Thank you for MDAnalysis! This PR changes how MRC files are checked before being loaded as a 3D grid. Rather than use the is_volume() method from mrcfile we instead use a simpler shape check.

The reason for this is that mrcfile returns False if the ISPG header field of a 3D MRC file is zero. While this is correct according to the spec, most cryoEM software packages save a 0 in this field for a variety of reasons. Because of this unwritten deviation from spec, most packages also read in files with a 0 space group as 3D Volumes anyway, but GridDataFormats does not.

Despite the fact that this is formally correct, I'm proposing that MDAnalysis should follow the behavior that users likely expect from their experience with other packages.

This PR fails three tests, but on my machine at least the current master branch also fails these same tests.

  • gridData/tests/test_grid.py::TestGrid::test_load_wrong_fileformat[ccp4]
  • gridData/tests/test_grid.py::TestGrid::test_load_wrong_fileformat[plt]
  • gridData/tests/test_grid.py::TestGrid::test_load_wrong_fileformat[dx]

@codecov
Copy link

codecov bot commented Jan 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.76%. Comparing base (abd0feb) to head (a405729).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #149      +/-   ##
==========================================
+ Coverage   87.70%   87.76%   +0.06%     
==========================================
  Files           5        5              
  Lines         748      752       +4     
  Branches       97       98       +1     
==========================================
+ Hits          656      660       +4     
  Misses         56       56              
  Partials       36       36              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@orbeckst
Copy link
Member

@PlethoraChutney thank you for your contribution and the insights. Could you please raise an issue that describes the issue — basically your justification for the fix? Separating solution from issues is good practice when we're discussing changes that may go against specs. Issues are our permanent record for decisions.

We can then link the PR to the issue.

Once you have the issue up, we can look closer at your proposed solution. Thank you!

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocked until we have discussed the approach.

@BradyAJohnston
Copy link
Member

It's related to this issue (BradyAJohnston/MolecularNodes#993) raised in MN. Downloading directly from the EMDB it would load fine, but if we round-trip through ChimeraX then it would fail to load with Grid() because of the points specified above.

Worth some proper discussion though!

@orbeckst
Copy link
Member

Thanks for the context — ideally we collect all of this in an issue and then we have one place that documents the decision and the reasoning, as opposed to potentially mixing it with implementation details etc.

@PlethoraChutney
Copy link
Author

Apologies! I'm relatively new to actually contributing to others' repos. Created #150 for discussion

@orbeckst
Copy link
Member

Many thanks @PlethoraChutney , #150 is perfect.

Different projects have different "cultures" on how they do development and when coming in you just don't know.

@PlethoraChutney
Copy link
Author

Tried to capture the discussion in #150

  • Grid() now takes a **kwargs parameter which is passed to all load functions. They all ignore it except MRC
  • MRC() takes load_mrc: None | bool. By default (None), mrcfile.is_volume() is used. If a bool is provided, it forces the file to be a volume (True) or a 2D image stack (False)
  • Added tests to cover this behavior

def __init__(self, grid=None, edges=None, origin=None, delta=None,
metadata=None, interpolation_spline_order=3,
file_format=None):
file_format=None, **kwargs):
Copy link
Member

@BradyAJohnston BradyAJohnston Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer an actual argument rather than **kwargs as the latter can easily (and silently) absorb spelling mistakes fiel_format='MRC would be silently no longer passed on instead of raising an error otherwise. I defer to @orbeckst on this but I think just mentioning in the docs that it is only used for mrc files (and maybe a warning when used for non-compatible format?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that **kwargs can silently swallow typos. On the other hand, tracking individual modifier keywords for different parsers can get annoying, at least in principle. In MDAnalysis we've typically been using the approach of named keywords for arguments that universally apply and kwargs for modifiers.

GDF really does not see a flurry of development activity so the maintainability argument that I am raising is not particularly strong and I do see the value of being explicit and more robust for users. So then let's follow your idea and make is_volume a top-level kwarg of Grid, document it there and in MRC (with a proper versionadded), and propagate it explicitly.

Copy link
Member

@BradyAJohnston BradyAJohnston Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel strongly about it mostly because I am endlessly spending hours bug fixing when it was just me with simple spelling mistakes.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making the changes. Please see comments.

Please also

  • add an entry to CHANGELOG under Enhancements for 1.1.0
  • add yourself to AUTHORS
  • check that the docs render correctly (follow the link under the docs deployment check)

def __init__(self, grid=None, edges=None, origin=None, delta=None,
metadata=None, interpolation_spline_order=3,
file_format=None):
file_format=None, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per conversation: explicit is_volume=None kwarg.

Document it in the Grid docs with something like

is_volume : bool
    Indicate that the data to be loaded from a file is a density,
    even if the file may indicate differently. Currently only 
    applies for MRC files (see :class:`gridData.MRC.MRC`)

    The default ``None`` indicates that the data is handled
    according to the file format specifications.

    .. versionadded:: 1.1.0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or use the docs that you already wrote :-) (+ the versionadded)


if filename is not None:
self.load(filename, file_format=file_format)
self.load(filename, file_format=file_format, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per conversation: explicit is_volume=is_volume kwarg.

def _load_mrc(self, filename, **kwargs):
"""Initializes Grid from a MRC/CCP4 file."""
mrcfile = mrc.MRC(filename)
mrcfile = mrc.MRC(filename, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly pull out kwargs

Suggested change
mrcfile = mrc.MRC(filename, **kwargs)
mrcfile = mrc.MRC(filename, is_volume=kwargs.get("is_volume", None))

use the MRC file header. If True, force the file to be read as a
volume. If False, the file is intepreted as a 2D image stack and
raise a ValueError.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. versionadded:: 1.1.0

Check the rendered docs!

self.filename = filename
if filename is not None:
self.read(filename)
self.read(filename, is_volume)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use kwargs explicitly, not implicitly.

Suggested change
self.read(filename, is_volume)
self.read(filename, is_volume=is_volume)

Grid(datafiles.ISPG_0)

def test_mrcfile_volume_force():
Grid(datafiles.ISPG_0, is_volume=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert something about the data

elif is_volume:
# non 3D volumes should always fail, regardless of is_volume value
is_volume = mrc.data is not None and len(mrc.data.shape) == 3
if not is_volume: #pragma: no cover
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question I had, never having used coverage.py before. Do we still need this pragma: no cover?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because you added an explicit test (which is great!).

Remove it. Good catch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GridDataFormats refuses to load 3D MRC files with space group 0

3 participants