Skip to content

Conversation

@gordonwoodhull
Copy link
Contributor

@gordonwoodhull gordonwoodhull commented Dec 30, 2025

Apologies for filing a PR over holidays. We can discuss next week when everyone is back, but I wanted to put in a draft.

Marimo's preferred syntax for executable cells is

```{python.marimo}

This already works fine as long as the engine is directly specified in the metadata

engine: marimo

But the engine won't claim the language without this, because jupyter claims python.

I think it would be nice for us to support claiming by both language and first class, because it allows engine discovery to resolve in favor of an engine extension when the extension supports the same language as a built-in engine.

It can be done in a backward-compatible way:

  claimsLanguage: (language: string, firstClass?: string) => boolean | number;

Where old engines still return true, false, treated as 1 or 0, and new engines can return 2.

But this is a fundamental change to engine discovery, so I want to discuss as a group before merging.

@dmadisetti

@posit-snyk-bot
Copy link
Collaborator

posit-snyk-bot commented Dec 30, 2025

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@gordonwoodhull gordonwoodhull changed the title feature: engine claim class feature: engine claim by language and class Dec 30, 2025
@gordonwoodhull gordonwoodhull force-pushed the feature/engine-claim-class branch from f6b0d92 to 022f9cd Compare December 30, 2025 18:24
@gordonwoodhull
Copy link
Contributor Author

gordonwoodhull commented Jan 15, 2026

It needs to have a space in order to be compatible with the way we are going to handle this going forward.

```{python .marimo}

The situation in Quarto 1.9 is that we split execution cell blocks with regexes and everything after the language in executable is handled as raw text by the execution engine.

In the near future, we will have a proper parser for Quarto markdown, so these will need to follow Pandoc's conventions, allowing #tag and .class and attr=val explicitly but requiring spaces between them and between the execution language and the extra parameters.

I also need to change false to mean "do not consider" rather than 0 score, because the current logic would cause every language to be claimed even if no engine claimed it. 😛

Hope to get these fixes in and update my fork of quarto-marimo in the next week.

gordonwoodhull and others added 5 commits January 30, 2026 16:54
Add support for extracting the first class from code block attributes
(e.g., {python .marimo}) and passing it to engine's claimsLanguage method.

Changes:
- Add languagesWithClasses() function returning Map<language, class|undefined>
- Update languagesInMarkdown() to use languagesWithClasses internally
- Update claimsLanguage signature to accept optional firstClass parameter
  and return boolean | number for priority-based selection
- Update markdownExecutionEngine to use scoring: highest score wins,
  first engine wins ties (backwards compatible)
- Expose getLanguagesWithClasses in the markdownRegex API namespace

Backwards compatibility:
- Old engines returning true -> score 1
- Old engines returning false -> score 0
- New engines can return numbers > 1 to override

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update the `quarto create extension engine` template to reflect the new
claimsLanguage signature that accepts an optional firstClass parameter
and returns boolean | number for priority-based selection.

No behavioral change - just updated signature and documentation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use breakQuartoMd API for cell iteration instead of raw regex, making
the code more idiomatic and fixing Windows CI failures caused by line
ending differences. Also fix test syntax to use {python .foo}
(space-separated) which is recognized by breakQuartoMd.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update regexes in break-quarto-md.ts and pandoc-partition.ts to allow
dots in language names. This is compatible with pampa executable cell
syntax and also with the preferred class syntax {python .marimo}.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- false means "don't claim" (skip engine entirely)
- true means claim with priority 1
- any number means claim with that priority (higher wins)

Changed bestScore from 0 to -Infinity so any number can win.
Added class-override-twice test verifying priority 3 beats priority 2.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@gordonwoodhull gordonwoodhull force-pushed the feature/engine-claim-class branch from 022f9cd to a71b54e Compare January 30, 2026 21:54
@gordonwoodhull gordonwoodhull marked this pull request as ready for review January 30, 2026 23:59
@gordonwoodhull
Copy link
Contributor Author

Okay, it turned out that my comment above is wrong; our future parser pampa accepts

```{python.marimo}
```

It just doesn't parse .marimo as a class; it interprets the whole string as a language. So this will work along with the preferred syntax

```{python .marimo}
```

and legacy syntax

```python {.marimo}
```

The difference is that the legacy syntax can't claim the language, so it won't work without

engine: marimo

in the metadata. The other two syntax do work without explicit engine, because quarto-marimo claims both language python, class marimo, and also language python.marimo.

I have the PR ready for quarto-marimo and it will work when this is released in Quarto 1.9.19.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants