Skip to content

Conversation

@Prajwal-banakar
Copy link
Contributor

@Prajwal-banakar Prajwal-banakar commented Jan 25, 2026

Purpose

Linked issue: close #469

The purpose of this change is to automate the generation of the "Configuration Reference" documentation. Previously, this was a manual process prone to "documentation drift." This new system ensures the website stays perfectly in sync with the ConfigOptions.java source file using a reflection-based generator.

Brief change log

Brief change log
New Module: Introduced fluss-docgen to house the documentation generation logic, keeping the production build lean.

Reflection Scanning: Implemented ConfigOptionsDocGenerator using Java Reflection to scan for ConfigOption fields and their metadata.

Custom Annotations: Introduced @ConfigSection and @ConfigOverrideDefault to provide granular control over how options are categorized and displayed.

Categorization: Implemented logic to group configurations into sections based on key prefixes (e.g., Acl, Client, Server) for better navigation.

MDX Integration: Adopted an MDX-based partial import strategy. The generator outputs config_reference.mdx, which is imported by configuration.md. This resolves Docusaurus pathing limitations while keeping the documentation structure clean.

Human-Readable Formatting: Integrated utilities to format Duration and MemorySize default values into readable strings (e.g., "15 min", "64 mb").

JSX/React Safety: Implemented character escaping for curly braces and angle brackets within descriptions to ensure stable rendering in the Docusaurus React engine.

Tests

Manual Verification: Verified by running the generator and confirming the config_reference.mdx file is successfully created with all categorized options.

Website Rendering: Confirmed via npm run start that the tables render correctly without React hydration errors or "Module not found" exceptions.

Build Stability: Verified that the new module integrates into the Maven reactor and passes mvn clean install -DskipTests.

Style: Passed spotless:check and checkstyle checks.

Screenshot 2026-01-27 113308

API and Format

This change does not affect the public API or storage format of Fluss. It only introduces a build-time utility module.

Documentation

This PR introduces a new automated system for documentation. The generated output is located at website/docs/maintenance/config_reference.mdx and displayed via website/docs/maintenance/configuration.md.

@wuchong
Copy link
Member

wuchong commented Jan 25, 2026

Thanks, @Prajwal-banakar. It looks like this framework generates Markdown files for the configuration options. However, I’m not sure how these can be integrated or embedded into our existing documentation Markdown files, such as:

I initially assumed the framework would generate HTML snippets, which would be straightforward to embed directly into the documentation, like how Flink does. Could you clarify how the generated Markdown is intended to be incorporated into the current doc structure?

Besides, it would be grreat if you can add a README.md under the fluss-docgen module root. That would help others to understand how to use this tool. Flink has such README: https://github.com/apache/flink/blob/master/flink-docs/README.md

@Prajwal-banakar
Copy link
Contributor Author

Hi @wuchong thank you for the insightful feedback!

To address your points and clarify the integration strategy:

Markdown vs. HTML Snippets: You are absolutely right. Generating HTML snippets is a much cleaner approach for embedding. I will update the generator to produce HTML tables/lists instead of pure Markdown headers. HTML is more robust for Docusaurus and prevents the "styling" of configurations from clashing with the page hierarchy.

Integration Structure: My vision for incorporating this into the existing structure (like maintenance/configuration.md) is to use a marker-based injection system. I will add hidden HTML comments (e.g., and) to the existing documentation files. The fluss-docgen tool will then:

Scan for these markers in the target .md files.

Replace only the content between them with the latest generated HTML snippets.

This ensures that manual content (introductions, custom notes) remains untouched while the configuration list stays automated.

README.md: I will definitely add a README.md under the fluss-docgen module root, following the Flink example, to document how to trigger the build and how the injection logic works.

Plan of Action: I will proceed with refactoring the generator to produce HTML snippets and set up the marker-based injection for the specific files you linked. I’ll also include the README in the next push.

Does this refined approach align with your expectations?"

@wuchong
Copy link
Member

wuchong commented Jan 25, 2026

Hi @Prajwal-banakar,

The goal of generating HTML was primarily to make it easy to embed configuration content into our documentation Markdown files. However, after reviewing the Docusaurus documentation, I believe it’s actually simpler and more maintainable to embed Markdown (or MDX) directly into Markdown files using Docusaurus’ built-in MDX support—see:
https://docusaurus.io/docs/3.8.1/markdown-features/react#importing-markdown

Here’s a refined proposal:

  1. Generate configs as an MDX partial:
    Output the auto-generated configuration reference into a file like _partial_config.mdx under website/docs/_configs/. This file can then be imported and rendered in any doc page using MDX.

  2. Support logical grouping via annotations:
    Introduce a @Documentation.Section annotation in the source code (e.g., on config fields or classes). Use the provided section name to generate corresponding headers in the output. Preserve the existing section structure from the current config doc at
    https://fluss.apache.org/docs/next/maintenance/configuration/
    as the baseline.

  3. Allow overriding system-dependent defaults:
    Add a @Documentation.OverrideDefault annotation to let developers specify a documentation-friendly default value (e.g., for client.scanner.io.tmpdir, which defaults to a system property). This avoids exposing environment-specific values like /tmp in the docs.

  4. Format defaults as inline code:
    Render all “Default” values in <code> style (e.g., `7d`) for better readability and consistency.

  5. Include a “Type” column:
    Clearly indicate the type of each configuration option (e.g., Duration, Boolean, MemorySize, String) in the generated table.

This approach keeps the docs accurate, maintainable, and aligned with Docusaurus best practices while enabling full automation of config documentation.

@Prajwal-banakar
Copy link
Contributor Author

"Hi @wuchong

I have completed the task of automating the configuration documentation. I’ve implemented a reflection-based generator in a new fluss-docgen module that ensures our docs stay perfectly in sync with the ConfigOptions source code.

Note on File Path & Naming: While implementing the MDX integration, I encountered a technical limitation with Docusaurus's path resolution. When the generated file was located in a separate _configs folder and used an underscore prefix (e.g., _partial_config.mdx), the development server failed to resolve the relative import, resulting in 'Module not found' errors.

To resolve this and ensure a stable build, I made two adjustments:

I moved the generated config_reference.mdx into the same directory as configuration.md (website/docs/maintenance/).

I removed the underscore prefix to ensure Docusaurus treats it as a standard, resolvable MDX component.

This setup is currently working perfectly on the local site. Please let me know if you’d prefer a different organizational structure, and I’ll be happy to adjust!

@Prajwal-banakar
Copy link
Contributor Author

Hi @wuchong , I’m a bit stuck on the CI. Everything is passing locally on my machine (Build Success, Spotless, and Tests), and I've verified the MDX output is correct. However, the fluss-test-coverage check keeps failing in the CI.

I've added a unit test in fluss-docgen to cover the generator logic, but it seems the CI is still flagging coverage. Is there a specific configuration in the project's JaCoCo setup I might be missing for new modules? Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[docs][build] Generate config options docs automatically from ConfigOptions

2 participants