Skip to content

Conversation

@ShreyeshArangath
Copy link
Contributor

@ShreyeshArangath ShreyeshArangath commented Jan 25, 2026

Which issue does this PR close?

Closes #1889

Rationale for this change

Adds's support for non-deterministic function, as part of #1833

What changes are included in this PR?

Implements native support for Spark's `monotonically_increasing_id()`` function in Auron.

Functionality TL;DR:
The monotonically_increasing_id() function generates unique, monotonically increasing 64-bit integers across all partitions. Each partition generates IDs using the formula:

id := (partition_id << 33) | row_number
  • Each ID is globally unique across all partitions
  • IDs increase monotonically within each partition
  • The upper 31 bits encode the partition ID, while the lower 33 bits encode the row number within that partition

Are there any user-facing changes?

N/A

How was this patch tested?

Unit tests

@ShreyeshArangath ShreyeshArangath changed the title [AURON #1889]Implement monotonically_increasing_id() function [AURON #1889] Implement monotonically_increasing_id() function Jan 25, 2026
@ShreyeshArangath ShreyeshArangath marked this pull request as ready for review January 25, 2026 05:13
@cxzl25 cxzl25 requested a review from Copilot January 26, 2026 03:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements native support for Spark’s monotonically_increasing_id() as a non-deterministic physical expression in Auron, wiring it through the Spark shims, protobuf plan representation, and the Rust planner and execution engine.

Changes:

  • Adds a MonotonicallyIncreasingIdExprNode to the protobuf PhysicalExprNode oneof and wires it through the Scala ShimsImpl.convertMoreExprWithFallback.
  • Introduces SparkMonotonicallyIncreasingIdExpr in datafusion-ext-exprs, including unit tests that validate type, nullability, monotonicity, partition offsets, and partition separation.
  • Extends the Rust PhysicalPlanner to build SparkMonotonicallyIncreasingIdExpr from the new protobuf expression type and exposes the module from datafusion-ext-exprs.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
spark-extension-shims-spark/src/main/scala/org/apache/spark/sql/auron/ShimsImpl.scala Maps Spark’s MonotonicallyIncreasingID Catalyst expression to the new protobuf MonotonicIncreasingIdExprNode for native planning.
native-engine/datafusion-ext-exprs/src/spark_monotonically_increasing_id.rs Implements the physical expression that generates 64-bit partition-scoped monotonically increasing IDs and adds unit tests for behavior.
native-engine/datafusion-ext-exprs/src/lib.rs Exposes the new spark_monotonically_increasing_id module from the extension expressions crate.
native-engine/auron-planner/src/planner.rs Deserializes the new protobuf MonotonicIncreasingIdExpr into SparkMonotonicallyIncreasingIdExpr during physical planning.
native-engine/auron-planner/proto/auron.proto Extends the physical expression protobuf with MonotonicIncreasingIdExprNode and its field in PhysicalExprNode, and relocates SparkPartitionIdExprNode to avoid duplication.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement monotonically_increasing_id() function

1 participant