Skip to content

Conversation

@x-tong
Copy link

@x-tong x-tong commented Jan 19, 2026

Summary

  • Implement Flink RowData to Arrow format conversion for Auron-Flink integration
  • Add FlinkArrowUtils, FlinkArrowWriter, FlinkArrowFieldWriter, and FlinkArrowFFIExporter
  • Support all common Flink types including primitives, temporal, and complex types
  • Add comprehensive unit tests

Test plan

  • Unit tests for FlinkArrowUtils type conversion
  • Unit tests for FlinkArrowWriter data writing
  • Unit tests for FlinkArrowFFIExporter (skipped when native libs unavailable)
  • Build passes with ./auron-build.sh --pre --sparkver 3.5 --scalaver 2.12 -DskipBuildNative
  • Code formatted with ./dev/reformat

Closes #1850

Implement Flink RowData to Arrow format conversion for Auron-Flink integration.

Key components:
- FlinkArrowUtils: Type conversion between Flink LogicalType and Arrow types
- FlinkArrowWriter: Converts Flink RowData to Arrow VectorSchemaRoot
- FlinkArrowFieldWriter: Field-level writers for all supported types
- FlinkArrowFFIExporter: Exports Arrow data via FFI for native consumption

Supported types:
- Primitive: Boolean, TinyInt, SmallInt, Int, BigInt, Float, Double
- String/Binary: VarChar, Char, VarBinary, Binary
- Temporal: Date, Time, Timestamp, LocalZonedTimestamp
- Complex: Array, Map, Row/Struct
- Decimal (128-bit)
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements Flink RowData to Arrow format conversion for the Auron-Flink integration. It introduces new utilities and writers to convert Flink's table data structures to Apache Arrow format, enabling efficient data exchange between Flink and native code via the Arrow C Data Interface.

Changes:

  • Added FlinkArrowUtils for type conversion between Flink LogicalType and Arrow types
  • Implemented FlinkArrowWriter and FlinkArrowFieldWriter for converting RowData to Arrow vectors
  • Added FlinkArrowFFIExporter for asynchronous FFI-based data export with producer-consumer pattern
  • Included comprehensive unit tests for all conversion and export functionality

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
FlinkArrowUtils.java Provides utilities for converting Flink types to Arrow types and creating Arrow schemas
FlinkArrowFieldWriter.java Implements field writers for all supported Flink types with recursive handling for complex types
FlinkArrowWriter.java Main writer class that orchestrates conversion of RowData to VectorSchemaRoot
FlinkArrowFFIExporter.java Asynchronous exporter using double-queue pattern for safe FFI data export
FlinkArrowUtilsTest.java Tests type conversion logic for all supported types
FlinkArrowWriterTest.java Tests data writing for basic, complex, and edge cases
FlinkArrowFFIExporterTest.java Tests FFI export functionality with native library availability checks
pom.xml Adds required Arrow and Flink dependencies
.gitignore Adds IDE/LSP configuration patterns

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Add try-finally resource protection in producer thread
- Return false on InterruptedException to avoid deadlock
- Fix comments: clarify nanoseconds vs microseconds
- Add Javadoc for resource management
@ShreyeshArangath
Copy link
Contributor

@x-tong is it possible to split this up into separate PRs to make it easier to review?

@x-tong
Copy link
Author

x-tong commented Jan 21, 2026

@x-tong is it possible to split this up into separate PRs to make it easier to review?是否可以将这个拆分成单独的 PR,以便更容易进行审查?

I will do this this week.

@x-tong
Copy link
Author

x-tong commented Jan 26, 2026

@ShreyeshArangath I've split this PR into 3 smaller PRs for easier review:

  1. Part 1 - FlinkArrowUtils (type conversion): [AURON #1850] Add FlinkArrowUtils for Flink-Arrow type conversion #1959
  2. Part 2 - FlinkArrowFieldWriter + FlinkArrowWriter (data writing): will submit after Part 1 is merged
  3. Part 3 - FlinkArrowFFIExporter (FFI export): will submit after Part 2 is merged

I'll keep this PR open for reference until all parts are merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Introduce Flink RowData to Arrow

2 participants