(won't merge) Read spill files with stream.perf #1419

mattcuento · 2026-01-28T17:51:37Z

Which issue does this PR close?

Closes #.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

milenkovicm · 2026-01-28T19:53:11Z

@sqlbenchmark run tpch -s 10 -i 3

sqlbenchmark · 2026-01-28T20:09:32Z

Ballista TPC-H Benchmark Results

PR: #1419 - prefer flight
PR Commit: f94b5e4
Base Commit: 7c7880e (main)
Scale Factor: SF10
Iterations: 3

Query Comparison

Query	Main (ms)	PR (ms)	Change
Q1	2543.90	2551.50	⚪ +0.3%
Q2	1916.60	2138.90	🔴 +11.6%
Q3	2079.50	2693.70	🔴 +29.5%
Q4	1262.20	1284.90	⚪ +1.8%
Q5	3978.40	6101.50	🔴 +53.4%
Q6	843.20	1018.60	🔴 +20.8%
Q7	5513.20	7496.30	🔴 +36.0%
Q8	5154.80	7212.70	🔴 +39.9%
Q9	6227.50	9145.00	🔴 +46.8%
Q10	2382.10	2715.60	🔴 +14.0%
Q11	1421.20	1628.90	🔴 +14.6%
Q12	1736.50	1765.40	⚪ +1.7%
Q13	2127.80	2116.10	⚪ -0.5%
Q14	1045.50	1087.10	⚪ +4.0%
Q15	1536.80	1576.60	⚪ +2.6%
Q16	1306.30	1306.90	⚪ +0.0%
Q17	4167.50	5795.30	🔴 +39.1%
Q18	6248.00	6553.30	⚪ +4.9%
Q19	1833.10	1848.80	⚪ +0.9%
Q20	1840.70	1828.10	⚪ -0.7%
Q21	5978.30	8029.80	🔴 +34.3%
Q22	1134.40	1034.20	🟢 -8.8%

Total: Main=62277.50ms, PR=76929.20ms (+23.5%)

Automated benchmark run by dfbench

milenkovicm · 2026-01-28T22:22:03Z

I have no idea why is it so slower than the other implementation. The only major difference is file ipc vs stream ipc

mattcuento · 2026-01-29T04:31:53Z

Yeah agreed. I'm going to try to do some local profiling to see where this is coming from.

I did notice that generally it looks like sort-based shuffle (without the stream fix) generally does much worse than hash-based for tpch #1401 (comment) (from your last experiment).

Huy1Ng · 2026-01-29T10:24:05Z

@sqlbenchmark run tpch -s 10 -i 3

it's off topic but what is it? I don't find any documentation about this new bot

milenkovicm · 2026-01-29T10:42:55Z

@sqlbenchmark run tpch -s 10 -i 3

it's off topic but what is it? I don't find any documentation about this new bot

its @andygrove effort to make perf tests easier, still work in progress

sqlbenchmark · 2026-01-29T11:03:50Z

Ballista TPC-H Benchmark Results

PR: #1419 - prefer flight
PR Commit: f94b5e4
Base Commit: 979d3af (main)
Scale Factor: SF10
Iterations: 3

Query Comparison

Query	Main (ms)	PR (ms)	Change
Q1	2542.60	2606.90	⚪ +2.5%
Q2	1895.10	2255.50	🔴 +19.0%
Q3	2037.40	2773.00	🔴 +36.1%
Q4	1248.70	1304.30	⚪ +4.5%
Q5	4003.30	6056.80	🔴 +51.3%
Q6	930.80	951.00	⚪ +2.2%
Q7	5501.80	7410.40	🔴 +34.7%
Q8	5186.90	7377.60	🔴 +42.2%
Q9	6234.40	9057.30	🔴 +45.3%
Q10	2322.10	2830.60	🔴 +21.9%
Q11	1475.00	1501.10	⚪ +1.8%
Q12	1762.80	1703.30	⚪ -3.4%
Q13	2116.70	2211.00	⚪ +4.5%
Q14	1101.00	1067.30	⚪ -3.1%
Q15	1590.90	1502.40	🟢 -5.6%
Q16	1306.90	1388.30	🔴 +6.2%
Q17	4147.10	5668.70	🔴 +36.7%
Q18	6261.70	6438.20	⚪ +2.8%
Q19	1791.90	1873.80	⚪ +4.6%
Q20	1805.00	1792.90	⚪ -0.7%
Q21	5774.30	8275.70	🔴 +43.3%
Q22	1081.80	1114.40	⚪ +3.0%

Total: Main=62118.20ms, PR=77160.50ms (+24.2%)

Automated benchmark run by dfbench

mattcuento added 3 commits January 26, 2026 21:09

feat: Read sort-based shuffle spill files via stream

66b442e

force sort shuffle on

33969a9

prefer flight

f94b5e4

mattcuento mentioned this pull request Jan 29, 2026

(wont merge) Baseline remote sort shuffle reads #1425

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(won't merge) Read spill files with stream.perf #1419

(won't merge) Read spill files with stream.perf #1419

Uh oh!

mattcuento commented Jan 28, 2026

Uh oh!

milenkovicm commented Jan 28, 2026

Uh oh!

sqlbenchmark commented Jan 28, 2026

Uh oh!

milenkovicm commented Jan 28, 2026

Uh oh!

mattcuento commented Jan 29, 2026 •

edited

Loading

Uh oh!

Huy1Ng commented Jan 29, 2026

Uh oh!

milenkovicm commented Jan 29, 2026 •

edited

Loading

Uh oh!

sqlbenchmark commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

(won't merge) Read spill files with stream.perf #1419

Are you sure you want to change the base?

(won't merge) Read spill files with stream.perf #1419

Uh oh!

Conversation

mattcuento commented Jan 28, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Uh oh!

milenkovicm commented Jan 28, 2026

Uh oh!

sqlbenchmark commented Jan 28, 2026

Ballista TPC-H Benchmark Results

Query Comparison

Uh oh!

milenkovicm commented Jan 28, 2026

Uh oh!

mattcuento commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Huy1Ng commented Jan 29, 2026

Uh oh!

milenkovicm commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sqlbenchmark commented Jan 29, 2026

Ballista TPC-H Benchmark Results

Query Comparison

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mattcuento commented Jan 29, 2026 •

edited

Loading

milenkovicm commented Jan 29, 2026 •

edited

Loading