Intermittent subquery panic caused by unnest query

An [`unnest` query](https://github.com/brimdata/super/blob/66dc4e5d8bca5f594bac163e4cd855234841b459/book/src/tutorials/jq.md?plain=1#L1113-L1121) from an mdtest has been observed to trigger intermittent panics in CI, such as in [this job](https://github.com/brimdata/super/actions/runs/21656145218/job/62431186345).

```
                panic: context canceled
                
                goroutine 56 [running]:
                github.com/brimdata/super/runtime/sam/op/subquery.(*Subquery).Eval(0xc0005a0de0, {{0x242f200?, 0xc00059eb70?}, 0xc000a8a003?, 0x1ef7800?})
                	/home/runner/work/super/super/runtime/sam/op/subquery/subquery.go:121 +0x7e5
                github.com/brimdata/super/runtime/sam/op/unnest.(*Unnest).unnest(0xc0000bac80, {{0x242f200?, 0xc00059eb70?}, 0xc000a8a003?, 0x0?})
                	/home/runner/work/super/super/runtime/sam/op/unnest/unnest.go:57 +0x3d
                github.com/brimdata/super/runtime/sam/op/unnest.(*Unnest).Pull(0xc0000bac80, 0x0?)
                	/home/runner/work/super/super/runtime/sam/op/unnest/unnest.go:46 +0x159
                github.com/brimdata/super/runtime/sam/op/aggregate.(*Op).run(0xc0000bacd0)
                	/home/runner/work/super/super/runtime/sam/op/aggregate/aggregate.go:195 +0xa2
                created by github.com/brimdata/super/runtime/sam/op/aggregate.(*Op).Pull.func1 in goroutine 55
                	/home/runner/work/super/super/runtime/sam/op/aggregate/aggregate.go:163 +0x66
```

## Details

Repro is with super commit 66dc4e5.

In running the query outside of CI, I've found I can't seem to repro it on my Macbook. However, I can reliably repro it on an `ubuntu-22.04` hosted GitHub Actions Runner. To get a faster repro rate, I start four concurrent instances of:

```
cat /dev/urandom > /dev/null &
```

Tying up all the cores in this way seems to aggravate the likely timing issue that's causing the panic. I need to be in the `book/src/tutorials` directory of the my checkout of the super repo to access the input data. Then I can start to see some panics within a minute:

```
$ super -version
Version: v0.1.0-11-g66dc4e5d8

$ while true; do super -S -c '
  unnest {user:user.login,reviewer:requested_reviewers} into (
    reviewers:=union(reviewer.login) by user
  )
  | groups:=union(reviewers) by user
  | sort user,len(groups)
' prs.bsup; done > /dev/null

panic: context canceled

goroutine 45 [running]:
github.com/brimdata/super/runtime/sam/op/subquery.(*Subquery).Eval(0xc00044b260, {{0x2430220?, 0xc0003a5860?}, 0xc000892408?, 0x2430220?})
        /home/runner/super/runtime/sam/op/subquery/subquery.go:121 +0x7e5
github.com/brimdata/super/runtime/sam/op/unnest.(*Unnest).unnest(0xc000540c80, {{0x2430220?, 0xc0003a5860?}, 0xc000892408?, 0xc00041e110?})
        /home/runner/super/runtime/sam/op/unnest/unnest.go:57 +0x3d
github.com/brimdata/super/runtime/sam/op/unnest.(*Unnest).Pull(0xc000540c80, 0x0?)
        /home/runner/super/runtime/sam/op/unnest/unnest.go:46 +0x159
github.com/brimdata/super/runtime/sam/op/aggregate.(*Op).run(0xc000540cd0)
        /home/runner/super/runtime/sam/op/aggregate/aggregate.go:195 +0xa2
created by github.com/brimdata/super/runtime/sam/op/aggregate.(*Op).Pull.func1 in goroutine 44
        /home/runner/super/runtime/sam/op/aggregate/aggregate.go:163 +0x66
panic: context canceled

goroutine 52 [running]:
github.com/brimdata/super/runtime/sam/op/subquery.(*Subquery).Eval(0xc000391560, {{0x2430220?, 0xc00054a120?}, 0xc0009074c7?, 0x0?})
        /home/runner/super/runtime/sam/op/subquery/subquery.go:121 +0x7e5
github.com/brimdata/super/runtime/sam/op/unnest.(*Unnest).unnest(0xc000574d20, {{0x2430220?, 0xc00054a120?}, 0xc0009074c7?, 0xc000054070?})
        /home/runner/super/runtime/sam/op/unnest/unnest.go:57 +0x3d
github.com/brimdata/super/runtime/sam/op/unnest.(*Unnest).Pull(0xc000574d20, 0x0?)
        /home/runner/super/runtime/sam/op/unnest/unnest.go:46 +0x159
github.com/brimdata/super/runtime/sam/op/aggregate.(*Op).run(0xc000574d70)
        /home/runner/super/runtime/sam/op/aggregate/aggregate.go:195 +0xa2
created by github.com/brimdata/super/runtime/sam/op/aggregate.(*Op).Pull.func1 in goroutine 51
        /home/runner/super/runtime/sam/op/aggregate/aggregate.go:163 +0x66
...
```

@mattnibs has looked at this one and said it started happening when we made it such that `unnest into` compiles into `unnest [unnest]`:

```
$ super compile -dag -C 'unnest [1,2,3] into ( sum(this) )'
null
| unnest (
  unnest [1,2,3]
  | aggregate
      sum:=sum(this)
  | values sum
  | aggregate
      collect:=collect(this)
  | values collect)
| output main
```

And regarding the cause of the panic, observed:

> The problem is we don't have a way of handling the error in the inner flowgraph of the subquery. Subquery panics when it encounters an error:
>```
>		b, err := s.body.Pull(false)
>		if err != nil {
>			panic(err)
>		}
>```
>Cached subquery returns the error as a value:
>```
>		batch, err := c.body.Pull(false)
>		if err != nil {
>			return c.rctx.Sctx.NewError(err)
>		}
>```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intermittent subquery panic caused by unnest query #6601

Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Intermittent subquery panic caused by unnest query #6601

Description

Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions