Testing platform plans for 2026

For now just adding them here, needs some proper planning

- We need a way to bisect
- Make it AI friendly; this means that we always compile with debug info and for each test we make the full output (not just the result) available for download as well as the sample (directly). Basically, we want to be able to give claude one URL with a run and from there it should be able to access everything - binaries, coredumps (if there's one), input (i.e. samples), arguments, outputs (real and expected).
- We can probably do a UI refresh very quickly with AI now and modernize it.
- Estimation of test duration can give negative results...
-  There's no queue but we still say there is one in which you can have several tests waiting to run and all of them have 0 tests before them.
- Current state of tests are either Pass or Fail. We don't have a state of "Never worked" to reflect a situation in which we have samples but they have some problem and there's never been a version of CCExtractor that worked with those samples.
-We don't have a frontend (of any kind) to define the tests, it's all queries (SQL) direct to the DB.
- Building takes forever on Windows. Is there a way to cache artifacts, or the starting point to build? 
- Our test suite is limited. Many samples that have been shared in issues have never been added, even if we used them to fix bugs. We need to recover as many of those as we can as add them.
- Some videos are really long; we don't need one hour recording, or even 10 minutes. Usually if there's a problem, you can see it in the first minute.
-  We're not validating the Docker build

Separately:
- We need to update our samples, also importing reference corpus such as the one from VLC:
http://streams.videolan.org/streams/ts/

And we need to sort the streams better. Possibly using mediainfo, ffprobe, etc, to tag each of the samples we have. 

We are not testing every possible argument. When I say testing I don't mean against a sample file but by itself - even if the parsing works. We broke some things on the rust migration that apparently flew under the radar.

Some times, for unknown reasons, Sample Platform runs will not be triggered at all - the PR shows "All checks have passed" with no sign SP. Don't know if it's a problem with GitHub or SP or something else, but in other to trigger the run manual intervention is needed. 

- We need to add functionality to test Linux vs Windows results so we can see, for a given PR, whether the output is identical or not.

- The comparison system is very dump, comparing frame by frame - so if a frame (subtitle frame) is missing, then from that point everything is a mismatch.
- 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Testing platform plans for 2026 #91

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Testing platform plans for 2026 #91

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions