Add Rust benchmarks for json ser/de, protobufs, compression, html rewriting#301
Conversation
benchmarks/rust.suite
Outdated
| # image-classification/image-classification-benchmark.wasm | ||
| pulldown-cmark/benchmark.wasm | ||
| regex/benchmark.wasm | ||
| rust-compression/benchmark.wasm | ||
| rust-html-rewriter/benchmark.wasm | ||
| rust-json/benchmark.wasm | ||
| rust-protobuf/benchmark.wasm | ||
| # tract-onnx-image-classification/benchmark.wasm |
There was a problem hiding this comment.
Why are there these two commented out entries?
There was a problem hiding this comment.
This is mirroring what is done in all.suite for the non-onnx case and I missed that the onnx one isn't commented there. I'm not quite sure on why the commenting is present there, it traces back to the original all.suite version from @abrown.
I'll see if these can run; my guess is that it might be expensive or doesn't/didn't work at some point in time, but I'll confirm that.
There was a problem hiding this comment.
Looks like image-classification requires OpenVINO and doesn't have a .wasm checked into tree. The onnx test is pure guest execution using tract and is included. I removed the image-classification bench comment and uncommented tract-onnx here.
There was a problem hiding this comment.
Looks like
image-classificationrequires OpenVINO and doesn't have a .wasm checked into tree.
Ah okay, I didn't realize that. We should either remove it from the suite or fix the build as necessary check in the wasm binary.
|
Looks like CI started failing. Maybe related to the suite changes? Feel free to revert if so. Feel free to merge when CI is passing. |
I think I've seen forms of this failure before on Windows and there's some flakiness in the tests. I'll rerun and see if I can capture an issue to track down the source of the flakiness -- doesn't appear to be related to the suite as it is occurring while running the tests in the core crates unrelated to the benchmark suite. |
|
#303 has a CI fix for the Windows failures -- still not entirely confident on the root cause but doesn't appear to be related to these open PRs as it was happening in my CI runs on main. |
Adds a benchmark testing JSON parsing and serialization performance using serde_json. The benchmark processes ~1.3MB of JSON data representing 100 user records with nested structures (profiles, settings, posts). Tests both deserialization (JSON → Rust structs) and serialization (Rust structs → JSON) in a single benchmark cycle.
Adds a benchmark testing Protocol Buffers encoding and decoding using prost. The benchmark processes ~1.0MB of protobuf binary data representing 100 user records with nested structures. Tests both deserialization (protobuf binary → Rust structs) and serialization (Rust structs → protobuf binary) in a single benchmark cycle. Includes a converter utility to generate protobuf binary data from JSON.
Adds a benchmark testing compression and decompression performance for two popular algorithms: Gzip (deflate) and Brotli. The benchmark processes ~1MB of mixed data (repeated patterns, structured data, natural text, and random bytes). Tests both compression and decompression in a single benchmark cycle, with verification that decompressed data matches the original.
Adds a benchmark testing HTML parsing and rewriting using lol_html, a streaming HTML rewriter developed by Cloudflare. The benchmark processes ~114KB of realistic HTML, performing multiple transformations: adding CSS classes, modifying links, injecting scripts, and adding security attributes. lol_html is used in production by Cloudflare Workers for edge HTML processing and represents real-world HTML transformation workloads.
Creates a new test suite that includes all benchmarks written in Rust, both existing and newly added. Also adds new tests to all.suite.
The tract-onnx test is pure guest execution of a model using tract. The image-classification has host dependencies on OpenVINO and does not have a wasm checked into tree at this time.
ba16cf2 to
bc57641
Compare
These commits add several new benchmarks implemented in rust:
These are intended to provide highly consistent measures of workloads that closely reflect some of the highest CPU usage for a typical serverless compute use case. Portions of the contents here were generated with the help of an LLM given the fairly mechanical task here.