Skip to content

Conversation

@srnnkls
Copy link
Contributor

@srnnkls srnnkls commented Oct 26, 2025

No description provided.

This commit adds extensive benchmark tests using pytest-benchmark to measure
the performance overhead introduced by SerializationProxy across various
operations:

- Proxy creation overhead for BaseModel, dataclass, and nested structures
- Attribute access overhead (single and nested)
- Iteration overhead for collections
- Serialization operations via proxy's built-in serializer
- Custom field serializer overhead
- Memory access patterns (repeated vs different attributes)
- String representation (__str__ and __repr__)
- End-to-end workflow scenarios

Each benchmark includes a corresponding baseline test (direct operations
without proxy) to measure the actual overhead introduced by the proxy layer.

The benchmarks cover:
- Simple and nested BaseModel instances
- Simple and nested dataclass instances
- Models with custom field serializers
- Dictionary and list comparisons

Dependencies:
- Added pytest-benchmark>=5.1.0 to dev dependencies

All 33 benchmark tests pass successfully.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@gemini-code-assist
Copy link

Summary of Changes

Hello @srnnkls, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the project's testing infrastructure by adding a comprehensive set of benchmark tests for the SerializationProxy. These tests are designed to quantify the performance impact of using the proxy layer for operations like object creation, attribute access, iteration, and serialization, providing valuable insights into its efficiency. The changes also include necessary dependency updates and a slight adjustment to the supported Python versions.

Highlights

  • Comprehensive Benchmarking: Introduced a new suite of benchmark tests to thoroughly measure the performance overhead of SerializationProxy operations across various use cases.
  • New Dependency: pytest-benchmark: Added pytest-benchmark to the development dependencies to facilitate the new performance tests, along with its dependency py-cpuinfo.
  • Python Version Compatibility: Adjusted the minimum required Python version from >=3.12 to >=3.11 in the project configuration.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive suite of benchmark tests for the SerializationProxy to measure its overhead. The overall structure is well-organized, with clear baseline comparisons for various operations. I've identified a couple of critical issues in the benchmark setup where TypeAdapter instantiation was included in the measurement, which would skew the results. I've provided suggestions to correct this for more accurate benchmarking. Additionally, I've pointed out a couple of minor code quality improvements.

Comment on lines +143 to +151
def test_benchmark_build_vs_typeadapter_dump(self, benchmark, simple_model):
"""Compare proxy build time with direct TypeAdapter.dump_python."""

def direct_serialize():
adapter = TypeAdapter(type(simple_model))
return adapter.dump_python(simple_model)

# This benchmark measures the baseline serialization time
benchmark(direct_serialize)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Creating the TypeAdapter inside the direct_serialize function includes its instantiation overhead in the benchmark, which can skew the results. To get a more accurate baseline for serialization, the adapter should be created once outside the benchmarked function. The test can also be simplified by passing the adapter.dump_python method directly to the benchmark.

Suggested change
def test_benchmark_build_vs_typeadapter_dump(self, benchmark, simple_model):
"""Compare proxy build time with direct TypeAdapter.dump_python."""
def direct_serialize():
adapter = TypeAdapter(type(simple_model))
return adapter.dump_python(simple_model)
# This benchmark measures the baseline serialization time
benchmark(direct_serialize)
def test_benchmark_build_vs_typeadapter_dump(self, benchmark, simple_model):
"""Compare proxy build time with direct TypeAdapter.dump_python."""
adapter = TypeAdapter(type(simple_model))
# This benchmark measures the baseline serialization time
benchmark(adapter.dump_python, simple_model)

Comment on lines +423 to +441
def test_benchmark_direct_complete_workflow(self, benchmark, nested_model):
"""Benchmark complete workflow without proxy (baseline)."""

def complete_workflow():
# Direct access
_ = nested_model.id
_ = nested_model.data.name
items_len = len(nested_model.items)
# Iterate
count = 0
for item in nested_model.items:
count += 1
if count >= 5: # Just sample a few to test iteration
break
# Serialize
adapter = TypeAdapter(NestedModel)
return adapter.dump_python(nested_model)

benchmark(complete_workflow)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The TypeAdapter is created inside the complete_workflow function, which is benchmarked. This includes the adapter's creation overhead in the measurement, which is not ideal for a baseline comparison. The adapter should be created once, outside the function passed to benchmark.

Suggested change
def test_benchmark_direct_complete_workflow(self, benchmark, nested_model):
"""Benchmark complete workflow without proxy (baseline)."""
def complete_workflow():
# Direct access
_ = nested_model.id
_ = nested_model.data.name
items_len = len(nested_model.items)
# Iterate
count = 0
for item in nested_model.items:
count += 1
if count >= 5: # Just sample a few to test iteration
break
# Serialize
adapter = TypeAdapter(NestedModel)
return adapter.dump_python(nested_model)
benchmark(complete_workflow)
def test_benchmark_direct_complete_workflow(self, benchmark, nested_model):
"""Benchmark complete workflow without proxy (baseline)."""
adapter = TypeAdapter(NestedModel)
def complete_workflow():
# Direct access
_ = nested_model.id
_ = nested_model.data.name
items_len = len(nested_model.items)
# Iterate
count = 0
for item in nested_model.items:
count += 1
if count >= 5: # Just sample a few to test iteration
break
# Serialize
return adapter.dump_python(nested_model)
benchmark(complete_workflow)

"""

from dataclasses import dataclass
from typing import Any

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Any type is imported but not used in this file. It's good practice to remove unused imports to maintain code cleanliness.

proxy = SerializationProxy.build(model_with_serializer)

def access_and_serialize():
_ = proxy.name # Should apply the serializer

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comment # Should apply the serializer is likely incorrect. Attribute access on the proxy does not trigger Pydantic's @field_serializer. The serializer is applied during the actual serialization process, which is correctly benchmarked by the proxy.__pydantic_serializer__.to_python(proxy) call. This line is unnecessary for the benchmark and the comment is misleading, so it can be removed.

@srnnkls srnnkls changed the title Add comprehensive benchmark tests for SerializationProxy overhead Add benches for SerializationProxy overhead Oct 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants