Skip to content

Refactor transpiler into modular architecture with separated concerns#54

Merged
sudo-owen merged 9 commits intomainfrom
claude/refactor-sol2ts-transpiler-Yp1ez
Feb 2, 2026
Merged

Refactor transpiler into modular architecture with separated concerns#54
sudo-owen merged 9 commits intomainfrom
claude/refactor-sol2ts-transpiler-Yp1ez

Conversation

@sudo-owen
Copy link
Collaborator

Summary

This PR refactors the Solidity to TypeScript transpiler from a monolithic structure into a clean, modular architecture with clear separation of concerns. The codebase is reorganized into specialized modules while maintaining full backward compatibility through re-exports.

Key Changes

  • New modular structure: Created dedicated subpackages for different concerns:

    • lexer/: Tokenization and lexical analysis
    • parser/: AST nodes and parsing logic
    • types/: Type registry and type conversion utilities
    • codegen/: Code generation with specialized generators
  • Code generation refactoring: Split monolithic code generation into focused modules:

    • base.py: Shared utilities and expression analysis
    • context.py: Centralized state management via CodeGenerationContext
    • abi.py: ABI type inference for encoding operations
    • type_converter.py: Type conversion logic
    • expression.py, statement.py, function.py: Domain-specific generators
    • definition.py, imports.py, contract.py: Higher-level constructs
    • generator.py: Main orchestrator
  • State consolidation: Moved scattered instance variables into CodeGenerationContext dataclass for:

    • Indentation management
    • File and contract context tracking
    • Variable type tracking
    • Import and reference tracking
    • Type knowledge caching
  • Backward compatibility: Maintained sol2ts.py as a compatibility layer that re-exports all classes, ensuring existing code continues to work without changes

  • Enhanced documentation: Added comprehensive module docstrings explaining the purpose and structure of each component

Implementation Details

  • CodeGenerationContext uses @dataclass for clean state management with sensible defaults
  • Qualified name caching in context improves performance for repeated type lookups
  • BaseGenerator provides common utilities to all specialized generators, reducing code duplication
  • AbiTypeInferer handles complex type inference for abi.encode, abi.encodePacked, etc.
  • All generators inherit from BaseGenerator and access shared state through the context

Benefits

  • Maintainability: Each module has a single, clear responsibility
  • Testability: Smaller, focused classes are easier to unit test
  • Extensibility: New generators can be added without modifying existing code
  • Performance: Centralized caching and context reduces redundant lookups
  • Clarity: Code organization reflects the logical flow of transpilation

https://claude.ai/code/session_01GjrjyxhRAhckrhyJGogaQw

Split the monolithic 6,065-line sol2ts.py into a modular structure:

- transpiler/lexer/ - Tokenization (tokens.py, lexer.py)
  - TokenType enum, Token dataclass
  - Keyword/operator mappings
  - Lexer class

- transpiler/parser/ - AST and parsing (ast_nodes.py, parser.py)
  - All AST node dataclasses (SourceUnit, ContractDefinition, etc.)
  - Recursive descent parser

- transpiler/types/ - Type system (registry.py, mappings.py)
  - TypeRegistry for cross-file type discovery
  - Solidity-to-TypeScript type mappings
  - Default value helpers

- transpiler/codegen/ - Code generation helpers (yul.py, abi.py, context.py)
  - YulTranspiler for inline assembly
  - AbiTypeInferer for ABI encoding
  - CodeGenerationContext for state management

The original sol2ts.py is preserved for backward compatibility.
New code can import from the modules directly:
  from transpiler.lexer import Lexer
  from transpiler.parser import Parser
  from transpiler.types import TypeRegistry

https://claude.ai/code/session_01GjrjyxhRAhckrhyJGogaQw
Decompose the monolithic TypeScriptCodeGenerator into focused generator classes:
- BaseGenerator: Shared utilities for all generators
- TypeConverter: Solidity to TypeScript type conversions
- ExpressionGenerator: Expression AST code generation
- StatementGenerator: Statement AST code generation
- FunctionGenerator: Function/constructor generation
- DefinitionGenerator: Struct/enum/constant generation
- ImportGenerator: Import statement generation
- ContractGenerator: Contract class generation
- generator.py: Main orchestrator coordinating all generators

Benefits:
- Single responsibility for each generator class
- Easier testing and maintenance of individual components
- Cleaner separation of concerns
- Total 116 Solidity files transpiled successfully in testing

https://claude.ai/code/session_01GjrjyxhRAhckrhyJGogaQw
…ctoring

Changes:
- Add current_method_return_types tracking in CodeGenerationContext
- Populate method return types in ContractGenerator._setup_contract_context
- Fix ExpressionGenerator._infer_single_abi_type to lookup method return types
- Update test_transpiler.py to use modular imports
- Replace old 6,065-line sol2ts.py with new 264-line modular version
- Update transpiler/__init__.py exports for new module structure

All 8 unit tests pass and 111 Solidity files transpile successfully.

https://claude.ai/code/session_01GjrjyxhRAhckrhyJGogaQw
Generated TypeScript output should not be committed.

https://claude.ai/code/session_01GjrjyxhRAhckrhyJGogaQw
- Create MetadataExtractor to collect contract info from ASTs
- Create FactoryGenerator to produce factories.ts with container registrations
- Integrate metadata generation with --emit-metadata flag
- Auto-generate interface aliases (IEngine -> Engine, etc.)
- Auto-generate lazy singletons with constructor dependencies

All 38 TypeScript tests now pass (previously 31 passed, 1 suite failed).

https://claude.ai/code/session_01GjrjyxhRAhckrhyJGogaQw
…istry

Both modules now import successfully with the runtime replacements
for EIP712 and Ownable. All 40 tests pass.

https://claude.ai/code/session_01GjrjyxhRAhckrhyJGogaQw
- Document new module structure (lexer/, parser/, types/, codegen/)
- Document code generator architecture with specialized generators
- Update test counts (40 TypeScript tests, 8 Python tests)
- Document factories.ts generation with --emit-metadata
- Preserve clear, no-frills documentation style

https://claude.ai/code/session_01GjrjyxhRAhckrhyJGogaQw
Add utility methods:
- _parse_binary_op(): Generic left-associative binary operator parsing
- parse_comma_separated(): Parse comma-separated lists with end token
- parse_storage_location(): Parse storage/memory/calldata
- skip_balanced(): Skip balanced bracket pairs (parens, braces)

Apply to simplify:
- 10 binary operator methods now use _parse_binary_op helper
- Event/error/function parameters use parse_comma_separated
- State variable/function attributes use token->value dictionaries
- Function skip, try/catch, base contracts use skip_balanced

All 40 TypeScript tests and Python unit tests pass.

https://claude.ai/code/session_01GjrjyxhRAhckrhyJGogaQw
- expression.py: Extract _resolve_abi_base_type() and _infer_expression_type()
  to share logic between ABI type inference methods (~80 lines reduced)
- statement.py: Extract _generate_body_statements() for loop body generation
- function.py: Extract _get_visibility_modifier() and _get_static_modifier()
- Rename types/ to type_system/ to avoid shadowing Python's standard library

All 40 TypeScript tests and 8 Python unit tests pass.

https://claude.ai/code/session_01GjrjyxhRAhckrhyJGogaQw
@sudo-owen sudo-owen merged commit d480314 into main Feb 2, 2026
1 check passed
@sudo-owen sudo-owen deleted the claude/refactor-sol2ts-transpiler-Yp1ez branch February 2, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants