Skip to content

Conversation

@julik
Copy link
Owner

@julik julik commented Dec 2, 2025

We have a number of things which may only occur on state transitions or in certain states - where the states are defined by the section of the ZIP file we have just output. It is better - for rollbacks, but also for general sanity - to control this via a state machine instead of flags.


Note

Introduce a state machine to track ZIP output state in ZipKit::Streamer, enforce valid call sequences (with new InvalidState errors), and refine rollback behavior; add comprehensive tests.

  • Core:
    • Add ZipKit::Streamer::StateMachine to track states (initial, local_header, entry_body, data_descriptors, central_directory, end_of_central_directory) and validate transitions.
    • Expose state_machine on ZipKit::Streamer; raise InvalidState on invalid operations (e.g., adding entries after close, re-closing).
    • Integrate state transitions in add_file_and_write_local_header, update_last_entry_and_write_data_descriptor, and close (begin central directory, finalize).
    • Add guards raise_if_cannot_begin_entry! and raise_if_closed! to enforce sequencing.
    • Rework rollback! to consult state machine, remove failed entry, rebuild @path_set, and add Filler based on bytes written.
  • Tests:
    • New spec/zip_kit/streamer/state_machine_spec.rb covering transitions, queries, rollback, and finalization.
    • Extend spec/zip_kit/streamer_spec.rb to assert sequencing errors, state tracking through heuristics, rollback behavior, and valid multi-entry flows.

Written by Cursor Bugbot for commit ebb1940. This will update automatically on new commits. Configure here.

We have a number of things which may only occur on state transitions or in certain states - where the states are defined by the section of the ZIP file we have just output. It is better - for rollbacks, but also for general sanity - to control this via a state machine instead of flags.
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the final PR Bugbot will review for you during this billing cycle

Your free Bugbot reviews will reset on January 13

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

@current_entry = nil

context
end
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Double rollback causes TypeError due to nil entry_offset

After a rollback, the state machine sets @state to ENTRY_BODY and @entry_offset to nil. If rollback! is called again before starting a new entry, the Streamer's state check (state == StateMachine::ENTRY_BODY) passes, triggering @state_machine.rollback. Inside the state machine's rollback method, the computation bytes_written: current_offset - @entry_offset attempts to subtract nil, causing a TypeError. The old code protected against this with an if @offset_before_last_local_file_header guard.

Additional Locations (1)

Fix in Cursor Fix in Web

@remove_last_file_at_rollback = false

# Heuristic buffers data, then calls write_stored_file or write_deflated_file
# State only changes when actual ZIP data is written to the stream
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Heuristic failure rolls back wrong entry

When write_file (using Heuristic) fails before deciding whether to use stored or deflated compression, the rollback! call incorrectly removes the previous entry from the archive. The old code reset @offset_before_last_local_file_header and @remove_last_file_at_rollback at the start of write_file to prevent stale values from affecting rollback. This protection was removed but the state machine still holds current_entry and entry_offset from a previous entry when Heuristic hasn't yet written any ZIP data. This causes data loss when a Heuristic failure occurs after another entry was started but before Heuristic called write_stored_file or write_deflated_file.

Additional Locations (1)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants