Skip to content

fix/json unicode offsets#2659

Draft
jurgenvinju wants to merge 5 commits intomainfrom
fix/json-unicode-offsets
Draft

fix/json unicode offsets#2659
jurgenvinju wants to merge 5 commits intomainfrom
fix/json-unicode-offsets

Conversation

@jurgenvinju
Copy link
Member

@jurgenvinju jurgenvinju commented Feb 18, 2026

This is a work in progress. Not urgent. Testing out a first streaming solution to see if it works and how fast it still is. We're keeping a shadow buffer of int[1024] offsets which maps character offsets (indices into the array) to their codepoint offset count. From that we can derive the start of buffer and positions for every character in the current buffer.

Note that unicode characters in comments are equally responsible for shifts in the offsets as unicode characters in string constants and names.

  • start of buffer offset compensated for presence of surrogate pairs
  • pos index into buffer compensated for surrogate pairs
  • current experiment throws NPEs; needs fix
  • column position (same)
  • line position (same)

When the above is done, the origin locations and the error positions will be
correct in the presence of surrogate pairs.

@codecov
Copy link

codecov bot commented Feb 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 46%. Comparing base (57d8bcd) to head (505e233).

Additional details and impacted files
@@           Coverage Diff           @@
##              main   #2659   +/-   ##
=======================================
- Coverage       46%     46%   -1%     
+ Complexity    6677    6670    -7     
=======================================
  Files          795     795           
  Lines        65899   65902    +3     
  Branches      9878    9880    +2     
=======================================
- Hits         30709   30706    -3     
- Misses       32806   32818   +12     
+ Partials      2384    2378    -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jurgenvinju jurgenvinju self-assigned this Feb 19, 2026
@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments