-
-
Notifications
You must be signed in to change notification settings - Fork 200
(POC) feat: Sentry native crash backend #1433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
mujacica
commented
Oct 29, 2025
- Sentry native crash backend
- Out-of-process daemon/handler
- MacOS/Android/Windows/Linux support
- Integration with external crash reporter
- IPC/SHM/Signaling multi-platform implementation
- Minidump writers for all platforms
- In-process option (process_crash function)
- Different options for Minidump sizes
- Full sentry-codebase integration
- Sentry logger integration
- Sentry debug-flags integration
|
@sentry review |
1 similar comment
|
@sentry review |
Apply black formatting to assert statements. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Linux E2E tests show varying thread counts depending on mode, with native mode capturing only 1 thread. Relax the assertion to ensure at least 1 thread is captured. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
On Linux, the signal handler cannot safely enumerate modules (unlike macOS which has signal-safe dyld APIs). Add daemon-side module capture that reads /proc/<pid>/maps to populate debug_meta with module images and Build IDs for symbolication. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Debug symbols are not required for basic E2E testing - the tests verify crash events are received, not symbolication quality. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add (void)thread_idx to suppress C4100 warning on Windows where the parameter is only used in macOS-specific code paths. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The debug_id for ELF modules must be converted to little-endian GUID format for Sentry symbolication. Apply the same byte swapping as sentry_modulefinder_linux.c does. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
On Linux, the signal handler only captures the crashing thread because opendir/readdir aren't signal-safe. Add daemon-side thread enumeration from /proc/<pid>/task to capture all thread IDs for the native event. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add enable-logs and capture-log arguments to E2E tests so structured logs are captured and sent with crash events. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix module capture from /proc/maps to correctly handle PIE binaries: - Capture first mapping for each file regardless of permissions - Calculate base address by subtracting file offset from mapping start - This ensures modules are found even when base mapping is r-- not r-x - Fix ARM64 address validation in is_valid_code_addr(): - x86_64 user space is below 0x00007FFFFFFFFFFF - ARM64 user space can use addresses like 0xAAAA_xxxx with ASLR - Only reject kernel space addresses (>= 0xFFFF_0000_0000_0000) - Add Windows module and thread capture for native mode: - EnumProcessModules for module enumeration - CreateToolhelp32Snapshot for thread enumeration Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add explicit casts when passing stack_size (uint64_t) to sentry_malloc and ReadProcessMemory on 32-bit Windows where size_t is 32-bit. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The default 20 attempts with 5 second intervals (100s total) was too short for some events to appear in Sentry. Increased to 100 attempts with 6 second intervals (600s total) for more reliable E2E tests. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Extract PE TimeDateStamp from module files on disk and include
code_id (timestamp + size) in debug_meta images. This helps Sentry
identify Windows PE modules even without full PDB debug info.
The code_id format is "{TimeDateStamp:08X}{SizeOfImage:x}" which
matches the Windows symbol server convention.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Start stack capture from minimum of SP and FP (FP might be below SP) - Add debug logging for SP, FP, and stack start values - Log GetLastError() when ReadProcessMemory or OpenProcess fails Note: Frame pointer-based unwinding on Windows x64/ARM64 is unreliable since compilers often don't use frame pointers. A proper fix would use StackWalk64 from dbghelp.dll, but this improves diagnostics for now. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace frame pointer-based stack walking with StackWalk64 from dbghelp.dll for reliable out-of-process stack unwinding on Windows x64/ARM64. Key changes: - Add walk_stack_with_dbghelp() function for out-of-process unwinding - Use custom ReadMemoryRoutine callback for cross-process memory access - Initialize dbghelp with SymInitialize() for proper PE unwind info access - Works like the inproc backend but for the crashed process This fixes the issue where Windows crash events only had 1 frame because frame pointer-based unwinding doesn't work reliably on x64/ARM64 where compilers often optimize away frame pointers. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Move dbghelp.h include and forward declaration to top of file so the function is declared before it's called in build_stacktrace_for_thread. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…hreads - Add enrich_frame_with_module_info() to set package and image_addr on frames - Update Linux/Windows thread loops to build stacktraces for all threads - Use thread-specific context for Windows StackWalk64 calls - Fix macOS stack file deletion timing to preserve files for native stacktrace Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Return null stacktrace when frames array is empty (Sentry rejects empty frames) - Only add stacktrace to thread if it contains frames - Use thread-specific ucontext_t for Linux register extraction Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove unused thread_id variable from register extraction section (the StackWalk64 section already has its own walk_thread_id). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The module capture from /proc/pid/maps was only using the size of the first memory mapping for each module. This caused frame-to-module matching to fail for addresses in later segments (code, data) since those addresses fell outside the incorrectly small module bounds. For example, a PIE binary might have: - First mapping: 0x5500-0x5510 (rodata, offset=0) → size=0x10 - Code mapping: 0x5510-0x5590 (code, offset=0x10) - Data mapping: 0x5590-0x55a0 (data, offset=0x90) Previously we'd set size=0x10, so code addresses like 0x5540 would not match the module. Now we track all mappings and extend the size to cover from base_address to the maximum end address of all segments. This fixes the issue where Linux native backend crashes showed <unknown> for module names in Sentry, while Windows and macOS worked correctly (they use platform APIs that return accurate sizes). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The mod->uuid field is uint8_t[] with 1-byte alignment, but GUID structure requires 4-byte alignment. Casting directly to GUID* could cause undefined behavior or incorrect reads on some platforms. Use memcpy to copy bytes into a properly aligned GUID struct before passing to sentry__uuid_from_native(). This fix ensures debug_id is correctly formatted, allowing Sentry to match frames to debug images properly. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add debug logging to crash daemon: - Log module count before adding debug_meta - Warn when no modules are captured - Log when frames can't be matched to any module Add diagnostic output to E2E tests: - Add get_debug_meta_from_event() helper function - Print debug_meta.images info (count, addresses, sizes) - Print frame info (instruction addresses, symbolication status) This helps diagnose why Windows shows "unknown problem" while Linux shows "missing debug files" - the difference indicates frames aren't being matched to debug images on Windows. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add debug logging to crash daemon: - Log module count before adding debug_meta - Warn when no modules are captured - Log when frames can't be matched to any module Add diagnostic output to E2E tests: - Add get_debug_meta_from_event() helper function - Print debug_meta.images info (count, addresses, sizes) - Print frame info (instruction addresses, symbolication status) This helps diagnose why Windows shows "unknown problem" while Linux shows "missing debug files" - the difference indicates frames aren't being matched to debug images on Windows. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
@sentry review |
The native backend on Windows was causing symbolicator to fail with "internal server error" because it was missing the debug_file field that minidump events include. This prevented proper symbolication. Changes: - Add pdb_name field to sentry_module_info_t struct - Extract PDB filename from PE CodeView debug directory - Set debug_file field on PE images in native crash events - Fix code_id format to use lowercase hex (matching minidump format) The debug_file field is essential for Sentry's symbolicator to know where to look for PDB files when symbolicating Windows crash events. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
d46d50a to
d38b5e1
Compare
The native backend on Windows was causing symbolicator to fail with "internal server error" because: 1. Missing debug_file field that tells symbolicator where to find PDBs 2. code_id format was wrong - timestamp needs 8-digit zero-padding Changes: - Add pdb_name field to sentry_module_info_t struct (Windows-only) - Extract PDB filename from PE CodeView debug directory - Set debug_file field on PE images in native crash events - Fix code_id format: use %08x%x (8-digit padded timestamp + size) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| uint64_t flags; | ||
| } PACKED_ATTR minidump_header_t; | ||
| PACKED_STRUCT_END | ||
| PACKED_STRUCT_END |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate structure packing macros cause incorrect MSVC behavior
Medium Severity
The PACKED_STRUCT_BEGIN and PACKED_STRUCT_END macros are duplicated for minidump_header_t, minidump_directory_t, and minidump_location_t. On MSVC, PACKED_STRUCT_BEGIN expands to __pragma(pack(push, 1)), so doubling it pushes the pack state twice onto the stack. While the corresponding doubled PACKED_STRUCT_END happens to balance this out, this is incorrect and error-prone macro usage that could cause subtle packing issues if the pattern becomes inconsistent.
Additional Locations (2)
| SENTRY_WARN("write_minidump: directory write failed"); | ||
| close(writer.fd); | ||
| return -1; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing file cleanup on macOS fallback error paths
Medium Severity
When task_for_pid fails and the macOS minidump writer uses the fallback path (lines 1065-1162), all error paths call close(writer.fd) but never call unlink(output_path) to delete the partial/corrupt file. This leaves orphaned minidump files on disk. The Linux implementation and the macOS "full path" (via goto cleanup_error) correctly delete the file on error. This inconsistency causes disk space leakage and could lead to corrupt minidumps being mistakenly processed later.