-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
Example 1 for GEMM 01_gemm_global_reg crashes during kernel for default problem size:
- M = 1024
- N = 1024
- K = 2048
- TileM = 256
- TileN = 128
- TileK = 64
System:
OS: Ubuntu 24.04.2
Device: RTX 3070 Mobile (SM86)
Driver: 560.35.03
CUDA Version: 12.6
Output:
kThreads: 128
RegA: RowMajor(16, 16)
RegB: ColMajor(16, 8)
RegC: RowMajor(16, 16)
IteratorA: numel = 524288, ChunkShape = (256, 64), stripe count = (1, 32)
IteratorB: numel = 262144, ChunkShape = (64, 128), stripe count = (32, 1)
blocks: [4, 8]
terminate called after throwing an instance of 'thrust::THRUST_200400_860_NS::system::system_error'
what(): trivial_device_copy D->H failed: cudaErrorIllegalAddress: an illegal memory access was encountered
Aborted (core dumped)
Metadata
Metadata
Assignees
Labels
No labels