-
Notifications
You must be signed in to change notification settings - Fork 145
perf(radar): Reduce cost of W3DRadar::renderObjectList (by 80%), W3DRadar::buildTerrainTexture (by 25%), W3DRadar::clearShroud by locking radar surface only once #2138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…adar::buildTerrainTexture (by 25%), W3DRadar::clearShroud by locking radar surface only once
|
| Filename | Overview |
|---|---|
| Core/Libraries/Source/WWVegas/WW3D2/surfaceclass.h | Added Get_Bytes_Per_Pixel() method and updated Draw_Pixel, Draw_H_Line, and Get_Pixel signatures to accept pre-locked surface parameters for performance optimization |
| Core/Libraries/Source/WWVegas/WW3D2/surfaceclass.cpp | Refactored Draw_Pixel, Draw_H_Line, and Get_Pixel to work with pre-locked surfaces, replaced individual Lock/Unlock calls with direct memory access using memcpy, added Get_Bytes_Per_Pixel() implementation |
| Core/GameEngineDevice/Source/W3DDevice/Common/System/W3DRadar.cpp | Core optimization - locks radar surfaces once per function instead of per pixel in renderObjectList, buildTerrainTexture, clearShroud, setShroudLevel, and refreshObjects, achieving 80% and 25% performance improvements |
Sequence Diagram
sequenceDiagram
participant Caller as W3DRadar::renderObjectList
participant Surface as SurfaceClass
participant D3D as Direct3D Surface
Note over Caller,D3D: Before Optimization (Old Approach)
rect rgb(255, 200, 200)
loop For each radar object (N objects)
loop Draw 4 pixels per object
Caller->>Surface: DrawPixel(x, y, color)
Surface->>D3D: LockRect()
D3D-->>Surface: locked bits
Surface->>D3D: Write pixel
Surface->>D3D: UnlockRect()
end
end
Note over Caller,D3D: Total: 4N Lock/Unlock operations
end
Note over Caller,D3D: After Optimization (New Approach)
rect rgb(200, 255, 200)
Caller->>Surface: Lock(&pitch)
Surface->>D3D: LockRect()
D3D-->>Surface: pBits, pitch
Surface-->>Caller: pBits, pitch
Caller->>Surface: Get_Bytes_Per_Pixel()
Surface-->>Caller: bytesPerPixel
loop For each radar object (N objects)
loop Draw 4 pixels per object
Caller->>Surface: Draw_Pixel(x, y, color, bytesPerPixel, pBits, pitch)
Note right of Surface: Direct memory write<br/>using memcpy<br/>(no Lock/Unlock)
end
end
Caller->>Surface: Unlock()
Surface->>D3D: UnlockRect()
Note over Caller,D3D: Total: 1 Lock/Unlock operation
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 files reviewed, 2 comments
| for(x = 0; x< size.x; x++) | ||
| { | ||
| surface->DrawPixel( x, y, file.readInt() ); | ||
| surface->Draw_Pixel( x, y, file.readInt(), bytesPerPixel, pBits, pitch ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we planning on using this commented code? Keeping it for historical reference? Something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not know. I just updated it for courtesy.
| Convert_Pixel(rgb,sd,(unsigned char *) lock_rect.pBits); | ||
| DX8_ErrorCode(D3DSurface->UnlockRect()); | ||
| unsigned int bytesPerPixel = ::Get_Bytes_Per_Pixel(sd.Format); | ||
| unsigned char* dst = static_cast<unsigned char *>(pBits) + y * pitch + x * bytesPerPixel; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unused, so should be fine, but should we still clamp x and y?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because this function does not know size of pBits. It assumes caller makes no mistake. Also it would be bad for performance.
| { | ||
| surface->DrawPixel( x, y, file.readInt() ); | ||
| surface->Draw_Pixel( x, y, file.readInt(), bytesPerPixel, pBits, pitch ); | ||
| buffer[y + x] = file.readInt(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it ok to call file.readInt() again here? (I know it's commented out, but in theory do we want to call it once and use the value twice?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also is [y + x] right? should it be buffer[y * size.x + x] ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually do we need the buffer at all? It gets filled but then setRawTextureData(buffer) is called without freeing it, which could leak, but it's not clear to me that we need it in the first place.
Int color = file.readInt();
surface->Draw_Pixel(x, y, color, ...);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is commented, so do not care. Just touched for courtesy.
|
Code LGTM, good find! I left some nits about commented out code, but looks good to go |
bobtista
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This change reduces the cost of
W3DRadar::renderObjectListby around 80%, from 0.5 ms to 0.1 ms (tested with many radar objects)W3DRadar::buildTerrainTextureby around 25%, from 0.0127 ms to 0.0164 msW3DRadar::clearShroud- not measured because not importantThis is achieved by locking the radar surface only once instead of many times in loops.
W3DRadar::renderObjectListis called every 6 logic frames, and the jump from 0.5 ms to 0.1 ms is nice to improve overall performance (measured on modern machine).Functions
SurfaceClass::DrawPixelandSurfaceClass::DrawHLinewere renamed to better fit the WWVegas style.The
NULL's in W3DRadar were changed tonullptr, which was added by #2072.