Add Stats Level config value for use with stats counter collection.#207
Add Stats Level config value for use with stats counter collection.#207
Conversation
|
It's early I realize and @derobins is re-working our docs, but we probably should add something to our public facing docs on stats collection 'soon'. hipFile -> docs -> (something related to stats, perhaps) Also, other libraries have a standalone environment variables explainer page, we should add something too. Okay to capture that in another follow-on story/PR. Thanks. |
There was a problem hiding this comment.
Pull request overview
This PR adds a HIPFILE_STATS_LEVEL environment variable and a corresponding statsLevel() configuration method to allow controlling the granularity of stats collection (for ais-stats). The feature mirrors the existing HIPFILE_FORCE_COMPAT_MODE pattern.
Changes:
- Adds
Environment::get<int>specialization andstats_level()convenience function in the environment layer. - Adds
statsLevel()to theIConfigurationinterface and the concreteConfigurationclass. - Adds unit tests for the new
get<int>environment parsing and thestatsLevel()configuration getter.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/amd_detail/environment.h |
Declares STATS_LEVEL constant and stats_level() method |
src/amd_detail/environment.cpp |
Implements get<int> template specialization and stats_level() |
src/amd_detail/configuration.h |
Adds statsLevel() pure virtual to IConfiguration and declaration to Configuration |
src/amd_detail/configuration.cpp |
Initializes m_statsLevel from environment and implements statsLevel() |
test/amd_detail/environment.cpp |
Unit tests for the new get<int> specialization |
test/amd_detail/configuration.cpp |
Unit tests for statsLevel() via ConfigurationExpectationBuilder |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (end == nullptr) | ||
| return std::nullopt; |
There was a problem hiding this comment.
The check if (end == nullptr) on line 43 is dead code. According to the C standard, when strtol is called with a non-null second argument (&end), it always sets *endptr to a valid pointer — it will never be nullptr. Since end is initialized by strtol and is always set to a non-null pointer, this check can never be true and should be removed to avoid misleading future readers.
| return std::nullopt; | ||
| errno = 0; | ||
| char *end; | ||
| int x{static_cast<int>(std::strtol(value, &end, 10))}; |
There was a problem hiding this comment.
The strtol call converts the result with static_cast<int>(...). On platforms where long is larger than int (e.g., 64-bit Linux where long is 64 bits), a value like 2147483648 (INT_MAX + 1) would not trigger ERANGE (since it fits in long) but would silently overflow when cast to int. This means values outside [INT_MIN, INT_MAX] are silently accepted and produce an incorrect result. Consider using std::strtol and checking that the returned long value fits within [INT_MIN, INT_MAX] before the cast, returning nullopt otherwise. Alternatively, use std::strtoimax and clamp to int range.
| /// @brief Shows the level of detail for stats collection | ||
| /// @return 0 if stats collection disabled, higher levels of detail as value increases | ||
| virtual int statsLevel() const noexcept = 0; |
There was a problem hiding this comment.
Adding statsLevel() as a pure virtual method to IConfiguration makes MConfiguration in test/amd_detail/mconfiguration.h abstract and unusable, since that mock does not yet provide a MOCK_METHOD for statsLevel(). While MConfiguration is not currently used in any test .cpp file, it will fail to compile if it ever gets included. The MConfiguration mock should be updated to add MOCK_METHOD(int, statsLevel, (), (const, noexcept, override));.
| template <> | ||
| std::optional<int> | ||
| Environment::get<int>(const char *key) | ||
| { | ||
| const char *value{Context<Sys>::get()->getenv(key)}; | ||
| if (value == nullptr) | ||
| return std::nullopt; | ||
| errno = 0; | ||
| char *end; | ||
| int x{static_cast<int>(std::strtol(value, &end, 10))}; | ||
| if (errno != 0) | ||
| return std::nullopt; | ||
| if (end == nullptr) | ||
| return std::nullopt; | ||
| if (end == value) | ||
| return std::nullopt; | ||
| if (*end != '\0') | ||
| return std::nullopt; | ||
| return x; | ||
| } |
There was a problem hiding this comment.
The new Environment::get<int> specialization uses errno and std::strtol but the file only includes <strings.h> and project headers. Neither <cerrno> (for errno) nor <cstdlib> (for std::strtol) is explicitly included. While these may be pulled in transitively on some platforms via <strings.h> or other system headers, relying on transitive inclusion is fragile and non-portable. Other files in the codebase (e.g., src/amd_detail/sys.cpp and test/amd_detail/driver.cpp) explicitly include <cerrno> when they use errno. Both <cerrno> and <cstdlib> should be explicitly included.
Motivation
AIHIPFILE-53 It's possible as we add more counters to ais-stats that we may start to see a performance overhead to the data collection. This setting will be used to let us control exactly how much data we collect from a given process in the event very detailed stats gathering has a large performance impact.
Technical Details
Adds an environment variable(HIPFILE_STATS_LEVEL) and Configuration method(::statsLevel()) in the same vein as HIPFILE_FORCE_COMPAT_MODE.
Test Plan
Added Environment and Configuration unit tests for the new functions.
Test Result
All unit tests pass.
Submission Checklist