Skip to content

Add Stats Level config value for use with stats counter collection.#207

Open
zbyrne wants to merge 2 commits intodevelopfrom
zbyrne/stats-config
Open

Add Stats Level config value for use with stats counter collection.#207
zbyrne wants to merge 2 commits intodevelopfrom
zbyrne/stats-config

Conversation

@zbyrne
Copy link
Collaborator

@zbyrne zbyrne commented Mar 3, 2026

Motivation

AIHIPFILE-53 It's possible as we add more counters to ais-stats that we may start to see a performance overhead to the data collection. This setting will be used to let us control exactly how much data we collect from a given process in the event very detailed stats gathering has a large performance impact.

Technical Details

Adds an environment variable(HIPFILE_STATS_LEVEL) and Configuration method(::statsLevel()) in the same vein as HIPFILE_FORCE_COMPAT_MODE.

Test Plan

Added Environment and Configuration unit tests for the new functions.

Test Result

All unit tests pass.

Submission Checklist

@zbyrne zbyrne marked this pull request as ready for review March 3, 2026 21:43
@gaoikawa
Copy link
Collaborator

gaoikawa commented Mar 3, 2026

It's early I realize and @derobins is re-working our docs, but we probably should add something to our public facing docs on stats collection 'soon'.

hipFile -> docs -> (something related to stats, perhaps)

Also, other libraries have a standalone environment variables explainer page, we should add something too.

Okay to capture that in another follow-on story/PR.

Thanks.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a HIPFILE_STATS_LEVEL environment variable and a corresponding statsLevel() configuration method to allow controlling the granularity of stats collection (for ais-stats). The feature mirrors the existing HIPFILE_FORCE_COMPAT_MODE pattern.

Changes:

  • Adds Environment::get<int> specialization and stats_level() convenience function in the environment layer.
  • Adds statsLevel() to the IConfiguration interface and the concrete Configuration class.
  • Adds unit tests for the new get<int> environment parsing and the statsLevel() configuration getter.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/amd_detail/environment.h Declares STATS_LEVEL constant and stats_level() method
src/amd_detail/environment.cpp Implements get<int> template specialization and stats_level()
src/amd_detail/configuration.h Adds statsLevel() pure virtual to IConfiguration and declaration to Configuration
src/amd_detail/configuration.cpp Initializes m_statsLevel from environment and implements statsLevel()
test/amd_detail/environment.cpp Unit tests for the new get<int> specialization
test/amd_detail/configuration.cpp Unit tests for statsLevel() via ConfigurationExpectationBuilder

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +43 to +44
if (end == nullptr)
return std::nullopt;
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check if (end == nullptr) on line 43 is dead code. According to the C standard, when strtol is called with a non-null second argument (&end), it always sets *endptr to a valid pointer — it will never be nullptr. Since end is initialized by strtol and is always set to a non-null pointer, this check can never be true and should be removed to avoid misleading future readers.

Copilot uses AI. Check for mistakes.
return std::nullopt;
errno = 0;
char *end;
int x{static_cast<int>(std::strtol(value, &end, 10))};
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strtol call converts the result with static_cast<int>(...). On platforms where long is larger than int (e.g., 64-bit Linux where long is 64 bits), a value like 2147483648 (INT_MAX + 1) would not trigger ERANGE (since it fits in long) but would silently overflow when cast to int. This means values outside [INT_MIN, INT_MAX] are silently accepted and produce an incorrect result. Consider using std::strtol and checking that the returned long value fits within [INT_MIN, INT_MAX] before the cast, returning nullopt otherwise. Alternatively, use std::strtoimax and clamp to int range.

Copilot uses AI. Check for mistakes.
Comment on lines +24 to +26
/// @brief Shows the level of detail for stats collection
/// @return 0 if stats collection disabled, higher levels of detail as value increases
virtual int statsLevel() const noexcept = 0;
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding statsLevel() as a pure virtual method to IConfiguration makes MConfiguration in test/amd_detail/mconfiguration.h abstract and unusable, since that mock does not yet provide a MOCK_METHOD for statsLevel(). While MConfiguration is not currently used in any test .cpp file, it will fail to compile if it ever gets included. The MConfiguration mock should be updated to add MOCK_METHOD(int, statsLevel, (), (const, noexcept, override));.

Copilot uses AI. Check for mistakes.
Comment on lines +31 to +50
template <>
std::optional<int>
Environment::get<int>(const char *key)
{
const char *value{Context<Sys>::get()->getenv(key)};
if (value == nullptr)
return std::nullopt;
errno = 0;
char *end;
int x{static_cast<int>(std::strtol(value, &end, 10))};
if (errno != 0)
return std::nullopt;
if (end == nullptr)
return std::nullopt;
if (end == value)
return std::nullopt;
if (*end != '\0')
return std::nullopt;
return x;
}
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new Environment::get<int> specialization uses errno and std::strtol but the file only includes <strings.h> and project headers. Neither <cerrno> (for errno) nor <cstdlib> (for std::strtol) is explicitly included. While these may be pulled in transitively on some platforms via <strings.h> or other system headers, relying on transitive inclusion is fragile and non-portable. Other files in the codebase (e.g., src/amd_detail/sys.cpp and test/amd_detail/driver.cpp) explicitly include <cerrno> when they use errno. Both <cerrno> and <cstdlib> should be explicitly included.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants