-
Notifications
You must be signed in to change notification settings - Fork 6
test: Added Testcases for testing moneo tool #53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer's GuideAdds a bash-based Moneo installation/functional test script and wires it into the Azure HPC role while updating Moneo install paths and maintaining backward compatibility via a legacy symlink. Sequence diagram for running the new Moneo validation test scriptsequenceDiagram
actor Admin
participant TestScript as test_moneo_sh
participant MoneoInstaller as configure_service_sh
participant Grafana as Grafana_service
participant Prometheus as Prometheus_service
participant GPUDriver as nvidia_smi
Admin->>TestScript: Invoke with CLI options
TestScript->>TestScript: Parse options (verbose, skip_functional, json_output)
TestScript->>TestScript: setenforce 0 (temporarily disable SELinux)
TestScript->>TestScript: Verify Moneo installation directory (__hpc_moneo_install_dir)
TestScript->>TestScript: Verify Moneo main script presence
TestScript->>TestScript: Verify bashrc alias configuration
alt Functional tests not skipped
TestScript->>MoneoInstaller: Deploy Moneo with hostfile
MoneoInstaller-->>TestScript: Deployment status
TestScript->>TestScript: Wait up to 60 seconds for services
TestScript->>Grafana: HTTP health check on port 3000
Grafana-->>TestScript: Health status
TestScript->>Prometheus: HTTP health check on port 9090
Prometheus-->>TestScript: Health status
TestScript->>GPUDriver: Run nvidia-smi for GPU detection
GPUDriver-->>TestScript: GPU info or error
TestScript->>MoneoInstaller: Shutdown Moneo deployment
MoneoInstaller-->>TestScript: Shutdown status
else Functional tests skipped
TestScript->>TestScript: Skip functional validation steps
end
TestScript->>TestScript: setenforce 1 (restore SELinux)
TestScript->>Admin: Output color-coded results and optional JSON summary
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
341c3ea to
c830974
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've left some high level feedback:
- The test script hardcodes
MONEO_HOME=/opt/hpc/azure/tools/Moneoinstead of deriving it from the role variables (e.g.__hpc_moneo_install_dir), which may drift from the actual install path and break when the install prefix changes. - In
test-moneo.shthe SELinux mode is unconditionally changed withsetenforce 0/1; consider detecting the initial SELinux state and restoring it, and guarding these calls for systems without enforcing SELinux to avoid unintended behavior. - The test script installs packages and writes
/etc/containers/registries.conf.d/99-unqualified-search.conf; making these side effects optional or merging with existing config would reduce the risk of impacting the host’s container setup during test runs.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The test script hardcodes `MONEO_HOME=/opt/hpc/azure/tools/Moneo` instead of deriving it from the role variables (e.g. `__hpc_moneo_install_dir`), which may drift from the actual install path and break when the install prefix changes.
- In `test-moneo.sh` the SELinux mode is unconditionally changed with `setenforce 0`/`1`; consider detecting the initial SELinux state and restoring it, and guarding these calls for systems without enforcing SELinux to avoid unintended behavior.
- The test script installs packages and writes `/etc/containers/registries.conf.d/99-unqualified-search.conf`; making these side effects optional or merging with existing config would reduce the risk of impacting the host’s container setup during test runs.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
c830974 to
4a7a3e5
Compare
|
|
||
| # Disable SELinux | ||
| echo "[SETUP] Disabling SELinux..." | ||
| sudo setenforce 0 2>/dev/null || echo "[SETUP] Could not disable SELinux" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to disable selinux?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moneo.py is unable to deploy prometheus container due to selinux context , so for testing purpose i have to disable and enable in the script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for my information - what is the selinux issue e.g. output of ausearch? If it is something simple, we might use the selinux system role to add a policy for this. Every time someone disables selinux, Dan Walsh feels a disturbance in the Force.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please see below logs:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e7996a2c3636 docker.io/prom/prometheus:latest --storage.tsdb.pa... 12 seconds ago Exited (2) 11 seconds ago 9090/tcp prometheus
container exiting due to below error on logs:
sudo docker logs prometheus
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
time=2026-01-30T05:12:09.929Z level=ERROR source=main.go:654 msg="Error loading config (--config.file=/etc/prometheus/prometheus.yml)" file=/etc/prometheus/prometheus.yml err="open /etc/prometheus/prometheus.yml: permission denied"
This is a permission/SELinux issue when mounting volumes in Podman. The container can't access the config file due to SELinux labeling.
Fix: Add :z or :Z to your volume mount, or disable SELinux labeling:
|
|
||
| [aliases] | ||
| "prometheus" = "docker.io/prom/prometheus" | ||
| EOF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this container setup being done here? shouldn't it have been done during moneo package installation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prometheus docker image getting pulled using short name , it is done by moneo installation script , so change is done to support short name container pull
4a7a3e5 to
2b38428
Compare
2b38428 to
e48717c
Compare
|
lgtm but I'll defer to @dgchinner I notice a few style things like using |
Enhancement:
Added a new bash shell script (tests/test_moneo.sh) for Moneo installation verification and functional testing. This script provides automated testing capabilities for validating Moneo GPU monitoring tool deployment on HPC systems.
Key features include:
Installation verification tests (directory existence, main script presence)
Functional tests:
Moneo full deployment with hostfile support
Grafana service health check (port 3000)
Prometheus service health check (port 9090)
Moneo shutdown functionality
SELinux enforcement handling (setenforce 0 at start, setenforce 1 at end)
Automatic hostfile detection and creation
60-second service startup wait after deployment
Reason:
To provide a lightweight, dependency-free testing solution for validating Moneo installations in CI/CD pipelines and manual verification scenarios. The bash implementation eliminates Python dependencies and provides native shell integration for HPC environments where minimal tooling is preferred.
Result:
Automated verification of Moneo installation, configuration, and functionality.
Prerequisites to run Tests script:
Sample output:
./test-moneo.sh
Moneo Test Script
[SETUP] Disabling SELinux...
[SETUP] Configuring container registries...
[SETUP] Creating temporary hostfile...
==============================================
Running Tests
Testing: Moneo directory exists
[PASS] Moneo directory exists at /opt/hpc/azure/tools/Moneo
Testing: Moneo script exists
[PASS] moneo.py exists
Testing: Moneo deployment
[PASS] Moneo deployed successfully
Testing: Grafana is running
Waiting 60 seconds for services to start...
[PASS] Grafana is running (HTTP 302)
Testing: Prometheus is running
[PASS] Prometheus is running (HTTP 302)
Testing: Moneo shutdown
[PASS] Moneo shutdown completed
Testing: Grafana is stopped
[PASS] Grafana is stopped (connection refused)
Testing: Prometheus is stopped
[PASS] Prometheus is stopped (connection refused)
==============================================
All tests passed: 8
[CLEANUP] Removing temporary hostfile...
[CLEANUP] Re-enabling SELinux...
Issue Tracker Tickets (Jira or BZ if any):
https://issues.redhat.com/browse/RHELHPC-125
Summary by Sourcery
Add automated bash-based verification and functional tests for Moneo and align the Moneo installation path with the Azure tools directory while preserving compatibility with existing service configuration scripts.
New Features:
Enhancements:
Tests: