Conversation
f1636b5 to
4d0980b
Compare
Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>
Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>
Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>
Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>
Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>
Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>
Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>
Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>
Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>
Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>
|
I tried testing this but noticed that the Flavor doesn’t reflect some values from the FLARE annotations (GPU count and GPU memory stay at 0). Are the annotations the same as in this repository https://github.com/clastix/flare? For testing, should I follow the quickstart from that repo, or are there specific steps/examples for this PR to make validation easier? |
Yes, I implemented the required code following that documentation: just sharing the annotations and the resulting flavor. // kubectl get nodes fluidos-provider-2-worker -ojsonpath='{.metadata.annotations}'
{
"cost.fluidos.eu/currency": "EUR",
"cost.fluidos.eu/hourly-rate": "2.1",
"gpu.fluidos.eu/architecture": "ampere",
"gpu.fluidos.eu/clock-speed": "1.80G",
"gpu.fluidos.eu/compute-capability": "8.6",
"gpu.fluidos.eu/cores": "10752",
"gpu.fluidos.eu/count": "8",
"gpu.fluidos.eu/dedicated": "true",
"gpu.fluidos.eu/fp32-tflops": "38.7",
"gpu.fluidos.eu/interconnect": "nvlink",
"gpu.fluidos.eu/interconnect-bandwidth-gbps": "600",
"gpu.fluidos.eu/interruptible": "false",
"gpu.fluidos.eu/memory-per-gpu": "48Gi",
"gpu.fluidos.eu/model": "nvidia-a6000",
"gpu.fluidos.eu/multi-gpu-efficiency": "0.85",
"gpu.fluidos.eu/sharing-capable": "false",
"gpu.fluidos.eu/sharing-strategy": "none",
"gpu.fluidos.eu/tier": "standard",
"gpu.fluidos.eu/topology": "ring",
"gpu.fluidos.eu/vendor": "nvidia",
"kubeadm.alpha.kubernetes.io/cri-socket": "unix:///run/containerd/containerd.sock",
"location.fluidos.eu/region": "us-east-1",
"location.fluidos.eu/zone": "zone-a",
"network.fluidos.eu/bandwidth-gbps": "25",
"network.fluidos.eu/latency-ms": "5",
"network.fluidos.eu/tier": "standard",
"node.alpha.kubernetes.io/ttl": "0",
"provider.fluidos.eu/name": "cloud-provider-2",
"provider.fluidos.eu/preemptible": "false",
"volumes.kubernetes.io/controller-managed-attach-detach": "true",
"workload.fluidos.eu/graphics-score": "0.95",
"workload.fluidos.eu/hpc-score": "0.80",
"workload.fluidos.eu/inference-score": "0.90",
"workload.fluidos.eu/training-score": "0.85"
}// kubectl get flavors.nodecore.fluidos.eu fluidos.eu-k8slice-89ad -ojsonpath='{.spec.flavorType.typeData.characteristics.gpu}'|jq
{
"architecture": "ampere",
"clock_speed": "1800M",
"compute_capability": "8.6",
"cores": "86016",
"count": 8,
"dedicated": true,
"fp32_tflops": 38.7,
"graphics_score": 0.95,
"hourly_rate": 2.1,
"hpc_score": 0.8,
"inference_score": 0.9,
"interconnect": "nvlink",
"interconnect_bandwidth": "600",
"memory": "384Gi",
"model": "nvidia-a6000",
"multi_gpu_efficiency": "0.85",
"network_bandwidth": "25",
"network_latency_ms": 5,
"network_tier": "standard",
"provider": "cloud-provider-2",
"region": "zone-a",
"sharing_strategy": "none",
"tier": "standard",
"topology": "ring",
"training_score": 0.85,
"vendor": "nvidia",
"zone": "us-east-1"
}
Yes, we're going to release the code before the end of the week: testing will be easier. |
No description provided.