GPU Benchmarking API Reference
This page provides detailed API reference for the GPU benchmarking functionality in CatP2P.
Structures
GpuBenchmarkContext
A context for GPU benchmarks that manages GPU resources and allows reusing them across multiple benchmarks.
Field | Type | Description | Example Access |
---|---|---|---|
gpu_info | GpuInfo | Information about the GPU | context.gpu_info.name |
adapter | Adapter | GPU adapter | (internal use) |
device | Device | GPU device | (internal use) |
queue | Queue | GPU command queue | (internal use) |
GpuBenchmarkResult
Contains detailed information about the results of a GPU benchmark.
Field | Type | Description | Example Access |
---|---|---|---|
gpu_model | String | GPU model name | result.gpu_model |
gpu_vendor | String | GPU vendor | result.gpu_vendor |
vram_estimate | String | Estimated VRAM in GB | result.vram_estimate |
compute_score | f64 | Compute performance score (MFLOPS) | result.compute_score |
texture_score | f64 | Texture sampling performance score | result.texture_score |
geometry_score | f64 | Geometry processing performance score | result.geometry_score |
memory_score | f64 | Memory bandwidth performance score | result.memory_score |
overall_score | f64 | Overall benchmark score (higher is better) | result.overall_score |
average_fps | f64 | Average frames per second across all tests | result.average_fps |
test_results | Vec<GpuTestResult> | Detailed results for each test | result.test_results[0].score |
GpuTestResult
Contains detailed information about the results of a specific GPU test.
Field | Type | Description | Example Access |
---|---|---|---|
test_name | String | Name of the test | test_result.test_name |
average_fps | f64 | Average frames per second | test_result.average_fps |
min_fps | f64 | Minimum frames per second | test_result.min_fps |
max_fps | f64 | Maximum frames per second | test_result.max_fps |
score | f64 | Test score in MFLOPS (higher is better) | test_result.score |
GpuBenchmarkConfig
Configuration options for GPU benchmarks.
Field | Type | Description | Default Value | Example Access |
---|---|---|---|---|
test_duration_secs | u64 | Duration of each test in seconds | 5 | config.test_duration_secs |
include_compute_test | bool | Whether to include compute test | true | config.include_compute_test |
include_texture_test | bool | Whether to include texture test | true | config.include_texture_test |
include_geometry_test | bool | Whether to include geometry test | true | config.include_geometry_test |
include_memory_test | bool | Whether to include memory test | true | config.include_memory_test |
complexity | u32 | Test complexity (1-10) affecting matrix size | 5 | config.complexity |
window_width | u32 | Width of benchmark window | 800 | config.window_width |
window_height | u32 | Height of benchmark window | 600 | config.window_height |
show_window | bool | Whether to show the benchmark window | false | config.show_window |
GpuInfo
Information about a GPU.
Field | Type | Description | Example Access |
---|---|---|---|
name | String | GPU model name | gpu_info.name |
vendor | String | GPU vendor | gpu_info.vendor |
driver | String | GPU driver version | gpu_info.driver |
vram | String | Estimated VRAM | gpu_info.vram |
backend | String | Graphics API backend | gpu_info.backend |
is_integrated | bool | Whether the GPU is integrated | gpu_info.is_integrated |
Functions
Context Management
Function | Return Type | Description | Example Usage | Possible Errors |
---|---|---|---|---|
GpuBenchmarkContext::new() | Result<GpuBenchmarkContext, Error> | Creates a new GPU benchmark context | let context = GpuBenchmarkContext::new()?; | No compatible GPU found, device creation failed |
context.run_matrix_mult(duration, matrix_size) | Result<GpuTestResult, Error> | Runs a matrix multiplication benchmark | let result = context.run_matrix_mult(Duration::from_secs(5), 1024)?; | Device lost, out of memory |
context.run_activation_functions(duration, data_size) | Result<GpuTestResult, Error> | Runs a neural network activation functions benchmark | let result = context.run_activation_functions(Duration::from_secs(2), 1_000_000)?; | Device lost, out of memory |
Information Gathering
Function | Return Type | Description | Example Usage | Possible Errors |
---|---|---|---|---|
get_gpu_info() | Result<GpuInfo, Error> | Gets information about the GPU | let gpu_info = gpu::get_gpu_info()?; | No compatible GPU found |
is_gpu_available() | bool | Checks if a compatible GPU is available | if gpu::is_gpu_available() { ... } | None |
Performance Testing
Function | Return Type | Description | Example Usage | Performance Impact |
---|---|---|---|---|
run_gpu_benchmark() | Result<f64, Error> | Runs a benchmark with default settings | let score = gpu::run_gpu_benchmark()?; | High - runs matrix multiplication test with default duration |
run_gpu_benchmark_with_config(config) | Result<GpuBenchmarkResult, Error> | Runs a benchmark with custom configuration | let result = gpu::run_gpu_benchmark_with_config(&config)?; | Varies based on configuration |
run_matrix_mult_benchmark(adapter, duration, size) | Result<GpuTestResult, Error> | Runs only the matrix multiplication test | let result = gpu::run_matrix_mult_benchmark(&adapter, duration, 1024)?; | Medium - runs only matrix multiplication test |
run_activation_functions_benchmark(adapter, duration, data_size) | Result<GpuTestResult, Error> | Runs only the activation functions test | let result = gpu::run_activation_functions_benchmark(&adapter, duration, 1_000_000)?; | Medium - runs only activation functions test |
Understanding GPU Benchmark Results
The GPU benchmark in CatP2P includes two main tests:
- Matrix Multiplication: How efficiently the GPU can multiply large matrices
- Neural Network Activation Functions: How efficiently the GPU can compute common activation functions used in neural networks
Matrix Multiplication Benchmark
This benchmark measures the GPU's ability to perform matrix multiplication operations, which are fundamental to many GPU computing tasks:
- Score in MFLOPS: Millions of floating-point operations per second
- Higher scores indicate better GPU compute performance
- Matrix size: Calculated as 512 + (complexity * 128), where complexity ranges from 1 to 10
Activation Functions Benchmark
This benchmark measures the GPU's ability to compute common neural network activation functions:
- Operations tested: ReLU, Sigmoid, Tanh, and Leaky ReLU
- Score: Based on millions of operations per second
- Higher scores indicate better performance for AI and deep learning applications
Interpreting the Score
The GPU benchmark score represents:
- Higher scores indicate better GPU compute performance
- Scores are influenced by:
- GPU architecture and generation
- Number of compute units/cores
- Memory bandwidth and capacity
- Driver optimization
- System configuration
Typical Performance Ranges
GPU Type | Typical Matrix Multiplication Score (MFLOPS) | Expected Performance |
---|---|---|
High-end Desktop GPU | 5,000,000 - 15,000,000 | Excellent for complex parallel computing |
Mid-range Desktop GPU | 1,000,000 - 5,000,000 | Good for most parallel computing tasks |
Entry-level Desktop GPU | 200,000 - 1,000,000 | Suitable for basic parallel computing |
High-end Integrated GPU | 50,000 - 200,000 | Limited parallel computing capability |
Basic Integrated GPU | 5,000 - 50,000 | Minimal parallel computing capability |
Note: Actual performance can vary significantly based on specific hardware, system conditions, and benchmark parameters.
Factors Affecting Benchmark Results
Factor | Impact | Notes |
---|---|---|
Matrix Size | High | Larger matrices provide more accurate results but may hit memory limits |
Data Size | Medium | Larger data sizes for activation functions provide more accurate results |
System Activity | Medium | Other processes using the GPU can reduce benchmark scores |
Driver Version | Medium | Updated drivers can provide performance improvements |
Thermal Throttling | High | GPUs may slow down if they overheat during benchmarking |
Power Limits | Medium | Power-limited systems (like laptops) may show lower performance |
API Overhead | Low | Different graphics APIs have different overheads |
Implementation Details
Matrix Multiplication Benchmark
The matrix multiplication benchmark measures how quickly the GPU can multiply two large matrices:
- Two random matrices of size NxN are created (where N is determined by the complexity parameter)
- The matrices are uploaded to GPU memory
- A compute shader multiplies the matrices
- The process is repeated for the specified duration
- Performance is measured in MFLOPS (Millions of Floating Point Operations Per Second)
The matrix size is calculated as: 512 + (complexity * 128), where complexity ranges from 1 to 10.
Activation Functions Benchmark
The activation functions benchmark measures how quickly the GPU can compute common neural network activation functions:
- Random input data of the specified size is created
- The data is uploaded to GPU memory
- A compute shader applies four activation functions (ReLU, Sigmoid, Tanh, Leaky ReLU) to each element
- The process is repeated for the specified duration
- Performance is measured in millions of operations per second
Resource Management
The GPU benchmarking module uses a context-based approach to manage GPU resources:
- A
GpuBenchmarkContext
is created once - This context holds the GPU device and command queue
- Multiple benchmarks can be run using the same context
- This approach avoids device creation/destruction overhead and potential driver issues
Graphics API Selection
The GPU benchmarking functionality uses wgpu, which automatically selects the best available graphics API:
- Vulkan on supported systems (Linux, Windows, Android)
- Metal on macOS and iOS
- DirectX 12 on Windows
- DirectX 11 on older Windows systems
- OpenGL as a fallback
Error Handling
The GPU benchmarking functions use Rust's Result
type to handle errors gracefully. Common errors include:
Error::Benchmark("No suitable GPU adapter found")
: System lacks required GPU capabilitiesError::Benchmark("Failed to create device: {error}")
: GPU initialization failedError::Benchmark("No matrix multiplications were performed during the benchmark")
: Benchmark failed to run any iterationsError::Benchmark("No activation function operations were performed during the benchmark")
: Benchmark failed to run any iterations