diff --git a/README.md b/README.md index 574cbe7..8615b21 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ directly on the CPU. You need a rust toolchain installation and the OpenCL headers. -```sh +``` USAGE: rust-opencl-demo @@ -17,8 +17,94 @@ FLAGS: -V, --version Prints version information SUBCOMMANDS: - calculate-primes - help Prints this message or the help of the given subcommand(s) + bench-global-size Benchmarks the global size (number of tasks) value + bench-local-size Benchmarks the local size value + calculate-primes Calculates primes on the GPU + help Prints this message or the help of the given subcommand(s) + info Prints GPU information +``` + +### Bench Global Size + +``` +Benchmarks the global size (number of tasks) value + +USAGE: + rust-opencl-demo bench-global-size [OPTIONS] + +FLAGS: + -h, --help Prints help information + -V, --version Prints version information + +OPTIONS: + -o, --bench-output The output file for timings [default: bench.csv] + -n, --calculation-steps + How many calculations steps should be done per GPU thread [default: 1000000] + + --global-size-start The start value for the used global size [default: 1024] + --global-size-step The step value for the used global size [default: 128] + --global-size-stop The stop value for the used global size [default: 1048576] + --local-size The maximum number of tasks for the benchmark [default: 128] + -r, --repetitions + The average of n runs that is used instead of using one value only. By default the benchmark for each step + is only run once [default: 1] +``` + +### Bench Local Size + +``` +Benchmarks the local size value + +USAGE: + rust-opencl-demo bench-local-size [OPTIONS] + +FLAGS: + -h, --help Prints help information + -V, --version Prints version information + +OPTIONS: + -o, --bench-output The output file for timings [default: bench.csv] + -n, --calculation-steps + How many calculations steps should be done per GPU thread [default: 1000000] + + --global-size The maximum number of tasks for the benchmark [default: 6144] + --local-size-start The initial number for the local size [default: 4] + --local-size-step The amount the local size increases by every step [default: 4] + --local-size-stop + The maximum amount of the local size Can't be greater than the maximum local size of the gpu that can be + retrieved with the info command [default: 1024] + -r, --repetitions + The average of n runs that is used instead of using one value only. By default the benchmark for each step + is only run once [default: 1] +``` + +### Calculate Primes + +``` +Calculates primes on the GPU + +USAGE: + rust-opencl-demo calculate-primes [FLAGS] [OPTIONS] + +FLAGS: + --cpu-validate If the calculated prime numbers should be validated on the cpu by a simple prime algorithm + -h, --help Prints help information + --no-cache If the prime numbers should be used for the divisibility check instead of using an optimized + auto-increment loop + -V, --version Prints version information + +OPTIONS: + --local-size + The local size for the tasks. The value for numbers_per_step needs to be divisible by this number. The + maximum local size depends on the gpu capabilities. If no value is provided, OpenCL chooses it automatically + --end The maximum number to calculate to [default: 9223372036854775807] + -p, --parallel number of used threads [default: 2] + --numbers-per-step + The amount of numbers that are checked per step. Even numbers are ignored so the Range actually goes to + numbers_per_step * 2 [default: 33554432] + -o, --output The output file for the calculated prime numbers [default: primes.txt] + --start The number to start with [default: 0] + --timings-output The output file for timings [default: timings.csv] ```