Benchmarking with scipy#
This benchmarking file compares the performance and accuracy of the custom distributions implemented in pyMultiFit against their counterparts in SciPy (scipy.stats).
The purpose of these benchmarks is to evaluate:
- Speed:
Measure the execution time of key operations (e.g., PDF and CDF).
Determine whether the custom implementations are comparable in performance to SciPy’s optimized implementations.
- Accuracy:
Compare the results of key operations (e.g., PDF, CDF) against SciPy’s outputs to ensure numerical correctness.
Each benchmark will output timing information and accuracy metrics, providing insights into the trade-offs between speed and precision.
Key Benchmarked Operations#
- Probability Density Function (PDF):
Tests the performance of evaluating the likelihood at a given point.
This is a critical operation for statistical modeling.
- Cumulative Distribution Function (CDF):
Benchmarks the computation of cumulative probabilities.
Used widely in hypothesis testing and statistical applications.
- Accuracy Metrics:
Absolute Error will be calculated between custom and SciPy distributions for PDF and CDF outputs.
Benchmark Workflow#
Each distribution (e.g., Gaussian, Laplace) will be benchmarked as follows:
- Generate test inputs:
Use a range of inputs (e.g.,
numpy.linspacefor PDF/CDF).Ensure the input range covers typical and edge-case scenarios.
- Compare Speed:
Measure execution times for PDF and CDF for both the custom and SciPy implementations.
- Compare Accuracy:
Compute differences (e.g., Absolute Error) between the outputs of custom and SciPy distributions for identical inputs.
- Output Results:
Print/plot or store results in an easily interpretable format (tables or plots).
Expected Results#
At the end of the benchmarks, the following insights will be available:
Accuracy comparisons using numerical error metrics.
Timing comparisons (in seconds) for each operation (PDF, CDF).
Identification of cases where custom distributions outperform or lag behind SciPy counterparts.