Optimization Options
Compute Bound
- Use
tree_method=hist
for faster and just as accurate models - Use GPUs (
tree_method=gpu_hist
) - Consider using XGBoost-Ray
Memory Bound
- Use
max_cache_hist_node
to limit CPU cache size - Use
QuantizeDMatrix
to reduce intermediate memory size (only useful if entire memory can fit on the machine)
Large Data Size
Use memory mapping via the DataIterator API, see Using XGBoost External Memory Version for more info
- Use
grow_policy=depthwise
to iterate over data as efficiently as possible