Avoid using too many features
Selecting important features
Selecting features that generalize to unseen data

Why you shouldn’t use too many features

Too many features:

increases the possibility for data leakage.
can cause overfitting
can increase memory requirements during training and inference
increases inference latency
increases technical debt
- an outdated feature can affect model performance
- when deprecating a useless feature from the model, any features that depend on it also needs to be adjusted

Selecting important features

Use plotting tools like

SHAP plots
Feature importance plots (if provided by the model package)

Selecting features that generalize to unseen data

Aspect	Description
Coverage	The feature should be available for most of your data. An exception to this is if a feature has high predictive power when present, and is confirmed to not be a leaky feature.
Value distribution	The distribution of the feature value should be the same between training vs val/test/inference sets.¹
Availability at inference time	The feature should be available at inference time

See Data Drift. ↩