Definition
The fundamental facility provided by Feature Toggles - being able to ship alternative codepaths within one deployable unit and choose between them at runtime
By Pete Hodgson
Concepts / Terminology
Terminology | Description |
---|---|
Toggle Router | Function that controls which code branch to take |
Toggle Point | The location in each code logic branches based on a flag |
Toggle Configuration | The location where flags live. Examples include: - config files - in-memory store - DB - distributed system with a fancy UI |
Toggle Context | Additional information, like the user making the requests, to decide which code path to take. This can take the form of cookies or the HTTP header that represent the user making the request. |
Types of Feature Flags
Type | Description | Owner | Context | Longevity | Location |
---|---|---|---|---|---|
Release flags | Decouples feature release from code deployment. Allow developers to merge non-production ready code, minimizing risk of merge conflicts from long running branches. Release flags should be the last thing you do, prefer smaller releases first. | Developers | Build time / run time | One or two weeks | Core |
Ops flags | To be used by system operators that change influence the application’s operational behaviour. Usually short-lived and retired once operational confidence is developed. That said, a small number can remain as long-lived Kill Switches. Allows less important but memory hungry features from running during periods of high load | Operations | Run time | Months to years | Core |
Experiment flags | Allows for A/B testing. This is inherently dynamic (by user/request), but the user/request configuration itself is relatively static | Product | Run time | Weeks to months | Edge |
Permission flags | Allows for different users to experience the application differently. Users can be separated by internal/alpha/beta/premium. This mechanism allows for dogfooding. | Product | Run time | Months to years | Edge |
Dynamic routing vs dynamic configuration
NOTE
The longevity of the flag affects how it should be implemented
Implementation Techniques
Decoupling decision points from decision logic
Looking at the following example:
This is undesirable as the decision point (generate_invoice_email
) has to be aware of the decision logic (features['next-gen-ecomm'] == True
)
A better solution looks like this:
This way, the decision logic can evolve independently of the flag
For example,
- the feature flag controlling the decision
- the reason being the decision (driven by static config file to A/B testing)
Inversion of control
Make the code more configurable and testable:
Use the Strategy Pattern
Removes if
statements and simplifies the callsite.
where content_enhancer
is either a strategy function, or a concrete class of different strategies.
Tips for Working with Feature Flags
How to minimize the complexity of feature-flagged systems?
- Prefer static configuration
- Allows for source control, observability and for reproducibility
- Expose the application’s current configuration state
- via a file
- or some metadata API endpoint
- Use a structured (schema-ful) configuration file, with metadata for the user, including:
- Description and recommendation of when to use a flag
- Creation date
- Primary developer contact
- Expiration date for short-lived flags
- Have a process and strategy for managing different types of flags
- How to test feature flags?
- How to prevent feature flag sprawl?
How to test feature flags?
- Instead of it being a combinatorial problem, it’s usually sufficient to test that
- All flags that are expected to be on the next releases
- All flags on (assuming the default state if “off”)
How to prevent feature flag sprawl?
- Put expiration dates—after that date it defaults to one or the other
- More extreme would be to put a time bomb that fails the test or prevent the application from starting
- Limit the number of flags. Once a limit is reached, some flags have to be deprecated before new ones are added
Flag Configuration Management
Approaches | Description | Pros | Cons |
---|---|---|---|
Hardcoded | Hardcode the flag or comment parts of code | Super simple | - Requires redeployment - Commented code introduces ambiguity and affects readability |
Parametrized configuration | Use environment variables or CLI flags on program startup | Allows reconfiguration without re-building an app | - Requires re-deployment or an application restart - Hard to scale with many applications |
Configuration files | Application reads configuration from a configuration file | - Configuration state of application is more easily visible - Is version-controlled | May require a re-deploy if the file is in version control |
App DB | Application reads configuration from DB | - Values configurable from DB - There will be some UI to toggle values | |
Distributed configuration | For distributed systems where there are multiple node replicas within a cluster | ||
Overriding configuration | Some default values are overridden based on some context like the environment | - Provides flexibility | - Overriding configs introduces debugging complexity - It also runs counter to Continuous Delivery ideals, as the overrides config may not have been tested in CI/CD |
Per-request overrides | Based on the Toggle Context | - Very dynamic and granular-level of control - Allows for B testing and Canary testing | - More complex - Security risks as unreleased (and probably less tested) features are accessible by the public1 |
Footnotes
-
this can be mitigated by cryptographically signing the override configurations ↩