Feature Flagging

Definition

The fundamental facility provided by Feature Toggles - being able to ship alternative codepaths within one deployable unit and choose between them at runtime

By Pete Hodgson

Concepts / Terminology

Terminology	Description
Toggle Router	Function that controls which code branch to take
Toggle Point	The location in each code logic branches based on a flag
Toggle Configuration	The location where flags live. Examples include: - config files - in-memory store - DB - distributed system with a fancy UI
Toggle Context	Additional information, like the user making the requests, to decide which code path to take. This can take the form of cookies or the HTTP header that represent the user making the request.

Types of Feature Flags

Type	Description	Owner	Context	Longevity	Location
Release flags	Decouples feature release from code deployment. Allow developers to merge non-production ready code, minimizing risk of merge conflicts from long running branches. Release flags should be the last thing you do, prefer smaller releases first.	Developers	Build time / run time	One or two weeks	Core
Ops flags	To be used by system operators that change influence the application’s operational behaviour. Usually short-lived and retired once operational confidence is developed. That said, a small number can remain as long-lived Kill Switches. Allows less important but memory hungry features from running during periods of high load	Operations	Run time	Months to years	Core
Experiment flags	Allows for A/B testing. This is inherently dynamic (by user/request), but the user/request configuration itself is relatively static	Product	Run time	Weeks to months	Edge
Permission flags	Allows for different users to experience the application differently. Users can be separated by internal/alpha/beta/premium. This mechanism allows for dogfooding.	Product	Run time	Months to years	Edge

Dynamic routing vs dynamic configuration

Transclude of Feature-Toggles#^105406

NOTE

The longevity of the flag affects how it should be implemented

Implementation Techniques

Decoupling decision points from decision logic

Looking at the following example:

features: dict[str, bool] = fetch_feature_flags()
 
def generate_invoice_email():
    base_email = build_email_for_invoice()
    if features['next-gen-ecomm'] == True:
        return add_order_cancellation_content_to_email(base_email)
    return base_email

This is undesirable as the decision point (generate_invoice_email) has to be aware of the decision logic (features['next-gen-ecomm'] == True)

A better solution looks like this:

features: dict[str, bool] = fetch_feature_flags()
config: pydantic.BaseModel = create_feature_decisions(features)
 
def generate_invoice_email():
    base_email = build_email_for_invoice()
    if config.include_order_cancellation_email():
        return add_order_cancellation_content_to_email(base_email)
    return base_email

This way, the decision logic can evolve independently of the flag

For example,

the feature flag controlling the decision
the reason being the decision (driven by static config file to A/B testing)

Inversion of control

Make the code more configurable and testable:

features: dict[str, bool] = fetch_feature_flags()
config: pydantic.BaseModel = create_feature_decisions(features)
 
def generate_invoice_email(config: pydantic.BaseModel):
    base_email = build_email_for_invoice()
    if config.include_order_cancellation_email():
        return add_order_cancellation_content_to_email(base_email)
    return base_email

Use the Strategy Pattern

Removes if statements and simplifies the callsite.

def generate_invoice_email(content_enhancer):
    base_email = build_email_for_invoice()
    return content_enhancer(base_email)

where content_enhancer is either a strategy function, or a concrete class of different strategies.

Tips for Working with Feature Flags

How to minimize the complexity of feature-flagged systems?

Prefer static configuration
- Allows for source control, observability and for reproducibility
Expose the application’s current configuration state
- via a file
- or some metadata API endpoint
Use a structured (schema-ful) configuration file, with metadata for the user, including:
- Description and recommendation of when to use a flag
- Creation date
- Primary developer contact
- Expiration date for short-lived flags
Have a process and strategy for managing different types of flags
How to test feature flags?
How to prevent feature flag sprawl?

How to test feature flags?

Instead of it being a combinatorial problem, it’s usually sufficient to test that
- All flags that are expected to be on the next releases
- All flags on (assuming the default state if “off”)

How to prevent feature flag sprawl?

Put expiration dates—after that date it defaults to one or the other
More extreme would be to put a time bomb that fails the test or prevent the application from starting
Limit the number of flags. Once a limit is reached, some flags have to be deprecated before new ones are added

Flag Configuration Management

Approaches	Description	Pros	Cons
Hardcoded	Hardcode the flag or comment parts of code	Super simple	- Requires redeployment - Commented code introduces ambiguity and affects readability
Parametrized configuration	Use environment variables or CLI flags on program startup	Allows reconfiguration without re-building an app	- Requires re-deployment or an application restart - Hard to scale with many applications
Configuration files	Application reads configuration from a configuration file	- Configuration state of application is more easily visible - Is version-controlled	May require a re-deploy if the file is in version control
App DB	Application reads configuration from DB	- Values configurable from DB - There will be some UI to toggle values
Distributed configuration	For distributed systems where there are multiple node replicas within a cluster
Overriding configuration	Some default values are overridden based on some context like the environment	- Provides flexibility	- Overriding configs introduces debugging complexity - It also runs counter to Continuous Delivery ideals, as the overrides config may not have been tested in CI/CD
Per-request overrides	Based on the Toggle Context	- Very dynamic and granular-level of control - Allows for B testing and Canary testing	- More complex - Security risks as unreleased (and probably less tested) features are accessible by the public¹

this can be mitigated by cryptographically signing the override configurations ↩

🪴 Chris' Digital Garden

Recent Notes

Arithmetic Intensity of a Neural Network Linear Layer

Automatic Material System

Explorer

Feature Flagging

Concepts / Terminology

Types of Feature Flags

Implementation Techniques

Decoupling decision points from decision logic

Inversion of control

Use the Strategy Pattern

Tips for Working with Feature Flags

How to test feature flags?

How to prevent feature flag sprawl?

Flag Configuration Management

Graph View

Table of Contents

Backlinks

🪴 Chris' Digital Garden

Recent Notes

Arithmetic Intensity of a Neural Network Linear Layer

Automatic Material System

Explorer

Feature Flagging

Concepts / Terminology

Types of Feature Flags

Implementation Techniques

Decoupling decision points from decision logic

Inversion of control

Use the Strategy Pattern

Tips for Working with Feature Flags

How to test feature flags?

How to prevent feature flag sprawl?

Flag Configuration Management

Footnotes

Graph View

Table of Contents

Backlinks