logoalt Hacker News

baqtoday at 10:03 AM2 repliesview on HN

> it just reminds me on how feature flags can be misused as application configuration/customization. An antipattern i could observe at various organzations already.

feature flags are perfect for configuration and customization, why using them for this purpose is 'misuse' is beyond me and I've heard this claim from multiple people. they're literally configuration. feature with a flag to turn it on, off or give the flag a value. where's the misuse? is it a problem I'm not running experiments when switching over redis to valkey or whatever?


Replies

ZephyrBlutoday at 10:25 AM

Feature flags need to be treated as short-lived and experimental otherwise they end up getting abused for everything and make it very difficult to reason about your application.

If it's config/customization, it should be in code. If it's experimental it can be a flag until it solidifies, and then it needs to get moved to code.

When I was at Shopify a couple of years ago they mandated that feature flags had to be short-lived (Like 2-4w lifetime tops, some had exceptions) because they would end up getting left in code and never cleaned up, or for extended periods of time like months. Hard to tell if it's genuinely a "feature flag" or actually just a normal part of the system at that point.

Feature flags being flipped in prod was also a major source of incidents, in part because people didn't treat them as experimental and with the associated risk profile of something experimental.

The only exception where having long-lived flags was useful and required was for operational killswitches (E.g. disable Apple Pay because it's having issues), but that is explicitly not application config.

show 3 replies
jeremyjhtoday at 10:42 AM

One well known issue is that when you have a lot of separate feature flags that can interact, you explode the number of test cases you have to cover. For example if you have three feature flags that can interact in a module that has 100 test cases, you actually have 900 test cases if you are going to test with each possible combination of flags. Many teams don't test them all because they "already know" that doesn't apply here, and find out in production which combination of feature flags is unworkable.