As an internal tools PM, I am unfamiliar with the industry standard for feature flag management. I noticed that there was no logging of flags’ usage, by whom, which stories had flags, and flags that hadn’t been removed after 12+ months. This led to numerous issues, such as developers claiming flags were turned on three months ago. To address these issues, a workflow was developed for tracking flags, enabling and disabling them, and creating tickets to delete them. This setup has been smooth so far. I am curious about other processes used to ensure proper flag management.
I’m seeking your expert advice as Product Managers. What processes do you use to ensure flags are being managed properly? TIA.
It’s included in our epic template, so to speak. If no significant concerns are discovered after 2 weeks, FF is deleted. FF must be enabled for release. Epic cannot be shut down until all tasks have been finished. In other words, the FF serves as a crucial checkpoint to ensure the stability and readiness of the epic before its release. It acts as a safeguard to prevent any potential issues from arising once the epic is deployed to production.
The large companies where I’ve worked used the same technology for A/B testing and feature flags; a flag was just a test with a score of 0 or 100. Then you only need to check your control panel to view your test flags. Additionally, the fact that there are these stupid 100% “tests” there serves as a wonderful reminder to clear them up. Tests could be ramped up or down independently by PMs. This approach allowed for easy monitoring and management of experiments, ensuring that any potential issues or bugs could be quickly identified and addressed. Moreover, the flexibility provided by the independent ramping up or down of tests by PMs allowed for seamless adjustments and optimisations based on real-time data and user feedback.
They also act as circuit breakers in case something goes wrong. "There is a little service interruption. Turn off the feature until it starts working again. Deterioration is preferable to total failure. In such cases, these circuit breakers help prevent any further damage by automatically disconnecting the faulty part from the rest of the system. This allows for easier troubleshooting and repairs, ensuring that the overall functionality of the system is maintained while addressing the specific issue causing the interruption.
I’ve witnessed both.
Eng is in charge of making sure new features are included in the code. The dissemination of feature flags to users must be enabled by PMs.
Engineering is involved in deployment to the production environment, but the product decides whether to make it accessible to users.
I’ve occasionally observed Eng go through and delete the feature hooks once the feature is stable and has been in production for some time without issues.
I’ve observed feature flags connected to monitoring and observability in other, more established, and larger organisations. Automation would then turn on a feature flag for features reliant on the service if the health check on that service fails (resulting in a service outage).
In this case, a/b experimental flags and feature flags were separated. PMs are always in charge of the finer points of a/b traffic distribution, and this is typically the case in smaller organisations.
You, the PM, TPM, or PO, typically own these things:
-
Feature Flags: You don’t actually turn it on in the environment; you just tell the SE to do that for feature X. You might have been in a team meeting with other PMs, and an API you consume is not working right now.
-
What is Dark Deployed? It might be owned by a governance team…sometimes. I have seen that at a fintech firm.
-
White- or black-listed zip codes, addresses, etc. for any UAT-QA might own this too or just do the updates.
If it’s in the epic, then I don’t think it’s OPEX. Outside your question, though.
Managing feature flags effectively is crucial for ensuring smooth software development and release processes. Here are some best practices and processes to ensure proper feature flag management:
-
Flag Lifecycle Management:
-
Creation: Ensure that new flags are created with clear descriptions and purpose. It’s important to document why a flag was created and what it controls.
-
Usage Tracking: As you mentioned, logging usage of flags is critical. Implement a system that records when a flag is enabled or disabled, who performed the action, and why it was changed. This can help in troubleshooting and accountability.
-
Review and Cleanup: Regularly review active flags and their usage. Flags that have been in use for a long time, especially if they were for temporary purposes, should be evaluated for removal. Implement a process to automatically create tickets or reminders for flag cleanup after a certain period, like 12 months.
-
Archiving: Flags that are no longer needed but should be preserved for historical purposes should be archived, not deleted. This ensures you have a record of past configurations.
-
-
Change Management:
- Use a change management process to control who can create, modify, or delete flags. This can include code review processes or approvals by senior team members.
-
Documentation:
- Maintain a central repository of flag documentation that includes the purpose, usage guidelines, and any relevant historical information. This helps new team members understand why flags exist and how they should be used.
-
Naming Conventions:
- Use a consistent and descriptive naming convention for flags. This makes it easier to understand their purpose and reduces confusion.
-
Testing and Validation:
- Before enabling a flag in production, ensure that it has been thoroughly tested in various environments. Use automated tests, staging environments, and QA processes to verify that the flag behaves as expected.
-
Monitoring and Alerts:
- Implement monitoring for flags in production. If a flag behaves unexpectedly or causes issues, you should be alerted immediately.
-
Communication:
- Keep the team informed about flag changes. Use tools like Slack channels or email notifications to communicate when flags are enabled, disabled, or about to be removed.
-
Rollback Plans:
- Always have a rollback plan in case a flag causes critical issues in production. This might involve being able to quickly disable a flag or revert to a previous state.
-
Security Considerations:
- Ensure that flags are not used to introduce security vulnerabilities. Review their impact on security regularly.
-
Continuous Improvement:
- Periodically review your flag management process and look for areas of improvement. Gather feedback from your development and operations teams to identify pain points and address them.
Remember that the specific processes and tools you use can vary depending on your organization’s size and needs. Implementing these best practices can help you maintain proper control and visibility over feature flags, leading to smoother development and release cycles.
Once the FF is no longer required, you have some technical debt to deal with to declutter the code. It is helpful to work with the devs to effectively allocate an appropriate chunk of the teams capacity (~20%) to keep the code well maintained, documented, with up to date unit tests etc.
@TinaGreist, and this underscores the fact that every feature comes with an overhead maintenance cost. It is important for businesses to carefully consider the value and necessity of each feature before implementing it, as the maintenance cost can impact the overall profitability. Additionally, regularly evaluating and optimizing existing features can help minimize the maintenance burden and ensure efficient resource allocation.
Totally agree @HeatherKurtz.
I’ve been educating myself and making an effort to accept via negativa. You can enhance user experience while lowering maintenance needs by removing complexity. This approach involves eliminating unnecessary features or streamlining existing ones to focus on what truly adds value to the business and its customers. By simplifying the system, businesses can not only reduce maintenance costs but also improve user satisfaction and increase overall profitability.
Why would a PM do this?
Why would they not? Before we had technology to work around it, SE used to manage them where I work, but in any case, the PM would simply tell when to turn it on or off. Back then, SE had to release something to turn it on. Now that we have the necessary tool, the decision-maker is also in control.
PMs should concentrate on the functionality rather than the spaghetti monster, feature flag, or API that is powering it.
You need to take care of the tech issue.
Yes, PM does indicate when to push something live and when to roll it back, however the feature flag should not be aware of this information.
FF ought to be a blackbox to the product as a result.
@GerardKolan, Disagree. I believe focusing on functionality involves having the ability to turn things on and off. Give the business the power to control functionality so, in my opinion, the software engineers may concentrate on the challenging issues. For various organizations, I’m sure it varies.
Why would a PM engage in DevOps?
Well then start doing everything so each of your teams can focus on the hard things.
I don’t consider activating a feature or increasing the volume of consumer exposure to a product to be devops. For handling experiments and feature flags, we employ Optimizly. Owning the feature flags made sense because the business wants to be in charge of the tests. By utilizing Optimizly, we can easily manage and control the experiments and feature flags, allowing us to make data-driven decisions. This level of ownership aligns with our business goals and ensures that we have full control over the testing process.
Each organization has its own unique methods, as you indicated. However, in all the companies where my friends and I have worked, we have never encountered a situation like that. This highlights the importance of adapting to different work environments and being open to new experiences. It also emphasizes the need for flexibility and the ability to navigate unfamiliar situations effectively.
So you just tell your developers to click the button instead of the PM?
A good question.
We provide them with guidelines for A/B experiments that encompass
a) the design of the experiment,
b) based on the app’s adoption, when to enable, and
c) the timing of the conclusion and
d) Analysis of offline data.
Teams from the tech and data departments work together, coordinate, and provide us with data. When deciding whether to roll out the control or the test variant, we examine the situation.
This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.