Category Archives: Uncategorized

BRANCHING STRATEGIES

A sophisticated branching strategy can yield significant improvements

——————–

The function of a branching strategy has several functional goals : Create better software (fewer bugs); deliver software faster; allow for flexible (agile?) deployments, where an intelligent choice can be made as to which features are released, giving you the ability to do time-based releases with a variable feature list; and to optimize programmer and tester usage by keeping everyone maximally utilized over a project life cycle. There are several options available for branching a code base, and various reasoning has been given for various plans, but I will present my strategy, the reasoning behind it, and address common concerns.

The end goal is to have at least one branch of the code per environment that said code is deployed on.

In a typical web or client application, this implies that each developer has her own environment on a local machine, with a local database, or at least a dedicated copy of a schema on a shared machine. It is imperative that this level of isolation is achieved, either through interface definition (in the case of SOA architectures) or by copying schemas (in the case of classic n-tier architectures). Each branch will therefore represent the body of work of a single developer on a single feature deployed on a single environment, again implying that developers may have multiple branches for different features. The advantage here is that a developer can focus on changing one part of an application without fear of other changes affecting her *during* development. Furthermore, unit testing is much more targeted and effective because of the guarantee that there are no “unknown” changes affecting the desired changes. Inter-related dependencies will get resolved at a merge point.

Once work is complete and unit tested in a developers branch it can be merged into a release candidate branch. This release candidate branch in a typical organization corresponds to a shared development environment. It is in this branch that initial integration testing amongst branches occurs. Changes merged into this branch are tagged, built, deployed, and tested as necessary. Any fixes that may be required can be addressed directly in this branch and deployed back into the shared environment, again isolating individual users. When this code branch has been accepted, it can be merged into the next branch, which typically corresponds to a test/QA/user acceptance environment. The process is again followed with issue resolution and testing, until the final results are merged into production release candidate branch. At this point you can make the argument that the production branch is a defacto trunk, and any issues that result from here should spawn bug fix releases. I agree with this argument, so in practice when code is ready for release it is merged into the trunk and the production release is built from the trunk.

There are several reasons that are presented for a simpler branching strategy. Most commonly heard is the argument that this strategy is simply too complex; in a real environment there will not be enough discipline to enforce this, causing a tremendous management overhead. The overhead introduced by the sophisticated method is mitigated in several ways. First, as time passes this becomes second nature, or to put it another way the main cost involved is the learning curve, not the day to day operation. Second, the cost of merging in a simple situation is trivial; literally the click of a button. The added overhead is again minimal with the added benefit of a guaranteed clean release branch. Finally, the discussion of overhead must include the fact of overhead for inefficient use of development resources. When using simpler strategies, while the role of code librarian is eliminated or reduced in scope, the entire development team is essentially stopped while a build is tested or released. Also, In summary, the reasons to pursue other branching strategies are inherently flawed and are fueled more by fear an ignorance than by sound planning.

A Second common argument is that merging is time consuming and unreliable. This is also usually stated in terms of multiple developers modifying the same file or set of files concurrently, such as when adding new modules to a project in Visual Studio (TM). The response to this is threefold. First, it is an error of technical direction to have more than one developer modifying the same working set on a regular basis (bug fixes aside), except in certain very constrained scenarios as mentioned previously. For other cases, development should be focused on coding to interfaces not implementations and injecting stubs of known good data as early as possible to keep everyone moving forward. Since the scenarios where this is acceptable is so small, the issue of concurrent modification is invalid and can be reasonably be managed by a single person handling code elevations and merges, which again shifts complexity and stoppage from the whole team to one person, which is a good thing. Second, current control management systems have very robust merging capabilities that take into account shared common ancestors to perform diff-of-diffs type of comparisons, resulting in much better reliability. Finally, by following this strategy a side benefit is revealed in that by one person (or role) handling the elevation and merge you have introduced a rational check point for validating changes and performing code review and documentation tracing. Since this is inherent in the strategy it can easily be incorporated into a software lifecycle, and more importantly can be planned for as a discrete task in a project plan.

Finally, an argument is usually made that all developers need all changes from all other developers at all times. This argument is the hardest to refute because it is based upon the assumption that work should not be isolated, usually born out of a history of poor architectural and design choices wherein there is a tight coupling of all components of a system. The refutation then proposes that by implementing this strategy, these tight coupling become obvious and are an opportunity to make the code base more robust; in essence it institutionalizes the ability to organically detect poor design or implementation choices. Unfortunately this scenario can be a great impediment to change if an existing system already suffers from this tight coupling it can be hard to move away from an existing code base.

In conclusion, a robust branching strategy gives you advantages in reducing overall code development time, code robustness and process control. While the overall complexity of the process is not necessarily reduced, the complexity is shifted to a more efficient point, which allows all team members to be utilized most efficiently, which results in an overall reduction in development time. The inherent process that is introduced by this branching strategy also ties stable branches with hardware environments, which leads to more robust deliveries. These advantages come with no additional aggregate overhead, and as the learning curve is overcome can lead to substantial reductions in time spent with the bureaucratic portions of code control.

Multi Developer branching

Overview

Software Configuration Management (SCM) is the art of both controlling and tracking changes in a software project. If software engineering is fundamentally concerned with producing quality software in a known and repeatable fashion, then SCM is the fundamental tool that drives the operation of software engineering.

Why is Software Configuration Management important?

The formal goal of a software engineering project is to deliver a set of integrated software components. In practice, software engineering is often intertwined with systems engineering; that is, the portion of a project concerned with hardware and deployment or otherwise non-functional requirements. The practical goal for any project is then to deliver a stable product which includes not only software, but the hardware specifications and configuration artifacts needed to realize the software system. Configuration Management enables the delivery of stable software and further enhances an organizations ability to deliver revisions and new features by answering the fundamental question : “How do I reproduce a change that someone has made?”. By tracking and controlling changes, SCM allows a program manager to answer that question, and make intelligent decisions based on that information. For the programmer, configuration management is important in another way – research. Proper CM allows a programmer to explore changes and features that may lead to a dead-end, without affecting the rest of the project. To summarize, CM provides reliability (artifacts are maintained in case of failure), flexibility (you can choose to go in a different direction), repeatability (you know what has changed) and isolation (research in one field won’t affect other development). Any one of the preceding four is reason enough to employ intelligent SCM; taken together it is clear that SCM must be at the core of software development.

Who is responsible for configuration management?

In a word, everyone is responsible for configuration management. From a sales manager (on commercial projects) on down to the individual programmer, configuration management needs to be a part of every day discipline. As described previously, the programmer’s fundamental concern with configuration management is one of implementation; making sure that proper procedures are followed and artifacts are revisioned often to prevent data loss and create a change trail. Architects and project management are responsible for defining the strategies to be employed by a group; in particular the way in which the implementers collaborate to achieve a stable release. Sales management or client facing people are responsible for defining and communicating what features are needed that drive that strategy, including needed release dates.

Roles within SCM

With an understanding that everyone is responsible for configuration management, the role of each individual needs to be determined. Roles can loosely be broken into two groups : planners and executors. The planner role is played collectively by the system architects, the sales team (or equivalent) and the IT management. The executor role is played by anyone who is involved in the day to day management of artifacts. On particular role that can be of high importance is that of the code librarian who watches over the repository and communicates changes as necessary (more to be said on this later). As much as can be generalized, the code librarian is the arbiter of what is available for release, which can be further generalized as the role release manager (the librarian may physically execute, the release manager decides which features and when). Architects or project managers define what reasonable branches of work need to take place.

What artifacts are appropriate for revision control?

In software development it is pretty clear that source code needs revision control, along with associated development artifacts like project configuration files, but the scope of revision control is actually much larger. Any project documentation benefits from revision control, but even more from change management. Revision control is only part of the issue; that is to maintain a history of changes. Of even greater import is to control the changes. Any document that is used to communicate in the course of a project can benefit from control management. Requirements documents, design documents, scope documents, test cases and acceptance documents are all examples of artifacts that should be controlled and revisioned, even if they are ephemeral artifacts of a particular development phase. Many of these documents are already controlled in an ad-hoc manner when using MS-Word and the ability to accept changes! Taking the concept a step further, it is beneficial to control all aspects of a software development process, from the operating system to the compilers used. Unfortunately, this extreme view of revision controlling is seldom practiced, and lead to many common problems. How many times have you been on a project where the versions of development tools or compilers has been different? Or the development system versioning is different than the production system?

So I was reading the latest issue of Dr Dobb’s and got to the piece by Scott Ambler called “Is Fixed-Price Software Development Unethical?.” Mr. Ambler, as you may know, is the Agile Methodology guy at ddj. It is really a fascinating article, and something that I have been on about for some time now : you simply cannot do fixed cost software projects. Mr. Ambler took it a step further and questioned the ethics, which is really an interesting concept. As a quick side note, it is interesting that in an article on ethics, Mr. Ambler basically recycled an article he wrote more than a year ago. Journalistic ethics aside, the article does raise an interesting twist to the problem : can we ethically propose fixed cost software projects?
If you don’t want to RTFA, I can summarize a few of the more salient points. The key to understanding fixed cost software is that the intent is to mitigate risk for the stakeholder (the person wanting something done) by specifying boundaries of time and money, or as sometimes happens, the desire to squeeze a large profit margin out of a project. Unfortunately, what ends up happening is that an inordinate amount of time and money is spent on Big Requirements Up Front (BRUF) which results in an untenable development model that assumes that all requirements are known and static. Following this to its logical conclusion, The fixed price project suffers at all subsequent phases; in the development process there is inherent disincentive to allow for change management; the end product contains many portions that are unused and unnecessary; the end product fails to deliver on new (discovered) requirements; and finally, the project usually end up late and over budget anyway. QED, the risk mitigation aspect is a feel good fantasy at best for those unwilling or unable to understand the creative aspects of software development.
Now that we have an understanding that at some level, or at least an assumption, that Fixed Priced Projects are A Bad Thing, I must examine the assertion that responding to an RFP that is fixed price is unethical. On the surface, doing something that one knows is wrong is pretty much the definition of unethical. I think that Mr. Ambler missed a point though. Responding to an fixed cost RFP is not the unethical part; putting said RFP out in the first place is the unethical. There is nothing unethical in giving an organization what it wants, or at least thinks it wants. To be sure, it is better to propose a non-fixed price alternative, perhaps in addition to the RFP, but it seems to me that that route is on the fast path to unemployment. As Agile Methodologies continue to be accepted, I predict that fewer and fewer organizations will want a fixed cost project anyway, but until then, keep responding to those RFPs and try to get the system changed from the inside