In Favor of Architecture Design

An upfront architecture design phase can save a lot of time, and pain, prior to entering an Agile code development phase. Particularly, when complex requirements, high performance or new technology are involved.

Agile software development methodologies seem to dismiss architecture design, in favor of incremental development, and refactoring as needed. In my opinion, investing in upfront design not only accelerates projects, but can avoid unpleasant surprises, and painful delays. Architecture design and agile methodologies easily work hand in hand: the architecture design phases focuses explicitly on eliminating technical risk. Once the technical framework has been validated, implementation follows, applying the traditional agile methodologies

The “Agile” arguments go as follows: identify a new user story / feature, write a test that fails (but that’s required to meet the user story), write the code to pass this test (as well as all preceding ones) – repeat. In the process, keep the code as simple as possible, and since the code is simple and since you have an extensive suite of tests that validate existing functionality, refactoring is easy and fast.  While this methodology is indeed very powerful, it is not universally applicable. In addition, there are intrinsic advantages for upfront design

The main advantage of upfront design is “doing it right the first time”. By spending time upfront analyzing all the requirements, and technical challenges, and by evaluating competing approaches, one can avoid many dead-ends that one encounters when following an incremental approach. In the worst case of incremental design, one may run into a “killer” requirement near the end of the project which causes a complete refactoring of what’s been done before.

Further, even if one ends up with the right implementation in the end, one will simply save time by coming up with the “right design” the first time, and thus avoiding multiple refactoring efforts. While some issues only come up as one codes, spending sufficient time upfront will almost always eliminate unnecessary iterations.

In some cases, however, a phase solely dedicated to architecture design is almost always warranted. For example:

  • To partition a complex project in multiple components that can be handed off to a team of developers
  • To work through complex – and possibly conflicting – requirements
  • To ensure critical performance, resource utilization or scalability requirements
  • To validate the suitability of new technology that will be incorporated into the product: completeness of features, interfaces, or performance and scalability.
  • To validate with end users the usability of User Interfaces

In particular, features that impact different layers of the code (e.g. UI, business logic, database) need upfront design in order to avoid time-wasting back and forth between developers. Letting the whole team work it out is simply not efficient. A recent such project was for us to enable an application for multi-tenancy. Similarly, I have found that any project that involves clustering, fault-tolerance or high performance requires a dedicated and focused design – and validation – effort. Finally, incorporating any new technology – like an open-source package – must go through a prototyping phase: you never quite get what you expect …

By the way, an architecture design phase, should follow the principles of Agile Software development: keep it simple, use incremental milestones that demonstrate completion of a subset of requirements.

To be effective, the architecture / design phase must limit itself to what is strictly necessary, namely what motivated the design effort in the first place: e.g. functional partitioning or performance validation.  Anything that can be left to implementation must be.

Finally, the design phase must be concluded with a design review! …. More on this later.

Pair Programming – Does Anyone Do It?

Pair programming: not as efficient as individuals working on their own, but provides valuable benefits: code reviews and joint ownership of the code

I was surprised to read an article in the New York Times about Pair Programming. “For Writing Software, a Buddy System” that advocated 100% Pair Programming.

The New York Times article and Wikipedia give good definitions of pair programming – so I’ll only mention that the main idea behind this methodology is that “two heads are better than one”. While one engineer actually types in the code, the other reviews it, not only for typos, but also for all kinds of “gotcha” in the design, or the implementation.

It is a no brainer that a pair programming setup will lead to better code, written faster than with a simple programmer. The more difficult question, however, is whether this is more productive than 2 developers writing code on their own? In my estimate, No: two programmers will generate better quality code working independently that a programming pair.

I have to admit I have never actually tried pair programming with my teams – mainly because I don’t personally know anybody who has actually done it, and could have overridden my prejudice against it.

This being said, there are benefits that we can leverage from pair programming:

  • Code reviews: are definitely worthwhile. The time spent by a peer, or preferably by a technical lead, reviewing the code, is good investment against basic bugs and errors of interpretation in the spec or the design. Code reviews help ensure consistency across the product in various areas such as configuration, initialization, error handling, logging, resource management etc. Consistency leads to a higher quality product.
  • More than 1 person knowing any given piece of code: is also great practice. Not only is it good insurance should the original programmer fall under the proverbial bus, but it also helps in debugging, and gives everyone a broader picture of the whole project. It also reinforces the XP principle that the whole team owns the whole code, rather than having a set of individuals who own certain pieces of the code. This shared knowledge is particularly helpful when doing troubleshooting.  Finally, this also builds team spirit, and one always learns from the work of others.

What are your thoughts on pair programming? I would love to hear comments from people who have implemented pair programming on a production project.

MVP – Minimum Viable Product

Defining the Minimum Viable Product requires the selection of a segment of target customers and deliver the smallest critical mass of features – as early as possible – provided that you can charge a high enough price for it.

I have recently discovered, with great delight, Eric Ries’ “Startup Lessons Learned” blog , and in particular, his post about Minimum Viable Product (MVP). This is not surprising, since we are both fans of Steve Blank‘s Customer Discovery Process.

Eric’s post reminded me, how critical, yet how difficult in practice, the concept of Minimum Viable Product is.

Defining the minimum viable product correctly allows you to release products that are valuable to your customers with the minimal amount of energy and time invested – because as the name says, you have done the minimum, and yet you provide value. Said differently, if you only need to have 2 features in your product in order to sell it for $100, then you’d be crazy to spend the extra effort to add a 3rd or a 4th feature. Plus, by only delivering the minimum, you get to market fast – and hopefully beat the competition.

So why is this so difficult in practice … at least in my experience 🙂 ?

My first answer is that it is a lot easier to define the Maximum Product than it is to define the Minimum Viable Product.

Defining the Maximum Product  entails compiling a list of all the possible features that your product could possibly have: you only need to talk to a handful of customers and take good notes. Critical thinking is not required. It is easy to get consensus on the Maximum Product: More is always better. The only problem is that no company can afford the time it takes to deliver this “ideal” product. Hence this need for the MVP.

The first step in defining the MVP is the one that is most often overlooked: you first need to define the segment of your customers that you target with the new product. The segment has to be small enough to group customer with similar requirements, but large enough that your new product will generate enough revenue.

The second step is to define the theme of the product in terms of benefits (not features). One of the best tools to help define this theme is by imagining that you are putting up a huge billboard on 101 (one the main arteries of Silicon Valley) that will advertise the new product: what  does the billboard say?

The third and final step is to define the critical mass of features in the release. In this step,  ruthless time vs feature vs price trade-offs need to be made – because the question is not just “what features do our target customers absolutely need?” (this list will always be too long), but rather: “Will our customers be willing to buy the product with these  features – available at this date –  at this price? Economically, this question may have multiple correct answers. However, in practice, presented with this question, customers will often select a date in the near term, which in turn defines the minimum viable product.

Who Owns Quality? Part 5 and end

By testing early, we improve the predictability of the release, and we shorten the time to release.

Let us now turn to how our early focus on quality impacts methodology and release management.

Account for testing time in the plan

The most visible impact is that each developer must account for the time to fully test the code in his/her task estimates. It also behooves the release lead (scrum master) to remind developers to include testing time in their estimates. So each task must include: design, coding and unit tests, testing brainstorm with QA, building test fixtures, generation of test data, executing the tests … and some buffer to address whatever problems will be discovered during testing. Also remember that testing includes performance as well as functional validation.

Involve QA from day 1

Similarly, account for QA’s time starting from day 1 (not necessarily full time) in your project plans (vs planning for QA’s work to start at the QA phase). As soon as a design takes shape, QA (and developers) must figure out how to test it – and build the tools to do so.

The more innovative the design, the earlier QA needs to be involved: a new architecture, or a radically new category of features, is likely to require a radically new set of testing tools.

Finally, having QA involved at the inception of a design allows developers and QA engineers to truly team up.

Show-and-tell as you release to QA

While XP and Agile advocate writing the test code before writing the actual code, I don’t personally care, and let each developer do as he/she chooses. What IS important is that each developer proves that the code works before claiming to be done!

To this effect, I usually request a show-and-tell as a “right-of-passage” for releasing to QA. The show-and-tell goes like this:

  • QA provides a clean standard environment for the product
  • Developer installs his/her build including the new feature(s)
  • Developer demo’s core functionality and performance
  • QA engineer asks questions, and, if desired, requests additional tests to be run
  • When satisfied, QA formally accepts the feature(s)

I like to invite as many people to the Show-and-Tell, as the feature(s) warrant: at minimum, the product owner, and all the leads of the project, but there is no limit …. for major accomplishments, don’t hesitate to bring in the CEO, VP of Sales, VP of Marketing,  the receptionist (seriously), etc

This show-and-tell is a great opportunity to recognize the developers and QA engineers who made it happen. It also kills the silly arguments between Development and QA that drive me crazy, where a developer denies a bug because “…it works on my system!”

The more you test, the faster you develop

It sounds like we added a whole bunch of work during the development phase, and thus that we just caused the release to take longer.  In practice, it’s actually quite the opposite.

Testing, and bug fixing, must take place at one point or another, before the product is released. The choice is thus simple: “Pay now … or pay more later!” Either, test the code early, or, you wait until the end of the release, but at that point in time, the cost, and personal pain, of fixing the bug will be that much greater.

While one would think that the development phase will take much longer with all this testing, it actually does not change much. You gain time because testing is now done in parallel and in real-time as the code is being developed.

You may “lose” some time because you are testing performance upfront.

On the other hand, you gain significant amount of time at the end of a release, because your QA phase is now a true Quality Assurance phase rather than a bug-discovery-and-fixing phase. Having tested early, the unpredictability of this phase has been eliminated. You no longer have to fear the “show-stopper” bug that used to pop in the last days of the release.

In summary, from inception of project to actual release to the customer, you will experience significant time savings. Equally important, you will increase the predictability of your release schedule by an order of magnitude. To quote the Agile Manifesto: “Working software is the primary measure of progress”. By testing concurrently with code development, you advance the time at which software actually works, and thus the predictability of the product release date!

Who Owns Quality? Part 4

We examine how we modulate our testing efforts throughout the various phases of a project, and how the roles of architects, developers and testing engineers evolve accordingly

Let us examine the division of labor between QA and Developers/Architects, as we apply the “Developers own Quality” methodology.  Do we need a QA team at all?  J. What’s a QA engineer to do?

… quite a bit, as it turns out.

“Developers own Quality” simply prescribes that developers own the results of testing, and that their task is only complete once the code passes enough tests to prove that it works. However, this does not imply that developers DO all the testing.

In order to provide more details, let us split a release milestone (or sprint) into 3 phases – for the purposes of this discussion, where we focus on quality:

(1) Design, development and TESTING,

(2) QA: Quality ASSURANCE

(3) QC. Quality CONTROL

(1) During the first phase, architecture design and development, the focus, from a quality perspective, is on testing, with a goal of demonstrating that the product actually works, and meets the stated requirements in all aspects of functionality and performance – and —  that it works with the rest of the new code that’s been created during the milestone.

The testing efforts are lead by the architects or developers, with the QA team heavily involved: brainstorming on test cases, building and configuring test harnesses, executing manual tests – it is a team effort.

A key ingredient to this effort is: Architects, developers and QA engineers must ALL contribute test cases. There is joint ownership of test cases – each group brings its own perspective: the developer knows what’s inside, and thus what may be fragile, or what factors may limit performance. A QA engineer brings years of experience in testing, methodology, and his/her flair at identifying potential problem areas.

Cooperation is also critical in building the test fixtures, and generating the data sets that will exercise the full scope of the product. Architects often build the first barebones test-bed to validate their prototype. This prototype test-bed is then enhanced, or rewritten, during the development and testing phase, typically by developers, who then transition it upon release to QA, along with the product code.  The QA team subsequently takes ownership of the test fixtures and continues to refine them.

Typically, during the architecture, development and testing phase:

o Product code is written by architects and developers

o Everyone must generate test cases

o Test fixtures and test data are created by architects/developers for the first generation, and subsequently enhanced and ‘productized” by the QA team

o Tests are executed by the QA team.

Ideally, as the code stabilizes, QA automates the tests; and adds them to the daily build and/or make them conveniently available for developers to set up and run the tests on their own (thus saving time for themselves).

(2) During the second phase, the Quality Assurance phase, the QA team rounds out the testing, and ensures that ALL test scenarios have been exercised, and pass.

What should be tested in the first vs the second phase is largely a matter of judgment: In the first phase, we do just enough to prove that the code works while in the second phase, we ensure that the code has no errors.

One way of to better understand this it is to consider exit criteria of each phase:
The exit criterion of Phase 1 is that no Severity 1 or 2 bugs will be found in Phase 2.
The exit criterion of Phase 2 is that no Severity 3 bugs (or worse) will be found in QC or after release.
The ideal exit criterion of the QC phase is no Severity 4 bugs (or worse) will be found after release. As we have all experienced, in practice the product owner (product manager) decides when to ship the product, trading-off time, resources and the very last bits of quality.

One may partition Phase 1 vs Phase 2 efforts based on the environments in which the product will run (e.g. versions of browsers, operating systems, databases). You select a representative sample of environments to test in Phase 1, and you round out the effort by testing the remaining environments in Phase 2.

Another way of looking at work allocation is in terms of risk management: all risk should be eliminated in Phase 1. This translates into: all bugs found during Phase 2 should require a predictable – and small – amount of time to fix; plus there should only be a relatively small number of bugs found in Phase 2. This very important point goes against the engrained habit of some organizations where developers test the basic case, and leave the worst case scenario to be tested by QA. On the contrary, ALL the WORST test cases must be exercised in Phase 1, and made to pass. Leaving it to phase 2 is just delaying the inevitable.

(3) Phase 3 is Quality Control of the “release candidate” – and is typically run by the QA team only. During the QC phase, the complete product is tested from top to bottom – newly introduced features, as well as those from earlier releases.

The QC phase may be abbreviated in intermediate milestones, but it is a critical step before an official release.

Ideally, by the time you reach the final QC stage, all the tests have been automated (functional as well as performance), and the QC phase goes very fast J

The above is, in my experience, a typical distribution of tasks, yet by no means is it a prescription. On the contrary, it is best for each team (architects, developers and QA jointly) to self-organize – as recommended by Agile.

A self-organizing team will review the tasks of each milestone, and adapt to the circumstances. For example, nothing prevents developers from helping the QA team run tests at the end of the release when it’s crunch time. And there is nothing wrong with a QA engineer writing the code of a test harness (it is even recommended).

I cannot emphasize enough how the importance of taking the time upfront – as coding begins — to figure out the test cases, testing harnesses – as well as test data. Unless you have sophisticated enough tests, you will never know how solid your product is. And, the sooner you have this information (i.e. in the development phase) the faster you will deliver the product.

Finally, to further emphasize the importance of the testing environment, in my view the test code, as well as the test data, are part of the “product” on equal footing with the code that’s shipped to customers. Test programs are just as valuable to the company as the code that they test. Or said another way, source code, without the tools to validate its correctness, has little value to a company. As a consequence, equal attention needs to be placed on the creation, maintenance, update, and safekeeping of test code and test data, as is placed on customer-facing code

o generate test cases

o Test fixtures and test data are created by architects/developers for the first generation, and subsequently enhanced and ‘productized” by the QA team

o Tests are executed by the QA team.

Ideally, as the code stabilizes, QA automates the tests; and adds them to the daily build and/or make them conveniently available for developers to set up and run the tests on their own (thus saving time for themselves).

(2) During the second phase, the Quality Assurance phase, the QA team rounds out the testing, and ensures that ALL test scenarios have been exercised, and pass.

What should be tested in the first vs the second phase is largely a matter of judgment: In the first phase, we do just enough to prove that the code works while in the second phase, we ensure that the code has no errors.

One way of to better understand this it is to consider exit criteria of each phase:
The exit criterion of Phase 1 is that no Severity 1 or 2 bugs will be found in Phase 2.
The exit criterion of Phase 2 is that no Severity 3 bugs (or worse) will be found in QC or after release.
The ideal exit criterion of the QC phase is no Severity 4 bugs (or worse) will be found after release. As we have all experienced, in practice the product owner (product manager) decides when to ship the product, trading-off time, resources and the very last bits of quality.

One may partition Phase 1 vs Phase 2 efforts based on the environments in which the product will run (e.g. versions of browsers, operating systems, databases). You select a representative sample of environments to test in Phase 1, and you round out the effort by testing the remaining environments in Phase 2.

Another way of looking at work allocation is in terms of risk management: all risk should be eliminated in Phase 1. This translates into: all bugs found during Phase 2 should require a predictable – and small – amount of time to fix; plus there should only be a relatively small number of bugs found in Phase 2. This very important point goes against the engrained habit of some organizations where developers test the basic case, and leave the worst case scenario to be tested by QA. On the contrary, ALL the WORST test cases must be exercised in Phase 1, and made to pass. Leaving it to phase 2 is just delaying the inevitable.

(3) Phase 3 is Quality Control of the “release candidate” – and is typically run by the QA team only. During the QC phase, the complete product is tested from top to bottom – newly introduced features, as well as those from earlier releases.

The QC phase may be abbreviated in intermediate milestones, but it is a critical step before an official release.

Ideally, by the time you reach the final QC stage, all the tests have been automated (functional as well as performance), and the QC phase goes very fast J

The above is, in my experience, a typical distribution of tasks, yet by no means is it a prescription. On the contrary, it is best for each team (architects, developers and QA jointly) to self-organize – as recommended by Agile.

A self-organizing team will review the tasks of each milestone, and adapt to the circumstances. For example, nothing prevents developers from helping the QA team run tests at the end of the release when it’s crunch time. And there is nothing wrong with a QA engineer writing the code of a test harness (it is even recommended).

I cannot emphasize enough how the importance of taking the time upfront – as coding begins — to figure out the test cases, testing harnesses – as well as test data. Unless you have sophisticated enough tests, you will never know how solid your product is. And, the sooner you have this information (i.e. in the development phase) the faster you will deliver the product.

Finally, to further emphasize the importance of the testing environment, in my view the test code, as well as the test data, are part of the “product” on equal footing with the code that’s shipped to customers. Test programs are just as valuable to the company as the code that they test. Or said another way, source code, without the tools to validate its correctness, has little value to a company. As a consequence, equal attention needs to be placed on the creation, maintenance, update, and safekeeping of test code and test data, as is placed on customer-facing code.

Who Owns Quality? Part 3

“Test early, test often” applies to performance testing – which needs to be run continuously starting at the architecture design phase all the way through the end of the project – ideally on a dedicated system

Test Early

… does not only mean that tests must be run during development, but even more importantly, testing must start during the architecture design and prototyping phase.

As one does not wait until after the release to QA to start testing, by the same logic, one should not wait for the code to be complete to run tests – and make progress towards “proving that the code works”.

More specifically, performance must be validated during the design and prototyping phase. By the term “Performance”, I include individual server performance, scalability, fault-tolerance, longevity testing, error recovery, behavior under stress, etc. While it may not be possible to test everything with a prototype, one certainly has a duty to validate as much as can be tested. The sooner one tests, on the smallest code base as possible, the easier it is to (a) identify performance bottlenecks, (b) fix any issues and (c) minimize the impact of such fixes on the project and other team members. As we all know, performance shortcomings are among the most difficult problems to fix, and whose resolution time is hardest to predict.

In fact, one of the fundamental exit criteria of the architecture design and prototype phase must be that:  It validates Performance.

Another reason, in my experience, to test during the design phase is to engage the dialog on Performance between the Engineering team and the Business Owners (Product Management). In the abstract, we all want faster performance with every release. Yet, one has to wait until a first round of performance tests to see how close (or how far) we are from a given target. Thus the cost / benefit analysis of improved (or decreased) performance cannot start until the first round of test results. Only then, can the time and resources necessary to reach the desired level be evaluated with some degree of accuracy.

In some cases, “forcing” my team members to run performance tests is the only way to have them read the spec J. As they go through their design, I often remind them: “If you don’t know how to test it, you don’t know how to design it.”

Test Often

… the other half of the “Test early, test often” mantra reminds us that performance needs to be tested continuously through the development process. We have all experienced performance being impacted by the strangest things. The worst “death marches” that I have experienced were the consequence of a serious performance issue found in the last days of the release. I strongly recommend running a minimum set of performance tests within each milestone (or Sprint, if you use scrum methodology).

My “best practice” is to run performance tests continuously – from the first day until the last day of the project — on a dedicated system that tests the last stable release 24×7 (e.g. from the last milestone/sprint). The architects, and developers, will have coded some automated tests that exercise the corner cases of performance, and even stress tests. Furthermore, running the tests over long periods of time — 2 weeks minimum with the same executables — also tests against memory leaks and other resource exhaustion bugs.

Who Owns Quality? Part 2

Developers must take ownership of testing their code for functionality, integration and performance

Let us examine the consequences of “Developers Own Quality”.

Quality is already in the code at the time when it is delivered to the QA team

In other words, the code meets all functionality and performance objectives.  The obvious consequence – as suggested by Extreme Programming (XP), and Agile Software Development – is that, in addition to writing code, developers must also test it. More importantly, developers own the results of these tests.

Too often, I have heard developers claim that their task was complete once they had provided Unit Tests along with their code. Writing unit tests is a good thing, it is an important and necessary step, but it is far from sufficient. Rather, developers must take a results-oriented approach to testing, and ask themselves: do my tests PROVE that my code works?

Beyond a comprehensive suite of unit tests, which validate basic operation of the code, two main areas must be addressed: (a) integration and (b) performance.

Integration testing leads us to another XP and Agile best practice: frequent integration releases (or milestones) to ensure that all newly contributed code plays well together. For example, two developers will have often a different interpretation of an API. While each may have done the right thing in their own mind, and pass their individually created tests, the code, once integrated, will not work.

So, why ask developers, rather than QA, to test integration and performance? It is simply a matter of efficiency.

The process of releasing code to QA, having QA set up their test environments, find a bug, make sure it really is a bug, file a bug, assign the bug, re-run the test for the developer, wait for the fix, verify the fix, verify that the fix did not break anything else that worked before, and finally close the bug, is just too long a process. It should only occur in exceptional circumstances, or in controlled situations (more later).

To me it is also a matter of pride. As a developer, I need to be confident that I deliver solid work-product to my teammates. Finding a serious bug in my code (whether functional, or performance), once I have released it, should be a major embarrassment. I often tell my team – jokingly – “If QA finds a Severity 1 or 2 bug in your code, you owe me fifty bucks!”, as an illustration of the level of confidence and pride that one should have in one’s code.

In summary, comprehensive testing, is part and parcel of development. A developer who is proud of his/her code, and proves that it meets all functional, integration and performance requirements, is not only an efficient developer, but someone who makes his/her whole team efficient.