Test-driven Standards Development?
Many in the development community are familiar with the concept of Test-Driven Development (TDD), which, briefly, is based on the practice of writing test cases that define the boundaries of your software component, and then writing the minimal component necessary to satisfy those test cases. It aims to:
- ensure a robust test suite for automated testing, allowing detection of unintended changes to behaviour due to code updates
- ensure that the task is well defined
- prevent “gold plating” (adding unnecessary, but marginally “cool” or useful, code or capabilities)
Under TDD, coding should stop when all the tests pass, and all code should contribute to passing those tests. This approach has much to admire, but I have never worked on a team that has embraced it. On the surface, TDD would seem more applicable to “this input gives that output” requirements, and few projects I’ve worked on expressed requirements that way. It could be some problem spaces are better suited to TDD than others, or it could reflect on how some teams write requirements. With increasing use of cloud architectures and web APIs, perhaps more development problems can be cast in this fashion.
What got me thinking about TDD was a recent opportunity that I had to attend an AEGIS.net, Inc. Touchstone training session. Touchstone is a product that lets product developers test their HL7 FHIR implementations. It has a range of features supporting both client and server implementations and can be built into continuous integration pipelines to help prevent product regressions. There are other FHIR testing products—some commercial, some open source—worthy of checking out, but I like the Touchstone product and the AEGIS.net team behind it. One thing that has enabled an ecosystem of FHIR test systems is the fact that the FHIR standard has a built-in TestScript resource, and a product’s compliance to the standard or to an implementation guide can be evaluated against a growing body of available tests.
Beyond TestScript, the FHIR standard is unusual (although not unique) in being partially self-describing and computable. The Patient resource’s content is defined with a StructureDefinition resource. The way resources can be found or manipulated are defined with SearchParameter and OperationDefinition resources. Use of a resource for a particular purpose can also be profiled (constrained) with the StructureDefinition resource, and profiles collected into an ImplementationGuide resource, along with a CapabilityStatement resource outlining minimal expected system behaviour. An implementation guide could also include the TestScript resources necessary to assure correct system functioning.
Depending on definition, some other standards could be called computable. The HL7 Version 3 specification was expressed in XML and distributed as a set of HTML pages. After a multi-year effort to move from PDF distribution, the DICOM standard was published in 2013 in a retargetable XML format that allows publication in PDF, MS Word, DocBook or multiple HTML renderings. This single source for DICOM provided many advantages but failed to live up to the hopes of some in the community. The DICOM source files describe the documents, not the standard; that is, for example, rather than the source describe the properties of the DICOM attributes, it describes the tables listing the properties. The IHE (Integrating the Healthcare Enterprise) initiative has for many years published their implementation guides (which IHE calls Integration Profiles) as PDFs, but has recently begun moving content to a more computable representation. (Interestingly, but not surprisingly given the content of many of the profiles, the new IHE publications are based on the FHIR implementation guide tooling.) Although processing the content has become easier, in both the DICOM and IHE cases though, the majority of requirements are still expressed in computationally opaque narrative.
What is the value of a computable specification, and how far could it go? The current FHIR test scripts, based on my review, focus on testing a narrow band of functionality, above the baseline data structures and operations, but below sophisticated multi-step coordinated behaviors. There is nothing wrong with that: the current tests focus on where the biggest benefit lies for the most users. Imagine, though, if we took a TDD-approach not just to standards implementations, but to the standard itself. Each requirement in a standard would be accompanied by the tests necessary to prove conformance to that requirement. Going further, the tests could be the actual definition of the requirements; any text would be just informative to help mere humans make sense of the tests.
Moving to a more test-focused specification authoring processing requires skills that currently aren’t present in the SDO (standards development organizations) committee rooms. Authors are often from healthcare organizations, or consultants representing them; or healthcare IT vendor representatives, usually product managers or developers. Product testers and conformance organizations are poorly represented. That isn’t to say that from time to time, someone in committee doesn’t raise the question of “how would I test that?”; it’s just that’s not a primary focus. Merely having testing and conformance representatives in the room would undoubtably lead to more testable specifications without necessitating a complete switch to test-driven specifications.
In the 1990s there was interest in provably-correct software through techniques like annotating programs with Z-notation. Unfortunately, authoring the Z-notation annotations essentially meant writing the program twice, and it was just as likely for the annotation to have an error as the program. I haven’t heard discussion of Z-notation in 25 years; the return on the effort wasn’t there. I fear that despite the attractions of test-driven standards development, it would have the same fate.
There are benefits, though, to be realized from more rigorous testing of standards, and from machine-consumable specifications. I am aware of multiple standards profiling efforts that intend to deploy testing frameworks alongside their implementation guides, hoping to make it easier for implementors to adopt their works. The HL7 standards organization has recently announced a restructuring into two divisions, one supporting standards development, the other supporting standards implementation. Testing tools will assuredly be a key feature of the implementation division.
As healthcare systems (in both the computer and organization senses) become more interconnected, there is greater need to ensure they are interoperable, and information flows with collective understanding. Testing using tools, such as Touchstone, computable specifications, and specification-support content, such as the FHIR StructureDefinition and TestScript resources, are key to achieving those goals. IHE’s Connectathon testing events have long been valued for the opportunity they provide to bring healthcare IT vendors together to ensure the interoperability of their solutions. Rigorous testing suites can provide many of the same benefits. In truth, the testing suites can often go farther, allowing continuous use for regression testing, significantly more complete and elaborate test cases, and eliminating the dependency on availability of a willing test partner. We should be embracing fuller use of these tools throughout healthcare IT product development.
Like commercial libraries, open-source tooling, and reference implementations, and advanced programming languages, testing tooling lets healthcare IT vendors focus their development resources on what differentiates them. By catching deviations from the standard early—or by enabling your library vendor or community to catch deviations before your development team is even aware of them—you can direct your organization’s effort to the features that make your product valuable, not to the plumbing that makes it work. Leveraging automated test tools, organizations across a wide range of industries have adopted continuous integration testing into their development process, moving their QC staff away from performing the repetitive, simple test tasks to higher-value complex tests or to test authoring and orchestration. Healthcare IT development is beginning to see the benefits of using these tools. Public (and semi-public) repositories of FHIR TestScript resources exist allowing organizations to benefit from each other’s experience, and accelerating product development.
While test-driven specification development may seem like a utopia to some, it has significant costs, complexities, and challenges. On the other hand, widespread, consistent use of the current FHIR testing tools will help vendors, today, produce more interoperable, robust solutions, faster.
No Comments