Reflections from DIA: CDISC or HL7?
During a session at the DIA Euro Meeting in Geneva last week, one of the speakers uttered the words “CDISC or HL7”. This is one of those statements that rather grates with me, as it demonstrates a lack of understanding of something that is very important when looking at standards and trying to build better standards for the future.
Now, I know the speaker was referring to standards but a) there are no such entities as the “CDISC Standard” or the “HL7 Standard” and b) CDISC and HL7 are organisations. So the statement makes little technical sense and also only serves to confuse the community at large.
So what is the issue? The problem is awareness, an awareness of what some technicians call the “standards stack”. This expression rather over-glorifies the state of play – something we technicians are inclined to do – but it does reflect the notion that there is more than one standard in the clinical research landscape. When I drew a white board picture recently of the that landscape, a comment was made that it was a “cake” diagram with a number of layers of standards, resembling traditional English sponge cake.
We need to split our view of the world into three primary layers, see the diagram. I will start in the middle as this, to me, is the important one. This is where we need to define our content in one clear and unambiguous way. This is the primary goal of the CDISC SHARE project and a number of similar health care industry initiatives. The advantage that CDISC has is that it can work from a well-defined scope of SDTM and CDASH, but that is a different issue and maybe the subject of another blog.
Once we have one clear way of defining our content, being both well modelled and understood, we can then use this content to create operational structures to use the content for whatever business need we see fit: a CRF, tabulation or a useful presentation of the data. The content can also be transported in a variety of methods between machines, but by having sound content, the job of designing transport structures becomes much easier.
As you can see from the picture, you see a stack of three layers and hence why we hear the words “standards stack”. And this stack can becomes increasingly complex because the transport layer will be built upon W3C’s XML standard and in turn that XML might be transmitted using internet standards such as HTTP, IP and TCP. Drawn out fully the overall stack will have many layers.
By separating the layers, we can change one layer and insulate the others from change. We can add content and not impact the presentation layer, though we would probably want to use our new content in some way. We can change transport mechanisms without changing the content to the extent that most users of the content would not notice.
With SDTM today, we have a world where these layers have been collapsed into one with the SAS transport mechanism seen as part of SDTM. It is this single layer view of the world that has led to the confusing statement of “CDISC or HL7”.
Returning to the original question, CDISC v HL7? Well the various CDISC standards work in all three layers, HL7 tends to work in the content and transport areas but more in transport (the XML) and less in the content where HL7 tends to defer to using the CDISC content for the clinical research standards, for example the emerging Clinical Trial Registry and Results Message. But different standards sit at different levels. For example the CDISC ODM is a transport standard, CDASH is a content standard but contains elements of presentation and business use grouping, the domains. So when comparing standards from the respective organisations we need to be comparing apples with apples. The statement “CDISC or HL7” is worse than comparing apples with oranges, it is rather like comparing apples with screwdrivers.
9 Responses to “Reflections from DIA: CDISC or HL7?”
Comments
Read below or add a comment...


I’ve always thought of HL7 as a EHR standard and the CDISC standards as Clinical Research standards, but I can now see the layers and how they are different. Thank you for the interesting view.
BTW, What’s an English Sponge Cake?
I fully agree with David’s statement that it is not “HL7 OR CDISC”.
I do also not believe this was the real discussion point at the DIA conference (at least not in the session I attended).
Of course the transport layer is fully exchangeable, as any good informatician knows, but the question is whether also the FDA understands this.
Until now, and highly probable for at least five more years, the FDA will require us to use SAS Transport 5 for the transport layer, though there are already XML-based formats that are better suitable for this. Such a modern format is already used by at least 3 vendors of SDTM mapping tools.
However, although “the transport layer is fully exchangeable” I expect that the FDA will reject my SDTM submission if I do not use SAS Transport 5, but the modern XML-based format.
By the way, I also think that HL7 will strongly disagree with the statement that there standards are more in transport than in content, HL7 always claims to be about “semantic interoperability”. The HL7-v3 is more an XML wrapper for transporting the content in a structured way, unfortunately not in a very successfull one (personal opinion).
But back to SDTM submissions, which we still need to do in SAS Transport 5, but which we could already do in a simple XML, if only the FDA was wise enough to understand this “principle of exchangebility of the transport layer”. However, the FDA sets its money on an HL7-v3 based format that still needs to be developed, and of which it is unsure whether it CAN be used to port SDTM data, as there are some major discrepancies between the principles of HL7-v3 and of SDTM. One simple example is the different formatting of dates and times.
Though the three layers should in principle be completely separated, it is a myth to think that SDTM (as a content standard) can/should remain unchanged when the transport format is changed to either HL7-XML or an XML that is based e.g. on the CDISC define.xml transport format. The reason is that SDTM has been developed with (all the limitations of) SAS Transport 5 in mind: 8-character limitations for variable names, 200-character limitation for variable values, and more particularly, the existence of SUPPQUAL, RELREC and Comments domains. Some argue that these are a result of the database-like design of SDTM, but I cannot believe that a smart database designer would voluntarily setup a set of SUPPQUAL tables for non-standard variables.
The vendors that already use a modern XML format for SDTM, do not use SUPPQUAL, the SUPPQUAL domains are only generated in the very last step when “downgrading” to SAS Transport 5.
The transport layer is fully exchangeable – but do we (or the FDA) need to force our industry into learning a transport format that is completely strange to our industry? And that does not fit with the transport formats we are already using (ODM, define.xml). And if so, is HL7-v3 really the best choice? Or should one have a look again at ASTM (CCR) or OpenEHR or even what is used in VistA?
Or is the concept of messages outdated anyway and should there be an exchange API?
Just a quote from the recent PCAST report on health information technology: “standards (for EHRs) and infrastructure are lacking that would allow information to be easily shared across organizations “ and “While useful as an initial step, the adopted standards for data vocabulary and messaging will not be sufficient to advance the state of the art either of clinical practice or of a robust health IT infrastructure” (I recommend a full read of the PCAST report for more …).
The recent “SMART Challenge” claims to implement the recommendations of the PCAST report, and does not use a message at all anymore for the transport. It simply supplies a REST-API that can be used to exchange information between applications. If this is indeed the future (instead of messaging), a better choice for the FDA would be to put their money on the development of such an API that allows sponsors to exchange SDTM information from their database directly to the systems of the FDA, e.g. the “Janus” data warehouse (now in the stage of being re-engineered and renamed), i.e. without the need for a transport format.
Yes, the transport layer is exchangeable, but would the FDA accept that? If the industry can choose, it will probably choose for a transport format based on existing CDISC transport formats like ODM and define.xml, rather than on a format that is strange to the industry (HL7-v3). Given the state of the information technology the FDA has, could it handle more than one transport format for the same content? And if the API approach is the right one, is the FDA able to implement such an approach?
I doubt it.
Dave is right, it is not CDISC OR HL7, even when we are talking about transport standards. It might also be the API approach. But I am very pessimistic whether the FDA understands this, is flexible enough, and can offer the industry a choice.
My (sadly enough, pessimistic) prediction is that the FDA will still accept submissions in SAS Transport 5 format in 2020, and I am willing to bet on this with Dave for a fine bottle of scotch whiskey.
Jozef Aerts, XML4Pharma
Suggested reading:
PCAST report: http://www.whitehouse.gov/sites/default/files/microsites/ostp/pcast-health-it-report.pdf
“The rise and fall of HL7”: http://hl7-watch.blogspot.com/2011/03/rise-and-fall-of-hl7.html
Hi Jozef, good of you to point us to two great suggestions for reading. I also like the point you make about RESTful API:s. So, maybe we should on the technical level ask the question: “Exchange of data between trusted parties by Exposing data using RESTful API:s, or via Messaging?”
Related to this is also the move towards exposing data via SPARQL endpoints using the common model of RDF, and applying the so called linked data principles, in many domains such as for UK and US governments.
I also think there is another question waiting for us around the corner on the semantic level. It is indicated in the follow-up post on the HL7 Watch blog that you pointed us to. That is, the question of “The bridge health care and clinical research, should it be RIM based, or Open Biomedical Ontology (OBO) Foundry based?”
References:
My presentation next week in Brussels, at the CDISC Interchange EU, about the potential in using of semantic web standards and applying linked data principles for clinical data standards, with some live examples from the UK government
http://www.slideshare.net/kerfors/linking-clinical-data-standards
HL7 Watch blog
http://hl7-watch.blogspot.com/2011/04/fall-of-rim.html
Dave,
A good point made.
I believe the problem is an excessive requirement for non technical people to understand highly technical issues. Should your Pharma Exec need to understand the standards stack and the interdependencies between each layer… etc… No. I don’t think so. However, it is often the case that dialogues related to CDISC standards dive down to the nitty gritty details.
I don’t think this is peculiar to this industry segment. Technologists love to talk about the details, even at the cost of the loss of the greater proportion of the audience.
Returning to the potentially confusing statement. CDISC versus HL7. I fully understand your description of the interrelationship between the standards. However, I am not sure if that was the point of the statement. Whether or not the presenter really understood the facts, the question still holds.
A number of eClinical companies offer support for CDISC standards – typically ODM and SDTM. They rarely provide support for HL7. If you were to ask a company if they support CDISC or HL7, they would typically say CDISC. What they mean is that they offer support for one or more of the CDISC standards, and they have not yet extended these standards to include HL7 Transport or Content.
So – to be accurate – the question was more – do you support CDISC [ODM|SDTM|LAB|ADaM etc] or CDISC[ODM|SDTM|LAB|ADaM etc] + HL7 Content on the HL7 Transports layer but, maybe that was too much for the audience.
Would agree that the original question may have come from a different perspective but the “HL7 or CDISC” statement is often heard and I just wanted to offer an explanation of why it is a somewhat confused question.
Doug brings up a point that I am faced with all the time. I am not a technical person. I am a QA auditor who needs to deal with technical employees during the course of audits. There needs to be a bridge between the two worlds. I have interviewed data management staff who seem to be speaking an entirely different language. That being said, I found Dave’s original post well-written and helpful.
Carl-
from the U.S.
Carl, Good clear communication is hard, especially between the multiple disciplines involved within clinical trials, the scientists to the data managers, the IT folks, the statisticians etc. It is a major challenge.
Corporate communications is very tricky and ensuring that there are diverse messages for each stakeholder is a challenge. It’s interesting because in order to execute company change and to bring about wider use of standards, there needs to be a greater understanding by everyone involved in clinical research. Communications then comes down to internal systems and how much people can absorb quickly and easily … there is no simple solution but it can be done. Changing the way we think about our workflow and engaging with different processes is not just about the technical infrastructure, it’s also about people, as ultimately people drive the technical solutions. And Carl, I sit with you on the non-technical side of the fence so I can fully appreciate your position.