The term “round data” conjures up many images. Coined, I believe, by the FDA about three years ago, there is no precise definition of this term. If I asked five people what does “round data” mean, I would probably get six opinions, some abuse and I would probably be reluctant to ask the question ever again.
So it is for those reasons that this is one of those difficult blog posts to write simply because the subject matter is one that has caused quite a bit of confusion, as well as being one that has resulted in a releasing of emotions in the past. So either this post will result in a flurry of comments or a deathly silence.
I cannot start off by giving that wonderfully simple definition. Wouldn’t it be wonderful to simply say “round data is …” however. I can’t. But I believe the vision of round data, for that is what it is really, a vision. I believe round data is all about context. I have taken part in many conversations about the CDISC standards and context and I have even understood some of them. In the current standards, you will find that there is a lot of implicit context or assumptions. It is these assumptions that get us into trouble because implementation of the same standard will undoubtedly make different assumptions and thus we don’t have a standard, we have two similar implementations.
Consider the value, 180. That one item of information tells you nothing. If I add the units “cm” so as to make 180 cm, you now know it is a dimension. Given the magnitude you might make an assumption it is a height. Now if I add HEIGHT, to build HEIGHT=180 cm I am beginning, but only beginning, to add the context around the recording – or observation – of the height.
But more questions immediately emerge. When did I observe, why did I observe, what happened before and what happened after? So the observation needs to be placed into the context of the method of collection, the collection instrument, be it a case report form or a medical record both of which may be paper or electronic. In the event of the case report form that probably exists within the context of a study and thus a protocol. So as you start to see from this very simple picture of our clinical research world, the context of a simple observation becomes increasingly complex.
What we strive for is to place the observation into the right context. A human does that very well. You or I can read the protocol, examine the annotated CRF and look at the resulting dataset. But if we want to get more from our data, to be easily able to aggregate across studies, we need the computer, the machine, to have access to that same context. Some of the things we might wish to see are:
- Relationship between a clinical observation and the planned observations, did we capture everything intended? Did we capture more? Which observation is which?
- Parent child relationships, for example, these multiple results came from a single sample
- Relating adverse events to interventions
- Terminology relationships
We have some of these relationships today but they are, as I have said, implicit. The value and units of our observation are related but only by an implicit positional relationship because the two columns are next to each other in the dataset and the data specification (human readable only) tells me what those two columns contain. We need to make those relationships tangible. SDTM has gone a long way to achieving this but suffers from only providing them in a human-readable form and not having the scope of relationships that we may wish to see in the future.
We therefore need to build on SDTM and all the other good work to get to the round world. But the relationships extend across a single standard. Consider the issue of traceability of submitted data back to the case report form. The relationships in this scenario encompass not only SDTM but CDASH and potentially ADaM.
So a definition for round data? I would suggest round data as being nothing more than a richness of relationships. The challenge is defining the relationships we want, how best to describe them and how to transmit them.