The FDA and a Round World

The term “round data” conjures up many images. Coined, I believe, by the FDA about three years ago, there is no precise definition of this term. If I asked five people what does “round data” mean, I would probably get six opinions, some abuse and I would probably be reluctant to ask the question ever again.

So it is for those reasons that this is one of those difficult blog posts to write simply because the subject matter is one that has caused quite a bit of confusion, as well as being one that has resulted in a releasing of emotions in the past. So either this post will result in a flurry of comments or a deathly silence.

I cannot start off by giving that wonderfully simple definition. Wouldn’t it be wonderful to simply say “round data is …” however. I can’t. But I believe the vision of round data, for that is what it is really, a vision. I believe round data is all about context. I have taken part in many conversations about the CDISC standards and context and I have even understood some of them. In the current standards, you will find that there is a lot of implicit context or assumptions. It is these assumptions that get us into trouble because implementation of the same standard will undoubtedly make different assumptions and thus we don’t have a standard, we have two similar implementations.

Consider the value, 180. That one item of information tells you nothing. If I add the units “cm” so as to make 180 cm, you now know it is a dimension. Given the magnitude you might make an assumption it is a height. Now if I add HEIGHT, to build HEIGHT=180 cm I am beginning, but only beginning, to add the context around the recording – or observation – of the height.

But more questions immediately emerge. When did I observe, why did I observe, what happened before and what happened after? So the observation needs to be placed into the context of the method of collection, the collection instrument, be it a case report form or a medical record both of which may be paper or electronic. In the event of the case report form that probably exists within the context of a study and thus a protocol. So as you start to see from this very simple picture of our clinical research world, the context of a simple observation becomes increasingly complex.

What we strive for is to place the observation into the right context. A human does that very well. You or I can read the protocol, examine the annotated CRF and look at the resulting dataset. But if we want to get more from our data, to be easily able to aggregate across studies, we need the computer, the machine, to have access to that same context. Some of the things we might wish to see are:

  • Relationship between a clinical observation and the planned observations, did we capture everything intended? Did we capture more? Which observation is which?
  • Parent child relationships, for example, these multiple results came from a single sample
  • Relating adverse events to interventions
  • Terminology relationships

We have some of these relationships today but they are, as I have said, implicit. The value and units of our observation are related but only by an implicit positional relationship because the two columns are next to each other in the dataset and the data specification (human readable only) tells me what those two columns contain. We need to make those relationships tangible. SDTM has gone a long way to achieving this but suffers from only providing them in a human-readable form and not having the scope of relationships that we may wish to see in the future.

We therefore need to build on SDTM and all the other good work to get to the round world. But the relationships extend across a single standard. Consider the issue of traceability of submitted data back to the case report form. The relationships in this scenario encompass not only SDTM but CDASH and potentially ADaM.

So a definition for round data? I would suggest round data as being nothing more than a richness of relationships. The challenge is defining the relationships we want, how best to describe them and how to transmit them.

One comment on “The FDA and a Round World

  1. Dave, thanks for bringing this to the surface again, this is important. Ignoring the round world has contributed to the problems Pharma have always faced when aggregating and re-using data, and this has been a frustration of mine for my whole working life. Pharma and FDA have shared interests: to understand and interrogate data quickly and cheaply, whether for an individual study, an individual compound, a particular class of drug or broader.

    As you note, SDTM is not a complete solution to this problem. The solution to this is to utilise formal modelling of data from the point of study design. If a good model is employed, the context of the data can be made explicit in a way that computers can make use of the information.

    Here at GSK, and within the CDISC SHARE project, such a modelling approach is being employed. The model of clinical research that is being employed by both organisations is BRIDG. And this is being coupled with the use of a datatype standard (ISO21090). We have done enough work at GSK to feel comfortable that this approach will pay dividends. The BRIDG model, if implemented well, has enough capability to differentiate between activities being conducted because they were planned up front and activities that have been carried out due to being triggered by an “in-study” finding.

    Here is a link to two recent public presentations on SHARE:

Leave a Reply

Your email address will not be published. Required fields are marked *