What I Want, What I Really Really Want
It is not often that you can use lyrics from a Spice Girls hit single for a title but the words do sum up rather well the contents of this post. In a previous post, I mentioned the development of 55 therapeutic area standards being undertaken at the request of the FDA by CDISC and the Critical Path Institute, the so called 55 in 5 or CFAST project. This initiative is designed to extend the coverage of the CDISC domains from the so-called safety domains into the therapeutic areas. More information of the areas being targeted can be found on the FDA website.
So this is an interesting initiative. It is important. In fact it is very important but first the bad news. The 55 areas represent a massive piece of work. It equates to 11 areas per year assuming that each takes a year. It suggests significant staff numbers, 10 or 15, may be more, full time staff. The big danger I perceive is that there is a risk of developing 55 sets of standards that are not fit-for-purpose and that do not take into account the lessons we should have learnt from earlier initiatives. The good news is that this can be done well and to the benefit of all, if it is well thought out.
It is easy to criticize, so this post is my perspective by someone who a) has been using standards within a pharmaceutical company and wants to benefit from emerging standards so as to streamline the processes within that company; and b) as someone who helps develop new standards and has spent time looking at the structure of content standards.
To begin, and just as an example of the need, I will outline a couple of issues encountered within my current work.
The first is the Tanner Scale. This is a scale (1 to 5) of physical development in children, adolescents and adults. Now where do I place this within SDTM in terms of domain? I would immediately think Questionnaire (QS). But I have seen it placed in Subject Characteristics (SC). Which is right? SDTM experts will justify one or the other and that justification might be based on study specific criteria. But I want it to be obvious to someone who has only read the SDTM Implementation Guide, the “average” user in industry. Now one question does arise? Should a piece of content always be placed within the same domain within SDTM? From the FDA viewpoint this makes sense when trying to pool or integrate data. Does it make sense from the FDA perspective for other uses? From other scientific perspectives it may not. These issues seem rather important to me but I have not heard any debate on these topics. The important issue is the fact that I and others can debate it. And that debate does not bode well for a standard. We need precision.
A second example is the recent release of CDISC questionnaire terminology. Within this package were Test Codes, Test Names and Category Codes for a number of instruments. One in particular that I have worked with is the Work Productivity and Activity Impairment (WPAI) scale. Absent however within the terminology were the results scale and associated terminology; I know the results scale in this particular case is not complex but there is still an opportunity for the “average” user to get it wrong. And therefore while I and other companies can align our tests, we cannot align the result. Here there are gaps in the definitions for a given instrument.
I use these two simple examples so as to highlight the sort of information that needs to be defined and what we want from the therapeutic area development. So below is my specification of the TA needs. This is my set. Please feel free to comment, add or disagree. No one person has a monopoly on good thoughts and/or ideas.
I have split my thoughts into four parts, Scope, Methodology, Outputs and Training:
- Scope: what do we mean by therapeutic area? Is it a core of definitions, some meaningful subset? These simple questions then lead to thoughts about versions and increments. It strikes me that we will never be able to deliver the complete scope of an area in one release. Also with a large number of areas being worked upon in parallel, are we in danger of flooding the user community? Can the FDA cope with the flood? Scoping is therefore important as are the release cycles and mechanisms.
- Methodology: We need a methodology that delivers consistent structuring of definitions. I also want to understand the methodology so that I can develop content using it. This allows me to develop consistent content that I could also feed it back to the user community in an immediately useable structure. We simply cannot afford to have differences in the structure of the definitions.
Also, these definitions will, in the long run, need to be placed into a repository; the industry metadata repository. If definitions are not structured consistently then we will hit problems when we try and load them into such a repository. Also any tools developed to process the data based on the definitions will have problems with inconsistent structuring of definitions.
- Tools: This in turn leads to the need for tools so as to manage the consistent structuring of the definitions but also to manage the sheer amount of work needed. The tools do not need to be complex and sophisticated. We can manage with MS Excel tools in the first instance to get us going quickly and ensure that definitions are structured according to a methodology that brings consistency across the therapeutic areas. What we need to do is decided quickly what the structures are, how best to represent them, some simple tools to help manage them to get us going. Tools can also validate the content checking for consistency and completeness; does the content make sense and have we have filled everything in.
- Outputs: We must have high quality outputs
- Consistent – The outputs across therapeutic areas must be consistent within themselves and across the areas. This is something that has not been the case in the past.
- Complete – The outputs must be complete. We cannot have incomplete terminology or any other item for that matter. To have such results in a non-standard.
- Useable – The outputs must be usable. To be useable they must include not only a human readable forms (e.g. PDF) but also a machine readable forms such that a user can read into, or transform into a form suitable for reading into, their systems so as to gain immediate benefit. These formats will also allow sharing. If we have machine readable forms the human readable forms (PDFs) can be automatically generated.
- Training – We need to train people in the methodology and the tools so as to deliver the attributes discussed above. Without such training we will have great variation in the structure and quality of the deliverables.
None of this is easy. But clarity of thinking around the methodology and tools to support such will pay great dividends. They tools, in the first instance, do not need to be sophisticated. What we do need is the 10, 20 or whatever the number of people working on the project need to understand is what they are producing and why, they need to understand the methodology and be advocates for it.
The views expressed within this post are mine and do not represent the views of any organisation I volunteer for or client that Assero is employed by.