The aim is to develop the ideas within the blog posts that look at aspects of the requirements while, at the same time, detailing the technology that I am working on to meet the need. We cannot divorce the two; there is no point having the metadata if there is no requirement for it and it is always dangerous to develop solutions without knowing the problem! I feel I know a fair amount about the problem given my experience over the last 15 years or so, looking at it from a variety of perspectives; from a standards development one of trying to develop standards, from a user perspective of using tools and standards to implement studies and from a technology perspective of developing solutions to support such needs.
I started looking at the requirements in the recent End-to-End post and, as I write this post, I am making my way to the US to attend the PhUSE CSC meeting and the CDISC Intrachange that follows immediately after. At the intrachange, there will be a couple of sessions on the End-to-End issue, designed to look at what CDISC needs to do to make this a reality. It will be interesting to see what emerges from the meeting.
The End-to-End article discussed the requirements but this is the high-level list I keep in my head:
- Support generation of the business artefacts, for example
- Writing of protocols
- Setup of studies
- Creation of study items like SDTM, define.xml, aCRF
- Creation of analysis items and Clinical Study reports etc
- Metadata to support the above requirements
- Tools to build and maintain the metadata, the Metadata Repository
We can take these very high-level needs and subdivide looking at each from a perspective of the input, the desired value to be added and the resulting output. As an example, consider the table of assessments within a study protocol. Should we develop this within the context of a protocol document as a MS Word table or might we develop it within a tool that permits a more structured approach that allows for the table to included within the protocol document but also as an input to study build and all the other processes that require it (SDTM ‘T’ domains)? We can look at the inputs in each of the two scenarios, determine what values are being added and determine the outputs and how beneficial each of the approaches is to the overall process. It is these sort of discussions I want to document in the coming months.
From the technology perspective, I outlined some of the background to assessing approaches in the Right Tool For The Job post a few weeks back. One of the technologies I looked at during the summer of 2014 was the Neo4j graph database; one of the so called nonSQL databases. I managed to load several versions of the CDISC terminology and used this to look at the version management issues associated with Code Lists and Code List Items. This tool is very nice to use, is getting a lot of attention within a wide-range of industries and has relatively low costs. However, being a flexible tool also allows you to go mad and do some dangerous things; it lacks a little rigour. I will discuss some of the choices in the coming weeks.
I also spent some time looking at what others were doing, while trying to keep an eye on the target of building an essential MDR that can support the business need. Topics I looked at were the various comments made by the FDA in the past and their desires, what was happening within PhUSE (like the CDISC2RDF project) and within CDISC, use of the BRIDG model, ISO standards such as 11179 and a wealth of other material. During the investigation I was also keeping an eye on potential reuse (such as the CDISC2RDF project, the NCIt export of the terminology in semantic form etc.) and not reinventing the wheel.
In the end I came down to working with the semantic technologies. There is no documented justification, no report, just my gut feel that it is right.
Next I will detail the ‘model’ I choose to work with and the various technology standards I am working with. Hopefully this post will emerge shortly after next week and the PhUSE and CDISC meetings. A busy week ahead.