9 comments on “Searching for an eClinical Vision?

  1. Nice article Dave. A principle I entirely support.

    The standards do need some support here, and in particular the tools.

    One of the concepts that it is not well understood is the risk of a break in the bi-directional flow with most of the tools available today. If you give a SAS or SQL programmer a command line to perform data processes (ETL) the chances are that it will be difficult, if not impossible to reliably create a reverse thread to the source

    A possible solution to this problem is the creation of a 3rd dimension of the SDTM & ADaM data models. In this 3rd dimension we could hold (amongst other things) a pointer (UUID) to a datapoint in ODM. Currently, ODM is limited by a 2 dimensional column / row model that limits this.

    We may see vendors extend the SDTM – a form of SDTM++ that adds this 3rd dimension to data that they actual store.

    The beauty of this model is that it doesn’t break the existing specification of SDTM. It extends it in a compatible way. It is also easily manifested in XML, Object models (and HL7 v3 ?).

    One other critical point that you have not raised above is the need for logic (rules) context for the resulting data sets (unless this was included in the scope of your metadata). If organizations hope to use SDTM or ADaM as the format of data to use in a Global Data Repository, then they must also store the context information. Cutting across millions of records of historical data will be largely pointless if the rules that were original applied to screen and clean this data are missing. I have a proposed solution for that one, but I will save it for another time!

    1. Doug,
      Your response raises several issues – linking standards, metadata, expansion of standards, formats for warehouses – which deserve more than a quick response. I may roll some into future posts. Having said that, very quick answers which are flawed by/because of their simplicity
      1. I would not link ODM – SDTM. I would link CDASH to SDTM and ODM to new SDTM transport, we need the right linkages at the right level
      2. However, that may be wrong, because what we want is metadata across the continumn, and splitting our data into CDASH/SDTM “pots” goes against that. The split reflects the history of development not may be what we need in the future.
      3. The data cleaning, edit checks et al, we need to preserve the metadata for each study, i.e. a copy of the metadata – in all its glory – we used for that study
      4. Is SDTM the right “format” for a warehouse. I will stick my head out here. No. Why? Because it is a presentation of the data. That does not mean it is the best format for the storage. There are other ways we may wish to see the data or aggregate etc etc.
      As I say, very quick answers which I may expand into future posts.
      Dave

  2. It is not my intention to make advertisement for my SDTM-ETL product here, but what Doug suggests is exactly what we do in our tool: we map from ODM (metadata) to SDTM, and store the mapping script (which is a mixture of XPath expressions and a Perl-like instruction language) in the define.xml. We use (some may say abuse) the define.xml “ImputationMethod” element for this. If the latter is retained in the final define.xml (most of our users remove it just before submission), the reviewer can easily see how the SDTM datapoint was retrieved/derived from the ODM.
    For example (and hoping this blog entry tool does not ruin the XML):

    # Mapping using ODM element ItemData with ItemOID IT.SEX
    # Using SDTM CodeList CL.SEX
    # Using a CodeList mapping between ODM CodeList CL.SEXF and SDTM CodeList CL.SEX
    $DM.SEX = xpath(/StudyEventData[@StudyEventOID=’SE.VISIT0′]/FormData[@FormOID=’FORM.DEMOG’]/ItemGroupData[@ItemGroupOID=’IG.DEMOG’]/ItemData[@ItemOID=’IT.SEX’]/@Value);
    $NEWCODEDVALUE = ”;
    if ($DM.SEX == ‘0’) {
    $NEWCODEDVALUE = ‘F’;
    } elsif ($DM.SEX == ‘1’) {
    $NEWCODEDVALUE = ‘M’;
    } else {
    $NEWCODEDVALUE = ‘U’;
    }
    $DM.SEX = $NEWCODEDVALUE;

    The advantage is not only complete tracebility, but also allows a lot of reuse between studies especially when sponsor use an ODM-based library for setting up studies (as is done by Eli Lilly). For CDASH forms, the reuse is even much higher.

    I am also intending to work (if the time permits) on an example XML file that combines a define.xml (in one MetaDataVersion) and the original study design (in another MetaDataVersion) and then apply a stylesheet so that the results shows as well the SDTM metadata as the original forms, the latter annotated with SDTM information.

    Best regards,

    Jozef Aerts
    XML4Pharma

  3. OK, the blog submission tool ruined the XML, here another try …

    <ItemRef ImputationMethodOID=”IMP.MyStudy:DM.28.DM.SEX” ItemOID=”DM.SEX” Mandatory=”Yes”
    OrderNumber=”13″/>

    <ImputationMethod OID=”IMP.MyStudy:DM.28.DM.SEX”>
    # Mapping using ODM element ItemData with ItemOID IT.SEX
    # Using SDTM CodeList CL.SEX
    # Using a CodeList mapping between ODM CodeList CL.SEXF and SDTM CodeList CL.SEX
    $DM.SEX = xpath(/StudyEventData[@StudyEventOID=’SE.VISIT0′]/FormData[@FormOID=’FORM.DEMOG’]/ItemGroupData[@ItemGroupOID=’IG.DEMOG’]/ItemData[@ItemOID=’IT.SEX’]/@Value);
    $NEWCODEDVALUE = ”;
    if ($DM.SEX == ‘0’) {
    $NEWCODEDVALUE = ‘F’;
    } elsif ($DM.SEX == ‘1’) {
    $NEWCODEDVALUE = ‘M’;
    } else {
    $NEWCODEDVALUE = ‘U’;
    }
    $DM.SEX = $NEWCODEDVALUE;
    </ImputationMethod>

  4. Jozef
    Couple of comments
    1. This is the sort of automation and re-use we are looking for, we should not need to do anything really, assuming good metadata, once the form has been created to produce SDTM
    2. A small and fine point, but I would stress that your “mapping” or “transformation” takes your CRF (say a company standard given the use of 0 for Female and 1 for Male) held in an ODM structure and tranforms to SDTM held in a SAS XPT structure. And as you say, that form and automation/reuse, gets better if the form is CDASH compliant (in ODM) to SDTM (in SAS XPT). But the point I wish to make is the distinction between the metadata (the CRF design, the tabulation design) and the manner of representation/storage (ODM or SAS XPT).
    Dave

  5. Hi Dave, I fully agree.
    Please replace the wording “SAS” in “SAS XPT” everywhere in your answer, just to be clear that there are other ways than to use SAS software for generating the SDTM datasets. Too many people still believe that it is not possible (or not wise) to generate SDTM datasets without SAS software. Our software does not use SAS at all, though it generates the SDTM datasets in XPT format.
    What we do is to generate the SDTM first in an ODM-like structure, but keeping the comments with their parent records, allowing non-standard variables in the domains, allowing more than 200 characters, etc.., and then, in the final step, “downgrade” to XPT.

    Coming back to your answer to Doug’s post.
    I completely agree with your point 4 about SDTM being a presentation of data. I always tell my audience that SDTM is about categorization (and partially an interpretation) of the clinical data to make them presentable to FDA reviewers (i.e. putting snippets of data in drawers). Any categorization without keeping a link to the source means loss of information, and this is exactly what is happening now, as Doug states. I do understand the FDA that they want to compare studies and thus like this “putting in drawers”, but I do also have my doubts whether this is the ideal “format” for a warehouse. In too many cases I have seen “square pegs round holes” SDTM mappings, thus making the data unusable again for comparison between studies.
    Therefore I was so charmed by your slide back from 2006, where every datapoint can be traced back (together with its full audit trail) to its origin. We are now 5 years later, and this idea has still not be implemented (by the FDA, I do not know about the status at pharma companies).

    So why wouldn’t we inverse the whole thing, and e.g. submit ODM files with metadata AND clinical data (with audit trail and all) to the FDA where the study metadata are annotated with SDTM information? This would allow them to “replay” the whole study (as demonstrated in your slide), and still allow them to retrieve the categorized information (SDTM – the set of “drawers”) from it. It would also allow them to have/develop different views on the data, not only the “SDTM view”.
    All they need is some software that understands XML and ODM.

    But we also need to come to “standardized buckets of (meta)data” indeed, as you have always promoted, and this is where SHARE comes into play. Such standardized buckets would enormously help us doing data collection (especially for safety domains) in a more standardized way. So for example, we should have such a “bucket” for each VS test. This would also enormously help the FDA, as then, data points really become comparable between studies (or it becomes immediately clear that some “alike” data points are not comparable at all). It would however also enormously help those that work on EHR integration, as it e.g. allows making good mappings to SNOMED codes.
    The buckets could e.g. be developed as mindmaps and then become available as … RDF, ODM, HL7-v3, … etc.. The technical format is essentially unimportant, but of course we need to present users a way of exchanging that information is standardized formats.
    This is nothing new, you discussed this with many of us in the last months. But we need to keep going on convincing people about this, and especially find more people that volunteer to work on this.

Leave a Reply

Your email address will not be published. Required fields are marked *