FHIR coding, and the codeableConcept datatype

Many elements in the FHIR resources have a coded value, of which one of the more common types is the  codeableConcept.  (The other two are the code and the coding datatypes – which we’ll meet shortly). Coded data allows for precision of data collection, more accurate reporting, and lays the foundation for Decision Support and so is very important in the recording of healthcare information.

(I’m going to preface this discussion by stating that coding clinical data is a complex topic, and this post does not go into all the details of coding data – it is intended to make sure that you have a basic understanding of how to use this datatype in FHIR. There are lots of links where you can go to get more information if you need it.)

In the words of the specification:  “A CodeableConcept represents a value that is usually supplied by providing a reference to one or more terminologies or ontologies, but may also be defined by the provision of text”. It’s actually quite well described in the specification (as most things are, in truth), but I thought I’d write a short note about it – mostly because I said I would at the end of the last post – but also because it’s so commonly used in FHIR that’s it’s worth being familiar with it.

Examples of where you might use it include:

and there are lots of other situations as well – apart from a string it’s probably the most commonly used datatype in FHIR (I should write a short program to check that – perhaps I’ll leave it as ‘an exercise for the reader’!)

If you look at the definition of the codeableConcept, you will see that is actually a combination datatype:

  • a text property of type string
  • 0 or more coding properties of type coding

Evidently the coding datatype does most of the work. This structure allows you to capture the clinical information that is being coded (ie the description in plain text) as well as multiple representations of that text in the coding system of choice.

Confused? Well, imagine that the clinician entered “atopic asthma” and you wanted to represent that in a problem list. You might want to code that as “Allergic Asthma” in snomed (389145006) and “Extrinsic asthma – unspecified”  in ICD-9 (493.00). The codeableConcept allows you to have both of these codes, plus the original text entered by the clinician, and upon which the encoding was performed.

The coding datatype is the actual representation of the code in the coding system of choice. It has the following properties – all of them optional.

  • System. This is a URI that identifies the terminology (or collection of codes) from which the actual code value is chosen. Although marked as optional, if it is not present, then the value of the code is severely diminished. There are a number of places you can get this from:
    • There is a set of named lists in FHIR that has the most commonly used terminologies
    • The HL7 OID registry
    • Any other option that uniquely identifies the set of codes.
  • Version. The version of the terminology that the code is drawn from. This applies particularly to SNOMED, LOINC & ICD.
  • Code. The actual code from the terminology. If this is absent but there is a system property, then the meaning is that no suitable code could be found in that terminology. In all other circumstances, it doesn’t make a lot of sense not to include it.
  • Display. This is the display of this code in the terminology (which may not be the same as the text being coded).
  • Primary. If this is true, then it means that this code was explicitly chosen by the user – eg it was in a drop down or autocomplete that the user selected, rather than being generated ‘second hand’ – say by some coding algorithm, or mapping from some other terminology. The significance is that it will be the most accurate, if there is more than one coding present.
  • Valueset. This is included for 2 reasons:
    1. In some cases, the valueSet itself defines the code system and there’s no other way to determine the meaning of the codes (e.g. the codes are “a”, “b” or “c” from a questionnaire)
    2. There are some use cases where the set of choices available when the code was chosen is relevant to interpreting the code. – eg if the user only had 10 choices, they may have made a different choice than if there had been 100.

    However, even when using value set, you still need to populate the system.

So, lets create a  rendering of the asthma  example given above, assuming that we are placing this in a condition resource in the code property:


    <code>
        <!-- SNOMED code -->
        <coding>
            <system value="http://snomed.info/id"/>
            <version value="International Release – 20130731"/>
            <code value="389145006"/>
            <display value="Allergic Asthma"/>
            <primary value="true"/>
        </coding>
        <!-- ICD code -->
        <coding>
            <system value="urn:oid:2.16.840.1.113883.6.42"/>
            <version value="9"/>
            <code value="493.00"/>
            <display value="Extrinsic asthma - unspecified"/>
            <primary value="false"/>
        </coding>
        <text value="Atopic Asthma"/>
    </code>

From this we can tell that the clinician entered ‘atopic asthma’ but accepted the SNOMED code of allergic asthma.

When a terminology or codeset is associated with a particular resource property, this is called a binding. Each resource description in FHIR displays the bindings that are defined for that resource immediately below the description of the resource content. Each binding has the following properties:

  • Path. The resource property that is being bound.
  • Definition. Describes the binding.
  • Type. The ‘strength’ of the bindings. Options are:
    • Fixed means that there is a specific set of values defined in the spec than cannot be extended. Usually this is for ‘code’ datatypes (see below)
    • Incomplete means there is a recommended set of values but can be extended by implementers
    • Example – it is really up to each implementation to decide
    • Unknown – not yet decided
  • Reference. A link to the terminology or codeset.

Some other things to note:

  • In theory a resource could have a property with a datatype of coding – in practice this is not often done (if at all). The codeableConcept is far more flexible than coding – for example, it would support a property that *can* be coded, but for some reason is not in this specific instance – such a patient problem/condition where the problem description has been captured, but not yet formally coded.
  • There is also a code datatype that many resources employ. Generally this is used for ‘workflow’ type purposes, and the codeset from which it is drawn is defined – and fixed – by the designers of the resource. The mode property of the List resource is an example of this – it is important  that any user of a List resource should be able to correctly interpret this property on any list, with clinical safety issues if not.
  • The equivalent to the codeableConcept in v3 (or CDA) is the CD – Concept Descriptor – datatype. CD is more complex than codeableConcept, though the extra functionality it offers (such as translations) are not often needed in practice (which is why it was simplified in FHIR). Should you find that you need to do so, then the FHIR extension mechanism can be applied to datatypes as well as resources.

So, enjoy the codeableConcept!

About David Hay
I'm a Product Strategist at Orion Health, Chair emeritus of HL7 New Zealand and co-Chair of the FHIR Management Group. I have a keen interest in health IT, especially health interoperability with HL7 and the new FHIR standard.

24 Responses to FHIR coding, and the codeableConcept datatype

  1. Since you asked:

    Resource (reference): 245
    string: 181
    CodeableConcept: 151
    code: 104
    Identifier: 43
    dateTime: 42
    boolean: 40
    uri: 32
    Period: 30
    integer: 26
    Quantity: 24
    Coding: 15
    Contact: 13
    Ratio: 11
    decimal: 11
    instant: 11
    Type: 10
    id: 9
    Attachment: 9
    Address: 7
    date: 7
    Extension: 6
    Schedule: 6
    HumanName: 5
    base64Binary: 5
    Age: 4
    Range: 4
    Structure: 4
    oid: 4
    SampledData: 3
    Duration: 2
    Money: 1
    idref: 1

    • David Hay says:

      so, apart from the resource reference, I was right! whew…

      (Now, do I update the original post so that it looks as if I knew what I was talking about…)

  2. btw, there is a difference between the system and the value set reference. The system is a reference to the logical definer of the code value. This allows a system to identify and know the code and it’s meaning.

    The value set reference refers to a formal description of the set of the codes that the user could choose from. This helps a human reviewer of the data to understand the set of codes that a user could pick from. See http://www.healthintersections.com.au/?p=567 and http://www.healthintersections.com.au/?p=1716.

  3. FHIR Starter says:

    It’s good to see SNOMED included in your example, but check how you’ve used the version property. RF1 actually refers to release format version 1, versus RF2 which is a newer and more or less equivalent data format.

  4. FHIR Starter says:

    No, that’s just the distribution file format. SNOMED is updated every six months and its version number will be something like “International Release – 2013-07-31”.

  5. David Hay says:

    Updated the meaning & use of the valueSet property based on subsequent discussions.

  6. FHIR Starter says:

    You have a sequence of two coding elements and one text element inside a code element. Do you have a schema that says the elements have to be sequenced this way? I’m thinking about ease of parsing. Perhaps you don’t use all the many and varied features of XML Schema.

    • David Hay says:

      Each resource has a specific XSD schema (and schematron) that is generated automatically from the design files (which are currently excel files) at build time. I haven’t checked, but I understand that the order of elements is set by them…

  7. Victor Chai says:

    David,

    You said that “From this we can tell that the clinician entered ‘atopic asthma’ but accepted the SNOMED code of allergic asthma.”. How? By comparing the “text” xml element value against the “display” element value in the primary concept? What’s the logic you used to determine that the code is post-coded? FHIR only states that “…the text is the representation of the concept as entered or chosen by the user,…”.

    Secondly, you said that “CD is more complex than codeableConcept, though the extra functionality it offers (such as translations) are not often needed in practice (which is why it was simplified in FHIR).”. I think it is not entirely true, the design CodeableConcept is also using translations except that it does not include the translated concept in the “translation” xml element, instead it just list out the equivalent concept at the same level as the primary concept.

    • David Hay says:

      Hi Victor,
      Because the SNOMED code is marked as primary, we assume that the user chose it – from the spec: “If this code was chosen directly by the user”

      wrt CD – you may well be correct, translations might not have been the best example. Nevertheless, v3 datatypes are not known for their simplicity…

      cheers…

  8. FHIR Starter says:

    I want more from a coded concept. I would like FHIR to have a subtype of codeableConcept that directly represents a SNOMED concept (snomedConcept/preferredTerm etc) and permits post-coordinated concepts with the compositional syntax, eg

    274457001 | reduction of dislocation of ankle | : 272741003 | laterality | = 7771000 | left |

    Other subtypes would be defined for other major vocabularies.

    I don’t want to have to use an OID to indicate a coding system.

    • David Hay says:

      Well, you certainly don’t need to use an OID – in fact the preference is not to. With regard to your desire for a more SNOMED specific codeableConcept I may have to defer that to others, but I’ll see if i can get a comment for you…

    • You can already use the post-coordination syntax as defined by SNOMED. The syntax of the code is defined by the code system. For post-coordinating code systems like SNOMED and UCUM, any valid expression is a legal string for code.

      That said, when sending SNOMED codes, the display names should be excluded from the string. They make computation a lot more difficult, given that many systems will process SNOMED codes as strings with no parsing. If you have some need to send the SNOMED code with embedded human readable display names, that should be done using an extension.

      And it’s more than a preference not to use an OID. You *must* use the URI defined in the FHIR spec for SNOMED: http://snomed.info/id

  9. FHIR Starter says:

    Putting the post-coordinated SNOMED expression minus the text in the code element, my example becomes something like:

    I suppose that works, although I don't see why FHIR shouldn't have a special construct for SNOMED references. It would be handy, for example, to be able to bring SNOMED attributes like finding site and laterality into a FHIR resource.

    Sorry, I've been using CDA and I tend to see OIDs everywhere.

  10. FHIR Starter says:

    Putting the post-coordinated SNOMED expression minus the text in the code element, my example becomes something like (apologies for the pseudo XML; my angle bracketed attempt above disappeared):

    coding
    system
    @value=”http://snomed.info/id”
    version
    @value=”International Release – 20130731″
    code
    @value=”274457001 : 272741003 = 7771000″
    display
    @value=”Reduction of dislocation of ankle : laterality = left”
    primary
    @value=”true”

    I suppose that works, although I don’t see why FHIR shouldn’t have a special construct for SNOMED references. It would be handy, for example, to be able to bring SNOMED attributes like finding site and laterality into a FHIR resource.

    Sorry, I’ve been using CDA and I’m seeing OIDs real and imagined everywhere.

    • You certainly *can* create a special extension structure for SNOMED codes (you can create extensions for anything), but I’m not sure you necessarily should. Consumers of SNOMED codes are going to be of two types:
      1. Systems that look on it as “just another code” – a string to store and possibly to validate against a list of valid codes. They don’t see SNOMED as special, and don’t know how to parse it. Having a separate structure is useless to them. (You’d have to populate the “code” property anyhow,)
      2. Systems that understand how to parse and process SNOMED codes. These systems do understand what the distinct characteristics are, but seeing as they already know how to parse SNOMED strings, pre-parsing the code into separate fields actually creates more work for them.

      I guess the question is: What’s the benefit of having a special pre-parsed syntax for exchanging SNOMED? Should we do the same thing for UCUM (e.g. splitting “mg/l” into its constituent “m”, “g” and “l” components)? If not, why the difference in behavior?

      • FHIR Starter says:

        I accept that argument. Perhaps my issue is with SNOMED and the need for codes that have to be parsed to make sense of them.

      • Well, whether you need to parse a SNOMED code (or a UCUM code or any other post-coordinated expression) depends on what you’re trying to do. It’s perfectly legitimate for an application to treat “mg/l” or “fracture of the left arm” (too lazy to look up the formal SNOMED coding) as a code in its own right. The only time you need to parse is if you don’t recognize the post-coordinated expression *and* you need to perform some sort of computable analysis on it. And in those cases, you’ll likely just pass the expression off to specialized software specific to that code system along with any other parameters needed to answer the needed question “is this a type of chronic condition?” or “what’s the base SI equivalent of this unit?”, etc. Certainly expecting every EHR system to hand-code the logic to parse and process SNOMED or UCUM would not be reasonable.

  11. Victor Chai says:

    Hi, I think what you are looking for is the ability of structurally representing SNOMED CT post-coordination instead of an expression string in value attribute, so that the requester does not need to parse out the individual attribute from the expression, similar to the discussion with regard to HL7 RIM vs SNOMED CT which I provided my personal opinion on the my blog

    http://healthinterconnect.blogspot.sg/2011/10/use-hl7-v3-rim-or-snomed-post.html.

    If that’s the case, then what you are asking for is to design FHIR CodeableConcept in the similar way as in HL7 v3 RIM data type version 1 which is used in CDA.

    • FHIR Starter says:

      That’s almost certainly where my question inevitably leads – I’m asking for nothing less than … the RIM of FHIR.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: