Accessing lab data via FHIR – part 2

In the previous post we took a look at the overall organization of the resources involved in representing Laboratory tests and what an API to retrieve them might look like.

But the queries that we discussed were rather ‘blunt’ – just retrieving results for a person with minimal filtering. While that’s often needed, being able to make more targeted queries can be a whole lot more useful (especially when we want to chart it or include data into Decision Support routines), and that brings us onto the topic of coding the data – the subject of this post.

Let’s start by thinking about some representative use cases.

  • Create a chart of glucose or HbA1c results over time (perhaps including medication administration on the chart)
  • Find a persons most recent CBC results
  • Get the most recent haemoglobin.

Before we start thinking of the specifics of coding in lab tests, a quick review of coded data in FHIR in general is probably a good idea.

The following image is found in the Terminology section of the spec, and shows the relationship between the key terminology resources:

We’ll focus (for now) on 2 of them – the CodeSystem and the ValueSet.

The CodeSystem resource represents the ‘system’ within which a coded value is unique, and where the concept that it represents is defined. It can represent an external terminology such as LOINC or SNOMED, or can contain the actual definition of the concepts directly. In a coded element of a resource instance (which we’ll come to in a moment) it is the ‘system’ element that indicates the CodeSystem, and is generally a url.

The ValueSet is a selector of possible codes for an element described in a profile (or the core spec) from one or more CodeSystems – it doesn’t define the codes themselves. It answers the question “To be conformant to the profile x (or to the main spec), what are the possible values for this element”. The ValueSet is said to be ‘bound’ to the resource element with a particular ‘strength’, which determines to what degree a particular code can be present in a given instance, not present in the ValueSet, and still be conformant to the profile or spec.

For example:

  • The Condition.code element in the spec is ‘bound’ to the ValueSet with the url of
  • This ValueSet in turn includes a number of concepts (codes) from the SNOMED CodeSystem (with the url of
  • The strength of the binding is ‘example’, which means that, actually, any code from any codesystem is allowed. The binding in the spec is intended to be an example of what type of codes are expected.

So the important distinction here is that it is the CodeSystem where codes are defined – not the ValueSet, and in an instance, the coded element will refer to the CodeSystem (via the ‘system’ element) – not to the ValueSet.

There are 4 specific datatypes that can contain coded data. These are:

  • code. This dataType has only the value of the code (eg ‘active’ or ‘entered-in-error’) in the instance – ie there is no system child element. The ValueSet, CodeSystem and binding is all specified in the spec (or profile) and the binding strength is almost always “required” (ie cannot be changed). It is unusual to use this dataType outside of the core spec. (Note that the datatype name – code – starts with a lowercase letter. This is because it is a ‘primitive’ datatype – it has a value, but no child elements. Datatypes that can have children start with an uppercase).
  • Coding. This datatype has a system element which identifies the CodeSystem from where the code element is defined (by the url). For example,  a url value of would indicate that this is a LOINC code. You can see a list of the common CodeSystem urls in the spec.
  • CodeableConcept. This is by far the most frequently used coded datatype and contains within it any number of Codings (including none) and a text element. The text element is a human readable description of the concept that the CodeableConcept is indicating. It thus allows a concept to be represented in different codesystems (but it must be the same concept) or just as text if there is no coded value available.
  • Quantity datatype. We regard this as a coded dataType as it has a system child element to indicate the unit being used (most commonly from the UCUM system).

Coding data gets a LOT more complicated than this – both in trying to define it and trying to use it, but this is enough to get started.

But it does beg the question – why bother? Why not just use a description of the concept? The answer is that by using a coded value, you are unambiguously stating what you mean. Anyone who looks at the code and system in your coded element can go look in the CodeSystem indicated by the system url (not the ValueSet) and find out exactly what you are referring to.

Any by ‘anyone’ we also mean a computerized system – not just a human. If it were just text, then a simple typo could have disastrous implications.

We sometimes call this ‘semantic’ interoperability. The ‘meaning’ is explicit to both creator and viewer.

Now let’s take a closer look at the Observation resource, as this is where the actual result is recorded (recall from the previous post that the DiagnosticReport is effectively a ‘grouper’ of results).

You’ll note that it has a required Observation.code element, with a definition of  “Describes what was observed. Sometimes this is called the observation ‘name’.” . So it’s obviously the element that we need to be filtering on.

But what code to use?

Looking back at the spec, we see that Observation.code is bound to the LOINC code system with an example binding. In other words, it’s not specific at all.

This is a common pattern in the FHIR core spec. As a global standard, it can’t be too prescriptive, or it simply won’t be useable in many scenarios and countries. It’s the profiling capability where more specificity can be added – for example, in New Zealand we have the Pathology Observation Code sets (NZPOC) standard, so we can create a profile on Observation where we bind a ValueSet with those codes in it to the Observation.code element with an extensible binding.

So if all the data in the repository were conformant to that profile, then we’d be good to go, and the code for glucose is 14749-6 from NZPOCS. (As an aside, to do this we’d create a ValueSet resource containing the NZPOCS codes. Because all the NZPOCS codes are LOINC, we don’t need a separate CodeSystem for this. Then, we bind Observation.code in our profile to that ValueSet. The Observation.code values in a conformant resource instance will still contain the LOINC system).

Let’s pretend for the moment that this is the case – all data in the repository is conformant to a profile binding code to NZPOCS. (We’ll consider what to do if it is not in the next post).

And that assumption makes life much easier – we just need to filter on code, for example


We can add other parameters as needed (and as supported by the repository API), for example if we wanted to return the ServiceRequest as well we could use _include as follows:

[host]/Observation?patient.identifier={}&code=|14749-6&_include Observation:basedOn

To get the most recent glucose, we can use a combination of 2 modifiers – _sort and _count. We use _sort to sort the Observations in date descending order, and _count to return just the last one:


(This is an example where the search parameter name – date – doesn’t match the element name. And, indeed, can refer to multiple elements)

We can also return the DiagnosticReport from where the Observation is referred from using the _revinclude modifier, which will include resources that have a reference to the one under consideration as follows:


Which states: “Include all DiagnosticReport resources where the result element in that resource has a reference to the Observation”. Note that not all Observations will necessarily have an associated DiagnosticReport

And all the considerations above apply equally to the DiagnosticReport.code. For example, the LOINC code 58510-2 refers to the Complete Blood Count. So we can return all the CBC reports for a patient – including the individual observations – as follows:


In the next post we’ll dive a bit further into what we can do if we can’t assume that all data in the repository is coded to our expectations.

If you want to experiment further with these queries, then the approach I took may be of interest. I used the clinFHIR GraphBuilder to create some sample data, which I then saved to a HAPI server (I have several that I maintain – but the public HAPI server is just fine). After that I was able to use POSTMan to check that the queries were correct. (Actually, I couldn’t get _sort to work – but it’s correct in the spec).

You can create sample data in the best way that works for you of course – my approach isn’t that great for large amounts of data – but there are plenty of other options – such as synthea for more realistic data sets. If you do do it yourself, then I highly recommend using FSH to do it – perhaps using GraphBuilder to create a base set, then copy as required.

About David Hay
I'm an independent contractor working with a number of Organizations in the health IT space. I'm an HL7 Fellow, Chair Emeritus of HL7 New Zealand and a co-chair of the FHIR Management Group. I have a keen interest in health IT, especially health interoperability with HL7 and the FHIR standard. I'm the author of a FHIR training and design tool - clinFHIR - which is sponsored by InterSystems Ltd.

4 Responses to Accessing lab data via FHIR – part 2

  1. David Fallas says:

    Not surprisingly, I’m following this series with much interest! Thanks for the great work David.

  2. Pingback: Dew Drop – May 25, 2021 (#3450) – Morning Dew by Alvin Ashcraft

  3. Pingback: Accessing lab data via FHIR – part 3 | Hay on FHIR

Leave a Reply