Definition based resource extraction
February 26, 2026 Leave a comment
In this post we’re going to take a closer look at SDC definition based resource extraction from a completed Questionnaire.
SDC (Structured Data Capture) is an Implementation Guide that describes and expands the use of the FHIR Questionnaire resource in defining forms to display and capture structured information. As well as visual and behavioural enhancements it defines mechanisms for pre-population of data as well as extracting it to FHIR resources that we’ll discuss here. (We’ll take a look at pre-population in another post).
In terms of resource extraction, SDC actually defines four different extraction mechanisms.
- Observation based is the simplest way to extract an Observation resource, and it defines how specific parts of the Observation such as the patient, author, code and value are populated from the form data (specifically the QuestionnaireResponse resource). It is only used for Observation resources.
- Definition based is more comprehensive – it can extract to any resource, but does require that all mappings from the QuestionnaireResponse to an extracted resource are explicit in the Questionnaire.
- Template based extraction involves creating skeleton resources to extract to with placeholders for specific elements to populate.
- StructureMap based extraction is the most comprehensive – and the most complex. It uses the FHIR Mapping language and so requires familiarity with that part of the specification
We’re going to use definition based extraction in this post as it meets all our requirements – it’s flexible enough for what we need to do, and we’ve also developed tooling as part of the CanShare project that can generate the Q from a model so it’s a logical fit for us.
Speaking of requirements, what we’re going to do here is to generate a request bundle for the histological examination of a specimen – such as a skin lesion. It’s going to be a simple request – we’ll only include space for a clinical note – but the pattern could easily be extended to include more detailed information if needed. There will be something similar for the report – but we won’t discuss that here.
We’re going to need a number of resources:
- A Patient
- A Practitioner
- An Observation for the clinical note
- A Specimen
- and a ServiceRequest that represents the actual ‘request’ for examination
What is a little tricky is that all these resources need to have references between them – as shown in the graph below:

Let’s see how we can do that. BTW there are other references we may want to create – such as from ServiceRequest to Specimen – but the principles are the same.
There are a number of key extensions and concpets to understand when doing this kind of design.
We use the definitionExtract extension to indicate the type of resource that we wish to extract to. Actually it’s a canonical url which means that we can specify a specific profile as well as a base FHIR type. Once defined, we can then use that url in any child item using the item.definition element in the Questionnaire to specify a mapping from that item to an element in the resource. There can be multiple profiles available at any point – this is important as we will see. This is also called the extraction context.
When the extraction engine is performing the extraction, it will create an id for that resource (strictly speaking it’s a bundle entry.fullUrl element, but we’ll think of it as an id). However, that isn’t always going to work for us, as when we create a reference from one resource to another we need to specify the id of the source resource so we’ll need to know that in advance. The way that we will achieve this is to use the extractAllocateId extension to create a fhirPath variable that the extraction engine will populate with a UUID, and then use that variable both to set the resource id in a fhirPath expression on the definitionExtract extension and also for the reference. We’ll see examples of this shortly.
And finally, there will be times when we want to set an element value directly without user involvement. For example, the Observation status needs to be set, but we don’t want the user to have to select something. We’ll use the definitionExtractValue extension for this. We will also use this extension for the references.
Just for the record, there are other ways we could fix a ‘hidden’ value, for example:
- We could set the initial value element on the item and hide it
- We could use the initial expression SDC extension on the item and hide it
but we’ll stick with definitionExtractValue.
So let’s start by thinking about how we extract from a questionnaire to a single resource. We’ll use the Patient resource as an example. If you want to follow along, you can load the clinFHIR Questionnaire viewer, and select the ‘skin request’ from the Library (It’s the one with a description). I can’t guarantee that it will always be there, but it is at the time of writing.
After selecting the Questionnaire, the first tab to be displayed is the Outline view, which shows the Questionnaire as a tree. Here’s a screenshot with the patient selected.

If you click on the LinkId tab to the lower right, you will get a modal dialog showing the Patient item, along with the ancestor items up to the Q root. Here’s what it looks like:

We’re going to use this dialog quite a bit, so a brief description is in order.
- At the top left is the hierarchy of Questionnaire items that is ‘above’ the selected item. You can select any of them to view that item
- Bottom left are a number of useful informational links.
- To the right is a tab set with the following tabs
- The raw json of the item with all child items removed (they clutter the display)
- An SDC view that displays key SDC extensions
- A list of any child nodes for this item
- A hierarchical display of all the key SDC extensions for this node and its ancestors. This is particularly useful when understanding the context for the item – and where in the ancestry it is defined as context ‘flows down’ the branch.
If you look at the json you can see that we’ve used the definitionExtract resource to set the context for extracting to a core Patient resource (using the ‘definition’ sub extension). But take a look at the ‘fullUrl’ sub-extension. It refers to an SDC variable ‘patientID’ (the ‘%’ means ‘insert the value of the patientID variable here) – where is that variable defined?
Well, it must be either on the same item (which it isn’t) or an ancestor. Click on the Extensions ancestry tab.

As mentioned above, this tab shows all the SDC extensions that have been defined ‘above’ the patient. Right at the top under ‘No text’ (it’s actually the Questionnaire root) is an extractAllocateId extension that defines the patientID variable. By defining it here, any expression ‘below’ the root (which is all of them) can access that variable. This is going to be important later on when we start creating references, but for now lets continue with Patient resource elements.
As an aside, we didn’t really need to define a separate variable for a patient id – we could have used the patient context that can be passed into the Questionnaire and use that to pre-populate the form. However, this does assume that the patient context is being passed in, which we can’t rely on, so for now we’ll do it the manual way. We’ll consider pre-population in another post.
Close the dialog and go back to the Outline, select the Date of Birth element in the tree and then click its LinkId to display the item details. Here’s what it looks like:

Note that the definition element has the value:
http://hl7.org/fhir/StructureDefinition/Patient#Patient.birthDate
There are 2 parts to this with a # separator between them
- The first part is the canonical url of the resource/profile that is to be populated. We’ve just seen how that is defined – and if you click the ancestry tab you can see it there (as Patient is an ancestor to Date of Birth).
- The second part is the path within the resource – Patient.birthdate.
In other words, if there is a value for date of birth in the form (and hence the QuestionnaireResponse resource which has the form data), it will be copied into the Patient.birthDate element in the extracted patient.
Incidentally, if there hasn’t been a type/profile defined then this will silently fail. This is why the ancestry tab is so useful when debugging. By the way the initialExpression extension is used for pre-population – we’ll look at that in another post.
So that’s a simple example – what about a more complicated one like the first – or ‘given’ – name? A patients name is a dataType of HumanName so the first name is ‘inside’ the dataType. To see how that works, show the item details for given name (select in the outline and click the linkId)

You can see that the definition has the mapping to the given name element – just as the date of birth did. Note that the path includes the name element – ie Patient.name.given. And here’s trap for the unwary – you also need to set the path for the patient.name as well. Click on the ‘Name’ in the left pane and you can see it there. If you don’t do that the extraction will silently fail.

By the way another gotcha (at least it was for me) is extracting to a CodeableConcept. Assuming the item type is ‘choice’ or ‘open-choice’ then you’re actually extraction to the ‘coding’ child of the CodeableConcept. Here’s what the definition for the Observation category looks like:
http://hl7.org/fhir/StructureDefinition/Observation#Observation.category.coding
What about extensions? Take a look at the ethnicity item.

The value for definition is
http://hl7.org/fhir/StructureDefinition/Patient#Patient.extension.value
which will set the value, and there’s a definitionExtractValue to set the extension url using a fixed-value.
By the way, although not shown in the screenshot, we’re using a preferredTerminologyServer extension on the Patient to indicate that the ethnicity ValueSet is on the New Zealand terminology server. This is a very useful extension for coded data and just needs to be on an ancestor item to where the item is being displayed. If there are multiple defined on different items, the one closest to the item will be used.
Now let’s think about those references.
Recall that we created a patientID variable on the Questionnaire root, and then used that value as the fullUrl (representing the id) when creating the patient resource. So that means that we can use the definitionExtractExtension to create the reference.
Take a look at the Specimen item.

This item is doing a number of things.
- It’s setting the reference from the Specimen.subject.reference element to the Patient using definitionExtractValue. We specify the .reference element as is the actual reference (as opposed to, say, the display element)
- It’s also establishing an extraction context (using definitionExtract) for Specimen for any children (which will be in addition to any other contexts that may have been established). We haven’t set a variable as we aren’t creating any references from this item.
Setting the patient reference is straightforward as we defined the id on the Q root so every item can see it. But we also want to be able to establish a reference from the ServiceRequest to any Observations we create (the clinical note in this case).
There may well be multiple Observations, so we can’t just create a variable for Observation id on the root and use that, as the id would be the same for all observations. What we need to do is to create a new observation id for each one and use that as the reference (in a similar way to the Patient references). But there’s a twist. In order to be able to set the reference on ServiceRequest (remember the direction is ServiceRequest -> Observation, the ServiceRequest context must be established (using definitionExtract) by an ancestor of the observation. In other words, the Observation item needs to be a child of the ServiceRequest item.
If you take a look at the Clinical details item you can see this.

- We create a variable (using extractAllocateId) for the observation – obsId (it’s at the bottom of the list of extensions)
- Then we use that in definitionExtract to set the id for the Observation
- Finally we can use the same id to set the reference from the Observation to the ServiceRequest
Note that the extensions don’t need to be in any specific order on the item
So that’s how you can set up a Questionnaire to extract to a graph of resources – which will be delivered in a Bundle. You can see this in action in the clinFHIR Questionnaire Viewer as follows.
Select the Rendered Form tab.

This uses the Fhirpath lab to render the form using the CSIRO form renderer. You may need to click the ‘Refresh’ link above the display to render the form – there’s a timing issue I need to fix up.
Next click the ‘Set pre-populate link’. This invokes the SDC pre-population functionality to set the patient name from defaults that the viewer provides. (This works as this Questionnaire has pre-population enabled. We won’t more talk more about this here – a topic for another post)

Note that the viewer will display the value for all items in the right hand pane. What’s happening behind the scenes is that after invoking the ‘pre-pop’ functionality, the viewer asks the lab to generate the QuestionnaireResponse resource with the form data and then displays that in the tab.
Enter data into the form. Note that the tab to the right displays the data as it is entered.
Now click the ‘Get extract bundle’ link. A couple of new tabs will appear to the right.
The “Extract resource graph” presents a graphical view of the extracted resources, clearly showing the resources and the references between them.

Note that the extraction engine will create a resource if there are any valid elements in it – whether they were entered by a user or from pre-population activities. For example, if you load the example form, click the pre-populate link and then the get extract bundle link you’ll get a Patient, ServiceRequest and Observation resource because items from the form that mapped to elements in the resource were pre-populated.
The extract resources list shows the resources in a list view, allowing you to view the details of the individual resources (actually, it’s the Bundle.entry element that contains the resource). In the screen below you can see the ServiceRequest with data & references)

You can also load the graph into the clinFHIR Bundle Viewer for a more detailed view.
So that’s a run through of a real example of using SDC to generate FHIR resources – it’s a powerful spec!
Incidentally the interaction between lab and viewer is done using web messaging – we’ll take a look at this in another post.
Recent Comments