The FHIR questionnaire: part 2

Note that since this post was written, the Questionnaire resource got split in 2 with the Questionnaire being just a template, and the QuestionnaireResponse holding the data. More on that here.

In this post we’re going to take a closer look at the internals of the Questionnaire.

The Questionnaire resource has 2 main parts:

  • The ‘header’ part that describes the Questionnaire
  • The contents – questions and answers

The Questionnaire has a dual personality. If it only has questions then it is a template (and the status has options of draft, published and retired). If it has answers, then it is a form instance (and the status has options of in progress, completed and amended). It could be argued that this difference should be made explicit with a specific property – for the mean time we’ll refer to these personalities as a template or an instance/form.

In the header is:

  • Status was discussed in the previous post. It actually serves a couple of purposes:
    • It identifies whether the Questionnaire is a template or an instance of a form
    • It indicates the workflow state
  • Authored. This is the date that the Questionnaire was authored – specifically the actual version of this resource – ie the date is updated whenever a new version is created.
  • Subject. Who the instance refers to.
  • Author. The person who recorded the information in the instance. When the instance has been completed by different people (eg the nurse fills in the start of an assessment form and the doctor completes the rest), then different versions of the instance will have different authors, and the version history will tell who entered what into the instance.
  • Source. The person who actually supplied the information (who may not be the same as the subject). For example, a mother may be supplying information about her child. If the subject is supplying the information then this property is not needed. Similarly to the author, this could conceivably be different across versions.
  • Name. This is used particularly for templates, and is the equivalent of the name of a form in the paper world – e.g. ‘Pre-operative assessment’. (The spec talks about a pre-defined list of questions – but it means the same). Interestingly, there is no concept of a ‘version’ of the form (as opposed to the usual FHIR resource versioning). For example, suppose the ‘Pre operative assessment template’ is modified. You have a couple of options:
    • Just update the template (new FHIR version). You do need to be careful about backwards compatibility  if form instances refer to external entities for rendering.
    • Change the status of the original to ‘retired’, and create a new one with the same name (and identifier). It may make sense to have an extension to explicitly state the template version.
  • Identifier.  Also used for templates, this allows a template to be part of a business workflow – e.g. the template has the identifier of “POA-2013/15 revA” – or some such. Potentially you could the identifier.period to specify the date range over which the template was valid – so when you create  or retire the template, you set the identifier.period.start and identifier.period.end properties appropriately.
  • Encounter.  This is an optional reference to the encounter where the form was completed.

The actual content of the Questionnaire is contained in an element named ‘group’. There can only be one top level group in a Questionnaire and it is optional (A Questionnaire with no questions would seem to be a particularly useless beast, but you might want to create the template, and then later add the questions I guess). As an aside – I suspect that a better name for ‘group’ here might have been ‘content’ – and for clarity that is how we’re going to refer to it here.

The content can be quite complex, being a hierarchy of sections (which are represented by groups), questions within the groups and groups within the questions…

Lets start from the bottom by thinking about the Questions.

Each Question has the following properties.

  • Name. A CodeableConcept that contains the code for each question. For example, suppose the question is ‘How many babies have you had (Parity)’, then the code could be the snomed code 118212000
  • Text. The text of the question as shown to the user in the form
  • Answer[x]. The answer to the question – if there is a single answer that is not drawn from a value set (we’ll discuss this in a moment). There is a fixed set of datatypes you can use here, and it isn’t possible in the Questionnaire to dictate what datatype to use – however you could create a profile for this purpose if you needed to. (A possible enhancement to Questionnaire would be to allow this in the Questionnaire as well). The permissible datatypes are: String, Decimal, Integer, Date, dateTime and instant. Obviously, the answer element will only be present in a completed form, not in a template.
  • Options. This allows a question to specify a set of possible answers – as you would use in a ‘picklist’ or ‘autocomplete’ control on a form. The options element refers to a ValueSet resource – which is a very sophisticated resource with many different ways to define the set of answers (I should do a post on this resource some time).
  • Choice. This is the selection that the user made from the defined options. It’s possible to make more than one choice from the list of options.
  • Data[x]. This allows you to refer to a specific resource, or include an answer that is of a datatype not included in the the Answer[x] element.
    • If you are including an answer of a specific datatype – maybe a CodeableConcept –  then the element name will include the datatype – in this case dataCodeableConcept.
    • If you are referring to a resource, then the element name is dataResource, and the contents are the usual resource reference (reference and display)
  • Remarks. Comments that the author of the form wishes to include about the answer – perhaps they feel the answer is incomplete or inaccurate. It should not be used for the answer itself.
  • Group. Allow you to have ‘questions within questions’. For example, the main question could be ‘Are you a smoker’, and the ‘sub-questions’ are only relevant (and possibly only shown) when the answer is ‘yes’.

If you haven’t seen the format ‘ Answer[x]’  or ‘Data[x]’ before (i.e. the [x] bit), then what that means is that the element name includes the datatype. For example, if you used a string as the answer, then the element name is answerString; an instant would be answerInstant, and so on.

As you can see, there are a number of different ways of specifying the possible answers to a question. In fact there are 3, and they are mutually exclusive (you can only have 1 of them):

  • Answer[x] – when there is a single, simple answer
  • Choice – when you select one or more options from a set of possible answers (including a full terminology – or subset of a terminology)
  • Data[x] – when there is a different datatype or a FHIR resource that is the answer. (And this could be constructed from other data entered in the form, or a selected from an existing resource).

A question arises: If you construct a resource based on data entered through a Questionnaire, how can you record both the original data and the resource in the Questionnaire? As of now, I’m not sure what the answer is – presumably you could create a Provenance resource that links the two, though that seems overly complex. Another possibility would be to use the remarks property. Or perhaps an extension on the answer that references the resource is the way to go. I need to think a bit more about the use cases where this would apply.

Here’s an example of a simple form. It’s a pregnancy assessment with 2 questions; one will be a simple text answer and the other is a selection from a pre-defined list. To specify the list, we’ve included a ValueSet as a contained resource and expanded the options to include the name of each option so that they can easily be displayed to the user. (go check out the ValueSet in the spec if these terms are unfamiliar). If this particular set of options was used by multiple Questionnaires then it might make sense to save it as a separate ValueSet resource and reference it from the question rather than containing it.


<Questionnaire xmlns="http://hl7.org/fhir">
     <text>
         <status value="additional"/>
         <div xmlns="http://www.w3.org/1999/xhtml">
             <p>An example of a Questionnaire being used as a template</p>
             <p>Could Place the template HTML Here</p>
         </div>
     </text>
     <contained>
         <ValueSet id="vs1">
             <name value="Parity Options"></name>
             <description value="The list of possible responses for Parity"/>
             <status value="active"/>
             <compose>
                 <include>
                     <system value="http://snomed.info/sct"/>
                     <code value="127364007"/>
                     <code value="127365008"/>
                 </include>
             </compose>
             <expansion>
                 <timestamp value="2014-02-27T12:30:10Z"/>
                 <contains>
                     <system value="http://snomed.info/sct"/>
                     <code value="127364007"/>
                     <display value="Primagravida (First Pregnancy) "/>
                 </contains>
                 <contains>
                     <system value="http://snomed.info/sct"/>
                     <code value="127365008"/>
                     <display value="Second Pregnancy"/>
                 </contains>
             </expansion>
         </ValueSet>
     </contained>
     <status value="published"/>
     <authored value="2014-02-27"/>
     <name>
         <coding>
             <system value="http://orionhealth.com/fhir/questionnaire#templatecodes"/>
             <code value="pregAss"/>
             <display value="Pregnancy Assessment"></display>
         </coding>
     </name>
     <group>
         <name>
             <coding>
                 <system value="http://orionhealth.com/fhir/questionnaire#sectioncodes"/>
                 <code value="subj"/>
                 <display value="Subjective entries - what the patient says"/>
             </coding>
         </name>
         <header value="Subjective"/>
         <text value="Use this section to record the patient history"/>
         <group>
             <question>
                 <name>
                     <coding>
                         <system value="http://snomed.info/sct"/>
                         <code value="127364007"/>
                         <display value="Primagravida (First Pregnancy) "/>
                     </coding>
                 </name>
                 <text value="How many Pregnancies"/>

                 <choice>
                     <code value="53881005"/>
                     <display value="None"/>
                 </choice>
                 <options>
                     <reference value="#vs1"/>
                 </options>

             </question>
             <question>
                 <name>
                     <coding>
                         <system value="http://snomed.info/sct"/>
                         <code value="118212000"/>
                     </coding>
                 </name>
                 <text value="What is the Parity (Number of live births)"/>
                 <answerInteger value="0"/>

             </question>
         </group>
     </group>
 </Questionnaire>

As I was writing this post, there were a number of areas where I wondered if the Questionnaire could be modified to improve its usability (of course, it may simply be that I don’t fully understand the intentions behind it’s use – it’s one of the more complex resources). These are:

  • It would be nice to be able to be explicit about the version of a template. Eg – that this is the second revision of an assessment form. (This is different to the FHIR resource versioning)
  • Should there be an indicator – or a list – when a single form has had multiple authors or sources.
  • Should there be some explicit way of marking a Questionnaire if it is a template, or is inferring this from the status still be best way to go.
  • Should the top level ‘group’ be better named as  ‘content’
  • Allow a question to define the expected answer format (must use a profile at the moment)
  • Does the term ‘section’ make more sense than ‘groups’

This nicely leads me back to the connectathon. One of the key purposes of the FHIR connectathon – and the reason for choosing a theme – is to support implementers as they actually use these resources, and to feed back issues that arise during this  into the resource design process. I think that the Questionnaire is going to be one of the more commonly used resources, so it’s a good time to look at it in detail.

You really should be there!

We aren’t quite done with Questionnaire yet – next time we’ll take a closer look at the grouping structure, and the standard extensions.

FHIR Questionnaire: Connectathon 6

Note that since this post was written, the Questionnaire resource got split in 2 with the Questionnaire being just a template, and the QuestionnaireResponse holding the data. More on that here.

The theme for the next FHIR Connectathon has just been decided by the FHIR Management Group, and this time round we’re going to explore the Questionnaire resource. The reason for choosing this is that there is interest in this resource from a number of significant organizations in the healthcare space, and because it supports one of the core requirements in healthcare – the ability to collect information using a form, and to extract information from that form to populate or update a medical record, while always keeping a ‘copy’ of the form.

It’s important to appreciate that the purpose of the Questionnaire is in the collection of information – not to act as a ‘repository’ of that data for later searching. In other words, if the Questionnaire collects something that needs to be recorded outside of that form – like a Condition – then the correct thing to do is to create a formal Condition resource and store that. Searching for data within the Questionnaire is certainly technically feasible, but is discouraged – the spec doesn’t even define any standard search parameters for this.

It is still possible to refer to a Resource from the Questionnaire  – you might want to do this if sending a completed Questionnaire to someone else  and wish to include the created resources in a bundle – but to do that you must have already extracted the data, created the resource, and then placed a reference to it within the Questionnaire. In some ways, this is like a FHIR Document – where the Questionnaire is analogous to the Composition resource – although the purpose is quite different. And, like a document, you can include all the resources in a bundle and sign it. (We talked about signing in the post on tamper proof auditing)

If you want to be able to record that the resource was created from data in a questionnaire, then the Provenance resource should be used. Provenance.target would refer to the resource, and Provenance.agent would refer to the Questionnaire.

At a high level, there are at 2 ways to use the Questionnaire:

  • As a ‘template’, containing only the questions – and potentially layout. (Template is a really overloaded term unfortunately).
  • As a ‘form instance’, containing the questions and the answers to those questions, or with the answers only (in which you would need to refer to something else to render it).

The Questionnaire.status property is used to distinguish a specific resource – the diagram below shows how this works.

questionnaire workflow

When it comes to using a Questionnaire to create a form, a common workflow might be:

  1. Select a blank Questionnaire from a set established as templates. These would be Questionnaire resources with a code of ‘published’.
  2. Prepopulate the Questionnaire from data sources if available. For example, you might want to include the current medication list or conditions.
  3. Render the Questionnaire as a form for the user to complete
  4. Save the completed Questionnaire as an interim Questionnaire.
  5. Complete the Questionnaire.
  6. Construct any other resources from the Questionnaire, and place a reference to them in the Questionnaire.
  7. Save the Questionnaire resource

To store a Questionnaire, you save the resource in the data store in the same way as any other resource. As described above, there may be other resources that were created on the basis of data entered in the Questionnaire that can be referred to from the Questionnaire.

To display a completed Questionnaire (form) to an end user, the simplest way is to store that rendition in the text property of the resource. In that way, any system with minimal FHIR compliance can retrieve and display it. This also gives the creator of the form control over the display format, which senders often require. Of course, as the resource contains structured data it can be rendered in a different way if required – e.g. on a mobile device. If storage space is an issue, then an alternative is to store the answers only in the Questionnaire – with a minimal text property – and require the consumer to retrieve a display template and render locally. The same applies to the a template: you could include the layout in the text element, or require that the recipient render it.

Well that’s an overview of the Questionnaire resource. In the next post we’ll dig a bit more into its structure, and go over this workflow in more detail.

Tamper resistant auditing in FHIR

Had an interesting chat with a colleague here at Orion (Richard) about audit events and signing them, and how to be sure that they haven’t been tampered with (which is apparently a Meaningful Use requirement) so have made a note here for when I forget, as this is an area that I’m not that familiar with. I should also say that there are doubtless other ways of doing this and real life implementations really need to be done by those who are experts – it’s been said that poor security is worse than no security at all…

FHIR provides a specific resource – the SecurityEvent resource – which records some event of importance. This could be anything from creating, reading, deleting or updating other resources, and can serve as the basis for an audit log to answer questions like ‘who has accessed my record’. The resource is based on the IHE ATNA profile, and indicates ‘who‘ did ‘what’ and ‘when’.

The spec states that a server that stores SecurityEvents should not allow them to be updated or deleted – which makes sense for something related to audit – but for the truly paranoid, how can you know this? How can you know that a SecurityEvent resource has not been deleted or modified?

To answer this question, we need a bit of background on some security terminology.

  • A cryptographic hash is a string (sometimes called a digest) that is generated from a source (like a FHIR resource) using some hash function. SHA-256 is an example. The digest that is generated by the function will always be the same when the input is the same, but it is not possible (or more accurately is unfeasible) to regenerate the original from the digest. So, if you have a digest and the original, then you can be sure that the digest ‘matches’ the original by creating a digest yourself and checking that they are the same – kind of like a fingerprint identifies an individual person.
  • Public Key cryptography is a system where there are 2 ‘keys’ that are linked mathematically. It’s possible to encrypt something (producing a signature) with one of the keys, and then to re-generate the original using the other key (So this is ‘two-way’ – unlike the hash which is ‘one-way’). One of the keys is kept secret (the private key) and the other one is shared (a public key). Assuming that you have my public key, then I can encrypt a message with my private key and send it to you. You can use my public key to decode it – and you know that it came from me. The public key is often called a certificate.
  • Of course, for the above to work, you need to be able to trust that you have got my public key – and not that of someone else. This is where a certificate authority (CA) comes in. This is an organization that we both trust which asserts that a particular key pair (public and private) actually belongs to a given person (or organization / thing). You prove to the CA that you are who you say you are, and the CA signs your public certificate with their own one. So – if you get a message encrypted with my private key, and you have my public key (certificate), then you can check that the CA has validated that certificate, and then use it to decode the message. Provide you trust the Certificate Authority, then all is well. The infrastructure that supports all this is referred to as Public Key Infrastructure – or PKI.

Just a couple more things before we move on:

  • The computations required to sign and decode messages using public & private keys can take a long time to run with large input files. What people commonly do is to take a message, produce a hash from that and then sign the hash.
  • This discussion is all about proving who (or what) created a resource – the integrity of that resource. The actual encryption of the contents is a separate discussion.
  • It can get a lot more complicated than this! This is not a simple area…

So with the basics behind us, how can we be sure that our SecurityEvent resources have not been tampered with? One way is to use a signed Provenance resource that refers to the SecurityEvent. This would be created by the same application that created the SecurityEvent.

A key assumption to make is where do we get the signers public key (certificate) from? There are a number of options for this, and the details will vary for each particular implementation (as far as I am aware there is no intention for FHIR to move into this space).

One possibility is to actually store the certificate in the signature itself! This sounds silly, but the idea is that the certificate is signed by a Certificate Authority. So, if we trust that Authority, then we trust that they assigned it to the right person. We’ll use that for the example below.

This process would go something like this.

  1. The application creates and saves the SecurityEvent resource.
  2. Next, it creates a Provenance resource. The provenance resource will refer back to the SecurityEvent resource (Provenance.target).
  3. Next generate a Hash of the SecurityEvent resource.
  4. Create a signature by encrypting the Hash using the systems private key, and include the public key (certificate) in the signature, which is then saved in the Provenance resource. (Provenance.integritySignature).
  5. Save the Provenance resource.

All of the above steps would need to be executed on the server in a single transaction.

So, with that in place how can I check that a particular resource hasn’t been altered? Well:

  1. Retrieve the SecurityEvent resource I want to check
  2. Generate a Hash of the resource using the same hash function as before.
  3. Retrieve the Provenance resource that refers to that SecurityEvent. (This would be a query against Provenance – /Provenance?target={ SecurityEvent.ID} ). The element Provenance.integritySignature has the signature generated by the system that signed it.
  4. Extract the signers Certificate from the signature, and make sure that the issuing CA (in the Certificate) is one that I trust. (Actually, there can be a whole chain of CA’s – but let’s not go there…)
  5. Using that Certificate, decode the rest of the signature. This will produce the original Hash, which should be the same as the one generated in step 2. If they match, then you can be sure that the SecurityEvent is the same as when it was signed. If not, you have a problem.

So what stops the bad guy from altering the signature in the Provenance? Well, they don’t have the private key of the signing system, so can’t encode the hash in a way that can be decoded by the public key (certificate) of the signing system. If they do put their own certificate in, then the CA to which it refers (and that I trust) tells me it was someone else.

The whole thing depends on being able to keep the private key of the signing system secret, and having certificates – maybe from a CA – that we both trust.

Couple of questions that come to mind.

Why not just sign the SecurityEvent resource and add the signature to it directly? Well, the problem with that is that the signature is ‘signing’ the whole SecurityEvent resource. If the digest is a part of that resource, then you have to generate the digest and then add it to the resource and then… oh wait … we’re just modified the resource and therefore the digest is invalid…

How do you know if a SecurityEvent has been deleted? After all, the bad guy could delete both SecurityEvent and Provenance? One way around this is for each SecurityEvent to refer to the previously created SecurityEvent in a chain (You’d need an extension for this in FHIR). That way you can ‘walk the chain’ to make sure there are none missing – and provided you check each SecurityEvent as you go, you know that the bad guy hasn’t changed any of the links.

Security is a complex topic, and this merely scratches the surface of what is required to implement correctly – it’s a high level overview and not an implementation guide.  To understand more – John Moehrke is one of the acknowledged experts in this space in healthcare.

isModifier and mustSupport in FHIR

There’s been a discussion on the FHIR list about ‘mustSupport’ and ‘isModifier’ that I think I now understand – but I’m going write down for when I forget later! (Most of the wording below has been directly copied from comments by Grahame or Lloyd either in the spec or in comments on the list)

First, what  do they mean.

isModifier

From the spec:

If true, the value of this element affects the interpretation of the element or resource that contains it, and the value of the element cannot be ignored. Typically, this is used for status, negation and qualification codes. The effect of this is that the element cannot be ignored by systems: they SHALL either recognize the element and process it, and/or a pre-determination has been made that it is not relevant to their particular system.

Only the definition of an element can set IsModifier true – either the specification itself or where an extension is originally defined. Once set to false, it cannot be set to true in derived profiles. An element/extension that has isModifier=true SHOULD also have a minimum cardinality of 1, so that there is no lack of clarity about what to do if it is missing. If it can be missing, the definition SHALL make the meaning of a missing element clear.

 Then Lloyd commented:

We mark something as a modifierExtension not because a human might care about it, but rather because it might impact what a computer is doing with existing data.  If you’ve got a decisions support system that’s driving off of data elements and your extension changes the interpretation of any of those, then you need to flag to the decision support engine “don’t try to process this data unless you understand the extension”.  So long as the meaning of existing data is unchanged, it doesn’t matter how clinically relevant/important the data is, it’s not a modifierExtension.  (Data perceived to be clinically important/relevant by the originator should however be highlighted prominently in the narrative.)

 And Grahame added:

What makes something is a modifier is not that it might modify care, but that it might modify the understanding of existing data elements. This should be understood in a narrow sense, not an expansive one – it’s really about “is it safe to use the existing data without knowing the additional information”, not “would knowing the additional information influence my course of action in interpreting the data”.

mustSupport

From the spec

If true, conformant resource authors SHALL be capable of providing a value for the element and resource consumers SHALL be capable of extracting and doing something useful with the data element. If false, the element may be ignored and not supported.

“Something useful” is context dependent. This flag is never set to true by the FHIR specification itself – it is only set to true in profiles, and when the profile sets it true, it SHALL describe what it means for applications to support the element. In general, the question is what would a reasonable observer expect of a system that explicitly claims to “support” this element?.

And Lloyd added:

The purpose of “mustSupport” is for use in situations such as national program created profiles where they want to leave an element as “optional” (data won’t always exist and thus no need to include the element in all instances), but want to state that implementers playing in a particular space (and thus declaring conformance to the profile) must *support* that element (e.g. be able to capture, display, process, store, etc.)  The specific meaning of mustSupport would need to be declared by the Profile.  As an example, the panCanadian v3 profile for Patient identifies “date of death” as being “required” in our v3 message specification.  That means minOccurs=0, conformance=required.  Systems don’t have to send it, but they must be capable of sending, capturing and storing it.  With FHIR, we’d accomplish the same thing by setting minOccurs=0 and mustSupport=true in our profile.

and

mustSupport is meaningless at run time.  mustSupport only matters at conformance time.  Someone publishes a profile that says “If you want to comply with my profile (and thus get paid ‘meaningful use’ $$ or be eligible to respond to this RFP or be allowed to be installed in my hospital, etc.), you must support this element.  You declare that you support the profile in a contract or an RFP response or whatever.  It’s entirely possible that the profile will never be referenced in an instance at all.  (And if it were, it shouldn’t cause any change of behavior.)

 My Comments

Both mustSupport and isModifier flags are recorded in the profile resource.

isModifier can apply both to core elements (i.e. elements of resources that in the spec) and also to extensions. In some cases the base specification has marked elements as isModifier – eg MedicationPrescription.status. If you do define an extension in a profile, and mark it as ‘isModifier=true’, then when an actual resource instance contains this extension, it must use the modiferExtension element, rather than the usual extension element. This makes it really explicit to a consumer that this extension can modify the meaning of the resource.

Remember that all the core FHIR resources also have a profile attached to them. You can retrieve these from either Grahames or Ewouts server by a query against the profile resource – eg http://fhir.healthintersections.com.au/open/Profile?name=Person or http://spark.furore.com/fhir/Profile?name=Person

Updated: Grahame commented that all profiles are provided directly from the spec

mustSupport can apply to both elements and extensions, but are only set in a profile – i.e. the base specification has no elements marked as mustSupport. A resource instance will use an ordinary extension element (unless it it isModifer too of course).

Hope that all makes sense!