Where did that data come from?

This post grew out of a question from one of the Analysts at Orion Health. We’re in the process of embedding FHIR pretty deeply into our product stack – and part of that involves creating FHIR interfaces to our existing data repositories.

This particular repository takes data feeds from a number of sources – mostly in the form of v2 messages, but also including CCDA documents – and from them extracts clinical data of interest such as encounters, procedures, problems and so forth. Because of the varied source of the data, one of the data items in the existing output that is displayed to the user is where it come from – ie which facility and possibly which application. (As much as we’d like to get the clinician, this data is not generally available in v2 messages).

So they question they asked me was – ‘where does this stuff go in FHIR’?

Read more of this post

Clinical Scenarios in FHIR: Adverse reaction

The final scenario that is a candidate for the clinical connectathon is one that deals with allergies and Intolerance. Here are the details, but at a high level:

  • A patient is prescribed penicillin and 8 days into the course develops symptoms that are diagnosed as an allergy to penicillin, which is recorded in their allergy list, along with details of the nature of the allergy.
  • In addition, the patient (through a patient portal) updates their allergy list – adding an entry that is subsequently updated (or reconciled) by the clinician.

Read more of this post

The FHIR questionnaire: part 2

In this post we’re going to take a closer look at the internals of the Questionnaire.

The Questionnaire resource has 2 main parts:

  • The ‘header’ part that describes the Questionnaire
  • The contents – questions and answers

The Questionnaire has a dual personality. If it only has questions then it is a template (and the status has options of draft, published and retired). If it has answers, then it is a form instance (and the status has options of in progress, completed and amended). It could be argued that this difference should be made explicit with a specific property – for the mean time we’ll refer to these personalities as a template or an instance/form.

In the header is:

  • Status was discussed in the previous post. It actually serves a couple of purposes:
    • It identifies whether the Questionnaire is a template or an instance of a form
    • It indicates the workflow state
  • Authored. This is the date that the Questionnaire was authored – specifically the actual version of this resource – ie the date is updated whenever a new version is created.
  • Subject. Who the instance refers to.
  • Author. The person who recorded the information in the instance. When the instance has been completed by different people (eg the nurse fills in the start of an assessment form and the doctor completes the rest), then different versions of the instance will have different authors, and the version history will tell who entered what into the instance.
  • Source. The person who actually supplied the information (who may not be the same as the subject). For example, a mother may be supplying information about her child. If the subject is supplying the information then this property is not needed. Similarly to the author, this could conceivably be different across versions.
  • Name. This is used particularly for templates, and is the equivalent of the name of a form in the paper world – e.g. ‘Pre-operative assessment’. (The spec talks about a pre-defined list of questions – but it means the same). Interestingly, there is no concept of a ‘version’ of the form (as opposed to the usual FHIR resource versioning). For example, suppose the ‘Pre operative assessment template’ is modified. You have a couple of options:
    • Just update the template (new FHIR version). You do need to be careful about backwards compatibility  if form instances refer to external entities for rendering.
    • Change the status of the original to ‘retired’, and create a new one with the same name (and identifier). It may make sense to have an extension to explicitly state the template version.
  • Identifier.  Also used for templates, this allows a template to be part of a business workflow – e.g. the template has the identifier of “POA-2013/15 revA” – or some such. Potentially you could the identifier.period to specify the date range over which the template was valid – so when you create  or retire the template, you set the identifier.period.start and identifier.period.end properties appropriately.
  • Encounter.  This is an optional reference to the encounter where the form was completed.

The actual content of the Questionnaire is contained in an element named ‘group’. There can only be one top level group in a Questionnaire and it is optional (A Questionnaire with no questions would seem to be a particularly useless beast, but you might want to create the template, and then later add the questions I guess). As an aside – I suspect that a better name for ‘group’ here might have been ‘content’ – and for clarity that is how we’re going to refer to it here.

The content can be quite complex, being a hierarchy of sections (which are represented by groups), questions within the groups and groups within the questions…

Lets start from the bottom by thinking about the Questions.

Each Question has the following properties.

  • Name. A CodeableConcept that contains the code for each question. For example, suppose the question is ‘How many babies have you had (Parity)’, then the code could be the snomed code 118212000
  • Text. The text of the question as shown to the user in the form
  • Answer[x]. The answer to the question – if there is a single answer that is not drawn from a value set (we’ll discuss this in a moment). There is a fixed set of datatypes you can use here, and it isn’t possible in the Questionnaire to dictate what datatype to use – however you could create a profile for this purpose if you needed to. (A possible enhancement to Questionnaire would be to allow this in the Questionnaire as well). The permissible datatypes are: String, Decimal, Integer, Date, dateTime and instant. Obviously, the answer element will only be present in a completed form, not in a template.
  • Options. This allows a question to specify a set of possible answers – as you would use in a ‘picklist’ or ‘autocomplete’ control on a form. The options element refers to a ValueSet resource – which is a very sophisticated resource with many different ways to define the set of answers (I should do a post on this resource some time).
  • Choice. This is the selection that the user made from the defined options. It’s possible to make more than one choice from the list of options.
  • Data[x]. This allows you to refer to a specific resource, or include an answer that is of a datatype not included in the the Answer[x] element.
    • If you are including an answer of a specific datatype – maybe a CodeableConcept –  then the element name will include the datatype – in this case dataCodeableConcept.
    • If you are referring to a resource, then the element name is dataResource, and the contents are the usual resource reference (reference and display)
  • Remarks. Comments that the author of the form wishes to include about the answer – perhaps they feel the answer is incomplete or inaccurate. It should not be used for the answer itself.
  • Group. Allow you to have ‘questions within questions’. For example, the main question could be ‘Are you a smoker’, and the ‘sub-questions’ are only relevant (and possibly only shown) when the answer is ‘yes’.

If you haven’t seen the format ‘ Answer[x]’  or ‘Data[x]’ before (i.e. the [x] bit), then what that means is that the element name includes the datatype. For example, if you used a string as the answer, then the element name is answerString; an instant would be answerInstant, and so on.

As you can see, there are a number of different ways of specifying the possible answers to a question. In fact there are 3, and they are mutually exclusive (you can only have 1 of them):

  • Answer[x] – when there is a single, simple answer
  • Choice – when you select one or more options from a set of possible answers (including a full terminology – or subset of a terminology)
  • Data[x] – when there is a different datatype or a FHIR resource that is the answer. (And this could be constructed from other data entered in the form, or a selected from an existing resource).

A question arises: If you construct a resource based on data entered through a Questionnaire, how can you record both the original data and the resource in the Questionnaire? As of now, I’m not sure what the answer is – presumably you could create a Provenance resource that links the two, though that seems overly complex. Another possibility would be to use the remarks property. Or perhaps an extension on the answer that references the resource is the way to go. I need to think a bit more about the use cases where this would apply.

Here’s an example of a simple form. It’s a pregnancy assessment with 2 questions; one will be a simple text answer and the other is a selection from a pre-defined list. To specify the list, we’ve included a ValueSet as a contained resource and expanded the options to include the name of each option so that they can easily be displayed to the user. (go check out the ValueSet in the spec if these terms are unfamiliar). If this particular set of options was used by multiple Questionnaires then it might make sense to save it as a separate ValueSet resource and reference it from the question rather than containing it.


<Questionnaire xmlns="http://hl7.org/fhir">
     <text>
         <status value="additional"/>
         <div xmlns="http://www.w3.org/1999/xhtml">
             <p>An example of a Questionnaire being used as a template</p>
             <p>Could Place the template HTML Here</p>
         </div>
     </text>
     <contained>
         <ValueSet id="vs1">
             <name value="Parity Options"></name>
             <description value="The list of possible responses for Parity"/>
             <status value="active"/>
             <compose>
                 <include>
                     <system value="http://snomed.info/sct"/>
                     <code value="127364007"/>
                     <code value="127365008"/>
                 </include>
             </compose>
             <expansion>
                 <timestamp value="2014-02-27T12:30:10Z"/>
                 <contains>
                     <system value="http://snomed.info/sct"/>
                     <code value="127364007"/>
                     <display value="Primagravida (First Pregnancy) "/>
                 </contains>
                 <contains>
                     <system value="http://snomed.info/sct"/>
                     <code value="127365008"/>
                     <display value="Second Pregnancy"/>
                 </contains>
             </expansion>
         </ValueSet>
     </contained>
     <status value="published"/>
     <authored value="2014-02-27"/>
     <name>
         <coding>
             <system value="http://orionhealth.com/fhir/questionnaire#templatecodes"/>
             <code value="pregAss"/>
             <display value="Pregnancy Assessment"></display>
         </coding>
     </name>
     <group>
         <name>
             <coding>
                 <system value="http://orionhealth.com/fhir/questionnaire#sectioncodes"/>
                 <code value="subj"/>
                 <display value="Subjective entries - what the patient says"/>
             </coding>
         </name>
         <header value="Subjective"/>
         <text value="Use this section to record the patient history"/>
         <group>
             <question>
                 <name>
                     <coding>
                         <system value="http://snomed.info/sct"/>
                         <code value="127364007"/>
                         <display value="Primagravida (First Pregnancy) "/>
                     </coding>
                 </name>
                 <text value="How many Pregnancies"/>

                 <choice>
                     <code value="53881005"/>
                     <display value="None"/>
                 </choice>
                 <options>
                     <reference value="#vs1"/>
                 </options>

             </question>
             <question>
                 <name>
                     <coding>
                         <system value="http://snomed.info/sct"/>
                         <code value="118212000"/>
                     </coding>
                 </name>
                 <text value="What is the Parity (Number of live births)"/>
                 <answerInteger value="0"/>

             </question>
         </group>
     </group>
 </Questionnaire>

As I was writing this post, there were a number of areas where I wondered if the Questionnaire could be modified to improve its usability (of course, it may simply be that I don’t fully understand the intentions behind it’s use – it’s one of the more complex resources). These are:

  • It would be nice to be able to be explicit about the version of a template. Eg – that this is the second revision of an assessment form. (This is different to the FHIR resource versioning)
  • Should there be an indicator – or a list – when a single form has had multiple authors or sources.
  • Should there be some explicit way of marking a Questionnaire if it is a template, or is inferring this from the status still be best way to go.
  • Should the top level ‘group’ be better named as  ‘content’
  • Allow a question to define the expected answer format (must use a profile at the moment)
  • Does the term ‘section’ make more sense than ‘groups’

This nicely leads me back to the connectathon. One of the key purposes of the FHIR connectathon – and the reason for choosing a theme – is to support implementers as they actually use these resources, and to feed back issues that arise during this  into the resource design process. I think that the Questionnaire is going to be one of the more commonly used resources, so it’s a good time to look at it in detail.

You really should be there!

We aren’t quite done with Questionnaire yet – next time we’ll take a closer look at the grouping structure, and the standard extensions.

Tamper resistant auditing in FHIR

Had an interesting chat with a colleague here at Orion (Richard) about audit events and signing them, and how to be sure that they haven’t been tampered with (which is apparently a Meaningful Use requirement) so have made a note here for when I forget, as this is an area that I’m not that familiar with. I should also say that there are doubtless other ways of doing this and real life implementations really need to be done by those who are experts – it’s been said that poor security is worse than no security at all…

FHIR provides a specific resource – the SecurityEvent resource – which records some event of importance. This could be anything from creating, reading, deleting or updating other resources, and can serve as the basis for an audit log to answer questions like ‘who has accessed my record’. The resource is based on the IHE ATNA profile, and indicates ‘who‘ did ‘what’ and ‘when’.

The spec states that a server that stores SecurityEvents should not allow them to be updated or deleted – which makes sense for something related to audit – but for the truly paranoid, how can you know this? How can you know that a SecurityEvent resource has not been deleted or modified?

To answer this question, we need a bit of background on some security terminology.

  • A cryptographic hash is a string (sometimes called a digest) that is generated from a source (like a FHIR resource) using some hash function. SHA-256 is an example. The digest that is generated by the function will always be the same when the input is the same, but it is not possible (or more accurately is unfeasible) to regenerate the original from the digest. So, if you have a digest and the original, then you can be sure that the digest ‘matches’ the original by creating a digest yourself and checking that they are the same – kind of like a fingerprint identifies an individual person.
  • Public Key cryptography is a system where there are 2 ‘keys’ that are linked mathematically. It’s possible to encrypt something (producing a signature) with one of the keys, and then to re-generate the original using the other key (So this is ‘two-way’ – unlike the hash which is ‘one-way’). One of the keys is kept secret (the private key) and the other one is shared (a public key). Assuming that you have my public key, then I can encrypt a message with my private key and send it to you. You can use my public key to decode it – and you know that it came from me. The public key is often called a certificate.
  • Of course, for the above to work, you need to be able to trust that you have got my public key – and not that of someone else. This is where a certificate authority (CA) comes in. This is an organization that we both trust which asserts that a particular key pair (public and private) actually belongs to a given person (or organization / thing). You prove to the CA that you are who you say you are, and the CA signs your public certificate with their own one. So – if you get a message encrypted with my private key, and you have my public key (certificate), then you can check that the CA has validated that certificate, and then use it to decode the message. Provide you trust the Certificate Authority, then all is well. The infrastructure that supports all this is referred to as Public Key Infrastructure – or PKI.

Just a couple more things before we move on:

  • The computations required to sign and decode messages using public & private keys can take a long time to run with large input files. What people commonly do is to take a message, produce a hash from that and then sign the hash.
  • This discussion is all about proving who (or what) created a resource – the integrity of that resource. The actual encryption of the contents is a separate discussion.
  • It can get a lot more complicated than this! This is not a simple area…

So with the basics behind us, how can we be sure that our SecurityEvent resources have not been tampered with? One way is to use a signed Provenance resource that refers to the SecurityEvent. This would be created by the same application that created the SecurityEvent.

A key assumption to make is where do we get the signers public key (certificate) from? There are a number of options for this, and the details will vary for each particular implementation (as far as I am aware there is no intention for FHIR to move into this space).

One possibility is to actually store the certificate in the signature itself! This sounds silly, but the idea is that the certificate is signed by a Certificate Authority. So, if we trust that Authority, then we trust that they assigned it to the right person. We’ll use that for the example below.

This process would go something like this.

  1. The application creates and saves the SecurityEvent resource.
  2. Next, it creates a Provenance resource. The provenance resource will refer back to the SecurityEvent resource (Provenance.target).
  3. Next generate a Hash of the SecurityEvent resource.
  4. Create a signature by encrypting the Hash using the systems private key, and include the public key (certificate) in the signature, which is then saved in the Provenance resource. (Provenance.integritySignature).
  5. Save the Provenance resource.

All of the above steps would need to be executed on the server in a single transaction.

So, with that in place how can I check that a particular resource hasn’t been altered? Well:

  1. Retrieve the SecurityEvent resource I want to check
  2. Generate a Hash of the resource using the same hash function as before.
  3. Retrieve the Provenance resource that refers to that SecurityEvent. (This would be a query against Provenance – /Provenance?target={ SecurityEvent.ID} ). The element Provenance.integritySignature has the signature generated by the system that signed it.
  4. Extract the signers Certificate from the signature, and make sure that the issuing CA (in the Certificate) is one that I trust. (Actually, there can be a whole chain of CA’s – but let’s not go there…)
  5. Using that Certificate, decode the rest of the signature. This will produce the original Hash, which should be the same as the one generated in step 2. If they match, then you can be sure that the SecurityEvent is the same as when it was signed. If not, you have a problem.

So what stops the bad guy from altering the signature in the Provenance? Well, they don’t have the private key of the signing system, so can’t encode the hash in a way that can be decoded by the public key (certificate) of the signing system. If they do put their own certificate in, then the CA to which it refers (and that I trust) tells me it was someone else.

The whole thing depends on being able to keep the private key of the signing system secret, and having certificates – maybe from a CA – that we both trust.

Couple of questions that come to mind.

Why not just sign the SecurityEvent resource and add the signature to it directly? Well, the problem with that is that the signature is ‘signing’ the whole SecurityEvent resource. If the digest is a part of that resource, then you have to generate the digest and then add it to the resource and then… oh wait … we’re just modified the resource and therefore the digest is invalid…

How do you know if a SecurityEvent has been deleted? After all, the bad guy could delete both SecurityEvent and Provenance? One way around this is for each SecurityEvent to refer to the previously created SecurityEvent in a chain (You’d need an extension for this in FHIR). That way you can ‘walk the chain’ to make sure there are none missing – and provided you check each SecurityEvent as you go, you know that the bad guy hasn’t changed any of the links.

Security is a complex topic, and this merely scratches the surface of what is required to implement correctly – it’s a high level overview and not an implementation guide.  To understand more – John Moehrke is one of the acknowledged experts in this space in healthcare.