Mapping HL7 Version 2 to FHIR Messages

David Hay

10 years ago

When thinking about how FHIR is going to be implemented ‘for real’, it’s likely that ‘green fields’ applications – where there is no existing standard in use – will be early adopters – mobile is an obvious example.

But inevitably, we’re going to need to manage situations where HL7 standards are already in use and we want to introduce FHIR as well – i.e. the two standards are going to co-exist.

There’s a lot of work underway with mapping CDA (and CCDA in particular) to FHIR resources – indeed this work will provide a valuable ‘peer review’ for the completeness of FHIR resources, but in this post let’s have a look at how we might introduce FHIR into an existing HL7 version 2 based infrastructure.

Now, this is a really big area to cover, so we’ll use a specific – and small – Use Case to guide our discussion – we’ll take an HL7 version 2.4 ORU message (technically an ORU^R01 containing a radiology result), and produce a FHIR message that we can submit to a FHIR server, which will then save the Observation in the data store.

To keep things reasonably simple (yet be realistic) we’ll have a single OBX segment with the result in it. In practice, we’d likely have multiple OBX segments – each of which would be a separate Observation resource in our message (plus NTE segments and others). And note that some of the segments aren’t used – we’re only pulling out the data we need to populate our message. If we were planning a ‘2 way’ transform – ie retaining the ability to recreate the v2 message from the FHIR one, then we’d probably need to capture more of the data – and likely need some extensions as well.

In this post we’re just going to consider the creation of the FHIR message containing the Radiology observation (and supporting resources) – we can consider the architecture and workflow aspects of subsequently processing the message in another post.

And remember that HL7 v2 can be used in many different ways (if you’ve seen one v2 implementation…) so this analysis is highly specific to this use case.

Here’s a sample of what the v2 message might look like:

MSH|^~\&amp;|Amalga HIS|BUM|New Tester|MS|20111121103141||ORU^R01|2847970-201111211031|P|2.4|||AL|NE|764|ASCII|||
PID||100005056|100005056||Dasher^Mary^&quot;&quot;^^&quot;&quot;|&quot;&quot;|19810813000000|F||CA|Street 1^&quot;&quot;^&quot;&quot;^&quot;&quot;^34000^SGP^^&quot;&quot;~&quot;&quot;^&quot;&quot;^&quot;&quot;^&quot;&quot;^Danling Street 5th^THA^^&quot;&quot;||326-2275^PRN^PH^^66^675~476-5059^ORN^CP^^66^359~-000-9999^ORN^FX^^66^222~^NET^X.400^a@a.a~^NET^X.400^dummy@hotmail.com|123456789^WPN^PH^^66|UNK|S|BUD||BP000111899|D99999^&quot;&quot;||CA|Bangkok|||THA||THA|&quot;&quot;|N
PV1||OPD   ||||&quot;&quot;^&quot;&quot;^&quot;&quot;||||CNSLT|||||C|VIP|||6262618|PB1||||||||||||||||||||||||20101208134638
PV2|||^Unknown|&quot;&quot;^&quot;&quot;||||&quot;&quot;|&quot;&quot;|0||&quot;&quot;|||||||||||||||||||||||||||||HP1
ORC|NW|&quot;&quot;|BMC1102771601|&quot;&quot;|CM||^^^^^&quot;&quot;|||||||||&quot;&quot;^&quot;&quot;^^^&quot;&quot;
OBR|1|&quot;&quot;|BMC1102771601|&quot;&quot;^Brain (CT)||20111028124215||||||||||||||||||CTSCAN|F||^^^^^ROUTINE|||&quot;&quot;||||||&quot;&quot;|||||||||||^&quot;&quot;
OBX|1|FT|&quot;&quot;^Brain (CT)||++++ text of report goes here +++|||REQAT|||F|||20111121103040||75929^Gosselin^Angelina

So there are 7 segments in our incoming HL7 v2 message:

Segment	Purpose	FHIR Resource
MSH	Message header	MessageHeader
PID	Patient Identification	Patient
PV1	Patient Visit	Not used in this example
PV2	Patient Visit – Additional data	Not used in this example
ORC	Common Order	Not used in this example
OBR	Observation Request	Observation
OBX	Observation	ObservationProvider

Heres what our FHIR bundle will look like:

Resources

MessageHeader
Observation
Patient (Subject of the Observation)
Provider (Performer of the Observation)

As a basis of the analysis we’ll the mappings that are in the FHIR specification (at the top of each resource page there are tabs for the content, examples, formal definitions, mappings & profiles )

So here’s how we can create each resource – with some notes about the choices we need to make.

Bundle

The bundle is an atom feed, and the id of the bundle is a globally unique ID (eg a UUID) that represents the FHIR message as a whole. This is distinct to the MessageHeader ID which is important when dealing with errors that occur during messaging (see the spec for a discussion of this).

MessageHeader

A FHIR message is a bundle of resources, with the MessageHeader resource as the first resource (and a tag on the bundle) . There’s a pretty good mapping between MSH and MessageHeader:

Element	V2 segment	Description
Identifier	MSH-10	Message Control ID
Timestamp	MSH-7	Message Date/time
Event	MSH-9.2	observation-provideDerived from the second component of the Message Type field. Its value comes from HL7 table 3
Source.name	MSH-3	Sending application name
Source.software	MSH-3	Sending application name
Source.endpoint	MSH-24	Sending network address
Destination.name	MSH-5	Receiving application
Destination.endpoint	MSH-25	Receiving network address
data		References to the ‘root’ resource of the message.

Some notes.

The identifier is set by the sending system, and should be unique in the context of the sender. Ideally globally unique, but there’s no guarantee of that from the v2 side. From FHIRs perspective, it needs to be unique within ‘the stream of messages’, in practice we’d expect it to be unique in the context of the sender – after all, it’s how they will be matching acknowledgements to the message.
The software element is is a required field, and the mapping in the spec suggests using the SFT segment. However, that was only introduced in version 2.5, so we’ll duplicate the source.name here.
The event code is defined in the FHIR spec (Theres a more complete list here). These are not the same as the v2 ones – though as the ‘strength’ of the binding is we could use the v2 ones if we wanted to. However, theres a value that suits this implementation – observation-provide – so we’ll use that.
The enterer, author and receiver elements allow us to specify individuals (or organizations) from whom the message originated, or who it is for. We don’t need that here, but nice to know it’s available if needed.
In this case the data element will be a reference to the Observation. If we had multiple observations, then we’d probably include a List resource to group them together. And, of course, a message can get a lot more complicated than that.

Observation

This is the resource that holds the actual thing that we want to represent – in this case a radiology result – and Observation is one of the most flexible resources in the FHIR stable.

Here’s the mapping summary table:

Element	V2 segment	comment
name	OBX-3
valueString	OBX-5	OBX-5 holds the actual value of the observation
interpretation	OBX-8
comments	NTE-3	See note below
appliesDateTime	OBX-14
issued	OBR.22
status	OBX-11	Observation Result Status
reliability	OBR-25
identifier	OBX-21
subject		Reference to the Patient
performer		Reference to the Practitioner

Some notes:

The name (or type) of observation (a radiology result) is stored in OBX-3. It’s a CE datatype in v2, which maps to a CodeableConcept in FHIR
The value of the observation will be in the Observation.value[x] element, where [x] is one of the specified FHIR datatypes. In v2, there are at least 3 fields that we’d look at to decide how to get the value.
- OBX-2 is the type of the Observation as defined in the HL7 v2 table 0125. For our radiology result, this will likely be FT or ST and the type of the value[x] element will be a string – making the full element name valueString. If we were building a generic parser, then we’d look at the value of OBX-2 and use that to decide what FHIR data type to use.
- OBX-5 holds the actual value of the observation. The details of this will depend on the datatype as discussed above of course…
- OBX-6 is the units of the value. Doesn’t apply here but in many situations it will. For example, if this was a Systolic Blood Pressure then the datatype would be Quantity, which has a units component to it.
According to the mapping in the spec, when the report was issued, can come from a number of different places depending on the exact source and nature of the message. We’ll go with OBR-22 (Status change date/time) for our scenario.
The reliability of the result is a bit challenging – and it’s a required element in the Observation resource so we can’t ignore it. The mapping in the spec refers to OBX-8 (Abnormal Flags) and OBX-9 (Probability) – but these are more a reflection of the result within the expected range, rather than whether the result can be trusted. OBR-25 (Result Status) seems a better choice, so we’ll go with that.
If there was a comment about the result, then it would be stored in a separate NTE segment (or segments) immediately after the OBX. There isn’t a direct reference in v2 from the OBX to the NTE (or the other way round) – you have to infer that from the fact that the NTE follows the OBX.
We also need to think about what the Observation.text – the human readable narrative should be. As this is a string datatype – and assuming that the v2 datatype (OBX-2) was FT, then we’ll enclose the whole value in <pre> elements, and perhaps add the date and author names there as well. I suspect that a proper consideration of how to generate narrative is a full topic in and of itself – perhaps another time.

Patient

In FHIR, the Patient is a separate resource (maybe not even on the same server as the observation), and the Observation will have a reference to it (as the subject). So we need to find the Patient in whatever server it is stored and retrieve its url. We might also need to create the Patient if we can’t find it – though this will depend on the policies of the implementation, which might require that the patient exist first.

The way we’ll search for a patient will be using the patient identifier from the PID segment. PID-3 has a list of patient identifiers that we can use – each being a CX datatype, so equivalent to FHIRs identifier datatype. We need to query the Patient server to see if the Patient already exists, using the appropriate identifier (based on the namespace). If it exists, then we have the reference. If not then we can use the data in the PID segment to create a new Patient resource – if policy allows. Here are the steps:

Using the identifier from the v2 message, query the FHIR server as follows:

GET [patientserver]/patient?identifier={identifier}

If there’s a single match then we have the ID that we can use for the Observation.subject. (And we can place a copy of the Patient inside the bundle as well)
If there is more than one match, then that’s probably an error and we should reject the message or get human intervention (again, there will be some policy around this).
If there are no matches, then the next step depends on whether the patient store is on the same server as the observation.
- If it is, then create a new Patient resource using the data from the PID segment, and give it an ID with a cid: prefix (which will tell the server that it’s a new Patient resource and it will save it locally – resolving the reference to the Observation as it does so).
- If the patient is on a separate server, then we need to create and save a patient now, getting an ID back from the server as it is saved. We can then place the patient resource in the message (along with the correct ID). Of course there are transactional concerns here: what if we save the patient, but the rest of the message processing fails? Well, it doesn’t really matter – the next time a message comes through with this patient identifier we’ll find the one we just created, so no harm done.

Note that it’s possible to offload all this lookup to the server by using the internal search facility of a transaction (we talked about that here) – but I’m not sure that functionality applies for message processing, and in any case we can’t assume that all servers will have that functionality, so we’ll do it the long way.

And also note that we are assuming that the server processing the message has the ability to create Patient resources locally. If not, then we’ll need to create the Patient as a separate step as we did with a remote Patient server. (We’ll think about message processing in more detail when we think about workflow in another post)

Provider

The observation has a performer element which is the person/device who performed the observation. In our case this will be a Provider though a device is another common use – we’ll think about the implications of that in another post.

The same considerations will apply to the Provider as it did for the Patient in terms of finding a resource and providing it as a reference to the Observation, but there are a couple of extra hooks:

There are more possible locations in the v2 message where we can get ‘performer’ information – and which one we choose will depend on the type of message (check out the mapping in the spec for some of the possibilities).
And depending on which one we choose – we’ll likely have less information about the provider in our message, making it harder to create one if it doesn’t already exist in our system (and the local policy allows this)

In our case we’ll choose OBX-16 – responsible observer. This has the added advantage that the v2 datatype for this field is XCN (Extended Composite ID Number and Name for Persons) –which means that we should have enough detail to create a provider resource if we need to (provided that all the fields in the message are populated of course).

Last Words

When I started this exercise, I assumed that the mapping exercise would be straightforward – and at a high level it is reasonably so. However, like all v2 implementations there is a considerable amount of choice when mapping – perhaps not too surprising given the complexity of healthcare. Even if vendors provide tooling with ‘standard’ mappings’ and automated narrative generation, I expect that most implementations are going to require some ‘tweaking’ of those mappings for individual resources…

And many of the choices made in this post can certainly be challenged!

The question of ‘policy’ arose quite a bit – what should systems do in certain circumstances like new Patient’s or where there are multiple patients with the same identifier or what to do when a Provider can’t be found – is also rather more important than I had anticipated.

Which reminds me of Grahames second law (he’s obviously a fan of Asimov) You can move complexity around, but you can’t make it go away.