Saturday, November 22, 2008

The DICOM Exposure attribute fiasco

Summary: The original ACR-NEMA standard specified ASCII numeric data elements for Exposure, Exposure Time and X-Ray Tube Current that could be decimal values; for no apparent reason DICOM 3.0 in 1993 constrained these to be integers, which for some modalities and subjects are too small to be sufficiently precise; CPs and supplements since have been adding new data elements ever since to fix this with different scaling factors and encodings, so now receivers are faced with confusion; ideally receivers should look for all possible data elements and chose to display the most precise. Next time we do DICOM, we will do it right :)

Long Version:

Just how difficult can those of us who write standards for a living actually make an implementer's life ? Pretty difficult, is the answer, though largely this occurs as we strive to avoid breaking the installed base of existing applications that might never be upgraded.

Today I was responding to a question from a software engineer at a vendor of veterinary radiology equipment who had come to realize the the "normal" attribute for encoding Exposure Time was insufficiently precise, given that it was restricted to being an Integer String, and small things, like cats, may have exposure times shorter than a whole second. I say "normal attribute", because the original CR IOD, and most other IODs since, have used this and other attributes with similarly constrained encoding to describe X-Ray technique, and in some cases made these attributes mandatory or conditional. The attributes I am talking about are:

  • Exposure (0018,1152), which is IS VR
  • Exposure Time (0018,1150), which is IS VR
  • X-Ray Tube Current (0018,1151), which is IS VR
This problem was realized not too long after the standard was published and the resulting fix was published as final text in CP 77 in 1996, entitled "Wrong VR for exposure parameters". So, what's the problem, you might ask, it's fixed right ? Well, the problem is the nature of the fix.

A naive approach would be to just change the VR for the existing data element, say from Integer String (IS) to Decimal String (DS), which would then allow fractional values. The problem with this solution would be that recipients that expected a string formatted in a particular manner might fail, for example if the parser, or display text field or database column did not expect decimal values. I.e., existing implementations might be broken, which is something we always try to avoid when "correcting" the standard.

You might well ask why the standard makes the distinction between integer strings and decimal strings in the first place, or indeed allows for both binary and string encoding of integers and floating point values. For example, a number might be encoded as an integer string (IS), decimal string (DS), unsigned 16 bit short (US) or 32 bit long (UL) or signed 16 bit (SS) or signed 32 bit (SL) binary integer, or as a 32 bit (FL) or 64 bit (FD) IEEE floating point binary value. The original ACR-NEMA standard had fewer and less specific encoding choices; it specified only four choices for value representation, 16 bit binary (BI), 32 bit binary (BD), ASCII numeric (AN) and ASCII text (AT). Note that there was no distinction between signed and unsigned binary values, and no distinction between integer and decimal string numeric values, and no way to encode floating point values in a binary form (indeed the standard for encoding binary floating point values, IEEE 754, was released in the same year as the first ACR-NEMA standard, 1985, and certainly was not universally adopted for many years). Anyway, if you review the list of data elements, the authors of the ACR-NEMA standard seem to have taken the approach of encoding:
  • structural elements related to the encoding of the message (like lengths and offsets) and pixel value related (rows, columns, bits allocated) stuff as binary (16 or 32 bit as appropriate),
  • "real world" things as ASCII numeric, even things things that could have been binary integers like counts of numbers of images, etc.
In ACR-NEMA, there was no indication of whether or not ASCII numeric values could be integers or decimal values or whether one or the other made sense. The authors of DICOM, in attempting to maintain some semblance of backward compatibility with ACR-NEMA and at the same time apply more precise constraints, re-defined all ACR-NEMA data elements of VR AN as either IS or DS, the former being the AN integer numbers (with new size constraints), and the latter being the AN fixed point and floating point numbers. In the process of categorizing the old data elements into either IS or DS, not only were the obvious integers (like counts of images and other things) made into integers, but it appears that also any "real world" attribute that in somebody's expert opinion did not need greater precision than a whole integer, was so constrained as well. If you look at the original 1993 Part 6 Data Dictionary, you will see a surprising number of these, not just the exposure-related data elements, but also other things like cine rates, R-R intervals, generator power, focal distance, velocities, depths of scan field, etc. It is hard to know what drove the decisions to constrain these, but perhaps it was related to the fact that many of the data elements were literal translations of what vendors already included in their own proprietary image file formats, and if some engineer in pre-historic times had allocated an integer rather than a fixed or floating point value for something, that arbitrary constraint founds its way into the standard without much further evaluation or consideration. Alternatively, the authors may have been of the common mindset that it was helpful to recipients to constrain the size, length of value range of data elements to the greatest extent possible, something that now seems counter-productive in a world of nearly unlimited bandwidth, storage capacity and computing power, but in the recent past could have been perceived as a significant performance benefit, even in an interchange standard.

Unfortunately, even though the DICOM standard introduced the concept of sending not only the value of a data element but also its type in the message, using the so-called "explicit value representation" transfer syntaxes, the new standard continued to support, and indeed require as the default, the "implicit value representation" that was equivalent to the way some vendors had implemented the ACR-NEMA standard over the network. Requiring only explicit VR would have allowed recipients to use the VR transmitted to decide what to do with the value, and opened the door to "fixing" incorrect VRs in the data dictionary. One could have required that recipients check and use the explicit VR. Unfortunately, by permitting implicit VR transfer syntaxes, the VR has to remain fixed forever, otherwise receivers have no way of knowing what to do with a value that is of an unexpected form. I am told that there was significant discussion of this issue with respect to the 1992 RSNA demonstration, and that implicit VR was allowed for the demonstration to maximize participation, with the intent that it not be included in the standard published in 1993, but there was not sufficient support to follow through with this improvement after all. In hindsight it is easy to criticize this short-sighted decision. On interchange media, added in 1995, only explicit VR transfer syntaxes are permitted, but by then it was too late.

So what does all this mean for our exposure-related attributes ? Given that one cannot reasonably change the VR of an existing data element, the only option was to add a new one. So this is what CP 77 did:
  • it described the problem with all three data elements
  • it described the historic lack of constrains in ACR-NEMA
  • it only fixed the problem for one of the data elements (Exposure (0018,1152)), without further explanation as to why only that one was addressed
  • it add a new data element, Exposure in μAs (0018,1153), to the data dictionary and added it as an optional attribute in the CR Image Module
  • it defined the new attribute to have a scaling factor 1,000 different than the original attribute, which was defined to be in mAs (as is normally displayed to the user)
  • it gave the new attribute a VR of IS
You might well ask
  • why CP 77 didn't just make the new data element a DS, keep the same units that were used previously and that are the normal units in which a user expects to see the value displayed ?
  • why not just call the data element something like Exposure (Decimal), or indeed use the same name and rename the old one to Exposure (Retired) or similar ?
  • why was the old attribute in the CR Image Module not simply retired or deprecated in some other way ?
I have no good answers to these questions, but unfortunately the CP 77 approach set a precedent for all subsequent changes of this type, which include the data elements listed in but not fixed by CP 77, which is perhaps why we have ended up with:
  • Exposure Time in μS (0018,8150), which is DS VR
  • Exposure in μAs (0018,1153), which is IS VR
  • X-Ray Tube Current in μA (0018,8151), which is DS VR
Thankfully, CP 187, which introduced the new data elements, did not repeat the same mistake of using an IS rather than DS VR, but did perpetuate the notion of adding a different scaling factor to disambiguate the new data element from the old. I have to take responsibility for this particular piece of stupidity, since I was doing the editing for the DX supplement and probably this CP also at the time. Surprisingly, and I can't remember why (probably an oversight on my part), though Exposure in μAs (0018,1153) got propagated into the CR and CT IODs, Exposure Time in μS (0018,8150) and X-Ray Tube Current in μA (0018,8151) did not, which often causes implementers reading PS 3.3 not to realize that these can be used to solve any precision problems for time and current as well as exposure. Another CP on this subject is probably in order.

There are several other problems than the VR and the scaling factor with this approach of fixing inappropriate VRs by adding optional attributes that mean the same thing as what they are intended to "replace", without actually retiring and removing the old attribute. Specifically:
  • How is a poor receiver to know which to use if it receives both (the sensible answer is to use the more precise one instead of the less precise one, but the standard does not require that) ?
  • What about an old receiver that has never heard of the new attribute (it will display the old less precise one) ?
  • Should a sender send both a less precise and a precise value, just to be able to allow such old receivers to display something rather than nothing (almost certainly yes) ?
If you think this is unfortunate, guess what, with the new Enhanced IODs we decided to make things even "better" by introducing yet more new attributes, this time with a more conventional scaling factor but an FD value representation. These are used in the Enhanced CT IOD, as well as the new Enhanced XA/XRF, 3D X-Ray and similar IODs:
  • Exposure Time in ms (0018,9328), which is FD VR
  • X-Ray Tube Current in mA (0018,9330), which is FD VR
  • Exposure in mAs (0018,9332), which is FD VR
Note that this is not nearly as bad as it sounds, because these new attributes only occurr nested inside the per-frame and shared functional group sequences, and hence will not occur in the "top level" dataset in a manner that might confuse receivers. Receivers of enhanced IOD images need to extract all their technique, positioning and other frame-specific annotation information from such sequences, and hence should always use the new attributes and never need to worry about encountering the old ones. These attributes are also mandatory attributes by the way, as is the convention with all of the Enhanced family of objects. The use of FD (or FL) rather than DS, by the way, has been the policy of WG 6 for some time now when introducing new non-integer numeric data elements, since the use of binary IEEE floats eliminates any ambiguity in encoding or parsing funky string values that are not described for DS, like infinity or NaN.

The problem with these new data elements is that now that they are in the data dictionary, some creative implementers of non-enhanced images have started to stuff them into the "old" IODs in order to send values with greater precision, instead of sending the intended CP 77 and CP 187 data elements. Strictly speaking this is legal as a so-called "Standard Extended SOP Class", but it creates an even greater problem for the receivers. When I first encountered someone doing this, I added a specific check to my dciodvfy validator to display an error if these attributes are present when they should not be in the DX IOD, and I have subsequently the check to other "old" IODs as well, including CR, XA/XRF and CT; I also implemented some limited consistency checking when multiple attributes for the same concept are present, since I encountered examples where completely different values were present that made no sense at all. As more and more modalities implement the Enhanced family of objects, however, and include the ability to "fall back" to sending the "old" objects if the SCP does not support the new ones, and do it by copying the "new" attributes from the functional group sequences into the top level datasets of old IOD objects rather than converting them to the "old" attributes, we may see more proliferation of a multitude of different data elements in which the exposure parameters might be encoded.

So back to the problem of what a poor receiver (of non-enhanced IOD) images is to do ? The bottom line in my opinion is that a modern receiver should check for the presence of any of the alternative attributes that encode the exposure parameters, and use whatever they find in order of greater precision. I implemented this rather crudely recently in the com.pixelmed.display.DemographicAndTechniqueAnnotations class in my PixelMed toolkit, if you are interested in taking a look at one approach to this; look for the use of the getOneOfThreeNumericAttributesOrNull() method.

If the foregoing sounds a little critical and sarcastic, it is intended to be. I continue to amaze myself with my own poor expedient decisions, lack of consistency and frequent carelessness when working on corrections and additions to the DICOM standard, and so this missive is intended to be as self-deprecating as it is critical of my contemporaries and predecessors. Much as we would like to change DICOM to make it "perfect", the need to correct problems and add functionality yet avoid breaking things that already work and avoid raising the implementation hurdle too high to be realistic are overriding; the result of compromise is significant "impurity".

If we ever had the chance to start DICOM all over again and "do it right", I am sure that despite our best intentions we would still manage to screw it up in equally egregious ways. We sometimes joke about doing a new standard called just "4", so-called because it would be the successor to DICOM 3.0, would not necessarily be just about images, and which would be an opportunity to skip the past the morass that is HL7 version 3. I doubt that we would really do much better and would no doubt encounter Fred Brooks' "second system syndrome". Indeed, DICOM 3.0 being the successor to ACR-NEMA already suffers in that respect, perhaps being accurately described as an "elephantine, feature-laden monstrosity". From what little I know about HL7 v3, it is not exempt either.

David

2 comments:

drozzy said...

Some bugs in your post:
-"real world" things as ASCII numeric, even things [things] that could"
-"and I have subsequently the check "

Also should the com.pixelmed.display.DemographicAndTechniqueAnnotations implement Iterable?

drozzy said...

FYI: I am new to Dicom and pretty young developer too.

I have just one question for you: how do you manage to learn so much about this! How do you know so many overly complex things about the weirdest decisions! Most importantly - how do you not fall asleep while looking at this EXTREMELY boring subject?

It is really nice to see a post like this where the "standard making" person is caring about the implementer.

Dealing with Dicom is very scary and one simply looses breath when looking at the pile of documentation and specification.

It is an understatement to say that Dicom is an "elephantine, feature-laden monstrosity", primarily because it has nowhere the complexity of an operating system.
In my ignorant opinion one can probably compare it better with the XML standard or something like Atom/RSS specification.

It is very easy to criticize something as one does not need to be an expert to do so. So please excuse me, I am just very frustrated.

The first time I heard about Dicom, and then learned how old it was - I thought it was impossible. In my mind I thought "surely someone out there has found a better way", write an alternative, competing with Dicom specification.
Maybe there is no "better way", but I am amazed that no-one would even try.

PS: If you get a change in the future, please post on the decision to send the "data-types" along with the values. Why could it not be just stated in the specification?