Saturday, October 31, 2015

The slings and arrows of outrageous radiologists - I want my FLA.

Summary: We don't need fewer arrows. We need more arrows more often. And we need better arrows (in the sense that they are hyperlinked to the findings in the report when images are rendered, i.e., are Findings-Linked Annotations (FLA)). The term "arrows" being a surrogate for "visual indication of location".

Long Version.

I came across the strangest article about "arrows" in EJR.

Now, I don't normally read EJR because it is a little expensive, it doesn't come along with any professional society membership I have, I don't work at an institution that gets it, most of its articles are not open access (there is an EJR Open companion journal though), and it doesn't have a lot of informatics content. But this paper happened to be quoted in full for some reason on Aunt Minnie Europe, so I got a squizz without having to wait to receive a pre-print from the authors via ResearchGate or some other mechanism.

The thesis of the radiologist authors seems to be that "arrows" on images pointing to findings are a bad thing, and that recipients of the report should read the report instead of having access to such visual aids.

This struck me as odd, from the perspective of someone who has spent the last two decades or so building and evangelizing about standards and systems to do exactly that, i.e., to make annotations on images and semantically link them to specific report content so that they can be visualized interactively (ideally through DICOM Structured Reports, less ideally through the non-semantic but more widely available DICOM Softcopy Presentation States, and in the worst case in a pre-formatted multimedia rather than plain text report).

What are the authors' arguments against arrows? To summarize (fairly I hope), arrows:
  • are aesthetically ugly, especially if multitudinous, and may obscure underlying features
  • draw attention from unmarked less obvious findings (may lead to satisfaction of search)
  • are not a replacement for the more detailed account in the report
  • are superfluous in the presence of the more detailed account in the report
  • might be removed (or not be distributed)
  • detract from the role of the radiologist as a "readily accessible collaborator"
For the sake of argument, I will assume that what the authors' mean by "arrows" includes any "visual indication of location" rendered on an image, passively or interactively. They actual describe them as "an unspoken directional signal".

The authors appear to conflate the presence of arrows with either the absence of, or perhaps the ignorance of, the report ("relying on an arrow alone as a manifestation of our special capabilities", "are merely a figurative crutch we can very well do without").

I would never assert that arrows alone (or any form of selective annotation) substitute for a good report, nor it would seem to me, would it be best or even common practice to fail to produce a full report. The implication in the paper seems to be that when radiologists use arrows (that they expect will be visible to the report recipient), they record less detail about the location in the report, or the recipient does not read the report. Is that actually the case? Do the authors put forth any evidence to support that assertion? No, they do not; nor any evidence about what recipients actually prefer.

I would completely agree with the authors that there is an inherent beauty in many images, and they are best served in that respect unadorned. That's why we have buttons to toggle annotations on and off, including not only arrows but those in the corners for demographics and management as well. And why lead markers suck. And who really cares whether we can check to see if we have the right patient or not? OK, so there are safety issues to consider, but that's another story.

As for concerns about satisfaction of search, one could equally argue that one should not include an impression or conclusion in a report either, since I gather few recipients will taken the time to read more than that. Perhaps they should be forced to wade through reams of verbosity just in case they miss something subtle not restated in its entirety in the impression anyway. And there is no rule that says one can't point out subtle findings with arrows too. Indeed, I was lead to believe during my training that it was the primary interpreting radiologist's function (and major source of added value) to detect, categorize and highlight (positively or negatively) those subtle findings that might be missed in the face of the obvious.

Wrt. superfluousness, I don't know about you, but when I read a long prose description in a report that attempts to describe the precise location of a finding, whether it uses:
  • identifiers ("in series 3, on slice 9, approximately 13.8 mm lateral to the left margin of the descending aorta", which assumes incorrectly that the recipient's viewer numbers things the same way the radiologist's does),
  • approximate regions ("left breast MLO 4 o'clock position"), or
  • anatomical descriptions ("apical segment of the right lower lobe")
even if I find something on the image that is plausibly or even undeniably associated with the description, I am always left wondering if I am looking at exactly the same thing as the reporting radiologist is talking about, and with the suspicion that I have missed something. My level of uncertainty is significantly higher than it needs to be. Arrows are not superfluous, they are complementary and add significant clarity.

Or to put it another way, there is a reason the wax pencil was invented.

In my ideal world, every significant localized finding in a report would be intimately linked electronically with a specific set of coordinates in an image, whether that be its center (which might rendered as an arrow, or a cross-hair, or some other user interface element), or its outline (which might be a geometric shape like an ellipse or rectangle, or an actual outline or filled in region that has been semi-automatically segmented, if volume measurements are reported). Further, the display of such locations would be under my interactive control as a recipient (just as one turns on and off CAD marks, or applies presentation states selectively); this would address the "aesthetic" concern of the annotation obscuring underlying structure.

We certainly have the standards. Coordinate references in reports were one of the core elements of Dean Bidgood's Structured Reporting (SR) initiative in DICOM ("Documenting the information content of images", 1997). I used a (contrived) example of a human-generated report to emphasize the point in Figure 1 of my 2000 DICOM SR textbook (long due for revision, I know). There was even work to port the DICOM SR coordinate reference pattern into HL7 CDA (although of late this has been de-emphasized in favor of leaving these in the DICOM realm and referencing them, e.g., in PS3.20).

Nor is this beyond the state of the art of authoring and rendering applications, even if it is not commonly implemented or used. The primary barriers to adoption seem to be:
  • the diversity of the heterogeneous mix of image display, voice reporting and report display systems that are difficult to integrate tightly enough to achieve this,
  • coupled with the real or perceived difficulty of enabling the radiologist to author more highly linked content without reducing their "productivity" (as currently incentivized).
In a world in which the standard of care in the community is the fax of a printed report, possibly coupled with a CD full of images with a brain-dead viewer (and no presentation state or structured report coordinate rendering), the issue of any arrows at all is probably moot. The financial or quality incentives are focused on embellishing the report not with clinically useful content but instead with content for reimbursement optimization. The best we can probably do for these scenarios is the (non-interactive) "multimedia report", i.e., the one that has the selected images or regions of images pre-windowed and embedded in the report with arrows and numbers shared with the findings in the prose, or similar. An old concept once labelled as an "illustrated" report, recently revisited or renamed (MERR), but still rarely implemented AFAIK.

Even within a single enterprise, the "hyperlink" between specific findings in the report content and the image annotations is usually absent. The EHR and PACS may be nominally "integrated" to the point of being able to trigger the PACS viewer whilst reading the report (whether to get Well Meaningful Use Brownie Points or to actually serve the needs of the users), and the PACS may be able to render the radiologist's arrows (e.g., if they are stored as presentation states in the PACS). While this scenario is way better than having no arrows at all, it is not IMHO as good as "findings-linked annotations" (let's call them FLA, since we need more acronyms like we need a hole in the head). Such limited integrated deployments are typically present when the lowest common denominator for "report interchange" is essentially the same old plain text report, perhaps "masquerading" as something more sophisticated (e.g., by wrapping the text in CDA or DICOM SR, with or without a few section headings but without "semantic" links from embedded findings to image coordinates or references to softcopy presentation states).

Likewise, though the radiology and cardiology professional societies have been strongly pushing so-called "structured reporting" again lately, these efforts are pragmatic and only an incremental extension to the lowest common denominator. They are still essentially limited to standardization of layout and section headings, and do not extend to visual hyperlinking of findings to images. Not to dismiss the importance of these efforts; they are a vital next step, and when adopted offer valuable improvements, but IMHO they are not sufficient to communicate most effectively with the report recipients.

So, as radiologists worry about their inevitable outsourcing and commodification, perhaps they should be more concerned about how to provide added value beyond the traditional verbose prose, rather than bemoaning the hypothetical (if not entirely spurious) disadvantages of visual cues. The ability to "illustrate" a report effectively may become a key component of one's "competitive advantage" at some point.

I suggest that we need more FLA to truly enable radiologists to be "informative and participatory as caregivers, alerting our colleagues with more incisiveness and counsel" (paraphrasing the authors). That is, to more effectively combine the annotations and the report, rather than to exaggerate the importance of one over the other.


PS. Patients read their reports and look at their images too, and they really seem to like arrows, not many of them being trained anatomists.

PPS.  I thought for a moment that the article might be a joke, and that the authors were being sarcastic, but its Halloween not April Fool's, the paper was submitted in August and repeated on Aunt Minnie, so I guess it is a serious piece with the intention of being provocative rather than being taken literally. It certainly provoked me!

PPPS. Do not interpret my remarks to in any way advocate a "burned in" arrow, i.e., one that replaces the original underlying pixel values and which is then sent as the only "version" of the image; that is obviously unacceptable. I understand the author's article to be referring to arrows in general and not that abhorrent encoding mechanism in particular.

Thursday, October 22, 2015

I think she's dead ... no I'm not ... Is PACS pining for the fiords?

Summary: The death of PACS, and its deconstruction, have been greatly exaggerated. Not just recently, but 12 years ago.

Long Version:

Mixing quotes from different Monty Python sketches (Death of Mary Queen of Scots, Pet Shop) is probably almost as bad as mixing metaphors, but as I grow older it is more effort to separate these early associations.

These lines came to mind when I was unfortunately reminded of one the most annoying articles published in the last few years, "PACS in 2018: An Autopsy", which is in essence an unapologetic unsubstantiated promotion of the VNA concept.

Quite apart from the fact that nobody can agree on WTF a VNA actually is (despite my own lame attempt at a retrospective Wikipedia definition), this paper is a weird collage of observable technological trends in standards and products, marketing repackaging of existing technology with new labels, and fanciful desiderata that lack real market drivers or evidence of efficacy (or the regulatory (mis-)incentives that sometimes serve in lieu).

That's fine though, since it is reasonable to discuss alternative architectures and consider their pros and cons. But wait, surprise, there is actually very little if any substance there? No discussion of the relative merits or drivers for change? Is this just a fluff piece, the sort of garbage that one might see in a vendor's press release or in one of those junk mail magazines that clutter one's physical mailbox? All hype and no substance? What is it doing in a supposedly peer-reviewed scientific journal like JDI?

OK, so its cute, and its provocative, and let's give the paper the benefit of the doubt and categorize it as editorial rather than scientific, which allows for some latitude.

And no doubt, somewhat like Keeping Up with the Kardashians and its ilk, since folks seem to be obsessed with train wrecks, it is probably destined to become the "most popular JDI article of all time".

And let's be more even generous and forgive the drawing of pretty boxes that smells like "Marchitecture". Or, that it would be hard for a marketing executive to draft a more buzzword compliant brochure. And perhaps as an itemized list of contemporary buzzwords, it has some utility.

My primary issue is with the title, specifically the mention of "autopsy".

Worse, the author's follow up at the SIIM 2015 meeting in his opening address entitled "The Next Imaging Evolution: A World Without PACS (As We Know It)" perpetuated this theme of impending doom for PACS, a theme that dominated the meeting.

Indeed, though the SIIM 2015 meeting was, overall, very enjoyable and relatively informative, albeit repetitive, the main message I returned home with was the existence of a pervasive sense of desperation among the attendees, many of whom seem to fear not just commoditization (Paul Chang's theme in past years) but perhaps even total irrelevance in the face of the emerging "threat" that is enterprise image management. I.e., PACS administrators and radiologists are doomed to become redundant. Or at least they are if they don't they buy products with different labels, or re-implement the same solutions with different technology.

When did SIIM get hijacked by fear-mongers and doubters? We should be demanding more rigidly defined areas of doubt and uncertainty ... wait, no, wrong radio show.

OK, I get that many sites are faced with the challenge of expanding imaging beyond radiology and cardiology, and indeed many folks like the VA have been doing that for literally decades. And I get that Meaningful Use consumes all available resources. And that leveraging commodity technology potentially lowers barriers to entry. And that mobile devices need to be integrated. And that radiology will no longer be a significant revenue stream as it becomes a cost rather than profit center (oops, who said that). But surely the message that change may be coming can be spun positively, as an opportunity rather than a threat, as incremental improvement rather than revolution. Otherwise uninformed decision makers as well as uneducated worker bees who respond to hyperbole rather than substance, or who are seeking excuses, may be unduly influenced in undesirable or unpredictable ways.

More capable commentators than I have criticized this trend of hyping the supposed forthcoming "death of PACS", ranging from Mike Cannavo to Herman O's review of SIIM 2015 and the equally annoying deconstruction mythology.

Call me a Luddite, but these sorts of predictions of PACS demise are not new; indeed, I just came across an old RSNA 2003 abstract by Nogah Haramati entitled "Web-based Viewers as Image Distribution Solutions: Is PACS Already a Dead Concept?". Actually, encountering that abstract was what prompted me to write this diatribe, and triggered the festering irritation to surface. It is interesting to consider the current state of the art in terms of web viewing and what is currently labelled as "PACS" in light of that paper, considering it was written and presented 12 years ago. Unfortunately I don't have the slides, just the abstract, but I will let you know if/when I do get hold of them.

One has to wonder to what extent recent obsession with this morbid terminology represents irresponsible fear mongering, detachment from whatever is going on in the "real world" (something I am often accused of), self-serving promotion of a new industry segment, extraordinary popular delusions and the madness of crowds, or just a desire to emulate the breathless sky-is-falling reporting style that seems to have made the transition from cable news even to documentary narrative (judging by the "Yellowstone fauna are doomed" program we watched at home on Animal Planet the other night). Where is David Attenborough when you need him? Oh wait, I think he's dead. No he's not!


plus c'est la même chose

Sunday, October 4, 2015

What's that 'mean', or is 'mean' 'meaningless'?

Summary: The current SNOMED code for "mean" used in DICOM is not defined to have a particular meaning of mean, which comes to light when considering adding geometric as opposed to arithmetic mean. Other sources like NCI Thesaurus have unambiguously defined terms. The STATO formal ontology does not help because of its circular and incomplete definitions.

Long Version:

In this production company closing logo for Far Field Productions, a boy point to a tree and says "what's that mean?"

One might well ask when reading DICOM PS3.16 and trying to decide when to use the coded  "concept" (R-00317, SRT, "Mean") (SCT:373098007).

This question arose when Mathieu Malaterre asked about adding "geometric mean", which means (!) it is now necessary to distinguish "geometric" from "arithmetic" mean.

As you probably know, DICOM prefers not to "make up" its own "concepts" for such things, but to defer to external sources when possible. SNOMED is a preferred such external source (at least for now, pending an updated agreement with IHTSDO that will allow DICOM to continue to add SNOMED terms to PS3.16 and allow implementers to continue to use them with license or royalty payments, like the old agreement). However, when we do this, we do not provide explicit (textual or ontologic) definitions, though we may choose to represent one of multiple possible alternative terms (synonyms) rather than the preferred term, or indeed make up our own "code meaning" (which is naughty, probably, if it subtly alters the interpretation).

So what does "mean" "mean"?

Well, SNOMED doesn't say anything useful about (R-00317, SRT, "Mean") (SCT:373098007). The SNOMED "concept" for "mean" has parents:

  > Qualifier value (qualifier value) 
     > Descriptor (qualifier value) 
        > Numerical descriptors (qualifier value)

which doesn't help a whole lot. This is pretty par for the course with SNOMED, even though some SNOMED "concepts" (not this one) have (in addition to their "Is a" hierarchy), a more formal definition produced by other types of relationship (e.g., "Procedure site - direct", "Method"), etc. I believe these are called "fully defined" (as distinct from "primitive").

So one is left to interpret the SNOMED "term" that is supplied as best one can.

UMLS has (lexically) mapped SCT:373098007 to UMLS:C1298794, which is "Mean - numeric estimation technique", and unfortunately has no mappings to other schemes (i.e., it is a dead end). UMLS seems to have either consciously or accidentally not linked the SNOMED-specific meaningless mean with any of (C0444504 ,UMLS, "Statistical mean"), (C2347634, UMLS, "Population mean") or (C2348143, UMLS, "Sample mean").

There is no UMLS entry for "arithmetic mean" that I could find, but the "statistical mean" that UMLS reports, is linked to the "mean" from NCI Thesaurus, (C53319, NCIt, "Mean"), which is defined textually as one might expect, as "the sum of a set of values divided by the number of values in the set". This is consistent with how Wikipedia, the ultimate albeit evolving source of all knowledge, defines "arithmetic mean".

SNOMED has no "geometric mean" but UMLS and NCI Thesaurus do. UMLS:C2986759 maps to NCIt:C94906.

One might expect that one should be able to do better than arbitrary textual definitions for a field as formalized as statistics. Sure enough I managed to find STATO, a general-purpose STATistics Ontology, which looked promising on the face of it. One can poke around in it on-line (hint: look at the classes tab and expand the tree), or download the OWL file and use a tool like Protégé.

If you are diligent (and are willing to wade through the Basic Formal Ontology (BFO) based hierarchy:

> continuant
  > dependent continuant
    > generic dependent continuant
      > information content entity
        > data item
          > measurement data item
            > measure of central tendency
              > average value

one finally gets to a child, "average value", which has an "alternative term" of "arithmetic mean".


But wait, what is its definition? There is a textual annotation "definition" that is "a data item that is produced as the output of an averaging data transformation and represents the average value of the input data".

F..k! After all that work, can you say "circular"? I am sure Mr. Rogers can.

More formally, STATO says "average value" is equivalent to "is_specified_output_of some 'averaging data transformation'". OK, may be there is hope there, so let's look at the definition of "averaging data transformation" in the "occurrent" hierarchy (don't ask; read the "Building Ontologies with Basic Formal Ontology" book).

Textual definition: "An averaging data transformation is a data transformation that has objective averaging". Equivalent to "(has_specified_output some 'average value') or (achieves_planned_objective some 'averaging objective')".


Shades of lexical semantics (Cruse is a good read, by the way), and about as useful for our purposes:(

At least though, we know that STATO:'average value' is a sub-class of STATO:'measure of central tendency', which has a textual definition of "a measure of central tendency is a data item which attempts to describe a set of data by identifying the value of its centre", so I guess we are doing marginally better than SNOMED in this respect (but that isn't a very high bar). Note that in the previous sentence I didn't show "codes" for the STATO "concepts", because it doesn't seem to define "codes", and just uses the human-readable "labels" (but Cimino-Desiderata-non-compliance is a subject for another day).

In my quest to find a sound ontological source for the "concept" of "geometric mean", I was also thwarted. No such animal in STATO apparently, yet, as far as I could find (maybe I should ask them).

So not only does STATO have useless circular definitions but it is not comprehensive either. Disappointed!

So I guess the best we can do in DICOM for now, given that the installed base (especially of ultrasound devices) probably use (R-00317, SRT, "Mean") a lot, is to add text that says when we use that code, we really "mean" "mean" in the sense of "arithmetic mean", and not the more generic concept of other things called "mean", and add a new code that is explicitly "geometric mean". Perhaps SNOMED will add a new "concept" for "geometric mean" on request and/or improve their "numerical descriptors" hierarchy, but in the interim either the NCI Thesaurus term NCIt:C94906 or the UMLS entry UMLS:C2986759 would seem to be adequate for our purposes. Sadly, the more formal ontologies have not been helpful in this respect, at least the one I could find anyway.

Maybe we should also be extremely naughty and replace all uses of (R-00317, SRT, "Mean") in the DICOM Standard with (R-00317, SRT, "Arithmetic mean"), just to be sure there is no ambiguity in the DICOM usage (and suggest to SNOMED that they add it as an alternative term). This would be less disruptive to the DICOM installed base than replacing the inadequately defined SNOMED code with the precisely defined NCI Thesaurus code.


PS. I italicize "concept" because there is debate over what SNOMED historically and currently defines "concept" to be, quite apart from the philosophical distinctions made by "realist" and "idealist" ontologists (or is it "nominalists" and "conceptualists"). I guess you know you are in trouble when you invoke Aristotle. Sort of like invoking Lincoln I suppose (sounds better when James McEachin says it).

Sunday, October 26, 2014

Keeping up with Mac Java - Bundling into Executable Apps

Summary: Packaging a Java application into an executable Mac bundle is not difficult, but has changed over time; JavaApplicationStub is replaced by JavaAppLauncher; manually building the package content files and hand editing the Info.plist is straightforward, but the organization and properties have changed. Still irritating that JWS/JNLP does not work properly in Safari.

Long Version.

I have long been a fan of Macs and of Java, and I have a pathological aversion to writing single-platform code, if for no other reason than my favorite platforms tend to vanish without much notice. Since I am a command-line weenie, use XCode only for text editing and never bother much with "integrated development environments" (since they tend to vanish too), I am also a fan of "make", and tend to use it in preference to "ant" for big projects. I am sure "ant" is really cool but editing all those build.xml files just doesn't appeal to me. This probably drives the users of my source code crazy, but c'est la vie.

The relevance of the foregoing is that my Neanderthal approach makes keeping up with Apple's and Oracle's changes to the way in which Java is developed and deployed on the Mac a bit of a challenge. I do need to keep up, because my primary development platform is my Mac laptop, since it has the best of all three "worlds" running on it, the Mac stuff, the Unix stuff and the Windows stuff (under Parallels), and I want my tools to be as useful to as many folks as possible, irrespective of their platform of choice (or that which is inflicted upon them).

Most of the tools in my PixelMed DICOM toolkit, for example, are intended to be run from the command line, but occasionally I try to make something vaguely useful with a user interface (not my forte), like the DoseUtility or DicomCleaner. I deploy these as Java Web Start, which fortunately continues to work fine for Windows, as well for Firefox users on any platform, but since an unfortunate "security fix" from Apple, is not so great in Safari anymore (it downloads the JNLP file, which you have to go find and open manually, rather than automatically starting; blech!). I haven't been able to find a way to restore JNLP files to the "CoreTypes safe list", since the "XProtect.plist XProtect.meta.plist" and "XProtect.plist" files in "/System/Library/CoreServices/CoreTypes.bundle/Contents/Resources/" don't seem to be responsible for this undesirable change in behavior, and I haven't found an editable file that is yet.

Since not everyone likes JWS, and in some deployment environments it is disabled, I have for a while now also been creating selected downloadable executable bundles, both for Windows and the Mac.

Once upon a time, the way to do this to build Mac applications was with a tool that Apple supplied called "jarbundler". This did the work of populating the tree of files that constitute a Mac application "bundle"; every Mac application is really a folder called "", and it contains various property files and resources, etc., including a binary executable file. In the pre-Oracle days, when Apple supplied its own flavor of Java, the necessary binary file was "JavaApplicationStub", and jarbundler would stuff that into the necessary place when it ran. There is obsolete documentation of this still available from Apple.

Having used jarbundler once, to see what folder structure it made, I stopped using it and just manually cut and past stuff into the right places for each new application, and mirrored what jarbundler did to the Info.plist file when JVM options needed to be added (such as to control the heap size), and populated the resources with the appropriate jar files, updated the classpaths in Info.plist, etc. Automating updates to such predefined structures in the Makefiles was trivial. Since I was using very little if anything that was Apple-JRE specific in my work, when Apple stopped doing the JRE and Oracle took over, it had very little impact on my process. So now I am in the habit of using various bleeding edge OpenJDK versions depending on the phase of the moon, and everything still seems to work just fine (putting aside changes in the appearance and performance of graphics, a story for another day).

Even though I have been compiling to target the 1.5 JVM for a long time, just in case anybody was still on such an old unsupported JRE, I finally decided to bite the bullet and switch to 1.7. This seemed sensible when I noticed that Java 9 (with which I was experimenting) would no longer compile to such an old target. After monkeying around with the relevant javac options (-target, -source, and -bootclasspath) to silence various (important) warnings, everything seemed good to go.

Until I copied one of these 1.7 targeted jar files into a Mac application bundle, and thought hey, why not rev up the JVMVersion property from "1.5+" to "1.7+"? Then it didn't work anymore and gave me a warning about "unsupported versions".

Up to this point, for years I had been smugly ignoring all sorts of anguished messages on the Mac Java mailing list about some new tool called "appbundler" described by Oracle, and the Apple policy that executable apps could no longer depend on the installed JRE, but instead had to be bundled with their own complete copy of the appropriate JRE (see this link). I was content being a fat dumb and happy ostrich, since things were working fine for me, at least as soon as I disabled all that Gatekeeper nonsense by allowing apps from "anywhere" to run (i.e., not just from the App Store, and without signatures), which I do routinely.

So, when my exposed ostrich butt got bitten by my 1.7 target changes (or whatever other incidental change was responsible), I finally realized that I had to either deal with this properly, or give up on using and sharing Mac executables. Since I have no idea how many, if any, users of my tools are dependent on these executables (I suspect not many), giving up wouldn't have been so bad except that (a) I don't like to give up so easily, and (b) occasionally the bundled applications are useful to me, since they support such things as putting it in the Dock, dragging and dropping to an icon, etc.

How hard can this be I thought? Just run appbundler, right? Well, it turns out the appbundler depends on using ant, which I don't normally use, and its configuration out of the box doesn't seem to handle the JVM options I wanted to specify. One can download it from, and here is its documentation. I noticed it seemed to be a little old (two years) and doesn't seem to be actively maintained by Oracle, which is a bit worrying. It turns out there is a fork of it that is maintained by others (infinitekind) that has more configuration options, but this all seemed to be getting a little more complicated than I wanted to have to deal with. I found a post from Michael Hall on the Mac Java developers mailing list that mentioned a tool he had written, AppConverter, which would supposedly convert the old to the new. Sounded just like what I needed. Unfortunately, it did nothing when I tried it (did not respond to a drag and drop of an app bundle as promised).

I was a bit bummed at this point, since it looked like I was going to have to trawl through the source of one of the appbundler variants or AppConverter, but then I decided I would first try and just cheat, and see if I could find an example of an already bundled Java app, and copy it.

AppConverter turned out to be useful after all, if only to provide a template for me to copy, since when I opened it up to show the Package Contents, sure enough, it was a Java application, contained a copy of the java binary executable JavaAppLauncher, which is what is used now instead of JavaApplicationStub, and had an Info.plist that showed what was necessary. In addition, it was apparent that the folder where the jar files go has moved, from being in "Contents/Resources/Java" to "Contents/Java" (and various posts on the Mac Java developers mailing list mentioned that too).

So, with a bit of manual editing of the file structure and the Info.plist, and copying the JavaAppLauncher out of AppConverter, I got it to work just fine, without the need to figure out how to run and configure appbundler.

By way of example, here is the Package Contents of DicomCleaner the old way:

and here it is the new way:

And here is the old Info.plist:

and here is the new Info.plist:

Note that it is no longer necessary to specify the classpath (not even sure how to); apparently the JavaAppLauncher adds everything in Contents/Java to the classpath automatically.

Rather than have all the Java properties under a single Java key, the JavaAppLauncher seems to use a JVMMainClassName key rather than Java/MainClass, and JVMOptions, rather than Java/VMOptions. Also, I found that in the absence of a specific Java/Properties/apple.laf.useScreenMenuBar key, another item in JVMOptions would work.

Why whoever wrote appbundler thought that they had to introduce these gratuitous inconsistencies, when they could have perpetuated the old Package Content structure and Java/Properties easily enough, I have no idea, but at least the structure is sufficiently "obvious" so as to permit morphing one to the other.

Though I had propagated various properties that jarbundler had originally included, and added one that AppConverter had used (Bundle display name), I was interested to know just what the minimal set was, so I started removing stuff to see if it would keep working, and sure enough it would. Here is the bare minimum that "works" (assuming you don't need any JVM options, don't care what name is displayed in the top line and despite the Apple documentation's list of "required" properties):

To reiterate, I used the JavaAppLauncher copied out of AppConverter, because it worked, and it wasn't obvious where to get it "officially".

I did try copying the JavaAppLauncher binary that is present in the "com/oracle/appbundler/JavaAppLauncher" in appbundler-1.0.jar, but for some reason that didn't work. I also poked around inside javapackager (vide infra), and extracted "com/oracle/tools/packager/mac/JavaAppLauncher" from the JDKs "lib/ant-javafx.jar", but that didn't work either (reported " ... Job failed to exec(3) for weird reason: 13"), so I will give up for now and stick with what works.

It would be nice to have an "official" source for JavaAppLauncher though.

In case it has any impact, I was using OS 10.8.5 and JDK 1.8.0_40-ea whilst doing these experiments.


PS. What I have not done is figure out how to include a bundled JRE, since I haven't had a need to do this myself yet (and am not motivated to bother with the AppStore), but I dare say it should be easy enough to find another example and copy it. I did find what looks like a fairly thorough description in this blog entry by Danno Ferrin about getting stuff ready for the AppStore.

PPS. I will refrain from (much) editorial comment about the pros and cons of requiring an embedded JRE in every tiny app, sufficeth to say I haven't found many reasons to do it, except for turn key applications (such as on a CD) where I do this on Windows a bit, just because one can. I am happy Apple/Oracle have enabled it, but surprised that Apple mandated it (for the AppStore).

PPPS. There is apparently also something from Oracle called "javafxpackager", which is pretty well documented, and which is supposed to be able to package non-FX apps as well, but I haven't tried it. Learning it looked more complicated than just doing it by hand. Digging deeper, it seems that this has been renamed to just "javapackager" and is distributed with current JDKs.

PPPPS. There is apparently an effort to develop a binary app that works with either the Apple or Oracle Package Contents and Info.plist properties, called "universalJavaApplicationStub", but I haven't tried that either.

Saturday, October 19, 2013

How Thick am I? The Sad Story of a Lonely Slice.

Summary: Single slice regions of interest with no multi-slice context or interval/thickness information may need to be reported as area only, not volume. Explicit interval/thickness information can and should be encoded. Thickness should be distinguished from interval.

Long Version.

Given a Region of Interest (ROI), no matter how it is encoded (as contours or segmented pixels or whatever), one can compute its area, using the pixel spacing (size) information. If a single planar ROI (on one slice) is grouped with a bunch of siblings on contiguous slices, then one can produce a sum of the areas. And if one knows the (regular) spacing between the slices (reconstruction interval in CT/MR/PET parlance), one can compute a volume from the sum of the areas multiplied by the slice spacing. Often one does not treat the top and bottom slice specially, i.e., the ROI is regarded as occupying the entire slice interval. Alternatively, one could consider the top and bottom slices (or both slices) as only being partially occupied, and perhaps halve the contribution of the top and bottom slices.

The slice interval is distinct from the slice "thickness" (Slice Thickness (0018,0050)), since data may be acquired and reconstructed such that there is either a gap between slices, or slices overlap, and in such cases, using the thickness rather than the interval would not return a volume representative of the object represented by the ROI(s). The slice interval is rarely encoded explicitly, and even if it is, may be unreliable, so one should compute the interval from the distance along the normal to the common orientation (parallel slices) using the Image Position (Patient) origin offset and the Image Orientation (Patient) row and column vectors. The Spacing Between Slices (0018,0088) is only officially defined for the MR and NM objects, though one does see it in CT images occasionally. In the past, some vendors erroneously encoded the gap between slices rather than the distance between their centers in Spacing Between Slices (0018,0088), so be wary of it.

This all presupposes that one does indeed have sufficient spatial information about the ROI available, encoded in the appropriate attributes, which is the case for 2D contours defined relative to 3D slices (e.g., SR SCOORDS with referenced cross-sectional images), 3D contours (e.g., SR SCOORD3D or RT Structure Sets), and Segmentation objects encoded as image objects with plane orientation, position and spacing.

And it works nicely down to just two slices.

But what if one only has one lonely slice? Then there is no "interval" per se.

For 2D contours defined relative to 3D image slices one could consult the adjacent (unreferenced) image slices and deduce the slice interval and assume that was applicable to the contour too. But for 3D contours and segmentation objects that stand alone in 3D space, and may have no explicit reference to the images from which they were derived, if indeed there were any images and if indeed those images were not re-sampled during segmentation, then there may be no "interval" information available at all.

The RT Structure Set does handle this in the ROI Contour Module, by the provision of an (optional) Contour Slab Thickness (3006,0044) value, though it may interact with an the associated Contour Offset Vector (3006,0045) such that the plane of the coordinates is not the center of the slab. See PS 3.3 Section C.

The Segmentation object, by virtue of inclusion of the Pixel Measures Sequence (functional group macro), which defines the Pixel Spacing, also requires the presence of the Slice Thickness attribute, but only if Volumetric Properties (0008,9206) is VOLUME or SAMPLED. And wouldn't you know it, the Segmentation IOD does not require the presence of Volumetric Properties :( That said, it is possible to encode it, so ideally one should; the question arises as to what the "thickness" of a segmentation is, and whether one should slavishly copy the slice thickness from the source images that were segmented, or whether one should use the interval (computed if necessary), since arguably one is segmenting the volume, regardless of how it was sampled. We should probably consider whether or not to include Spacing Between Slices (0018,0088) in the Pixel Measures Sequence as well, and to refine their definitions to make this clear.

The SR SCOORD3D content item attributes do not include interval or thickness. That does not prevent one from encoding a numeric content item to associate with it, though no standard templates currently do. Either way, it would be desirable to standardize the convention. Codes are already defined in PS 3.16 for 112225, DCM, “Slice Thickness”) and (112226, DCM, “Spacing between slices”) (these are used in the Image Library entries for cross-sectional images in the CAD templates).

Anyhow, from a recipient's perspective, given no explicit information and no referenced images there is no other choice than to report only area. If an image is referenced, and its interval or thickness are available, then one may be tempted to use it, but if they are different, which should one use? Probably the interval, to be consistent with the general case of multiple slices.

From a sender's perspective, should one explicitly encode interval or thickness information in the RT Structure Set, SR SCOORD3D, and Segmentation objects, even though it is not required? This is probably a good move, especially for single slice ROIs, and should probably be something considered by the standard for inclusion as a CP.


Monday, October 14, 2013

Binge and Purge ... Archive Forever, Re-compress or Discard ... PACS Lifecyle Management

Summary: Technical solutions and standards existing for implementing a hodge-podge of varied retention policies; teaching and research facilities should hesitate before purging or recompressing though; separating the decision making engine from the archive is desirable.

Long version:

As we continue to "binge" on imaging modalities that produce ever large quantities of data, such as MDCT, breast tomosynthesis and maybe one day whole slide imaging, the question of duration of storage becomes more pressing.

An Australian colleague recently circulated a link to a piece entitled "What should we do with old PACS images?", in which Kim Thomas from eHealth Insider magazine discusses whether or not to discard old images, and how. The article nicely summarizes the UK situation, and concludes with the usual VNA hyperbole, but fails to distinguish the differences in practice settings in which such questions arise.

In an operational environment that is focused only on immediate patient care, risk and cost minimization, and compliance with regulatory requirements, the primary questions are whether or not it is cheaper to retain, re-compress or delete studies that are no longer necessary, and whether or not the technology in use is capable of implementing it. In such environments, there is little if any consideration given to "secondary re-use" of such images, such as for research or teaching. Typically a freestanding ambulatory setting might be in such a category, the priorities being quality, cost and competitiveness.

An extreme case of "early discarding" arises in Australia where, as I understand it, the policy of some private practices (in the absence of any statutory requirement to the contrary) is to hand the images to the patient and discard the local digital copy promptly. Indeed, this no doubt made sense when the medium was radiographic (as opposed to printed) film.

In many jurisdictions though, there is some (non-zero) duration required by a local regulation specific to medical imaging, or a general regulation for retention of medical records that includes images. Such regulations define a length of time during which the record must be stored and made available. There may be a statutory requirement for each facility to have a written policy in place.

In the US, the HIPAA Privacy Rule does not include medical record retention requirements, and the rules are defined by the states, and vary (see for instance, the ONC summary of State Medical Record Laws). Though not regulatory in nature, the ACR–AAPM–SIIM Technical Standard For Electronic Practice of Medical Imaging requires a written policy and that digital imaging data management systems must provide storage capacity capable of complying with all facility, state, and federal regulations regarding medical record retention. The current policy of the ACR Council is described in Appendix E Ownership, Retention and Patient Access to Medical Records of the 2012-2012 Digest of Council Actions. This seems a bit outdated (and still refers to "magnetic tapes" !). Google did reveal a draft of an attempt to revise this, but I am not sure of the status of that, and I will investigate whether or not our Standards and Interoperability group can help with the technical details. I was interested though, to read that:

"The scope of the “discovery rules” in other states mean that records should conceivably be held indefinitely. Evidence of “fraud” could extend the statute of limitations indefinitely."

Beyond the minimum required, whatever that might be, in many settings there are good reasons to archive images for longer.

In an academic enterprise, the needs of teaching and research must be considered seriously, and the (relatively modest) cost of archiving everything forever must be weighed against the benefit of maintaining a durable longitudinal record in anticipation of secondary re-use.

I recall as a radiology registrar (resident in US-speak) spending many long hours in film archives digging out ancient films of exotic conditions, using lists of record numbers generated by queries for particular codes (which had been diligently recorded in the limited administrative information system of the day), for the purpose of preparing teaching content for various meetings and forums. These searches went back not just years but decades, if I remember correctly. This would not have been possible if older material had been discarded. Nowadays in a teaching hospital it is highly desirable that "good cases" be identified, flagged, de-identified and stored prospectively (e.g., using the IHE Teaching File and Clinical Trial Export (TCE) profile). But not everyone is that diligent, or has the necessary technology deployed, and there will remain many situations in which the value of a case is not recognized except in retrospect.

Retrospective research investigations have a place too. Despite the need to perform prospective randomized controlled trials there will always be a place for observational studies in radiology. Quite apart from clinical questions, there are technical questions to be answered too. For example, suppose one wanted to compare the performance of irreversible compression algorithms for a specific interpretation task (or to demonstrate non-inferiority compared to uncompressed images). To attain sufficient statistical power to detect the absence of a small but clinically significant difference in observer performance, a relatively large number of cases would be required. Obtaining these prospectively, or from multiple institutions, might be cost prohibitive, yet a sufficiently large local historical archive might render the problem tractable. The further the question strays from those that might be answered using existing public or sequestered large image collections (such as those available through the NBIA or TCIA or ADNI or CardiacAtlas), the more often this true.

Such questions also highlight the potential danger of using irreversible compression as a means of reducing storage costs for older images. Whilst such a strategy may or may not impinge upon the utility of the images for prior comparison or evidential purposes, they may render them useless for certain types of image processing research, such as CAD, and certainly so for research into compression itself.

Technologically speaking, as the eHI article reminds us, not all of the installed base of PACS have the ability to perform what is colloquially referred to as "life cycle management", especially if it is automated in some manner, based on some set of rules that implement configurable local policy. So, even if one decides that it is desirable to purge, one may need some technology refreshment to implement even a simple retention policy.

This might be as "easy" as upgrading one's PACS to a more recent version, or it might be one factor motivating a PACS replacement, or it might require some third party component, such as a VNA. One might even go so far as to separate the execution of the purging from the decision making about what to purge, using a separate "rules engine", coupled with a standard like IHE Image Object Change Management (IOCM) to communicate the purge decision (as I discussed in an old thread on Life Cycle Management in the UK Imaging Informatics Group). We added "Data Retention Policy Expired" as a KOS document title in DICOM CP 1152 specifically for this purpose.

One also needs a reliable source of data to drive the purging decision. Some parameters like the patient's age, visit dates, condition and types of procedure should be readily available locally; others may not, such as whether or not the patient has died. As I mentioned in that same UK thread, and has also been discussed in lifecycle, purging and deletion threads in the pacsadmin group, in the US we have the Social Security Administration's Death Master File available for this.

Since the necessary information to make the decision may not reside in the PACS or archive, but perhaps the HIS or EHR, separating the decision maker from the decision executor makes a lot of sense. Indeed, when you think about it, the entire medical record, not just the images, may need to be purged according to the same policy. So, it seems sensible to make the decision in one place and communicate it to all the places where information may be stored within an enterprise. This includes not only the EHR and radiology, but also the lab, histopathology, cardiology, and the visual 'ologies like ophthalmology, dermatology, etc. Whilst one day all databases, archives and caches may be centralized and consolidated throughout an enterprise (VNA panacea scenario), in the interim, a more loosely coupled solution is possible.

That said, my natural inclination as a researcher and a hoarder (with a 9 track tape drive and an 8" floppy drive in the attic, just in case) is to keep everything forever. Fortunately for the likes of me, disk is cheap, and even the power and HVAC required to maintain it are not really outrageously priced in the scheme of things. However, if you feel you really must purge, then there are solutions available, and a move towards using standards to implement them.


Sunday, September 29, 2013

You're gonna need a bigger field (not) ... Radix 64 Revisited

Summary: It is easy to fit a long number in a short string field by transcoding it to use more (printable) characters; the question is what encoding to use; there are more alternatives than you might think, but Base64 is the pragmatic choice.

Long Version.

Every now and then the subject of how to fit numeric SNOMED Concept IDs (defined by the SNOMED SCTID Data Type) into a DICOM representation comes up. These can be up to 18 decimal digits (and fit into a signed or unsigned 64 bit binary integer), whereas in DICOM, the Code Value has an SH (Short String) Value Representation (VR), hence is limited to 16 characters.

Harry Solomon suggested "Base64" encoding it, either always, or on those few occasions when the Concept ID really was too long (and then using a "prefix" to the value to recognize it).

The need arises because DICOM has always used the "old fashioned" SNOMED-RT style SnomedID values (like "T-A0100" for "Brain") rather than the SNOMED-CT style SNOMED Concept ID values (like "12738006"). DICOM was a relatively "early adopter" of SNOMED, and the numeric form did not exist in the early days (prior to the incorporation of the UK Read Codes that resulted in SNOMED-CT). Fortunately, SNOMED continues to issue the older style codes; unfortunately, folks outside the DICOM realm may need to use the newer style, and so converting at the boundary is irritating (and needs a dictionary, unless we transmit both). The negative impact on the installed base that depends on recognizing the old-style codes, were we to "change", is a subject for another day; herein I want to address only how it could be done.

Stuffing long numbers into short strings is a generic problem, not confined to using SNOMED ConceptIDs in DICOM. Indeed, this post was triggered as a result of pondering another use case, stuffing long numbers into Accession Number (also SH VR). So I thought I would implement this to see how well it worked. It turns out that there are a few choices to be made.

My first pass at this was to see if there was something already in the standard Java class library that supported conversion of arbitrary length base10 encoded integers into some other radix; I did not want to be constrained to only handling 64 bit integers.

It seemed logical to look at the arbitrary length numeric java.math.BigInteger class, and indeed it has a radix argument to its String constructor and toString() methods. It also has constructors based on two's-complement binary representations in byte[] arrays. Sounded like a no brainer.

Aargh! It turns out that BigInteger has an implementation limit on the size of the radix that it will handle. The maximum radix is 36 (the 10 digits plus 26 lowercase alphabetic characters that is the limit for java.lang.Character.MAX_RADIX). Bummer.

OK, I thought, I will hand write it, by doing successive divisions by the radix in BigInteger, and character encoding the modulus, accumulating the resulting characters in the correct order. Turned out to be pretty trivial.

Then I realized that I now had to choose which characters to select beyond the 36 that Java uses. At which point I noticed that BigInteger uses completely different characters than the traditional "Base64" encoding. "Base64" is the encoding used by folks who do anything that depends on MIME content encoding (email attachments or XML files with embedded binary payloads), as is defined in RFC 2045. Indeed, there are variants on "Base64" that handle situations where the two characters for 62 and 63 (normally '+' and '/' respectively) are problematic, e.g., in URLs (RFC 4648). Indeed RFC 4648 seems to be the most current definition of not only "Base64" and variants, but also "Base32" and "Base16" and so-called "extended hex" variants of them.

If you think about it, based on the long-standing hexadecimal representation convention that uses characters '0' to '9' for numeric values [0,9], then characters 'a' to 'f' for numeric values [10,15], it is pretty peculiar that "Base64" uses capital letters 'A' to 'J' for numeric values [0,9], and uses the characters '0' to '9' to represent numeric values [52,61]. Positively unnatural, one might say.

This is what triggered my dilemma with the built-in methods of the Java BigInteger. BigInteger returns strings that are a natural progression from the traditional hexadecimal representation, and indeed for a radix of 16 or a radix of 32, the values match those from the RFC 4648 "base16" and "base32hex" (as distinct from "base32") representations. Notably, RFC 4648 does NOT define a "base64hex" alternative to "base64", which is a bit disappointing.

It turns out that a long time ago (1992) in a galaxy far, far away, this was the subject of a discussion between Phil Zimmerman (of PGP fame), and Marshall Rose and Ned Freed on the MIME working group mailing list, in which Phil noticed this discrepancy and proposed it be changed. His suggestion was rejected on the grounds that it would not improve functionality and would threaten the installed base, and was made at a relatively late stage in development of the "standard". The choice of the encoding apparently traces back to the Privacy Enhanced Mail (PEM) RFC 989 from 1987. I dare say there was no love lost between Phil and the PEM/S-MIME folks, given that they were developers of competing methods for secure email, but you can read the exchange yourself and make up your own mind.

So I dug a little deeper, and it turns out that The Open Group Base (IEEE Std 1003.1) (POSIX, Single Unix Specification) has a definition for how to encode radix 64 numbers as ASCII characters too, in the specification of the a64l() and l64a() functions, which uses '.' (dot) for 0, '/' for 1, '0' through '9' for [2,11], 'A' through 'Z' for [12,37], and 'a' through 'z' for [38,63]. Note that is this is not part of the C standard library.

An early attempt at stuffing binary stuff into printable characters was used by the "uuencode" utility used in Unix-to-Unix copy (UUCP) implementations, such as was once used for mail transfer. It used the expedient of adding the 32 (the US-ASCII space character) to the 6 bit (base 64) numeric value and came up with a range of printable characters.

Of course, from the perspective of stuffing a long decimal value into a short string and making it fit, it doesn't matter which character representation is chosen, as long as it is valid. E.g., a 64 bit unsigned integer that has a maximum value of 18,446,744,073,709,551,615, which is 20 digits, is only 11 characters long when encoded with a radix of 65, regardless of the character choices.

For your interest, here is what each of the choices described above looks like, for single numeric values [0,63], and for the maximum unsigned 64 bit integer value:

Extension of Java and base16hex to hypothetical "base64hex":
0 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z : _

Unix a64l:
 . / 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z

Base64 (RFC 2045):
 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 + /

uuencode (note that space is the first character):
   ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _

Returning to DICOM then, the choice of what to use for a Short String (SH) VR is constrained to be any US-ASCII (ISO IR 6) character that is not a backslash (used as a value delimiter in DICOM) and not a control character. This would exclude the uuencode representation, since it contains a backslash, but any of the other choices would produce valid strings. The SH VR is case-preserving, which is a prerequisite for all of the choices other than uuencode. Were that not to be the case, we would need to define yet another encoding that was both case-insensitive and did not contain the backslash character. I can't thank of a use for packing numeric values into the Code String (CS) VR, the only upper-case only DICOM VR.

The more elegant choice in my opinion would be the hypothetical "base64hex", for the reasons Phil Z eloquently expressed, but ...

Pragmatically speaking, since RFC 989/1113/2045/4648-style "Base64" coding is so ubiquitous these days for bulk binary payloads, it would make no sense at all to buck that trend.

Just to push the limits though, if one uses all 94 printable US-ASCII characters except backslash, one can squeeze the largest unsigned 64 bit integer into 10 rather than 11 characters. However, for the 18 decimal digit longest SNOMED Concept ID, the length of the result is the same whether one uses a radix of 64 or 94, still 10 characters.


Thursday, September 12, 2013

What Template is that?

Summary: Determining what top-level template, if any, has been used to create a DICOM Structured Report can be non-trivial. Some SOP Classes require a single template, and an explicit Template ID is supposed to always be present, but if isn't, the coded Document Title is a starting point, but is not always unambiguous.

Long Version.

When Structured Reports were introduced into DICOM (Supplement 23), the concept of a "template" was somewhat nebulous, and was refined over time. Accordingly, the requirement to specify which template was used, if any, to author and format the content, was, and has remained, fairly weak.

The original intent, which remains the current intent, is that if a template was used, it's identity should be explicitly encoded. A means for doing so is the Content Template Sequence. Originally this was potentially encoded at each content item, but was later clarified by CP 452. In short, the identification applies only to CONTAINER content items, and in a particular to the root content item, and consists of a mapping resource (DCMR, in the case of templates defined in PS 3.16), and a string identifier.

The requirement on its presence is:

"if a template was used to define the content of this Item, and the template consists of a single CONTAINER with nested content, and it is the outermost invocation of a set of nested templates that start with the same CONTAINER"

Since the document root is always a container, whenever one of the templates that defines the entire content tree of the SR is used, then by definition, an explicit Template ID is required to be present.

That said, though most SR producers seem to get this right, sometimes the Template ID is not present, which presents a problem. I don't think this can be excused by lack of awareness of the requirement, or of failure to notice CP 452 (from 2005), since the original requirement in Sup 23 (2000) read:

"Required if a template was used to define the content of this Item".

Certainly CP 452 made things clearer though, in that it amended the definition to not only apply to the content item, but also "its subsidiary" content items.

Some SR SOP Classes define either a single template that shall be used, the KOS being one example, the CAD family (including Mammo, Chest and Colon) CAD being others. So, even if an explicit Template ID is not present, the expected template can be deduced from the SOP Class. Sometimes though, such instances are encoded as generic (e.g., Comprehensive) SR, perhaps because an intermediate system did not support the more specific SOP Class, and so one still needs to check for the template identifier.

In the absence of a specific SOP Class or an explicit template identifier, what is a poor recipient to do? One clue can be the concept name of the top level container content item, which is always coded, and always present, and which is referred to as the "document title". In many cases, within the scope of PS 3.16, the same coded concept is used only for a single root template. For example, (122292, DCM, "Quantitative Ventriculography Report”) is used only for TID 3202. That's helpful, at least as long as nobody other than DICOM (like a vendor) has re-used the same code to head a different template.

Other situations are more challenging. The basic diagnostic reporting templates, e.g., TID 2000, 2005 or 2006, are encoded in generic SOP Classes and furthermore don't have a single code or unique code for the document title, rather, any code can be used, and a defined set of them is drawn from LOINC, corresponding to common radiological procedures. It is not at all unlikely that some other completely different template might be used with the same code as (18747-6,LN,"CT Report"), or (18748-4,LN,"Diagnostic Imaging Report"), for instance.

One case of interest demonstrates that in the absence of an explicit Template ID, even a specific SOP Class and a relatively specific Document Title is insufficient. For Radiation Dose SRs, the same SOP Class is used for both CT and Projection X-Ray. Both TID 10001 Projection X-Ray Radiation Dose and  TID 10011 CT Radiation Dose have the same Document Title, (113701, DCM, “X-Ray Radiation Dose Report”).

One can go deeper into the tree though. One of the children of the Document Title content item is required to be (121058, DCM, ”Procedure reported”). For a CT report, it is required to have an enumerated value of (P5-08000,SRT, “Computed Tomography X-Ray”), whereas for a Projection X-Ray report, it may have a value of (113704, DCM, “Projection X-Ray”) or (P5-40010, SRT, “Mammography”), or something else, because these are defined terms.

So, in short, at the root level, the absence of a Template ID is not the end of the world, and a few heuristics might be able to allow a recipient to proceed.

Indeed, if one is expecting a particular pattern based on a particular template, and that pattern "matches" the content of the tree that one has received, does it really matter? It certainly makes life easier though, to match a top level identifier, than have to write a matching rule for the entire tree.

Related to the matter of the identification of the "root" or "top level" template is that of recognizing subordinate or "mini" templates. As you know, most of PS 3.16 is taken up not by monstrously long single templates but rather by invocation of sub-templates. So there are sub-templates for identifying things, measuring things, etc. These are re-used inside lots of application-specific templates.

Certainly "top-down" parsing from a known root template takes one to content items that are expected to be present based on the "inclusion" of one of these sub-templates. These are rarely, if ever, explicitly identified during creation by a Template ID, even though one could interpret that as being a requirement if the language introduced in CP 452 is taken literally. Not all "included" sub-templates start with a container, but many do. I have to admit that most of the SRs that I create do not contain Template IDs below the Document Title either, and I should probably revisit that.

Why might one want to be able to recognize such a sub-template?

One example is being able to locate and extract measurements or image coordinate references, regardless of where they occur in some unrecognized root template. An explicit Template ID might be of some assistance in such cases, but pattern matching of sub-trees can generally find these pretty easily too. When annotating images based on SRs, for example, I will often just search for all SCOORDs, and explore around the neighborhood content items to find labels and measurements to display. Having converted an SR to an XML representation also allows one to use XSL-T match() clauses and an XPath expression to select even complex patterns, without requiring an explicit ID.


Saturday, September 7, 2013

Share and share alike - CSIDQ

Summary: Image sharing requires the availability (download and transmission) of a complete set of images of diagnostic quality (CSIDQ), even if for a particular task, viewing of a lesser quality subset may be sufficient. The user then needs to be able to decide what they need to view on a case-by-case basis.

Long Version.

The title of this post comes from the legal use of the term "share and share alike", the equal division of a benefit from an estate, trust, or gift.

In the context of image sharing, I mean to say that all potential recipients of images, radiologists, specialists, GPs, patients, family, and yes, even lawyers, need to have the means to access the same thing: a complete set of images of diagnostic quality (CSIDQ). Note the emphasis on "have the means". CSIDQ seems to be a less unwieldy acronym that CSoIoDQ, so that's what I will use for notational convenience.

There are certainly situations in which images of lesser quality (or less than a complete set) might be sufficient, might be expedient, or indeed might even be necessary to enable the use case. A case in point being the need to make an urgent or rapid decision remotely when there is a only slow link available.

For folks defining architectures and standards, and deploying systems to make this happen, it is essential to assure that the CSIDQ is available throughout. In practice, this translates to requiring that
  • the acquisition modality produce a CSIDQ,
  • the means of distribution (typically a departmental or enterprise PACS) in the local environment stores and makes available a CSIDQ,
  • the system of record where the acquired images are stored for archival and evidential purposes contains a CSIDQ
  • any exported CD or DVD contains a CSIDQ,
  • any point-to-point transfer mechanism be capable of supporting transfer of a CSIDQ
  • any "edge server" or "portal" that permits authorized access to the locally stored images is capable of sharing a CSIDQ on request,
  • any "central" archive to which images are stored also retain and be capable of distributing a CSIDQ
  • any "clearinghouse" that acts as an intermediary needs to be capable of transferring a CSIDQ
These requirements apply particularly to the "Download" and "Transmit" parts of the Meaningful Use "View, Download and Transmit" (VDT) approach to defining sharing, as it applies to images and imaging results.

In other words, it is essential that whatever technologies, architectures and standards are used to implement Download and Transmit, that they be capable of supporting a CSIDQ. Otherwise, anything that is lost early in the "chain of custody", if you will, is not recoverable later when it is needed.

From a payload perspective, the appropriate standard for a CSIDQ is obviously DICOM, since that is the only widely (universally) implemented standard that permits the recipient to make full use of the acquired images, including importation, post-processing, measurement, planning, templating, etc. DICOM is the only format whose pixel data and meta data all medical imaging systems can import.

That said, it may be desirable to also provide Download of a subset, or a subset of lesser quality, or in a different format, for one reason or another. In doing so it is vital not to compromise the CSIDQ principle, e.g., by misleading a recipient (such as a patient or a referring physician) into thinking that anything less that a CSIDQ that has been download is sufficient for future use (e.g., subsequent referrals). And it is vital not to discard the DICOM format meta data. EHR and PHR vendors need to be particularly careful about not making expedient implementation decisions in this regard that compromise the CSIDQ principle (and hence may be below the standard of practice, may be misleadingly labelled, may introduce the risk of a bad outcome, and may expose them to product liability or regulatory action).

Viewing is an entirely different matter, however.

Certainly, one can download a CSIDQ and then view it, and in a sense that is what the CD/DVD distribution mechanism is ... a "thick client" viewer is either already installed or executed from the media to display the DICOM (IHE PDI) content. This approach is typically appropriate when one wants to import what has been downloaded (e.g., into the local PACS) so that it can be viewed along with all the other studies for the patient. This is certainly the approach that most referral centers will want to adopt, in order to provide continuity of patient care coupled with familiarity of users with the local viewing tools. It is also equally reasonable to use for an "in office" imaging system, as I have discussed before. It is a natural extension of the current widespread CD importation that takes place, and the only difference is the mode of transport, not the payload.

For sporadic users though, who may have no need to import or retain a local copy of the CSIDQ, many other standard (WADO and XDS-I) and proprietary alternatives exist for viewing. Nowadays web-based image viewing mechanisms, including so-called "zero footprint" viewers, can provide convenient access to an interactively rendered version of that subset of the CSIDQ that the user needs access to, with the appropriate quality, whether using client or server-side rendering, and irrespective of how and in what format the pixel data moves from server to client. Indeed, these same mechanisms may suffice even for the radiologist's viewing interface, as long as the necessary image quality is assured, there is access to the complete set, and the necessary tools are provided.

The moral being that the choice needs to be made by the user, and perhaps on the basis of whatever specific task they need to perform or question they want to answer. For any particular user (or type of user), there may be no single best answer that is generally applicable. For one patient, at one visit, the user might be satisfied with the report. On another occasion they might just want to illustrate something to the patient that requires only modest quality, and on yet another they might have a need to examine the study with the diligence that a radiologist would apply.

In other words, the user needs to be able to make the viewing quality choice dynamically. So, to enable the full spectrum of quality needs, the server needs to have the CSIDQ in the first place.


PS. By the way, do not take any of the foregoing to imply that irreversibly (lossy) compressed images are not of diagnostic quality. It is easy to make the erroneous assumptions that uncompressed images are diagnostic and compressed ones are not, or that DICOM images are uncompressed (when they may be encoded with lossy compression, including JPEG, even right off the modality in some cases), or that JPEG lossy compressed images supplied to a browser are not diagnostic. Sometimes they are and sometimes they are not, depending on the modality, task or question, method and amount of compression, and certainly last but not least, the display and viewing environment.

What "diagnostic quality" means and what constitutes sufficient quality and when, in general, and in the context of "Diagnostically Acceptable Irreversible Compression" (DAIC), are questions for another day. The point of this post is that the safest general solution is to preserve whatever came off the modality. Doing anything less than that might be safe and sufficient, but you need to prove it. Further, regardless of the quality of the pixel data, losing the DICOM "meta data" precludes many downstream use cases, including even simple size measurements.

PPS. This blog post elaborates on a principle that I attempted to convey during my recent testimony to the ONC HIT Standards Committee Clinical Operations Workgroup about standards for image sharing, which you can see, read or listen to if you have the stomach for it. If you are interested in the entire series of meetings at which other folks have testified or the subject has been discussed, here is a short summary, with links (or you can go to the group's homepage and follow the calendar link, to future meetings if you are interested in joining them, or past meetings:

2013-04-19 (initial discussion)
2013-06-14 (RSNA: Chris Carr, David Avrin, Brad Erickson)
2013-06-28 (RSNA: David Mendelson, Keith Dreyer)
2013-07-19 (lifeIMAGE: Hamid Tabatabaie, Mike Baglio)
2013-07-26 (general discussion)
2013-08-09 (general discussion)
2013-08-29 (standards: David Clunie)

Also of interest is the parent HIT Standards Committee:

2013-04-17 (establish goal of image exchange)

And the HIT Policy Committee:

2013-03-14 (prioritize image exchange)

PPPS. The concept of "complete set of images of diagnostic quality" was first espoused by an AMA Safety Panel that met with a group of industry folks (2008/08/27) to try to address the historical "CD problem". The problem was not the existence of the CD transport mechanism, which everyone is now eager to decry in favor of a network-based image sharing solution, but rather the problem of inconsistent formats, content and viewer behavior. The effort was triggered by a group of unhappy neurosurgeons in 2006 (AMA House of Delegates Resolution 539 A-06). They were concerned about potential safety issues caused by inadequate or delayed access or incomplete or inadequately displayed MR images. To cut a long story short, a meeting with industry was proposed (Board of Trustees Report 30 A-07 and House of Delegates Resolution 523 A-08), and that meeting resulted in two outcomes.

One was the statement that we hammered out together in that clinical-industry meeting, which was attended not just by the AMA and MITA (NEMA) folks, but also representatives of multiple professional societies, including the American Association of Neurological Surgeons, Congress of Neurological Surgeons, American Academy of Neurology, American College of Radiology, American Academy of Orthopedic Surgeons, American College of Cardiology, American Academy of Otolaryngology-Head and Neck Surgery, as well as vendors, including Cerner, Toshiba, Philips, General Electric and Accuray, and DICOM/IHE folks like me. You can read a summary of the meeting, but the most important part is the recommendation for a standard of practice, which states in part:

"The American Medical Association Expert Panel on Medical Imaging (Panel) is concerned whether medical imaging data recorded on CD’s/DVD’s is meeting standards of practice relevant to patient care.  

The Panel puts forward the following statement, which embodies the standard the medical imaging community must achieve. 

  • All medical imaging data distributed should be a complete set of images of diagnostic quality in compliance with IHE-PDI.
This standard will engender safe, timely, appropriate, effective, and efficient care; mitigate delayed care and confusion; enhance care coordination and communication across settings of care; decrease waste and costs; and, importantly, improve patient and physician satisfaction with the medical imaging process."

More recently, the recommendation of the panel is incorporated in the AMA's discussion of the implementation of EHRs, in the Board of Trustees Report 24 A-13, which recognizes the need to "disseminate this statement widely".

The other outcome of the AMA-industry meeting was the development of the IHE Basic Image Review (BIR) Profile, intended to standardize the user experience when using any viewer. The original neurosurgeon protagonists contributed actively to the development of this profile, even to the extent of sacrificing entire days of their time to travel to Chicago to sit with us in IHE Radiology Technical Committee meetings. Sadly, adoption of that profile has been much less successful than the now almost universal use of IHE PDI DICOM CDs. Interestingly enough, with a resurgence of interest in web-based viewers, and with many new vendors entering the field, the BIR profile, which is equally applicable to both network and media viewers, could perhaps see renewed uptake, particularly amongst those who have no entrenched "look and feel" user interface conventions to protect.