Thursday, August 1, 2013

Lumpers vs Splitters - Anatomy and Procedures, Prefetching and Browsing

Summary: For remote access and pre-fetching, should one lump anatomic regions into a small number of categories, or retain the finer granularity inherent in the procedure codes and explicitly encoded in the images?

Long Version.

Of late you may have noticed a spate of posts from me about anatomy in the images, procedure codes, as well as pre-fetching. Needless to say these topics are related, and there is a reason for my recently renewed interest in researching these subjects.

You may or may not have noticed that in IHE XDS-I.b, there is a bunch of information included in the registry metadata that is specifically described for imaging use (see IHE RAD TF:3, section 4.68.4.1.2.3.2 XDSDocumentEntry Metadata).

The typeCode is supposed to contain the (single) Procedure Code. Unfortunately, since almost nobody currently uses standard sets of codes, these will usually contain local institution codes. So whilst their display name may be rendered and browseable by a human, they will not easily be recognized by a machine, e.g., for pre-fetching or hanging. The specification currently says typeCode should contain the Requested Procedure Code rather than the performed Procedure Code, which is an interesting choice, since what was requested is not always what was done.

There is also an eventCodeList that is currently defined to contain a code for the Modality (drawn from DICOM PS 3.16 CID 29), and one or more Anatomic Region codes (from DICOM CID 4).

Now, no matter where the anatomic codes come from (be they derived from the local or standard procedure codes, extracted from the images, from some mysterious out of band source, or entered by a human), there is a fairly long list of theoretical values and practical values that are actually encountered, depending on the scenario, whether it be radiology, cardiology, or some other specialty that is a source of images, like ophthalmology.

There are different potential human users of this information, whether it be radiologists viewing radiology images, those physicians who requested the imaging viewing radiology images (like an ophthalmologist requesting an MR of the orbits), or other specialists viewing their own images (like ophthalmologists, endoscopists, dermatologists, etc.). Even confining oneself to the radiology domain, the reasons for retrieving a set of images may vary.

One might think that there is no problem, since XDS-I.b requires that the anatomical information be present, and requires that it be drawn from a rich set of choices.

However, some folks seem to think that the set of choices of anatomical concepts is too rich and too long, and want to cut it down to just a short list, "lumping" a whole bunch of stuff together, rather than leaving it "split" into its fined grained descriptions.

Why, one might ask, would one ever want to discard potentially useful information by such "coarsening" of the anatomical concepts in advance, when if there was a need to do so, one could easily do it on the querying end, when necessary?

So I did ask, and the result was a fairly vigorous and prolonged email "debate" back and forth between the "lumpers" and the "splitters". The net result of which is that neither side is convinced of the merits of the others' argument, and are not interested in talking to each other anymore. So the process has stalled, and in the interim individual XDS "affinity domains" will do whatever they see fit, with their choices no doubt modulated by what their vendors are able or willing to deliver in this respect.

An obvious compromise would be to always send both coarse and fine codes. Unfortunately, since the eventCodeList is a flat list of codes, there is no easy way to communicate name-value pairs, and since coarse and fine grained anatomy come from the same coding scheme (SNOMED), there is no easy way to send both and distinguish them, which turns out to be important. At least not without a change to the underlying ITI requirements for XDS, and they are loathe to make changes for fear apparently of invalidating the installed base of XDS systems (modest sized though that might be at this early stage). Getting a slot added to send Accession Number was like pulling teeth from ITI, and nobody has the stomach for a repeat of that tedious exercise.

The context in which this arose initially was pre-fetching. One reasonable approach is pre-fetching all those studies in the same coarse group as the current study, and the expectation is that this would be better than pre-fetching everything, or nothing, or relying on workflow related reasons, such as pre-fetching the most recent studies or the study that one actually ordered in the first place, or studies of the same modality, or intended for the same recipient, etc.

However, one can potentially do a better job of pre-fetching if one applies more granular rules, and this is particularly the case when one has a specific clinical question or task to perform.

An example may help. Suppose one is interested in, say, a patient's screening virtual CT colonoscopy, whether one is a radiologist reporting it, or the ordering physician. And one wants to compare it with previous virtual CT colonoscopy. Should one pre-fetch all CT's of the abdomen for comparison (and there may be quite a few given that they are handed out in the emergency room like candies), not to mention whole body CT-PET scans that include the abdomen, etc.? Or should one pre-fetch only CT's of the colon? Now, if one could match procedure codes, and there was only one or a limited number of procedure codes for CT colonoscopy, one could match on that and ignore all the extraneous studies. But we have already established that procedure codes are currently largely non-standardized and in any reasonable size enterprise that has grown through acquisition or changed its EHR or RIS lately (can you say MU?), there may be a multitude of different coding schemes used in the archives.

So, the lumpers would say, send abdomen for the anatomy, and pre-fetch them all. The splitters would say send colon for the anatomy, and pre-fetch whatever comes out of rules you want to apply at the requesting end (lump with other abdomens if you want to, or not, depending on your preference, or the sophistication of your rules, and your knowledge of the question).

The clinical question really is important. If you are a vascular surgeon wondering about change in size of an aortic aneurysm, you might really want any imaging that included the abdominal aorta, for whatever reason, and not just cardiovascular images, and CT colonoscopy would include useful images in the axial set.

One can come up with all sorts of similar examples, perfusion brain CT or petrous temporal bone CT versus any head CT, coronary or pulmonary CT angiogram versus any chest CT, etc. Beyond radiology, does an ophthalmologist want all head and neck, or just eyes, or just retinas?

The "lumping" strategy required also depends on the use, since there may be potential ambiguities. Is a cervical spine lumped into "spine" or does it go with "head and neck", for example, and with multiple contribution sources, will they implement the same lumping decisions?

The point being that it is impossible to anticipate the requirements on the receiving end until the question is asked, not when the studies are registered in the first place. Accordingly, in my opinion the richness should be recorded in the registry and available in the query, and the pre-fetching decision-making, including any "lumping" if appropriate, should be performed at the receiving end.

Retaining the more granular information is particularly important when one considers the possibility of using more sophisticated artificial intelligence approaches to pre-fetching, rather than simple heuristics or manually authored rules; you will find some references to those techniques in my recent pre-fetching post. Adaptive systems can learn what individual users (or sets of users in the same role) need based on what they are observed to actually view. But even simple rule-based pre-fetchers can be more sophisticated than just using a coarse list (e.g., the RadMapps approach based on string study descriptions).

Besides, if one believes in "lumping", it is not as if the task is very burdensome, no matter where it is performed, given the modest numbers of codes to deal with. Though I described the list of fine grained codes as "fairly long" earlier, it isn't really that long. Even were one to need to select from the list in a user interface, just like for a user interface for procedures (a much longer list than anatomic locations), there are tactics for presenting long lists in an easily navigable manner.

It is interesting to consider the history of the DICOM list in this respect. Over time the list has grown, from the original 19 that were CR-specific in DICOM 1993, to contain now 112 string values for Body Part Examined, most of which have been added to reflect experience in the field (e.g., what CR vendors started to send when they couldn't find a good match, or what other modalities needed). DICOM defines the SNOMED coded equivalents of all of those, plus various others that are used in specific objects (especially cardiology objects, and those for echocardiography in particular); the total is 340 coded concepts at the moment, many of which are not relevant to the application of anatomic region for a procedure for a registry and wouldn't be applicable, and some of which are the same concept but different meanings for different contexts (e.g., X and endo-X with same code). This is all summarized in DICOM PS 3.16 Annex L, which is related to CID 4. There are probably a few too many highly specific cardiovascular locations that got pulled in this way. There are a few specialties that have separate lists, e.g., ophthalmology, which have not been folded into Annex L yet, and do not have string equivalents for coded values. These lists may not be perfect, but they are a line in the sand and do reflect what people have asked for, over 20 years of experience with the standard.

So, in short, no more than a few hundred codes probably need to be mapped from the procedure codes (or acquired by some other means) at the sending end. And at the receiving end, no more than that few hundred codes need to be "lumped" to apply coarse pre-fetching rules, if that floats your boat. And since all the anatomy codes defined in DICOM CID 4 are SNOMED codes, the mapping is already right there for the implementer to extract in the relationships present in the SNOMED files.

One concern that has been expressed is that there are too many anatomical codes to map to from one's local procedure code, and it is easier to map to a short list. I would argue the opposite, in that it is easier to map "XR wrist" to "wrist" than "lower extremity", or "MR Pituitary" to "pituitary" rather than "head and neck". I.e. a literal mapping doesn't require a knowledge of anatomy. Not to mention the fact that the better approach is to map one's local procedure codes to standard procedure codes (like SNOMED or LOINC or RadLex Playbook) in the first place, then extract the anatomy automatically from the ontologies that back those standards.

I asked a bunch of radiologists in the US and Australia what their preference was, fine or coarse grained anatomy, and they all expressed a preference for retaining the fine grained concept.

A similar sentiment was expressed by several UK radiologists in the UK Imaging Informatics Group when a short list was suggested. The interest in "lumping" in the UK is particularly surprising, when one considers that they all have to use the NICIP codes, which are not only already mapped to SNOMED, but are also already mapped to OPCS-4, which already contains fine-grained anatomy codes (their Z codes and O codes). If you read the UK forum posts carefully though, you will see a distinction suggested between using their standard procedure (rather than anatomy) code for plain radiography pre-fetching, versus "lumping" anatomy for cross-sectional modalities.

Anyhow, I am not certain that I have convinced anyone who already has their mind made up (that coarse codes are sufficient), nor anyone who is for some reason intimidated by the more comprehensive fine grained list in DICOM CID 4 than a short and arbitrary list.

Personally though, given the limitations inherent in the XDS metadata model, I remain convinced that the more precise information is valuable, and the coarse information not only limits what a recipient can find but contaminates the information with noise (claiming more territory was imaged than actually was). Not only does this undermine the utility of XDS, but it creates an artificial distinction between what is possible using local PACS protocols like DICOM queries as opposed to cross-enterprise protocols, when instead we should be working to make such artificial distinctions transparent to the user. In my opinion, the remote user deserves the same level of pre-fetching and manual browsing performance that is achievable locally.

What do you think?

David

It is interesting to consider what concepts might be included in a lumped list.

The original IHE CP, which triggered this debate, proposed a list that consisted of:

Abdomen
Cardiovascular
Cervical Spine
Chest
Entire Body
Head
Lower Extremity
Lumbar Spine
Neck
Pelvis
Thoracic Spine
Upper Extremity

Not much use if you are a mammographer looking for last year's priors, for example, so at the very least it would make sense to add Breast.

The proposed UK forum list was initially:

Abdo
Body (esp for overlapping CT body areas)
Chest
Head
Heart
Lower Limb
Misc
Neck
Pelvis
Spine
Upper Limb
Vessels

to which there were later suggestions in the forum to add Breast and Bowel.

When the ACR ITIC was discussing appropriateness criteria work, it had found it helpful to group procedures for that specific purpose, and the list was:

Abdomen
Breast
Cardiac
Chest
Head
Lower extremity
Maxface-dental
Neck
Pelvis
Spine
Unspecified
Upper extremity
Whole body

Another source of interest is the RadLex PlayBook, which categorizes procedures by Body Region (e.g., abdomen), a very short list by comparison with the more fine-grained Anatomic Focus (e.g., pancreas) that is also used. That list is:

Abdomen
Abdomen and Pelvis
Bone
Breast
Cervical Spine
Chest
Face
Head
Lower Extremity
Lumbar Spine
Lumbosacral Spine
Neck
Pelvis
Spine
Thoracic Spine
Thoracolumbar Spine
Upper Extremity
Whole Body

The Canadian DI Standards collaborative working group (SCWG 10) short list for XDS (after they were not convinced by my argument that no short list is necessary) is currently proposed to be:

Abdomen
Breast
Cardiovascular
Cervical Spine
Chest
Entire Body
Head
Lower Extremity
Lumbar Spine
Neck
Pelvis
Thoracic Spine
Upper Extremity

When I asked various radiologists what they would prefer if they were forced to live with a coarse list only, one proposal was:

Abdomen
Breast
Cardiac
Cardiovascular (not heart)
Cervical Spine
Chest
Entire/Whole Body
Facial/dental
Head
Lower Extremity
Lumbar Spine
Neck
Pelvis
Thoracic Spine
Unspecified
Upper Extremity

There was then a discussion about whether Face should be separated from Brain within Head, and then what one should do about Base of Skull and Inner Ear, which serves to emphasize my point that it is difficult to come up with a list that satisfies every constituent.

To be fair, putting aside the fact that "unspecified" is undesirable, and that combined body parts may not be needed since one can send multiple codes (IHE XDS-I permits this), there is a lot of similarity between the proposals.

One might wonder about the apparent obsession with lumping regions within an upper or lower extremity category, and why one would want shoulders with wrists, etc. I suppose it might reflect the continuum of radiographic views that extend along the limbs (e.g. does humerus include shoulder and elbow). Then again, if one were doing a skeletal survey for metastases one might want a category of Bone instead I suppose, in which would be included Skull, and all Spines, and Pelvis, and Chest (for ribs). Or for a skeletal survey for arthritis, just Joints and Spine perhaps.

What would your list be, if you needed one?

4 comments:

Unknown said...

Thank you for the thoughtful post. The debate rages on. Two other significant use cases beyond pre-fetch and hanging protocols, came to mind: Work allocation and Reporting template mapping. Depending on the practice, having a very fine-grained list is useful to present just the right cases to the right specialist. Reporting templates provide greatest efficiency when tuned to the specific clinical question which in many cases can be inferred from the body part examined.
Pre-fetching is the area where a coarse grain is probably most useful, although it does by nature risk greater resource consumption. It has been my experience that Radiologists want ready access to related structures and may choose to wander a bit to satisfy curiosity. Having loosely related priors immediately available is often desired
So, with competing requirements, perhaps your suggestion of both lumped and split is the answer. But does this exist in the known universe?

David Clunie said...

At SIIM 2014, Helen Chen described some useful work related to use the relationships between anatomical parts to identify relevant priors; see "http://siim.org/siim2014/scientific-program/radbodykb-ontology-based-web-service-enhanced-search-anatomic-location".

Unknown said...

One challenge when trying to use SNOMED CT terms for Body Part is that the human readable terms are often in excess of DICOM BPE which is 16 char. This leads then to non-standard truncations of standardized terms, defeating the value

David Clunie said...

Quite so ... that's why we have Annex L of PS3.16 to define shorter, capitalized versions ... it is incomplete though, but covers the most common ones. I should probably do a CP to fill out the lot. Let me know if there are any in particular you think have priority.