David Clunie's Blog: September 2013

Sunday, September 29, 2013

You're gonna need a bigger field (not) ... Radix 64 Revisited

Summary: It is easy to fit a long number in a short string field by transcoding it to use more (printable) characters; the question is what encoding to use; there are more alternatives than you might think, but Base64 is the pragmatic choice.

Long Version.

Every now and then the subject of how to fit numeric SNOMED Concept IDs (defined by the SNOMED SCTID Data Type) into a DICOM representation comes up. These can be up to 18 decimal digits (and fit into a signed or unsigned 64 bit binary integer), whereas in DICOM, the Code Value has an SH (Short String) Value Representation (VR), hence is limited to 16 characters.

Harry Solomon suggested "Base64" encoding it, either always, or on those few occasions when the Concept ID really was too long (and then using a "prefix" to the value to recognize it).

The need arises because DICOM has always used the "old fashioned" SNOMED-RT style SnomedID values (like "T-A0100" for "Brain") rather than the SNOMED-CT style SNOMED Concept ID values (like "12738006"). DICOM was a relatively "early adopter" of SNOMED, and the numeric form did not exist in the early days (prior to the incorporation of the UK Read Codes that resulted in SNOMED-CT). Fortunately, SNOMED continues to issue the older style codes; unfortunately, folks outside the DICOM realm may need to use the newer style, and so converting at the boundary is irritating (and needs a dictionary, unless we transmit both). The negative impact on the installed base that depends on recognizing the old-style codes, were we to "change", is a subject for another day; herein I want to address only how it could be done.

Stuffing long numbers into short strings is a generic problem, not confined to using SNOMED ConceptIDs in DICOM. Indeed, this post was triggered as a result of pondering another use case, stuffing long numbers into Accession Number (also SH VR). So I thought I would implement this to see how well it worked. It turns out that there are a few choices to be made.

My first pass at this was to see if there was something already in the standard Java class library that supported conversion of arbitrary length base10 encoded integers into some other radix; I did not want to be constrained to only handling 64 bit integers.

It seemed logical to look at the arbitrary length numeric java.math.BigInteger class, and indeed it has a radix argument to its String constructor and toString() methods. It also has constructors based on two's-complement binary representations in byte[] arrays. Sounded like a no brainer.

Aargh! It turns out that BigInteger has an implementation limit on the size of the radix that it will handle. The maximum radix is 36 (the 10 digits plus 26 lowercase alphabetic characters that is the limit for java.lang.Character.MAX_RADIX). Bummer.

OK, I thought, I will hand write it, by doing successive divisions by the radix in BigInteger, and character encoding the modulus, accumulating the resulting characters in the correct order. Turned out to be pretty trivial.

Then I realized that I now had to choose which characters to select beyond the 36 that Java uses. At which point I noticed that BigInteger uses completely different characters than the traditional "Base64" encoding. "Base64" is the encoding used by folks who do anything that depends on MIME content encoding (email attachments or XML files with embedded binary payloads), as is defined in RFC 2045. Indeed, there are variants on "Base64" that handle situations where the two characters for 62 and 63 (normally '+' and '/' respectively) are problematic, e.g., in URLs (RFC 4648). Indeed RFC 4648 seems to be the most current definition of not only "Base64" and variants, but also "Base32" and "Base16" and so-called "extended hex" variants of them.

If you think about it, based on the long-standing hexadecimal representation convention that uses characters '0' to '9' for numeric values [0,9], then characters 'a' to 'f' for numeric values [10,15], it is pretty peculiar that "Base64" uses capital letters 'A' to 'J' for numeric values [0,9], and uses the characters '0' to '9' to represent numeric values [52,61]. Positively unnatural, one might say.

This is what triggered my dilemma with the built-in methods of the Java BigInteger. BigInteger returns strings that are a natural progression from the traditional hexadecimal representation, and indeed for a radix of 16 or a radix of 32, the values match those from the RFC 4648 "base16" and "base32hex" (as distinct from "base32") representations. Notably, RFC 4648 does NOT define a "base64hex" alternative to "base64", which is a bit disappointing.

It turns out that a long time ago (1992) in a galaxy far, far away, this was the subject of a discussion between Phil Zimmerman (of PGP fame), and Marshall Rose and Ned Freed on the MIME working group mailing list, in which Phil noticed this discrepancy and proposed it be changed. His suggestion was rejected on the grounds that it would not improve functionality and would threaten the installed base, and was made at a relatively late stage in development of the "standard". The choice of the encoding apparently traces back to the Privacy Enhanced Mail (PEM) RFC 989 from 1987. I dare say there was no love lost between Phil and the PEM/S-MIME folks, given that they were developers of competing methods for secure email, but you can read the exchange yourself and make up your own mind.

So I dug a little deeper, and it turns out that The Open Group Base (IEEE Std 1003.1) (POSIX, Single Unix Specification) has a definition for how to encode radix 64 numbers as ASCII characters too, in the specification of the a64l() and l64a() functions, which uses '.' (dot) for 0, '/' for 1, '0' through '9' for [2,11], 'A' through 'Z' for [12,37], and 'a' through 'z' for [38,63]. Note that is this is not part of the C standard library.

An early attempt at stuffing binary stuff into printable characters was used by the "uuencode" utility used in Unix-to-Unix copy (UUCP) implementations, such as was once used for mail transfer. It used the expedient of adding the 32 (the US-ASCII space character) to the 6 bit (base 64) numeric value and came up with a range of printable characters.

Of course, from the perspective of stuffing a long decimal value into a short string and making it fit, it doesn't matter which character representation is chosen, as long as it is valid. E.g., a 64 bit unsigned integer that has a maximum value of 18,446,744,073,709,551,615, which is 20 digits, is only 11 characters long when encoded with a radix of 65, regardless of the character choices.

For your interest, here is what each of the choices described above looks like, for single numeric values [0,63], and for the maximum unsigned 64 bit integer value:

Extension of Java and base16hex to hypothetical "base64hex":
0 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z : _
f__________

Unix a64l:
. / 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z
Dzzzzzzzzzz
Base64 (RFC 2045):
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 + /
P//////////

uuencode (note that space is the first character):
! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _
/__________

Returning to DICOM then, the choice of what to use for a Short String (SH) VR is constrained to be any US-ASCII (ISO IR 6) character that is not a backslash (used as a value delimiter in DICOM) and not a control character. This would exclude the uuencode representation, since it contains a backslash, but any of the other choices would produce valid strings. The SH VR is case-preserving, which is a prerequisite for all of the choices other than uuencode. Were that not to be the case, we would need to define yet another encoding that was both case-insensitive and did not contain the backslash character. I can't thank of a use for packing numeric values into the Code String (CS) VR, the only upper-case only DICOM VR.

The more elegant choice in my opinion would be the hypothetical "base64hex", for the reasons Phil Z eloquently expressed, but ...

Pragmatically speaking, since RFC 989/1113/2045/4648-style "Base64" coding is so ubiquitous these days for bulk binary payloads, it would make no sense at all to buck that trend.

Just to push the limits though, if one uses all 94 printable US-ASCII characters except backslash, one can squeeze the largest unsigned 64 bit integer into 10 rather than 11 characters. However, for the 18 decimal digit longest SNOMED Concept ID, the length of the result is the same whether one uses a radix of 64 or 94, still 10 characters.

David

Thursday, September 12, 2013

What Template is that?

Summary: Determining what top-level template, if any, has been used to create a DICOM Structured Report can be non-trivial. Some SOP Classes require a single template, and an explicit Template ID is supposed to always be present, but if isn't, the coded Document Title is a starting point, but is not always unambiguous.

Long Version.

When Structured Reports were introduced into DICOM (Supplement 23), the concept of a "template" was somewhat nebulous, and was refined over time. Accordingly, the requirement to specify which template was used, if any, to author and format the content, was, and has remained, fairly weak.

The original intent, which remains the current intent, is that if a template was used, it's identity should be explicitly encoded. A means for doing so is the Content Template Sequence. Originally this was potentially encoded at each content item, but was later clarified by CP 452. In short, the identification applies only to CONTAINER content items, and in a particular to the root content item, and consists of a mapping resource (DCMR, in the case of templates defined in PS 3.16), and a string identifier.

The requirement on its presence is:

"if a template was used to define the content of this Item, and the template consists of a single CONTAINER with nested content, and it is the outermost invocation of a set of nested templates that start with the same CONTAINER"

Since the document root is always a container, whenever one of the templates that defines the entire content tree of the SR is used, then by definition, an explicit Template ID is required to be present.

That said, though most SR producers seem to get this right, sometimes the Template ID is not present, which presents a problem. I don't think this can be excused by lack of awareness of the requirement, or of failure to notice CP 452 (from 2005), since the original requirement in Sup 23 (2000) read:

"Required if a template was used to define the content of this Item".

Certainly CP 452 made things clearer though, in that it amended the definition to not only apply to the content item, but also "its subsidiary" content items.

Some SR SOP Classes define either a single template that shall be used, the KOS being one example, the CAD family (including Mammo, Chest and Colon) CAD being others. So, even if an explicit Template ID is not present, the expected template can be deduced from the SOP Class. Sometimes though, such instances are encoded as generic (e.g., Comprehensive) SR, perhaps because an intermediate system did not support the more specific SOP Class, and so one still needs to check for the template identifier.

In the absence of a specific SOP Class or an explicit template identifier, what is a poor recipient to do? One clue can be the concept name of the top level container content item, which is always coded, and always present, and which is referred to as the "document title". In many cases, within the scope of PS 3.16, the same coded concept is used only for a single root template. For example, (122292, DCM, "Quantitative Ventriculography Report”) is used only for TID 3202. That's helpful, at least as long as nobody other than DICOM (like a vendor) has re-used the same code to head a different template.

Other situations are more challenging. The basic diagnostic reporting templates, e.g., TID 2000, 2005 or 2006, are encoded in generic SOP Classes and furthermore don't have a single code or unique code for the document title, rather, any code can be used, and a defined set of them is drawn from LOINC, corresponding to common radiological procedures. It is not at all unlikely that some other completely different template might be used with the same code as (18747-6,LN,"CT Report"), or (18748-4,LN,"Diagnostic Imaging Report"), for instance.

One case of interest demonstrates that in the absence of an explicit Template ID, even a specific SOP Class and a relatively specific Document Title is insufficient. For Radiation Dose SRs, the same SOP Class is used for both CT and Projection X-Ray. Both TID 10001 Projection X-Ray Radiation Dose and TID 10011 CT Radiation Dose have the same Document Title, (113701, DCM, “X-Ray Radiation Dose Report”).

One can go deeper into the tree though. One of the children of the Document Title content item is required to be (121058, DCM, ”Procedure reported”). For a CT report, it is required to have an enumerated value of (P5-08000,SRT, “Computed Tomography X-Ray”), whereas for a Projection X-Ray report, it may have a value of (113704, DCM, “Projection X-Ray”) or (P5-40010, SRT, “Mammography”), or something else, because these are defined terms.

So, in short, at the root level, the absence of a Template ID is not the end of the world, and a few heuristics might be able to allow a recipient to proceed.

Indeed, if one is expecting a particular pattern based on a particular template, and that pattern "matches" the content of the tree that one has received, does it really matter? It certainly makes life easier though, to match a top level identifier, than have to write a matching rule for the entire tree.

Related to the matter of the identification of the "root" or "top level" template is that of recognizing subordinate or "mini" templates. As you know, most of PS 3.16 is taken up not by monstrously long single templates but rather by invocation of sub-templates. So there are sub-templates for identifying things, measuring things, etc. These are re-used inside lots of application-specific templates.

Certainly "top-down" parsing from a known root template takes one to content items that are expected to be present based on the "inclusion" of one of these sub-templates. These are rarely, if ever, explicitly identified during creation by a Template ID, even though one could interpret that as being a requirement if the language introduced in CP 452 is taken literally. Not all "included" sub-templates start with a container, but many do. I have to admit that most of the SRs that I create do not contain Template IDs below the Document Title either, and I should probably revisit that.

Why might one want to be able to recognize such a sub-template?

One example is being able to locate and extract measurements or image coordinate references, regardless of where they occur in some unrecognized root template. An explicit Template ID might be of some assistance in such cases, but pattern matching of sub-trees can generally find these pretty easily too. When annotating images based on SRs, for example, I will often just search for all SCOORDs, and explore around the neighborhood content items to find labels and measurements to display. Having converted an SR to an XML representation also allows one to use XSL-T match() clauses and an XPath expression to select even complex patterns, without requiring an explicit ID.

David

Saturday, September 7, 2013

Share and share alike - CSIDQ

Summary: Image sharing requires the availability (download and transmission) of a complete set of images of diagnostic quality (CSIDQ), even if for a particular task, viewing of a lesser quality subset may be sufficient. The user then needs to be able to decide what they need to view on a case-by-case basis.

Long Version.

The title of this post comes from the legal use of the term "share and share alike", the equal division of a benefit from an estate, trust, or gift.

In the context of image sharing, I mean to say that all potential recipients of images, radiologists, specialists, GPs, patients, family, and yes, even lawyers, need to have the means to access the same thing: a complete set of images of diagnostic quality (CSIDQ). Note the emphasis on "have the means". CSIDQ seems to be a less unwieldy acronym that CSoIoDQ, so that's what I will use for notational convenience.

There are certainly situations in which images of lesser quality (or less than a complete set) might be sufficient, might be expedient, or indeed might even be necessary to enable the use case. A case in point being the need to make an urgent or rapid decision remotely when there is a only slow link available.

For folks defining architectures and standards, and deploying systems to make this happen, it is essential to assure that the CSIDQ is available throughout. In practice, this translates to requiring that

the acquisition modality produce a CSIDQ,
the means of distribution (typically a departmental or enterprise PACS) in the local environment stores and makes available a CSIDQ,
the system of record where the acquired images are stored for archival and evidential purposes contains a CSIDQ
any exported CD or DVD contains a CSIDQ,
any point-to-point transfer mechanism be capable of supporting transfer of a CSIDQ
any "edge server" or "portal" that permits authorized access to the locally stored images is capable of sharing a CSIDQ on request,
any "central" archive to which images are stored also retain and be capable of distributing a CSIDQ
any "clearinghouse" that acts as an intermediary needs to be capable of transferring a CSIDQ

These requirements apply particularly to the "Download" and "Transmit" parts of the Meaningful Use "View, Download and Transmit" (VDT) approach to defining sharing, as it applies to images and imaging results.

In other words, it is essential that whatever technologies, architectures and standards are used to implement Download and Transmit, that they be capable of supporting a CSIDQ. Otherwise, anything that is lost early in the "chain of custody", if you will, is not recoverable later when it is needed.

From a payload perspective, the appropriate standard for a CSIDQ is obviously DICOM, since that is the only widely (universally) implemented standard that permits the recipient to make full use of the acquired images, including importation, post-processing, measurement, planning, templating, etc. DICOM is the only format whose pixel data and meta data all medical imaging systems can import.

That said, it may be desirable to also provide Download of a subset, or a subset of lesser quality, or in a different format, for one reason or another. In doing so it is vital not to compromise the CSIDQ principle, e.g., by misleading a recipient (such as a patient or a referring physician) into thinking that anything less that a CSIDQ that has been download is sufficient for future use (e.g., subsequent referrals). And it is vital not to discard the DICOM format meta data. EHR and PHR vendors need to be particularly careful about not making expedient implementation decisions in this regard that compromise the CSIDQ principle (and hence may be below the standard of practice, may be misleadingly labelled, may introduce the risk of a bad outcome, and may expose them to product liability or regulatory action).

Viewing is an entirely different matter, however.

Certainly, one can download a CSIDQ and then view it, and in a sense that is what the CD/DVD distribution mechanism is ... a "thick client" viewer is either already installed or executed from the media to display the DICOM (IHE PDI) content. This approach is typically appropriate when one wants to import what has been downloaded (e.g., into the local PACS) so that it can be viewed along with all the other studies for the patient. This is certainly the approach that most referral centers will want to adopt, in order to provide continuity of patient care coupled with familiarity of users with the local viewing tools. It is also equally reasonable to use for an "in office" imaging system, as I have discussed before. It is a natural extension of the current widespread CD importation that takes place, and the only difference is the mode of transport, not the payload.

For sporadic users though, who may have no need to import or retain a local copy of the CSIDQ, many other standard (WADO and XDS-I) and proprietary alternatives exist for viewing. Nowadays web-based image viewing mechanisms, including so-called "zero footprint" viewers, can provide convenient access to an interactively rendered version of that subset of the CSIDQ that the user needs access to, with the appropriate quality, whether using client or server-side rendering, and irrespective of how and in what format the pixel data moves from server to client. Indeed, these same mechanisms may suffice even for the radiologist's viewing interface, as long as the necessary image quality is assured, there is access to the complete set, and the necessary tools are provided.

The moral being that the choice needs to be made by the user, and perhaps on the basis of whatever specific task they need to perform or question they want to answer. For any particular user (or type of user), there may be no single best answer that is generally applicable. For one patient, at one visit, the user might be satisfied with the report. On another occasion they might just want to illustrate something to the patient that requires only modest quality, and on yet another they might have a need to examine the study with the diligence that a radiologist would apply.

In other words, the user needs to be able to make the viewing quality choice dynamically. So, to enable the full spectrum of quality needs, the server needs to have the CSIDQ in the first place.

David

PS. By the way, do not take any of the foregoing to imply that irreversibly (lossy) compressed images are not of diagnostic quality. It is easy to make the erroneous assumptions that uncompressed images are diagnostic and compressed ones are not, or that DICOM images are uncompressed (when they may be encoded with lossy compression, including JPEG, even right off the modality in some cases), or that JPEG lossy compressed images supplied to a browser are not diagnostic. Sometimes they are and sometimes they are not, depending on the modality, task or question, method and amount of compression, and certainly last but not least, the display and viewing environment.

What "diagnostic quality" means and what constitutes sufficient quality and when, in general, and in the context of "Diagnostically Acceptable Irreversible Compression" (DAIC), are questions for another day. The point of this post is that the safest general solution is to preserve whatever came off the modality. Doing anything less than that might be safe and sufficient, but you need to prove it. Further, regardless of the quality of the pixel data, losing the DICOM "meta data" precludes many downstream use cases, including even simple size measurements.

PPS. This blog post elaborates on a principle that I attempted to convey during my recent testimony to the ONC HIT Standards Committee Clinical Operations Workgroup about standards for image sharing, which you can see, read or listen to if you have the stomach for it. If you are interested in the entire series of meetings at which other folks have testified or the subject has been discussed, here is a short summary, with links (or you can go to the group's homepage and follow the calendar link, to future meetings if you are interested in joining them, or past meetings:

2013-04-19 (initial discussion)
2013-06-14 (RSNA: Chris Carr, David Avrin, Brad Erickson)
2013-06-28 (RSNA: David Mendelson, Keith Dreyer)
2013-07-19 (lifeIMAGE: Hamid Tabatabaie, Mike Baglio)
2013-07-26 (general discussion)
2013-08-09 (general discussion)
2013-08-29 (standards: David Clunie)

Also of interest is the parent HIT Standards Committee:

2013-04-17 (establish goal of image exchange)

And the HIT Policy Committee:

2013-03-14 (prioritize image exchange)

PPPS. The concept of "complete set of images of diagnostic quality" was first espoused by an AMA Safety Panel that met with a group of industry folks (2008/08/27) to try to address the historical "CD problem". The problem was not the existence of the CD transport mechanism, which everyone is now eager to decry in favor of a network-based image sharing solution, but rather the problem of inconsistent formats, content and viewer behavior. The effort was triggered by a group of unhappy neurosurgeons in 2006 (AMA House of Delegates Resolution 539 A-06). They were concerned about potential safety issues caused by inadequate or delayed access or incomplete or inadequately displayed MR images. To cut a long story short, a meeting with industry was proposed (Board of Trustees Report 30 A-07 and House of Delegates Resolution 523 A-08), and that meeting resulted in two outcomes.

One was the statement that we hammered out together in that clinical-industry meeting, which was attended not just by the AMA and MITA (NEMA) folks, but also representatives of multiple professional societies, including the American Association of Neurological Surgeons, Congress of Neurological Surgeons, American Academy of Neurology, American College of Radiology, American Academy of Orthopedic Surgeons, American College of Cardiology, American Academy of Otolaryngology-Head and Neck Surgery, as well as vendors, including Cerner, Toshiba, Philips, General Electric and Accuray, and DICOM/IHE folks like me. You can read a summary of the meeting, but the most important part is the recommendation for a standard of practice, which states in part:

"The American Medical Association Expert Panel on Medical Imaging (Panel) is concerned whether medical imaging data recorded on CD’s/DVD’s is meeting standards of practice relevant to patient care.

The Panel puts forward the following statement, which embodies the standard the medical imaging community must achieve.

All medical imaging data distributed should be a complete set of images of diagnostic quality in compliance with IHE-PDI.

This standard will engender safe, timely, appropriate, effective, and efficient care; mitigate delayed care and confusion; enhance care coordination and communication across settings of care; decrease waste and costs; and, importantly, improve patient and physician satisfaction with the medical imaging process."

More recently, the recommendation of the panel is incorporated in the AMA's discussion of the implementation of EHRs, in the Board of Trustees Report 24 A-13, which recognizes the need to "disseminate this statement widely".

The other outcome of the AMA-industry meeting was the development of the IHE Basic Image Review (BIR) Profile, intended to standardize the user experience when using any viewer. The original neurosurgeon protagonists contributed actively to the development of this profile, even to the extent of sacrificing entire days of their time to travel to Chicago to sit with us in IHE Radiology Technical Committee meetings. Sadly, adoption of that profile has been much less successful than the now almost universal use of IHE PDI DICOM CDs. Interestingly enough, with a resurgence of interest in web-based viewers, and with many new vendors entering the field, the BIR profile, which is equally applicable to both network and media viewers, could perhaps see renewed uptake, particularly amongst those who have no entrenched "look and feel" user interface conventions to protect.

Friday, September 6, 2013

DICOM rendering within pre-HTML5 browsers

Summary: Retrieval of DICOM images, parsing, windowing and display using only JavaScript within browsers without using HTML5 Canvas is feasible.

Long Version.

Earlier this year, someone challenged me to display a DICOM image in a browser without resorting to HTML5 Canvas elements, using only JavaScript. This turned out to be rather fun and quite straightforward, largely due to the joy of Google searching to find all the various concepts and problems that other folks had already explored and solved, even if they were intended for other purposes. I just needed to add the DICOM-specific bits. As a consequence it took just a few hours on a Saturday afternoon to figure out the basics and in total about a day's work to refine it and make the whole thing work.

The crude demonstration JavaScript code, hard-wired to download, window (using the values in the DICOM header) and render a particular 16 bit MR image image can be found here and executed from this page. It is fully self-contained and has no dependencies on other JavaScript libraries. The code is ugly as sin, filled with commented out experiments and tests, and references to where bits of code and ideas came from, but hopefully it is short enough to be self-explanatory.

It seems to work in contemporary versions of Safari, Firefox, Opera, Chrome and even IE (although a little more slowly in IE, probably due to the need to convert some extra array stuff, and it seemed to work in IE 10 on Windows 7 but not IE 8 on XP, haven't figured out why yet). I was pleased to see that it also works on my Android phones and tablets.

Here is how it works ...

First task - get the DICOM binary object down to the client and accessible via JavaScript. That was an easy one, since as everyone probably knows, the infamous XMLHttpRequest can be used to pull pretty much anything from the server (i.e., even though its name implies it was designed to pull XML documents). The way to make it return a binary file is to set the XMLHttpRequest.overrideMimeType parameter, and to make sure that no character set conversion is applied to the returned binary stream. This trick is due to Marcus Granado, whose archived blog entry can be found here, and which is also discussed along with other helpful hints at the Mozilla Developer Network site here. There is a little bit of further screwing around needed to handle various Microsoft Internet Explorer peculiarities related to what is returned, not in the responseText, but instead in the responseBody, and this needs an intermediate VBArray to get the job done (discussed in a StackOverflow thread).

Second task - parse the DICOM binary object. Once upon a time, using the bit twiddling functions in JavaScript might have been too slow, but nowadays that does not seem to be the case. It was pretty trivial to write a modest number of lines of code to skip the 128 byte preamble, detect the DICM magic string, then parse each data element successively, using explicit lengths to skip those that aren't needed and skipping undefined length sequences and items, and keeping track of only the values of those data elements that are needed for later stages (e.g., Bits Allocated) and ignore the rest. Having written just a few DICOM parsers in the past made this a lot easier for me than starting from scratch. I kept the line count down by restricting the input to explicit VR little endian for the time being, not trying to cope with malformed input, and just assuming that the desired data element values were those that occurred last in the data set. Obviously this could be made more robust in the future for production use (e.g., tracking the top level data set versus data sets nested within sequence items), but this was sufficient for the proof of concept.

Third task - windowing a greater than 8 bit image. It would have been easy to just download an 8 bit DICOM image, whether grayscale or color, since then no windowing from 10, 12 or 16 bits to 8 would be needed, but that wouldn't have been a fair test. I particularly wanted to demonstrate that client-side interactivity using the full contrast and spatial resolution DICOM pixel data was possible. So I used the same approach as I have used many times before, for example in the PixelMed toolkit com.pixelmed.display.WindowCenterWidth class, to build a lookup table indexed by all possible input values for the DICOM bit depth containing values to use for an 8 bit display. I did handle signed and unsigned input, as well as Rescale Slope and Intercept, but for the first cut at this, I have ignored special handling of pixel padding values, and other subtleties.

These first three tasks are essentially independent of the rendering approach, and are necessary regardless of whether Canvas is going to be used or not.

The fourth and fifth tasks are related - making something the browser will display, and then making the browser actually display it. I found the clues for how to do this in the work of Jeff Epler, who described a tool for creating single bit image files in the browser (client side) to use as glyphs.

Fourth task - making something the browser will display. Since without Canvas one cannot write directly to a window, the older browsers need to be fed something they know about already. An image file format that is sufficient for the task, and which contributes no "loss" in that it can directly represent 8 bit RGB pixels, is GIF. But you say, GIF involves a lossless compression step, with entropy coding using LZW (the compression scheme that was at the heart of the now obsolete patent-related issues with using GIF). Sure it does, but many years ago, Tom Lane (of IJG fame) observed that because of the way LZW works, with an initial default code table in which the code (index) is the same as the value it represents, as long as one adds one extra true bit before each code, and resets the code table periodically, once can just send the original values as if they were entropy coded values. Add a bit of blocking and a few header values, and one is good to go with a completely valid uncompressed (albeit slightly expanded) bitstream that any GIF decoder should be able to handle. This concept is now immortalized in the libungif library, which was developed to be able to create "uncompressed GIF" files to avoid infringing on the Unisys LZW patent. Some of the details are described under the heading of "Is there an uncompressed GIF format?" in the old Graphic File Formats FAQ, which references Tom Lane's original post. In my implementation, I just make 9 bit codes from 8 bit values, and added a clear code every 128 values, and made sure to stuff the bits into appropriate length blocks preceded by a length value, and it worked fine. And since I have 8 bit gray scale values as indices, I needed to populate the global color tables that mapped each gray scale index to RGB triplets with the same intensity value (since GIF is an indexed color file format, which is why GIF is lossless for 8 bit single channel data, but lossy (needs quantization and dithering) for true color data with more than 256 different RGB values).

Fifth task - make the browser display the GIF. Since JavaScript in the browser runs in a sort of "sand box" to prevent unsecure access to the local file system, etc., it is not so easy to feed the GIF file we just made to the browser, say as an updated IMG reference on an HTML page. It is routine to update an image reference with an "http:" URL that comes over the network, but how does one achieve that with locally generated content? The answer lies in the "data:" URI that was introduced for this purpose. There is a whole web site, http://dataurl.net/, devoted to this subject. Here, for example, is a description of using it for inline images. It turns out that what is needed to display the locally generated GIF is to create a (big) string that is a "data:" URI with the actual binary content Base64 encoded and embedded in the string itself. This seems to be supported by all recent and contemporary browsers. I don't know what the ultimate size limits are for the "data:" URL, but it worked for the purpose of this demonstration. There are actually various online "image to data: URI convertors" available, for generating static content (e.g., at webSemantics, the URI Kitchen) but for the purpose of rendering DICOM images this needs to be done dynamically by the client-side JavaScript. Base64 encoding is trivial and I just copied a function from an answer on StackOverflow, and then tacked the Base64 encoded GIF file on the end of a "data:image/gif;base64," string, et voilà!

Anyway, not rocket science, but hopefully useful to someone. I dare say that in the long run the HTML Canvas element will make most of this moot, and there are certainly already a small but growing number of "pure JavaScript" DICOM viewers out there. I have to admit it is tempting to spend a little time experimenting more with this, and perhaps even write an entire IHE Basic Image Review Profile viewer this way, using either Canvas or the "GIF/data: URI" trick for rendering. Don't hold your breath though.

It would also be fun to go back through previous generations of browsers to see just how far back the necessary concepts are supported. I suspect that size limits on the "data:" URI may be the most significant issue in that respect, but one could conceivably break the image into small tiles, each of which was represented by a separate small GIF in its own small enough "data:" URI string. I also haven't looked at client-side caching issues. These tend to be significant when one is displaying (or switching between) a lot of images or frames. I don't know whether browsers handle caching of "data:" URIs objects differently from those fetched via http, or indeed how they handle caching of files pulled via XMLHttpRequest.

Extending the DICOM parsing and payload extraction stuff to handle other uncompressed DICOM transfer syntaxes would be trivial detail work. For the compressed transfer syntaxes, for single and three channel 8 bit baseline JPEG, one can just strip out the JPEG bit stream from its DICOM encapsulated fragments, concatenate and Base64 encode the result, and stuff each frame in a data:url with a media type of image/jpeg instead of image/gif. Same goes for DICOM encapsulated MPEG, I suppose, though that might really stretch the size limits of the "data:" URI.

Since bit-twiddling is not so bad in JavaScript after all, one could even write a JPEG lossless or JPEG-LS decoder in JavaScript that might not perform too horribly. After all, JPEG-LS was based on LOCO and that was simple enough to fly to Mars, so it should be a cakewalk in a modern browser; it is conceptually simple enough that even I managed to write a C++ JPEG-LS codec for it, some time back. That said, modest compression without requiring an image-specific lossless compression scheme can be achieved using gzip or deflate (zip) with HTTP compression, and may obviate the need to use a DICOM lossless compression transfer syntax, unless the server happens to already have files in such transfer syntaxes.

Doing the JPEG 12 bit DCT process might be a bit of a performance dog, but you never know until someone tries. Don't hold your breath for these from me any time soon though, but if I get another spare Saturday, you never know ...

Oops, spoke to soon, someone has already done a pure JavaScript JPEG decoder ...

nanos gigantum humeris insidentes

David