Saturday, January 21, 2012

Two (or More) Views of the World ...

Summary: Providing alternative "views" of studies via DICOM Q/R and other services allows legacy single frame and new enhanced original or converted multi-frame image header and pixel data to peacefully co-exist.

Long Version:

Needless to say, and tempting as it may be on this day of the South Carolina GOP Primary, I am not talking about alternative political views, but rather something far more interesting and important to the fate of the free world ...

As discussed in an early post, Framing the Big Study Problem, just because one does not get Enhanced CT/MR/PET multi-frame images from one's modalities, does not mean that the "legacy" single frame images cannot be converted into a similar representation. The DICOM Standards Committee has agreed to a work item to define a standard mechanism for doing that, and work is on going.

However, an interesting question has arisen, and that is, how does one distinguish between the "original" images as received from the modality, and the "converted" ones?

Or to put this another way, if the converted images are naively "added" to the same study in the PACS, then in the worst case one ends up with twice as much data in the same study, and a study level retrieval would retrieve all of it, unless there were some filtering mechanism.

Indeed, when you think about it, the requester (Q/R SCU) probably doesn't care which were "original" and which were "converted", but rather is interested in the form, i.e., whether they are "legacy" single-frame, or enhanced multi-frame, and perhaps in the latter case whether they are "genuine" enhanced multi-frame with all the standard attributes and codes, or whether they are "enhanced legacy converted" multi-frame, with only a limited amount of standard stuff but with the consolidated "header" and aggregated pixel data.

At the last ad hoc WG meeting, we discussed the idea of having two (or perhaps more) "views" of the data, and providing the Q/R SCU with the ability to choose which view they wanted.

I am not a database guy (and don't play one on TV), and I am certainly not well versed in the theory of relational databases, but the concept of a database "view" is well established (see for example, the Wikipedia description of a database view). As a historical note, the term "view" was even used in the very earliest relational database literature, to describe the concept of a "relational view" (Codd EF. A relational model of data for large shared data banks. CACM 1970 13:377), and indeed that paper also distinguishes the concept of a "stored set" from an "expressible set". Arguably, the concept of different views is perhaps similar to the concept of "subschemas" introduced in the CODASYL Database Task Group (DBTG) 1971 Report Data Definition Language (DDL), the idea therein being to restrict the application view of the database to a sub-set of the entire schema, and to allow multiple such subschemas, under the control of the database administrator. (It brings joy to the heart of an old COBOL programmer reminiscing about what fun this stuff used to be; I really must dig out my old manuals from storage.)

Anyway, it seems obvious to me that the basic idea of providing alternative ways to look at the same data and relationships is equally applicable to the access mechanisms that are relevant to this application, specifically the DICOM Query and Retrieval Service Class, or yet another mechanism for accessing the DICOM data, such as via the http-based services that DICOM WG 27 is developing (like QIDO). The definition of a "view" in the relational database world currently seems to be specific to the distinction between what the "native" representation in the tables is, as opposed to something that is derived dynamically via a query or via stored procedures, but that is probably not a distinction that is useful to us from the perspective of defining the interface boundary (though it is relevant to implementation, vide infra).

For our purposes then, we need to present the user (or their agent, the workstation) with two or more "views" of the same set of slices (in the case of a cross-sectional modality like CT, MR or PET), such that their experience is identical, regardless of the underlying representation or physical encoding or means of transfer. If the application they are using is only single-frame aware, then the query from that application to the PACS will request and receive information about one set of objects, and if it is multi-frame aware it will see a different set of objects. Semantically these will contain identical information. What they will not see will be duplicates (i.e., the same acquisition in two different forms).

The idea of these two views is equally applicable to both the query for information about the study, and the retrieval of the study and its components. That is, when the query is performed, it should return attributes of only the instances in the view that is specified, and when a retrieval of those instances is performed (whether it be at the patient, study, series, instance (image) or even frame level), only those entities in the requested view should be returned.

Note that referential integrity is important also. By that I mean that references to converted images within presentation states, structured reports, other similar non-image objects, and indeed references from one image to another (e.g., from a transverse slice to a localizer, or a derived image like an MPR to a source image) all need to be "updated" to point to the converted images, not the "original" images, as necessary. In my opinion the appropriate "scope" for preservation of this referential integrity is the entire patient, not just the study, since a current study may refer to images or other instances within a prior study for the same patient. The net effect of this is that more objects than just images may need to be "converted" in order to support an internally consistent "view".

How this is achieved behind the scenes in terms of implementation offers interesting opportunities for performance optimization. When a PACS receives images in one form, it could convert them into the other form and store both, and index both in its database, though that would seem wasteful. An obvious optimization is to store only one form and dynamically create the other form on demand, as long as one does this in a deterministic way (such that the unique identifiers generated on the fly were always consistent with previous requests for the same view). Perhaps the determinism constraint could be satisfied simply by storing the generated UIDs (rather than the whole object) in the database, on initial insertion, or just the first time they were needed.

An interesting side effect of this is that the implementer may have a choice of which is the optimum representation internally to achieve specific performance goals in specific scenarios. For example, it may be that on-demand retrieval for viewing, both in proprietary viewers built in to the PACS, or retrieval to standard displays or workstations, may be achieved by using the multi-frame representation internally, since the aggregated bulk pixel data might be more efficiently compressed, indexed and streamed, and the consolidated "header" information with its structure already factored into shared and per-frame information as well as with explicit dimensions might be more quickly transferred, parsed, navigated, indexed and retrieved for annotation, etc. Yet this can be achieved without sacrificing the benefit of using a standard DICOM PS 3.10 object inside the archive, both long term and in the short term "live" store (or cache or whatever), whether it be locally or centrally. If no other factor, the database tables indexing this stuff will certainly be more compact, given the lack of need to use a per-slice image table entry for each file, or at least one with a full set of per-image attributes as columns.

Yet at any time, the original (as received) single frame representation can be recovered from this "more efficient" internal representation, as long as an appropriate full fidelity round-trip can be defined and implemented. Arguably, it is also easier and faster to convert enhanced multi-frame to single frame images than it is the other way around, since there is a lot less "analysis" that has to be performed (i.e., trying to decide what is common and what varies per-frame and to split it into the appropriate functional groups, and then to look for standard patterns of use of dimensions like space, time and cardiac cycle position). In my experiments to date, even for very large sets of slices, the rate limiting step in either direction is limited by disk IO and header parsing, and not the business logic that is required to do the grouping and ungrouping, but I would expect that in a real-world implementation this would depend heavily on the specifics of a particular large, scalable PACS application architecture.

Like with web server front ends to PACS, it is very likely that some sort of intelligent pre-caching logic would need to be placed in the pipeline, such that predicted (as well as repeated) requests for a particular choice for a particular patient or study, could be pre-computed and stored in the cache in the likely preferred form if it is different from the natively stored form. For example, if from the modality worklist it was apparent that priors would be needed for comparison during reporting, and the priors were natively stored as single frame, yet the (new) modality produced enhanced multi-frame images, and the PACS and the reading workstation only supported multi-frame, then the PACS would know to pre-fetch, pre-convert the single frame priors to enhanced multi-frame and pre-load them to the reading station cache. As opposed to doing it on the fly if the user realized they needed these priors unexpectedly.

Some outstanding questions remain with respect to how best to negotiate and specify the selection of these views, how many views are necessary, whether it is necessary to have views that distinguish "original" (as received) objects from converted ones, and what the default view should be if it is not explicitly specified or if the view option is not negotiated. The last issue is particularly important with respect to legacy workstations and PACS that will not know how to request the option, yet may support the enhanced multi-frame objects, though it may be there is such poor support for enhanced multi-frame objects at the user-facing application level (as opposed to simple store and regurgitate in the PACS) that this is not a practical problem.

What I have written up so far in the draft supplement proposes that the negotiation of the ability to support views is an additional Extended Negotiation option on both the query and retrieval, and that if this option is successfully negotiation, a new Query/Retrieve View attribute can be included in the C-FIND, C-MOVE or C-GET request identifier, which then modulates the behavior of the SCP (i.e., effectively "filters" the set of data visible such that it is constrained to the specified view). I have specified this to also apply to the instance and frame-level retrieval and the no-bulk data retrieval, not because it is necessary to specify the view to select the objects (these require SOP Instance UIDs at the instance level already, which will be specific to the view), but rather because the retrieved objects will need to contain the correct referenced UIDs for the appropriate view (to maintain referential integrity within the view).

The foregoing all applies to "pull" (query and retrieval) use-cases and does not effect push use-cases. If an SCU and an SCP have a choice in the matter of whether to send legacy single frame images, genuine enhanced multi-frame images, or enhanced legacy converted multi-frame images, this still needs to be handled either as a matter of configuration, or by dynamic SOP Class negotiation during Association Negotiation, with the opportunity for the SCU to "fall back" to a less desirable SOP Class in the normal way, as discussed in the earlier post.

In a way the concept of "views" is vaguely reminiscent of a very specific form of the "filtering" operation that was part of the old OSI "scoping and filtering" capability that was potentially applicable to DICOM normalized objects but never included in the original standard, though of course, the filtering was way more versatile with boolean expressions, we are talking here about services on composite, not normalized objects, and DICOM never did follow through on the prospect of generalized normalized object management, nor is it really applicable to these highly specific use cases. Still, just goes to show that there is nothing new under the sun.

Anyway, it remains to be seen what the AHWG group comes up with on Monday when we review the proposal and it is subject to more detailed scrutiny. Sufficeth to say in our meetings so far we have been making fairly rapid progress, thanks in no small part to the number of vendors and others who have already been experimenting with multi-frame representations, and hopefully we will have something fairly solid to share with the world at large shortly. Anyone with a technical knowledge of the subject matter is of course welcome to join our group, physically or virtually at any time; just let me know.

David