Sunday, May 8, 2016

To C-MOVE is human; to C-GET, divine

Summary: C-GET is superior to C-MOVE for use beyond the firewall; contrary to some misleading reports, it has NOT been retired from DICOM, and implementations do exist.

Long Version.

With apologies to Alexander Pope, I wanted to draw attention to what appears to be a common misconception, that DICOM C-GET is retired or obsolete or deprecated.

C-GET is not retired; it most definitely is alive and well, and more importantly, useful.

C-GET is especially useful for DICOM use over the public Internet, beyond the local area network.

As you know, by far the most common way to retrieve a study, series or individual instances is to use a C-MOVE request, which instructs the server (SCP) to initiate the necessary C-STORE operations on one or more different connections (associations) to transfer the data.

This necessitates:
  • the requester being able to listen for and accept inbound connections (i.e., be a C-STORE SCP),
  • that any impediments on the network (like firewalls) allow such inbound connections,
  • that the sender be configured with the host/IP address and port of the requester (since only the Destination AET is communicated in the C-MOVE request), and
  • that Network Address Translation (NAT) be correctly configured to forward the inbound connections to the requester.
By comparison, a C-GET request does not depend on separate associations being established, but rather "turns around" the same connection on which the request is made, and re-uses it to receive the inbound C-STORE operation. I.e., it is just like an HTTP GET in that all the data comes back on the same connection. It is similar in functionality to a Passive FTP transfer, though in ftp there are actually two separate connections, though both are initiated by the requester (one for commands and one for data).

With all three protocols, DICOM C-GET, HTTP GET and Passive FTP GET, there is:
  • no need for the requester to be able to respond to inbound connections
  • no need to configure firewalls to allow inbound connections or perform NAT, and
  • no need (other than for access control) to configure the sender to know anything about the requester.
Of course, firewalls may also restrict outbound connections, but that affects all protocols similarly.

All three protocols can of course communicate over secured channels, whether by using TLS or a VPN.

So, if C-GET is so useful, why is it not as commonly implemented?

Historically, when DICOM was first getting started and being used mostly for mini-PACS clusters of acquisition modalities and workstations, the thinking of the designers went something like this. First, I have to be able to send and receive images by pushing them around, so I have to implement C-STORE as an SCU and SCP. Now, the product manager says I have to allow users to pull them too, so the easiest way is to write a C-MOVE SCU and SCP to command that the transfer takes place, but I can just reuse the existing C-STORE SCU and SCP code that I have already written. I only have a handful of devices to connect on the LAN, so the administrative burden of configuring them all to know about each other is not an issue. QED.

As smaller systems were scaled to enterprise level, and larger proprietary systems added DICOM Q/R capability to allow the same mini-PACS workstations to gain access to the archive, the use of C-MOVE became entrenched, without much further thought being given to the potential future benefits of C-GET for use beyond the walls of the enterprise or on a really large scale. Much later, IHE specified C-MOVE for the Retrieve Images (RAD-16) transaction (in Year 2 for 2000), which subsequently became part of the Scheduled Workflow Profile, but did not mention C-GET, presumably because the conventional wisdom at the time was that C-MOVE was much more widely implemented.

So who does support C-GET?

A Google search reveals quite a few systems that do. There are some open source or freely available SCUs and SCPs too. When I monitor at Connectathons, it is extremely convenient to be able to retrieve stuff from testers' systems (to compare what they have with what is expected) without having to go and bother them to add my configuration for C-MOVE, and off hand I would guess about 15-25% of the systems respond to a C-GET, including, of course, the central archive, which for the last few years has been dcm4chee. Dave Harvey's publicly accessible server and PixelMed's support C-GET, as do clients like Osirix, though I don't think either ClearCanvas or K-PACS do:(

The tricky thing with implementing C-GET as an SCU is the Association Negotiation, and particularly the (annoying, gratuitous, arbitrary) limit on the total number of Presentation Contexts caused by the "odd integers between 1 and 255" requirement on the single byte Presentation-context-ID. The naive (though inefficient) approach of listing all possible (storage) SOP Classes permuted with all possible Transfer Syntaxes reaches that limit quickly nowadays. Allowing the SCP to choose the Transfer Syntax, and using SOP Classes in Study from an earlier STUDY level C-FIND (or using plausible SOP Classes based on Modalities in Study, or if these are not supported as return keys by the C-FIND SCP, Modality from a SERIES level C-FIND, or worst case, the SOP Class UID from an IMAGE level C-FIND) helps a lot with this, though does limit the re-usability of the Association if you want to keep it alive in a "connection pool" for later retrievals.

From a performance perspective, single connection C-GET and C-MOVE are similar, which is not surprising since both are often limited by latency effects on the synchronous C-STORE response. In the absence of Asynchronous Operations support, it is obviously easier to accelerate C-MOVE by opening multiple return Associations across which to spread the C-STORE operations, which one can't do with C-GET, unless one selectively retrieves at the IMAGE level, which is possible, but tedious to set up and requires an initial IMAGE level C-FIND to get SOP Instance UIDs. Using large multi-frame images instance mitigates this issue.

It would be interesting to see, for the simple pull use case, how close the C-GET with Asynchronous Operations support could approach raw socket transfer speeds though, and how it would compare with an HTTP GET or Passive FTP GET.

The security considerations (include channel confidentiality, access control and audit trail) would seem to be similar for C-GET and C-MOVE, and both TLS and user identity communication are available if necessary.

David

PS. I was motivated to write this when I noticed that Sébastien Jodogne says in Note 1 of his description of "C-Move: Query/retrieve" documenting his Orthanc server:

"Even if C-Move may seem counter-intuitive, it is the only way to initiate a query/retrieve. Once upon a time, there was a conceptually simpler C-Get command, but this command is now deprecated."

I asked Sébastien where he got this impression and attributes the source of his confusion to this post by Roni Zaharia. Both are incorrect in this respect.

During the great DICOM purge of 2006 (Sup 98), though the Patient/Study Only Query/Retrieve Information Model was retired from the Query/Retrieve Service, C-GET was left alone, and none of the other Supplements or CPs related to retirement touched it either. On the contrary, subsequent additions to the standard to support Instance and Frame Level Retrieve and Composite Instance Retrieve Without Bulk Data (Sup 119) extended the use of C-GET significantly.

Sébastien profusely apologizes for relying on hearsay and failing to check the standard, and hopes to implement C-GET when he has a chance.

PPS. I observe in passing that Roni also recommends the use of Patient Root rather than Study Root queries, which I would strongly disagree with. In the early days, many systems' databases were implemented with the study as the top level and the patient's identifiers and characteristics were managed as attributes of the study, if for no other reason than HIS/RIS integration was not as common as it is today, and patient level stuff was often inconsistent and/or incorrect. IHE, for example, when Q/R was added in Year Two, specified the Study Root C-FIND as required and the Patient Root as optional for the Query Images (RAD-14) and Retrieve Images (RAD-16) transactions, and that is still true in Scheduled Workflow today. I never use Patient Root if I can avoid it, and Roni's assertion that "everyone supports it" certainly didn't used to be true.

PPPS. Some old comp.protocols.dicom posts on the subject of C-GET include the following, which show the "evolution" of my thinking:

C-MOVE vs. C-GET
Difference between C-GET and C-MOVE
DICOM retrieve (C-GET-RQ) example anyone?
C-GET vs C-MOVE (was Retrieving off-line studies from DICOM archive)
C-Get versus C-Move, was Re: C-Move

No comments: