David Clunie's Blog: 2007

Sunday, November 18, 2007

An impending reality - the Patient Contributed Image Repository

Summary: A test site for the PCIR is now up and running, allowing contribution and downloading; feedback is sought on the idea, the contribution process, the contribution agreement, and the site design itself. Go to "http://www.pcir.org/".

Long Version.

As you may recall from an earlier blog entry, I have been exploring the feasibility of a repository to which members of the general public could contribute their own digital medical images. Rather than wait for some grand scheme involving multiple protagonists and sources of funding to come together, I thought that it might be easier just to "build it" in the hope that "they will come". The "they", in this case, being patients willing to contribute their data.

By leveraging some very simple tools, existing relatively cheap web and data hosting services and my own time and funds, this turned out to be relatively straightforward, at least for the initial pilot.

If you wish to take a look at the test site that I have created, go to "http://www.pcir.org/". Send any comments you have back to me at "mailto:dclunie@dclunie.com".

The principles behind this site are straightforward, if perhaps somewhat naive:

that patients have an interest in promoting the common good;
that they can be convinced that contributing their own images to the public domain is for the common good;
that if only a modest level of effort is required they would be willing to do so;
that patients have sufficient basic computer skills, equipment and fast enough connections to do so;
that patients will be satisfied that their privacy will be protected;
that providing unrestricted downloading will disseminate the images to the most users;
most important of all, that the images will actually be useful.

As I said in the introduction, the currently deployed site is a test site, and this is the pilot phase of the project. Success criteria for the pilot include confirmation that:

there is sufficient interest in the concept to justify proceeding
the ease of use is within range of the target audience of non-medical, non-IT patient contributors
the level of effort to obtain images +/- accompanying information is feasible
the type of images and information collected will be of sufficient use
the concept of anonymous contribution to the public domain stands up to legal and ethical scrutiny

The next steps, if the pilot is successful, are to:

form the non-profit corporation to manage the effort,
have the "contribution agreement" tidied up by the lawyers,
start soliciting contributions of images and funds from the general public, and
engage the advocacy organizations in promoting and supporting the effort.

If you have any interest in assisting with any aspect of this, please contact me directly. All features of the site are accessible from the PCIR home page.

Approach.

The basic approach is that:

patients agree to contribute their own images and documents TO THE PUBLIC DOMAIN
the PCIR receives and de-identifies their images and documents
the PCIR distributes the de-identified set to ANYONE WITHOUT RESTRICTION

Or, to put this another way, there is no "consent" to particular uses, and there are no "data use agreements".

The definition of "public domain" is somewhat nebulous in this context; it is well-defined with respect to "creative works" like art and music and literature (and even computer software), but it is unlikely that medical images are creative works. The term is also used in the context of patents and land. Perhaps there can be no formal definition of "public domain" with respect to medical images, or medical records in general, until the term is used in a legislative or common law context to apply to such things, or until it is declared that they should be treated as if they were creative works and subject to copyright (not that I am advocating the latter). Regardless, for the PCIR's purposes, the analogy to creative works may suffice to convey the intent of both the contributor and the PCIR in this regard.

Do patients' even have the right to contribute their own images ? For that matter who actually owns them ? Certainly in the US, the HIPAA Privacy Rule has clarified that patients have a right to a copy of their medical record, regardless of who owns the "original". This seems to be a general principle that spans international boundaries, including in Europe, where the Privacy Directive specifically addresses access rights in general, not just to medical records. We assume that medical images are to all practical intents and purposes also medical records; though some medical records departments in hospitals may deny this, that seems to be because they do not store them (nor have a responsibility to), the radiology department does. The PCIR agreement in its current draft proposes that contributors do have such a right and are agreeing that they are not constrained in any manner from exercising it. This seems to be a reasonable strategy until somebody argues otherwise.

Implementation.

You may also be interested in some of the details of how this test site currently works.

To make tractable maintenance of the informative web pages, I use Apache Forrest. This tool, as discussed in a previous blog entry, allows one to construct the source form of the text and organization and external links in a simple XML format, and then to "build" the site using an appropriate "skin" to generate the look and feel. I can't really say enough good things about Forrest. I dare say many folks have commercial web site design tools that are more sophisticated and produce a more visually appealing result, but for the humble novice like myself who is more comfortable at the command line with a plain text editor, Forrest gets the job done.

The uploading tool when the patient decides to make a contribution is a Java Applet. This approach was chosen because a platform-neutral approach is a basic requirement; I dare say something Microsoft Windows specific would cover the majority of potential contributors, but I do not want to exclude anyone if possible. Using Java applets requires that the contributor's browser be both capable of and enabled to allow these to work. The invocation of the applet is through an HTML page that will prompt the user's browser, or the user themselves if necessary, to install a sufficiently recent version of Java to work. The lowest level of JRE that will work is SE5 (1.5), due to the need for support of various encryption features used by the applet.

The applet also requires access to resources on the local machine, both in order to read files and CDs to be uploaded, as well as to be able to transfer these over the network to the PCIR. This requires a signed applet, and for the user to agree to "trust" the applet. It seems to have become relatively commonplace nowadays for users to routinely click on "yes I trust you", pretty much regardless of the source of the applet.

It is possible to use a "self-signed certificate" to sign a Java applet and allow it to work, as long as the user does not mind seeing a message that the "digital signature has not been verified"; if one goes to the trouble of obtaining a legitimate code signing certificate from a certifying authority that is installed in the JRE, then the user instead sees a message that the "digital signature has been verified". The difference, frankly, is a little subtle. For the purposes of the test site, the applet used has been signed by PixelMed's verifiable certificate from Comodo.

[As an aside, getting such a code signing certificate is actually reasonably cheap and not too difficult; being inherently cheap, I searched long and hard using Google to find the lowest cost provider. Verisign charges a fortune for these; Thawte is not much cheaper. Comodo themselves have a relatively high price on their own site, but their reselling partners are generally much cheaper. I ended up using KSoftware; their price is right ($USD 85 for one year), and though their web site sucks, and causes all sorts of browser error messages, and will not accept credit card numbers until you give up and use their PayPal payment method, eventually you get to the Comodo site and things go smoothly from there. Since PixelMed is a legitimate business entity already, I had no trouble providing the appropriate credentials (in this case a bank statement by fax) and got the certificate almost immediately. I was also worried about getting just a Microsoft Authenticode certificate, which is all Comodo offer, since I had read all sorts of early posts about how to convert the various certificate forms from one to another and into something that the Java jarsigner can use. I need not have worried; since I was using Firefox (on a Mac as it happens), when I picked up my certificate it got automatically saved in the browser's collection of certificates. All I needed to do was then "export" it (to a PKCS12 file, as it happens), specifying a password for that exported file that I would need every time I signed with it; it worked fine with jarsigner, by specifying the exported file as the "-keystore" command line option, and using the "-storetype pkcs12" option, though I am not sure if that is strictly necessary). The CAcert Wiki was somewhat helpful in figuring out some of this.]

How does the applet manifest itself to the user ? Well, when the user navigates to the page, it checks to see if they have agreed to the contribution agreement, if not it asks them to do so, then displays the applet in the page. The user can:

specify a reason for the exam that they are going to upload,
upload an entire CD (e.g., of DICOM images), or
upload selected image files (e.g., of scanned documents like reports)

Once they have chosen an upload option, a file dialog appears to allow them to choose what to upload, and then packaging, compression, encryption and transfer begins immediately. When the process is complete, they can upload more if they like.

You can try this out yourself by going to the PCIR upload page, and uploading your own images; please be sure that you really do agree to contribute these to the public domain if you do so.

What is happening behind the scenes is that:

on starting the applet, any existing session information (stored in local preferences) is checked, so as not to keep asking the user to re-agree
if it is a new session, then the agreement itself is downloaded from the web site (in order to keep the web site version and the applet displayed version in concordance) and rendered in a dialog box to the user; they must agree to in order to proceed
when "enter reason" is clicked, a pop-up dialog is opened with buttons that have automatically generated tear off menus attached to them - these menus allow the user to choose from a pre-defined hierarchy (by category or by alphabetical nesting) of reasons for imaging exams (more about this later)
when upload disk or files is selected, a file chooser dialog appears; the reason for the separate buttons are two-fold; firstly, that the default directory is different (e.g., to the "My Computer" directory on Windows for CDs, or the "My Documents" folder on Windows for files); secondly, there is a well-known Java bug related to not being able to select entire drives under Windows
once the user has selected a CD or a set of files, these are packaged into a zip file, compressed whilst doing so, and encrypted using an AES symmetric cipher
the packaged, compressed and encrypted files are then transferred to the PCIR server, together with an RSA encrypted copy of the symmetric key encrypted with the current PCIR uploading public key, as well as an encrypted copy of the contribution agreement; the received files are not accessible for downloading

Note that the files chosen by the user never leave their computer in an encrypted form, satisfying is a primary requirement of the upload process to protect the contributors privacy, which is of course of paramount concern.

Once uploaded, the files enter a manually supervised de-identification process and all images and documents are both mechanically and visually checked for leakage of identifiable information, which is then removed. This includes:

editing of the pixel data to remove burned in identification
removal of all text strings that contain identifying information
checking and removal of either all private attributes, or those that are unsafe

Dates and times are normalized to an epoch, and longitudinal contributions (exams for the same patient on different dates) maintain their relative temporal interval. Some effort is applied to detecting separate contributions for the same individual on different occasions, both by matching of one way hash values derived from original identifiers, as well as through detection of persistent session information ("cookies") set in the user's computer's preferences (which helpful only if they use the same computer and same account on it to perform successive uploads).

On the download side, since this is only a test site, there is relatively little present; just a few examples. The primary requirement here is to make bulk downloading easy; no unwieldy "shopping cart" interfaces here. The de-identified exams are packaged up as a single set into bzip'd tar files. The reasoning for this is explained in the FAQ, but in short is to make the most efficient use of the bandwidth and storage available in lieu of their being any need to "browse" or "visualize" individual images from the PCIR website itself; i.e., you need to download and unpackage the set to use them.

Entering Reasons and Other Conditions

One of the core issues with having patients contribute their own images, is that those images would be more useful in context than alone. The PCIR site tries to encourage the contributor to also scan their radiology and pathology reports, but frankly, this may be too burdensome for many of them. With luck, some uploaded CDs may contain at least the radiology reports. A modest amount of information may occasionally be present in the DICOM image headers. As a fall back position, better than no information at all might be something that the patient themselves was willing to enter.

Accordingly, I put some effort into constructing a set of menus from which the patient could chose from a list of categories. After looking at a bunch of different available coding schemes, including SNOMED, ICD-9CM and ICD10CM, I finally settled on the Medical Subject Headings (MeSH) used by the NLM to index articles in medical journals. MeSH seemed to offer a comprehensive range of terms without being too detailed, is not encumbered by expensive or nationally-specific licensing restrictions, can be downloaded in an easily processable XML form, and most importantly, was already organized into hierarchies that translated well into menus. Some massaging was required for the lay person (e.g., to turn words like "neoplasm" into "cancer"), and there were a few missing critical categories (such as for healthy screening exams).

Let me know what you think of the result, which you can test by going to the PCIR upload page and clicking "Enter Reason".

David

Sunday, October 14, 2007

On generating and searching static web pages

Summary: Adding search capability to static pages is easy with Google; creating heavily template-based web pages with Apache Forrest is not quite so easy but worth the effort, there are some nice templates around like Mollio.

Long Version.

Anyone who has looked my web pages knows that they are lacking in style, both figuratively and literally, at least with respect to appearance. My pitiful excuse is that they started out as a place to disseminate the Medical Image Format FAQ, and since that started out as a set of plain text files posted via Usenet Newsgroups, the bulk of the material had no style to start with. Since then, I guess I have just focused more on content than appearance; shameful, I know.

But recently I have been thinking about how to create a more friendly web site, specifically in the context of the "Patient Contributed Image Repository" site that I am working on building. Since the primary audience will be ordinary people like patients and their families, a pleasant, professional appearance with easily navigable and clearly readable content is required. At the same time I have been working on the XML representation of the DICOM Standard, which we are doing in DocBook, and though I have previously used XSL-T quite a lot to transform structure and extract content, I have also been forced to learn something about CSS as well.

Accordingly, I began looking around for both nice "templates" to use, as well as ways of automating the transformation of structured content into web pages (without having to re-invent this from scratch).

I am looking only for straightforward navigation and layout, preferably CSS-based, since frames and tables seem to be regarded as passé these days. Frames in particular seem to be positively "harmful" in some folks opinion. Table-based layout versus CSS -based layout seems to be more a question of ease versus browser compatibility (see for example, Tables Vs. CSS - A Fight to the Death, and Why avoiding tables (for layout) is important). Strict XHTML requires the use of styles anyway, forbidding the legacy appearance related tags, though of course one can still use tables for layout; but the writing is on the wall, avoiding CSS is just not an option. But such stylesheets are potentially sufficiently complex that using somebody else's professionally designed template seems like a good tactic, especially if that professional is a trained and/or experienced graphic designer or artist.

In my hunt for nice templates for web sites (as opposed to entire documents), the only (free and reusable) ones I have come across so far that I liked enough to recommend are those from Mollio, which have a stark simplicity with sufficient functionality to match most typical modern web sites.

But I was still faced with having to do a lot of untidy manual cutting and pasting on multiple pages, as well as maintaining many internal and external navigation links. Most tedious. Being both an XSL-T and DocBook aficionado, I was most pleased to discover the "Website" package amongst the (many) types of DocBook stylesheet generated output possibilities, and even more pleased to discover that it was reasonably thoroughly documented in the standard text, "DocBook XSL: The Complete Guide". However, before getting too far into playing with it, I found what seemed to be a more "active" set of tools developed from DocBook Website called SilkPage. These stylesheets seemed to be quite a lot easier to use, and more thoroughly documented. Some preliminary experiments were quite promising. However, yet more searching revealed the existence of the Apache Forrest project, which seems to have taken over where SilkPage left off (and indeed the SilkPage developer, Sina K. Heshmati, seems to have moved on to Forrest). This is dead easy to get going (all pure Java and client-side), including generating a "seed" set of pages with a single command, which can then be edited to include the outline, content and layout that you desire. Though Forrest is still in development and not officially released yet, what is currently supplied looks like it works pretty well, and the default appearance of the seed "project" looks pretty good using the supplied appearance ("skin"), with more skins promised in the future (if you don't want to create your own). You can see some real-world examples here. Even better, Forrest promises DocBook support as well, though the primary "content" format seems to something called "xdoc", which contains a limited set of tags and I gather grew out of the Maven project. I haven't experimented sufficiently yet to decide whether xdoc or DocBook will be more suitable for my new web pages; there is probably a lot more tooling for the latter, but if the former is sufficient I may well opt for its simplicity.

Anyway, if I find anything better I will let you know, but for the time being Forrest seems to satisfy my relatively straightforward requirements of being able to create, and more importantly maintain, a non-trivial set of static page content with a contemporary appearance and navigation.

Of course it goes without saying that the pages will avoid the use of proprietary rubbish like Flash, which I (and it seems, many others) hate with a vengeance and regard as the modern equivalent of flashing text or banners. Indeed, I hate Flash so much that maybe I will start a "Flash-free validation service", maybe with a cute little logo to include on your site if you pass.

To the extent possible, I also want to avoid anything that might be configured off in the user's browser or require plugins or be non-portable across browsers, which includes Javascript, applets, etc. So I don't yet know to what extent Forrest supports these. The Patient Contributed Image Repository site will allow uploading of files, so something like Java Web Start will probably be required, but there is no getting around that, unfortunately (I do love Java Web Start, by the way, and have had great fun experimenting with pages that automatically download the correct JRE on a platform-neutral and browser-neutral basis and load the right native libraries for JAI, etc., but that is a subject for another day).

I have mentioned static content several times, and this is a consequence of my preference for avoiding server-side deployment issues that require any particular choice of server pages, database, etc., if at all possible. For complex content this always raises the question of how to search for stuff, and in perusing the Forrest documentation I came across a page that addresses this question, which reminded me about the possibility of using Google to search a particular site.

The bottom line is that one just needs to get the right parameters into the URL. A normal Google search for word "bla" looks like "http://www.google.com/search?q=bla". To constrain the search to a specific site only, such as my site, just add "&sitesearch=www.dclunie.com". Note that the sitesearch parameter can include sub-folders. It is trivial to insert a simple form element in any static web page to do this, and there are some simple examples at Dave Taylor's page. Note in particular that no Javascript is required to do this, no indirection through anybody else's site is required (which some downloadable scripts for this seem to do, perhaps nefariously to gather your details), and you don't have to have a special account or be registered with Google. This despite links to what apparently used to be the Google Free page describing this,
"http://www.google.com/searchcode.html", which now redirects one to a page called Google Custom Search Engine, which seems to imply that more is necessary. I am sure there are more powerful features there, but the simple approach seems good enough.

It took me only a few minutes to augment my own home page with an ugly little search tool and configure it to search not just my own site, but also a few favorites, like the current DICOM standard. Indeed, since these blog pages, though created with Blogger, are actually stored at and served from my primary web site, they get searched as well. As do PDFs, which is particularly cool.

Anyway, just thought you might like to know. Not that I am promising to update my primary site so that it looks halfway decent anytime soon. As I mentioned this is for a new project. The FAQ though, is quite structured despite its hand-written content, so it might be possible to automate most of that conversion.

Friday, September 28, 2007

iPhoto and importing large numbers of images

Summary: Panasonic Lumix DMC-FX01 and Nikon D200 rule, iPhoto sucks, external iPhoto libraries still use gobs of temporary space on startup disk during large imports (need to move /private/var/tmp)

Long version:

It has been a while since I posted, which I will attribute partly to having been on a long, computer-free, vacation back in Australia, visiting my parents, and indulging my wife's passion for wild-flower photography. The latter accounts for the timing, in that spring is not perhaps the best time to visit due to the prospect of rain, but a few weeks in September are the only time that the flowers in Western Australia bloom. Which brings me to the subject of the post ...

We took two cameras. The first was a tiny little Panasonic Lumix DMC-FX01, a great little 6MP snapshot camera that we bought immediately when we saw a friend's, because it is one of the very, very few with a macro (5cm) mode. The second was a recently purchased Nikon D200. My old 35mm Nikon had finally given up the ghost (its tiny little almost 15 year old brain just locked up one day), and on an impulse walking past Adorama on 18th St we decided to get back into the world of "real" photography. Though I have a good set of AF lenses that still work perfectly well with the D200, I also couldn't resist the impulse to replace our old manual macro lens with an AF Micro Nikkor 60mm f2.8, which worked very well for our wildflower photography in the field on this trip. The D200 has more than enough buttons and functions to make a programmer happy and yet in its default mode is easy enough for a normal human being to point and shoot; just be careful not to inadvertently move the auto-focus target away from center with the cursor buttons accidentally, which is easy to do. The best thing for me as an old Nikon user is that the buttons and functions are an incremental improvement on the film-based predecessors and hence fall naturally to hand.

So the cameras worked well and on returning with several thousand photos and a severe case of jet lag, I eagerly proceeded to import them into my wife's computer. Like me, Eleanor has always been a Mac aficionado. In her daily work as an artist, she lives and breathes Photoshop, but for expediency's sake uses iPhoto to manage her personal digital photographs. I am ashamed to say that her machine is not the latest and greatest, something that I will rectify next time she is between contracts and can tolerate some downtime, so it has a bit of a hybrid collection of disk drives to spread things around, which brings me to the primary subject of this post.

Since space on the startup internal hard drive was tight, I had recently moved the iPhoto library from its default location at "~/Pictures/iPhoto Library" to a folder on an external USB attached drive (by copying the library with the Finder, and then holding Option during iPhoto startup which prompts the user to create a new library or to locate one). This was with iPhoto 2.0, since I had not ever bothered to update iPhoto even when updating her system from Panther to Tiger. Anyhow, everything seemed to work just fine that way, in iPhoto's usually inimitable (sluggish) manner.

So back to importing. I plugged in the D200 via USB with its 8GB CF card with several thousand large JPEGs on board (told you I wasn't a serious photographer; no camera raw format for us at this stage). Told iPhoto to begin the import, and since it looked like it was going to take a really long time, left it overnight.

Surprise, surprise, in the morning a) there was a message that disk space was low on the start up disk, b) only 700 or so photos had imported and c) the battery in the D200 was now depleted. No message within iPhoto that anything was wrong though, and in particular no message that despite having started to import 2309 pictures only a small set had succeeded.

Concluding that I was an idiot for not having ordered an external power supply for the camera in the first place (having either not thought about it or assumed that USB power might be available or suffice), I thought that perhaps that was the root cause of the problem, and ordered a Nikon EH-6 AC Adapter ($79.80 from Amazon; at the price there is no excuse for not having one of these). Whilst waiting for it to arrive I was of course unable to resist playing around though.

Let's swap batteries and just try again and pay a little more attention this time; but I first I thought about the matter of running low on disk space on the startup disk. The iPhoto library was definitely on the external drive. There was nothing to speak of in /tmp, before or after quitting iPhoto. Strange, I thought. Assuming that perhaps iPhoto had just been allocating lots of memory to keep track of things, build thumbnails or whatever, and had run out of swap space (wherever that might happen to be stored, something I had not previously considered), I figured a reboot might be in order, and sure enough there was a gigabyte and half or so spare back again after restarting the machine.

Repeated the import process (selecting not to reimport duplicates when the dialog appeared) and a few more were imported, but again, not all, no message from iPhoto to confirm this or explain why, and a shortage of startup space once again. Tried a few other things, like disabling the inactivity sleep in energy saver as well as moving the library from the external USB drive to another internal hard drive with lots of space, but no joy. Old software perhaps, I mused, time for an upgrade, since I had a family pack of iPhoto 6 kicking around that I occasional used on my own machine.

Just to confirm that jet lag impairs decision making, this process did not of course go smoothly, though I did at least check that I had a pre-vacation backup of the the iPhoto Library, which it turned out that I needed ! Somewhere in the process of repeating the import and/or upgrading iPhoto, I managed to completely corrupt the iPhoto database, I don't recall exactly at which point. No amount of rebuilding (hold down command and option whilst starting iPhoto) thumbnails and/or the database succeeded in recovering the database despite the presence of all the photos themselves right there in the various folders in the library. Since my wife had put a considerable effort at sorting her existing (pre-vacation) collection into albums, starting afresh by just re-importing all the photos was not an option, so I restored the old database from the Retrospect backups, started up iPhoto 6 which then wanted to upgrade the library, which it did OK this time, and things seemed more or less back to normal.

Repeated the import from the D200 with exactly the same results - incomplete import, consumption of space on the startup disk, and no helpful message from iPhoto. Each time I repeated the process iPhoto would import a handful more, but never the complete set. I was still running out of battery power, but when my external power supply arrived, nothing changed. Of course the camera remained powered up and mounted as a disk, but the import behavior remained the same.

Hmm; time for some more serious Googling. I could find no specific reference to large imports failing in a similar manner. I did find somewhere though mention of iPhoto using "/private/var/tmp", particularly in relation to such a folder not being cleaned up on restart, though that turned out not to be relevant. Having a more than passing familiarity with "/tmp", but never having heard of "/private/var/tmp", some investigation seemed in order.

Sure enough, when iPhoto begins importing, a folder is created in "/private/var/tmp" and it starts to fill up with hundreds if not thousands of large JPEGs named in the same pattern as the camera file names; it seems that the import process involves copying to this folder prior to copying into the iPhoto library folder. It doesn't seem to be the complete set though, and perhaps there is some queue mechanism with one thread copying from the camera to the temporary folder and another copying them to the library and then removing them.

Whatever, the net result is that a large iPhoto import results in a huge amount of temporary space being consumed on the startup disk, and hence failing when it runs out, even if the iPhoto library is on another disk with plenty of space. Furthermore, though the operating system warns about this (presumably because there is a competition with swap space), iPhoto itself remains silent, which I find particularly dismaying.

It was possible to work around this simply enough, by:

- creating a temporary folder on another drive (e.g., "/Other/private/var/tmp") and giving it the appropriate permissions (as root, using chmod to make it rwxrwxrwxt (+arwx,+t))
- replace the "/private/var/tmp" folder with a symbolic link
(ln -s) to "/Other/private/var/tmp"

Then everything worked fine.

I was a little reluctant to mess with the temporary folder on a running system, and was too lazy to go down to single user mode, but I did quit all other applications and it worked OK; I wasn't game to reboot in this configuration though, in case "/private/var/tmp" is need during boot before the other file systems are mounted, so I put it back to normal before rebooting.

So, in short, I can't say that I am that impressed by iPhoto's robustness or scalability. To be fair, I have not tried the very latest version, which I gather is iPhoto 7 (in iLife 2008), since one has to pay for the upgrade and it has some significant changes in user interface and library organization so may interfere with my wife's workflow.

But I find it hard to excuse a silent failure to import images, since that has the potential to lose a photographer's hard work if they are not very careful to check for this (e.g., by comparing the expected number of images with the actual number in the last imported "roll").

It's probably a bit harsh to go so far as to state that iPhoto sucks, but like most people, it only takes one really bad experience to really put me off a product and this has come close; thankfully no pictures were actually lost though. Last time I was this irritated at a product to the point of publicly eviscerating it was when Retrospect did not keep up with support for the internal DVD writers in the new Mac laptops causing my entire personal backup strategy to go down the toilet. I guess I just have a low tolerance for inconvenience.

As far as my work around is concerned, I would not recommend it, and post it here only to make folks aware of the problem. Messing with system temporary folders as root is obviously not something for the average digital photographer and Mac user to be doing, so I conclude that in general imports of a large number of photographs requires a correspondingly large amount of free space on the startup disk to be completed reliably, regardless of the location of the iPhoto library itself.

David

PS. Note that at no stage did I check the "remove from camera after import" button in iPhoto. I was not willing to risk the camera images being deleted without having been successfully imported.

Sunday, June 17, 2007

Where to get images for research and testing - Public collections, routine re-use, and the possibility of direct patient contributions

Summary: Large useful collections of publicly accessible medical images for testing and research are few in number; despite public initiatives to build such collections, progress is slow though improving; the additional possibility of having individual members of the public contribute their own images and data directly has been raised; logistic and legal concerns are significant but surmountable, and there would seem to be few privacy and human research regulation issues.

Long version:

I have long fantasized about the existence of a large collection of complete sets of images suitable for research and testing purposes, whether it be for testing image pixel data for different types of compression, display, analysis, or similar studies, or for more mundane tasks like checking for DICOM compliance or testing DICOM-capable tools like PACS and workstations against the installed base of equipment. Indeed, I first developed an interested in DICOM in the early nineties not for clinical interchange, but as a means of formatting and organizing my own teaching and research collections. Little did I know where that would lead !

Traditionally in academic research studies, one begins with a laborious exercise of collecting patient-related images prospectively or retrospectively; this often involves multi-site collaboration, approval by Institutional Review Boards (IRBs), etc.; this is very expensive, time consuming, and frankly, beyond the capabilities of many scientists, engineers, programmers and students who just want to test their ideas, algorithms and code. Further, the folks who need the images may not have the academic affiliations, credibility or stature to even get to first base as far as funding or approval is concerned.

Some of us are fortunate enough to be actively engaged in large scale multi-center clinical trials and industry testing collaborations and we can often find ways of re-purposing and reusing images gathered for other purposes, with the appropriate approvals and permissions. This avenue is not open to many folks who need images though. Some of the NIH folks are keen to remedy this problem by recruiting images from other studies and making them publicly accessible via such mechanisms as the National Cancer Image Archive (NCIA) and the Alzheimer's Disease Neuroimaging Initiative (ADNI) projects to name just a few of several. These projects emphasize the importance of gathering not just any images, but complete sets, in a relatively homogeneous manner with respect to acquisition protocol, at multiple time points in the course of diseases that need to be followed over time, and with additional related data, such as experts' assessment of lesion location and outcome and historical data where relevant. Such efforts still require significant resources and involve sometimes difficult negotiations with respect to funding and permission.

Another option that I have considered in the past is to somehow capture images and associated information as a "side effect" of routine clinical use. For example, many facilities are partially or totally digital already, with respect to images, diagnosis codes and reports if not the entire medical record. Further, many such sites already use "off-site storage" provided by third-parties either as their primary archive or to support disaster recovery. Would it be a difficult step to go a little further and automatically collect and de-identify all such image and related data and make it publicly available for research ?

From a legal perspective, possibly all it would take would be for the facility to add consent and authorization for such routine (as opposed to prospectively identified) re-use purposes; however, each IRB would undoubtedly weigh in with policy and risk-management related issues that might be difficult to get by. And frankly, many physicians might feel threatened by releasing what they otherwise consider their proprietary material, which potentially provides them with a competitive advantage with respect to grant applications and publishing papers. To put it another way, one would need to provide a facility with one hell of an incentive to get by the obstacles that naysayers might raise.

One such incentive might be to provide free or really cheap storage; how many CFOs or CIOs would drool over the possibility of reducing or eliminating bulk data storage costs if a third party (such as a non-profit organization established for the benefit of the public research community) were to underwrite these costs, on the proviso that their de-identified form be made available ? Such an incentive might serve to significantly undermine any opposition within an institution. It might be possible to leverage the capabilities of existing commercial providers of off-site archives, who could offer a reduced price for such data sets. Conversely however, less well intentioned folks might see this as a commercial opportunity and explore the possibility of selling the data instead of making it publicly available for free.

Some existing archive providers also provide the opportunity for patients to contribute and maintain their own images, allowing access to their health care providers as appropriate, myNDMA being an example (though I noticed as I was researching this post that myNDMA are "accepting no new registrations at this time"). The concept of patient empowerment and patient-centric control of one's own destiny is perhaps a concept whose time has come, though obviously only a subset of the population will be willing to or capable of taking on such responsibility. An example of extending this concept to one's entire record is the MedCommons project.

On a previous occasion, frustrated by the difficulty of getting images from a broad range of installed modalities to test DICOM software, I had considered setting up a publicly accessible archive that would also allow anybody from the public at large to contribute. My plan was to canvas the community of digital imaging and PACS users as well as ordinary people undergoing imaging to submit material that I would then de-identify and make available for testing. At the time my primary interest was in the "DICOM-ishness" of the data and not the research applications, though I was interested in complete sets rather than individual images. I did not pursue this, since about the same time NEMA was initiating an effort to gather images from modality vendors for similar sorts of testing (the NEMA DICOM Object Library). However I was sorely disappointed when, despite my strong protests, the NEMA vendors decided to keep this a closed and secret database not accessible to non-NEMA members or the public, which it remains to this date. Bet you didn't even know about it, did you ?

However, I was reminded of the possibility of direct patient contribution to image archives at a recent Cancer Research and Prevention Foundation Lung Cancer Workshop, during which the concept of approaching patients, people under going screening, and survivors for image contributions was raised. A lively conversation among the participants ensued led by Jim Mulshine, David Yankelevitz and Rick Avila. In essence, most of the attendees were quite excited by this concept, particularly since there is an opportunity to leverage the good will of the survivor-driven charitable organizations to organize and promote such an activity. KitWare has kindly volunteered to coordinate some of this work and you can follow along on their Wiki once it gets under way. Though this was discussed in the context of lung cancer, and particularly with respect to gathering images for CAD testing and validation, the concept is obviously generalizable.

For example, in lieu of there being a good publicly available collection of images for digital (as opposed to digitized) mammography image compression research, one might consider attempting to build such a collection with the assistance of contributions from individual women. One of the obvious problems with this is the relatively low prevalence of disease; i.e., one might receive far more normal contributions than abnormal, which makes performing research on disease-enriched data more difficult, or conversely, means storing and curating a large amount of data for a relatively low yield of useful information. However, unlike the unfortunate situation for lung cancer, a far higher proportion of women either have a negative biopsy or survive their disease, and potentially a high yield of images with positive findings could be obtained from this group.

Another problem is the matter of gathering additional outcome data; for many types of experiment it is necessary to have some knowledge of the truth beyond what can be ascertained from the images themselves. Contribution of pathology reports and/or follow-up images would be desirable. The former presents problems in that these reports are less often accessible to patients (or screening participants) in digital form, though perhaps they could be scanned or faxed The latter might be contributed on a separate occasion, but if de-identified, how are they to be linked to the same (anonymized) individual ?

In general, the problem of reliable de-identification and anonymization (or pseudonymization) on a large scale is hard. Sure, one can clean the DICOM header information well enough, especially if one can discard most of the string descriptive and private attributes without affecting reuse, though even that is non-trivial in the general case. The problem of burned in pixel data identification can at least be detected in a subset of images (by automated algorithms examining header patterns as well as OCR-like analysis of pixel data), which can then be sequestered for manual review. Anything that is not an image though, such as a scanned or faxed, or even PDF or HL7 plain text or DICOM structured report will likely require manual (and hence error prone) attention. The resource burden of manual de-identification (and QC process to check on it) is not to be underestimated.

One approach would be to have the contributor themselves actually perform the de-identification by providing them with the appropriate web-deployed tool to use to contribute, view and edit the content; that way they could both do the work and absolve the archiver from future responsibility in this respect. Indeed, if all the work were performed client-side, the central server would not ever need to have access to or knowledge of the actual Protected Health Information (PHI), which might considerably simplify the necessary security measures. Continuity across contributions would be more difficult but could be achieved with some sort of registration or identity hash based mechanism. It would be shame if this additional burden were to prove a disincentive to contribute, though.

Thorough de-identification in the general case remains non-trivial though, especially if one goes so far as to consider facial information possibly recognizable from a 3D rendering of images of the head; there are means to disrupt the data to prevent this, but that would make it useless for many (though not all) potential future uses. Though trials on the matter of recognizability are currently under way, there is no consensus on this yet, and perhaps it would be easiest just to have the contributor consent around this issue.

Indeed, on the matter of consent, this might be more challenging than all the procedural and technical and resource issues put together. One would have to be sure that the contribution agreement would stand legal scrutiny, cover all potential uses of the data, irrevocably, and allow for the archive maintainer to disclaim any liability. Liability might include not only privacy concerns, but also responsibility to feed back any findings with respect to the data to the contributor. For example, in the case of CAD testing, one would not want the contributor to have the (unrealistic) expectation that if a future CAD experiment found something undesirable that they would receive feedback that would impact their care. Such an agreement would somehow need to be "signed", presumably, to have any legal standing, and a mechanism to do this via the web at the time of contribution and to archive the signature would be necessary.

Note that I distinguish the matter of the individual contribution agreement with respect to permission and liability from the matter of permission from others. To my knowledge, at least in the US, there are no regulations that would govern the establishment of such a repository of images. Whilst the HIPAA Privacy and Security rules might provide helpful guidance, the repository would not in and of itself be a Covered Entity, and hence would not be subject to the rules. Further, since contributions would be directly from individuals rather than Covered Entities, no HIPAA provisions on the sending side would come into play.

Would some form of IRB approval be required, either to contribute, maintain or to use any of the data ? The US federal regulation on Protection of Human Subjects, which potentially applies to federally funded activities, specifically exempts "research involving the collection or study of existing data ... if these sources are publicly available or if the information is recorded ... in such a manner that subjects cannot be identified ..." (45 CFR 46.101(b)(4)),.

However, whilst there might be no formal need for an IRB approval, review of the policies and procedures and agreements by some form of central IRB might well be worthwhile to mitigate any concern that the rights of the contributors are not being abused. Perhaps the NCI's Central IRB (CIRB) Initiative might be willing to take on this responsibility. One could envisage drafting a set of standard "open source" pre-approved documents that would allow any number of willing organizations to implement and replicate this strategy.

This is of course a somewhat US-centric view of the privacy and human research situation biased by my own experience; since any such repository might be open to global contributions, a further analysis of the issues in other countries is desirable.

But the bottom line is that there would seem to be few if any restrictions to a person who has access to their own record in electronic form to use it in any manner they see fit, and hence to contribute it to such a research collection for the public good. Whilst one may debate about who actually "owns" the data, I hope few would be so crass as to attempt to restrict an individual's use of their own personal data in such a manner.

What remains now is for those of us who see merit in this approach to take action to make it happen, and in such a manner that the data becomes useful in advancing the state of the art.

David

Sunday, June 3, 2007

On the lack of DICOM Police, the example of IHE content profiles, and the need for usability standards and cross-certification ...

Summary: Neither DICOM nor IHE may be sufficient to solve users' real world problems with usability of imaging devices, neither a hypothetical DICOM police nor the existing IHE Connectathon process would solve this problem, and there may be a need for a new type of "usability" standard and certification process, even to the extent of cross-certification of combinations of devices.

Long version:

As everyone is fond of saying, there are no "DICOM police".

NEMA, for example, specifically disclaims responsibility for policing or enforcing compliance to the DICOM standard. There is, for example, no DICOM "certification".

Nor is there an "IHE police", nor, for the time being, IHE "certification".

Some folks are under the mistaken impression that successful participation at an IHE Connectathon represents some sort of certification, but what is tested at IHE is not necessarily a product and may be a prototype, and often is not representative of what you can go out and buy, now, or ever. Furthermore, the IHE tests are limited in scope and depth, not only to the limits of the "profiles" being tested but also by the rigor of the tests themselves. For example, though vendors may demonstrate transfer of images within a specified workflow with the correct identifiers during the Connectathon, whether those images will be usable in any meaningful fashion by the receiver is not tested. These issues may be addressed over time as the IHE testing approach matures and is revised, and more "content" profiles like NM and Mammo are developed and tested. The Connectathon is a fantastic cooperative effort and an enormous investment of time and resources that results in considerable progress, but the fact remains that products are not certified during this process.

Hence the publicly posted "Connectathon Results" are only a guide to what vendors might or might not choose to make available as product, one is left to rely on so-called "self-certification" by the vendors. Vendors dutifully provide DICOM Conformance Statements and IHE Integration Statements, which both guide users with respect to what features are supposed to be available and outline what a product is supposed to do, but it seems that not infrequently products remain deficient in some small or significant way, either with respect to what is claimed, or even correct implementation of the underlying standard.

Who then, will police the compliance of the vendor in this respect? Currently, this is left to the users, or the experts with whom they consult. The vendors mostly appear to act in good faith, but when problems arise some are none too swift to acknowledge that they are at fault or to provide a solution.

But even if there were a DICOM (or IHE) police, would it actually help the users ?

Take for example the matter of compliance with the standard with respect to the encoding of images for a particular modality, say projection X-ray using the DICOM DX image object. Consider a frontal chest X-ray, which, depending on whether it is taken AP or PA, might from the perspective of the pixels read out from the detector have the left or the right side of the patient orientated towards the right side of the image. Now, the DICOM standard does NOT say that the transmitted image must be oriented in any particular manner; rather it says that the orientation of the rows and columns must be sent. In this case the row orientation would be sent either as towards the patient's left, meaning that the pixel data if rendered that way would look the way (most) radiologists would expect it, or the row orientation might be sent towards the patient's right, meaning that the receiver could use this orientation to flip the image into the expected orientation.

And therein lies the rub, since no standard, DICOM or IHE, currently requires that the receiver flip the image into the "desired" orientation for display based on the encoded orientation parameters. So a completely DICOM (and IHE) compliant storage SCU (Acquisition Modality actor) could encode an image in one orientation, and a DICOM (and IHE) compliant storage SCP (Image Display actor) could display it, and the user would still be unsatisfied and have to manually flip the image. No DICOM (or IHE) police or certification or anything else would be able to solve this problem for the user, beyond explaining it.

Conversely, if the modality were to not send the orientation at all, and violate the DICOM standard in this respect, if the pixels happened to be oriented correctly, the user experience would be satisfactory, and no problem would be perceived (except perhaps for the absence of an orientation marker to indicate the side). Indeed this would typically be the case for devices that use the older CR image object in DICOM, which allows the orientation to be empty, ostensibly on the grounds that sometimes it won't be known (e.g., there is a plate reader but no means for the operator to enter this information on the QC workstation, if there is one).

The acquisition modality vendors may solve this problem by making the sending device configurable in such a manner as to "flip" the images as necessary to give the expected result at the other end, either automatically or with the assistance of the operator, but the fact remains that this sort of configurability is not required by the standards.

Another example would be the matter of display shutters, such as to blank out the perimeter around a circular or rectangular angiography or RF acquisition, so that it remains black regardless of whether the image is inverted or not. The DICOM standard defines their existence encoded within an image, but does not mandate their application by the display (unlike in a presentation state). I was recently reminded of this when there was a compatibility issue between one vendor's acquisition device and another's PACS. The modality was sending a display shutter and the PACS was ignoring it, and the resulting white background was unacceptable to the user. A modality vendor would typically provide a configuration option to burn in the background as black in this case (resulting in white when inverted, but you can't configure around everything), and handle the lame PACS, but this particular modality did not have that feature. The PACS vendor had I am told only just released display shutter capability in a new and expensive release, so the user was essentially out of luck. Again, there would be no help from the DICOM police in this regard, assuming they could only act within the bounds of the "law" (what is written in the standard). Furthermore, it is very difficult to ascertain a priori from conformance statements what is possible in these situations, there typically being little if any documentation of the scope of configuration possible on the sending end, or the display behavior on the receiving end.

So, one is inevitably led to the conclusion that the standards are insufficient to satisfy the users needs in this regard, and that DICOM police or certification, whilst arguably necessary, would not be sufficient in their own right.

Or to put it another way, there seems to be a need for "usability" standards, perhaps layered on top of the DICOM and IHE standards. This is an area that vendors may be reluctant to address, since such standards might potentially erode what they see as "added value" (though many users might argue the same are "bare necessities"), and are a source of risk in that if they fail to offer the complete spectrum of "usability" requirements, they might be unmarketable.

There are two categories of precedent for this sort of thing that may be relevant. One category includes the IHE Radiology "content" profiles, specifically NM and Mammography; the other is the federally-mandated certification effort, exemplified currently by the Certification Commission for Healthcare Information Technology (CCHIT).

The IHE content profiles differ from much of the other radiology work in IHE in that they are less about workflow and more about modality-display interaction. Anyone with NM experience knows exactly how woeful most general purpose PACS are with respect to handling NM images, either in terms of providing interface tools with which NM physicians are comfortable, providing layout and reconstruction capability appropriate to different types of acquisition, not to mention analytic tools for quantitative measurements, especially cardiac. The NM folks (in the form of the SNM) finally said enough and ultimately decided to work through the IHE framework to achieve their goal. I have little experience in this area, so cannot say to what extent this profile has actually influenced purchasable products or helped the users in the real world, but this effort paved the way for content profiles that specified image display behavior in detail.

The IHE Mammo profile, on the other hand is one that I was directly involved in. In this case a bunch of very disgruntled users who had faced the realities of owning multiple vendors' FFDM equipment and trying to use it in high volume environments expressed their disappointment at a special SCAR session, which resulted in the formation of a sub-committee in IHE to address the concerns, and ultimately a profile that specified mutually compatible requirements for both modalities and displays.

The process by which the Mammo profile was developed is instructive. First the users expressed their concerns and requirements with respect to real world experience with products; second, the FFDM and dedicated display system vendors admitted that there were problems and expressed willingness to engage in a dialog; third, everyone met together face-to-face to hash out what the priorities were and where there was common ground. There was considerable argument on the fringes, especially with respect to exactly how much application behavior could be standardized or required as a minimum, and which of several competing solutions to choose for particular problems when there existed an installed base of incompatible solutions, but ultimately a reasonable compromise was reached. The users insisted that deployment be swift and arranged a series of public demonstrations at short intervals to ensure that progress was made.

What distinguishes the Mammo profile is that it is very specific about how displays behave and in particular what features they must have, e.g., the ability to display images all at the same size, current versus prior, regardless of vendor and detector pitch, to display true size, to display CAD marks, to annotate in a particular way to meet regulatory and quality requirements, and which DICOM header attributes to use in what manner to implement those features. Further, given the different type of processing and grayscale contrast from the various detectors, the display is required to implement all of the possible grayscale contrast windowing and lookup table mechanisms, not just a vendor-specific subset. I.e., in some cases the vendors agreed to standardize the "intersection" of various different possibilities, and in other cases the "union" of all possible, depending on the impact on the installed base and the usability of the feature.

This cooperative effort seems successful so far, though I am biased in this assessment having been intimately involved. However, is it scalable to more ambitious "content", "functional" or "usability" specifications, either within IHE or elsewhere ?

The mammography effort was made considerably easier by the fact that the digital mammography user and vendor communities are relatively small and tightly focused, if by no other factor than the regulatory burden imposed by MQSA. Everyone knows everyone else, basically everybody gets along and likes one another, and it is hard to take too unreasonable a stance in this group for very long. A certain amount of "cat herding" was required of course, but on a level of difficulty scale of 1 to 10, I would rate this one about a 4.

One risk to scalability is that "users" will not bother to ask for the IHE profile in their RFPs and contracts, and will buy whatever non-compliant lame "mammo package" their existing PACS vendor deigns to offer and force their radiologists to use it. This risk could be mitigated if the FDA were to require that only certified products were used for primary interpretation, but this would be a very special case since mammography is about the only area in which the FDA has authority to regulate the practice of medicine, and is not generally applicable. Other organizations, like JCAHO or third party payors could require certified compliance, but would there be any benefit for them to do so ?

Another risk with respect to generalizing the approach is the lack of interest by users in developing usability standards. The mammography and NM examples were perhaps atypical in that there were highly motivated individuals to champion the cause who devoted enormous amounts of time and energy with the support of their organizations. Is this degree of user involvement likely to be repeatable in other areas where the problems may not be so acutely felt, where the scope is broader, or the problem is larger in scale ?

Likewise, there is the risk that the vendors will be unresponsive to such efforts. Both DICOM and IHE development have been characterized by the active participation (some might say total domination) of vendors and have as a consequence been at least somewhat successful. Externally imposed standards to which there may be outright vendor opposition would be less likely to be successful.

On the subject of scale, it is potentially enormous, if one were to go the extent of defining the required functionality of an entire PACS with respect to usability of workflow and display. Anyone who has written requirements specifications and test scripts for the implementation of such products is familiar with the level of effort, but then again since this has already been done internally by vendors many times over this is not a new experience.

To that end it may be instructive to review the work of CCHIT so far; kick-started by federal funding and a requirement to certify ambulatory EHRs, this effort has produced some interesting materials, even if one is not a fan of the politics involved. On their web site you can find documentation of their process, the functional requirements against which certification takes place, the actual test scripts that are used, as well as the public comments received as these materials were being developed, which give an interesting insight into the vendors opinion of the process and the expense as well as the heavy handiness of the CCHIT.

I have no involvement in this process at all, so can't speak to its success or value so far, and you can read the materials as well as I can. It is interesting though, to review the functionality criteria for an ambulatory EHR and envisage how one might write similar criteria for a PACS. Likewise, to review the test script for these criteria from the perspective of perhaps testing an Image Display with the same approach. To return to the example at the beginning of this entry, one could envisage a criterion for a PACS such as "shall be able to display a frontal chest x-ray rotated or flipped into the correct orientation based on the DICOM orientation description" and a corresponding test script entry with a range of test materials that included images encoded in a manner that required such flipping. This is exactly the sort of testing that we did for the IHE Mammo profile.

If this were to be done, would self-attestation or self-certification be sufficient or would there need to be in addition external verification and certification such as CCHIT performs ?

Who would require either form of certification ? The users themselves ? The payors ? The government ?

What would be the appropriate organization to perform such work ? Would CCHIT take on imaging or do they have enough on their plate, not to mention no expertise in this area ? Could or would IHE do it, particularly now that it has grown well beyond radiology into other domains that have their own issues and priorities ? Would ACR, who are all very eager to "accredit" modalities, be interested in or capable of this ? SIIM would perhaps be a logical choice, were it not for the apparent influence vendors have on their decision making process about things controversial. How about RSNA, or are they too invested in IHE already to begin a separate effort if one were thought to be necessary ?

Or is there a need for yet another independent organization to do this ? If so who would start it ? Who would run it ? Who would pay for it ?

And ultimately, would "standalone" certification against criteria be of sufficient benefit ? It would be a start, but if there is one thing that the IHE Connectathons have demonstrated it is that the proof is in the testing of multiple devices working together. To that end, does one need an infrastructure to support certification of permutations and combinations of devices inter-operating together, either in a test environment or in the field ?

One could envisage an approach in which the two (or more) vendors involved submitted a "joint application" for certification of a combination evaluated against specific criteria based on the first actual deployment. Funding, implementing. monitoring and promulgating this information would be a challenge, but perhaps not insurmountable.

Imagine in the display shutter example that the forward thinking purchaser of the PACS had included in their support contract a requirement that the PACS vendor participate in such cross-certification activities as new modalities were acquired by the site; likewise before accepting the new modality the site would have required the same of the modality vendor. If both had been previously cross-tested satisfactorily they would already be certified, and indeed the purchaser would have known this by consulting the certifying authorities web site; any limitations would have been publicly documented and disclosed. If the particular combination had not, then a first-time test would need to be performed against the certification criteria, supervised by some sort of "designated examiner" trained and licensed by the certifying authority. The result, whether successful or not, would be promulgated in full. Fees to cover the cost would be payable by the pair of vendors, and they would recover this in their service contracts or purchase price. If one or other of the vendors refused to participate then the user could still execute the (publicly available) test script themselves at their own expense, with or without an examiner, the results could still be promulgated with or without either vendor's prior approval, and failure might be a clue to the user not to accept the modality or to plan to replace their PACS.

So we have come full circle, in that this is exactly the sort of paradigm that the IHE Connectathon supports. Except, that it would involve products rather than experimental or prototype systems, the details of test script execution would be fully public, rather than categorized as a simple pass/fail or prevented from disclosure by confidentiality agreements, a considerably more comprehensive range of old and new products would be tested, the result would be a formal certification, the criteria would be at a level that addressed functionality and usability not just message transfer and workflow, and the users and sites could specify certifications as criteria in their purchase and support contracts.

Or perhaps, the "great learning experience" for engineers, which is essentially what the Connectathon is, could be translated into a formalized process of direct, rather than only indirect (albeit very important), benefit to the user.

David

Welcome to David Clunie's Blog ...

... in which you will find various ruminations (calm, lengthy considerations ?) and periodic stream-of-consciousness dumps, in a less structured form than the regular material on this site ... feel free to leave comments but do not be offended if I edit or delete them at will ...