Some background material for Semantic Web topics for Architecture Conference Call Feb 28

Peter Li has asked me to talk a bit about my proposal to use semantic web technologies within the OSEHRA efforts on the Architecture call Tues Feb 28.  This is a continuation of the comments at http://osehra.org/blog/progress-rdfsparql-repository-vista-foundation-schema

The basic idea is to create a repository of all the elements required to define a VistA instance in a format that can be queried and manipulated with semantic web technology.  This consists of:

  • Naming every element with a unique Universal Resource Identifier (URI),
  • Defining the relationships and dependencies between these elements using a formal schema, such as RDF or OWL.  (see http://www.metavista.name/foundation/foundation.rdfs for a prototype of this idea)
  • Making the repository available as an RDF/SPARQL endpoint, so that we have a single place from which a VistA Instance can be be queried to determine anything about the foundation we want to know, from software releases, ontologies in use, device information, localizations, hardware configurations, logs, privacy and security info, etc.

The Foundation repository can be used interactively via query languages, or programatically via API calls.  It would serve as a common platform from which to manage other OSEHRA activities, such as:

  • Refactoring.  The XINDEX dependency information could be loaded into to the repository, and then used as a resource in a refactoring workbench, such as an Ecliplse plugin.  The foundation could be used to do a network analysis of the VistA code, to determine centrality of references, graphical display of dependencies, or OWL-based inferences and consistency rules.
  • Portability.  The foundation could be used for the SKIDS module, to track which elements are being ported.  The consistency constraints of the foundation schema could also be used to insure compatibility of the ported packages, including higher level semantic issues such as clinical ontologies.
  • Privacy and security.  Dealing with privacy and security, particularly in such a large scale and dynamic system such as the VA or DoD requires a great deal of sophistication.  http://wiki.hl7.org/index.php?title=Security_and_Privacy_Ontology is a semantic web oriented approach which addresses this, and looks like a good fit for inclusion into the foundation.  (As an aside, I lead the security and privacy efforts for the original VistA and CHCS efforts, from incorporating it into the basic data dictionary efforts through federal certification.  To my knowedge, neither of these systems have been breached through technological means - the problems have always been authorized users abusing their authority.  If anyone knows of any technical breaches, please let me know. )
  • Architecture.  The repository would provide a common basis for the architecture group to connect its work in UML in a more executable and queryable format in RDF/SPARQL.  It also provides a tool for managing installed VistA implementation in the field.  A broader vision of semantic-web oriented health information space - dealing with the full semantics of clinical interaction and medical informatics is a possible extension to this idea.

Some background information.

Security and Privacy Ontology project of HL/7 Tip of the Hat to Mike Davis of the VA, who has been active in this effort.

Conversation between Ward Cunningham and Tom Munnecke at Health Camp Oregon 2011.  Ward Cunningham, inventor of the Wiki, talks with me about some of the overlapping ideas he had in design of the Wiki and I had with respect to the original design of VistA.  Of particular interest is our focus on the notion of a dynamic information space connected by language references, rather than a static "integrated system" connected by pre-defined APIs.

Semantic VistA: Conor Dowling's work to create a FileMan Query Language based on a semantic web interpretation of the VistA data dictionary.

MITRE hData initiative: This is a step towards a semantic approach to clinical data, even if they are shy about calling it such.  'At its core, hData is a simple way of organizing data into small pieces in folders, with a corresponding set of XML Schemas or other appripriate media types (such as e.g. PDFs, PNGs, or DICOM containers) to describe the small pieces.  Currently, hData has one RESTful transport binding, which is being standardized at the Object Management Group.  Future transports may include SMTP or XMPP transports."

Tim Berners-Lee's TED talk on the Next Web.  Tim talks about his vision of Linked Data, which can be seen as the next generation (and, perhaps, more easily understood) of the Semantic Web.  (I arranged a meeting around 1995 with VA folks including Rob Kolodner, Clayton Curtis, Jim Demetriades, with Tim soon after he came to MIT from CERN.  We met with MIT Professor Pete Szolovitz and Harvard's Issac Kohane.  This was the seminal meeting that lead to the creation of My Health eVet.

The Promise of Semantic Web in Genome technology.  Creating a semantic web foundation for VistA would allow it to participate (and manage the security and privacy of) the upcoming revolution in genomics.

Semantic web and Social Networks.   There is some very interesting research linking health and social networks in far richer ways than just "normal distribution" population studies.  This has far-reaching implications for dealing with global pandemics, vaccination, and other possible interactions with "the science of the individual" as discussed by Eric Topol in The Creative Destruction of Medicine

like0