Database Elements Problem 1: Standard or Local Data

So then which aspects of what VISTA is made of make this more complex than it seems?
 
Many things. Let's start with the data.
 
In some files, such as the State file (#5), the contents are shared. It's not like the number of U.S. states changes depending on whether your VISTA system is running at a VA hospital in Tampa, Florida or Cheyenne, Wyoming. This file is not yet optimized for use by other nations, so at present it's pretty much just a description of U.S. states, which means its contents should be the same for all VISTA systems. Therefore, when we ship this file, we should include the data.
 
Not so other files, such as the Patient file (#2), whose contents are strictly local. This file holds site-specific data, so Tampa's Patient file needs to describe Tampa's patients, while Cheyenne's Patient file needs to describe Cheyenne's patients. Therefore, when we ship this file, we should not usually include the data - except for special circumstances, such as when we're setting up a local test environment.
 
This variable - whether the data are shared, that is, standardized, and therefore should be shipped with the file, or whether the definition alone should be shipped - makes a huge difference in how we version-control a file.
 
Version control is partly just about tracking changes over time. From that perspective, it's fine to keep track of a file's contents locally.
 
But version control is also partly about tracking deviations from a reference standard, to know if what we're runing is standard or up to date or otherwise okay. From that perspective, tracking the contents of a file whose data are by design intentionally nonstandard would create a terrible signal-to-noise problem, since it would endlessly be reporting the "problem" that our site's Patient file has different contents than another site. From a standards perspective, we need version control to be able to worry about whether the State file deviates from the standard but not whether the Patient file does.
 
In other words, there is inherently no single right answer to the question "Should a file's data be included in the elements we version-control?" The answer is "It depends on the file and on what we're using version control to do for us."
 
One could conceivably still automate this. VISTA's interface to version-control systems might include parameters, and two different version-control repositories or modalities might be used, one for just tracking change over time, the other for running deltas against the standard. Similarly, we would need to have a flag on each file to say whether its data are standardized.
 
Unfortunately, we have none of these things today. The OSEHRA Forum project is introducing automated interfaces between VISTA and external, industry-standard version-control systems for the first time, step by step, but this software does not yet exist in the field, and even the current software under development is not yet ready to perform this kind of automation.
 
For one thing, Fileman - like all other database systems I know of - is not truly aware of the existence of other databases, Fileman or otherwise. Each database thinks it is the only database in the cosmos. They have no tables listing some or all of the other databases in the world, which makes it difficult to meaningfully express the idea that a file's contents might or might not be the same across different databases. For now we could add an attribute to the File file or the DD that says whether data are standardized, without a full semantic appreciation of what that means, but with enough to set the right default behavior when exporting and importing files, but Fileman does not currently include such an attribute.
 
Therefore, at present, these kinds of distinctions are made manually, by expert application developers. In development shops that do not allow developers to become experts at any one application, because they are rotated from project to project on an ad hoc basis, the potential for error is high. So even such an easy-to-ask question as "Should we send the data?" can be difficult to answer correctly and consistently.
 
The answer so far to the question "What is VISTA made of?" is "Routine and non-routine software elements, some of which are complex to manage."
 
This is only the tip of the iceberg when it comes to the complexities introduced by non-routine software elements. More next time.
like0