TDWG Subgroup on Accession Data

Stan Blum on semantics, syntax, and flat file data standards

Walter Berendsohn, October 4 1996, to Stan Blum: [HISPID] "trying to do semantics along with exchange protocol syntax". Could you very briefly state what harm is in that?

Stan Blum, October 4 1996:

I don't want to you to think I'm really a zealot (unfair) about not mixing semantics and syntax, but there are two things that frustrated me several years ago when people mixed purposes (without understanding them). First, because these things were understood to be "data standards", and were given as a list of fields, naive people took them to mean "as long as my database can export these fields my database is OK". Lists of fields and definitions promoted a lot of flat-file, functionally-limited, databases that held a lot of messy data.

Second, trying to specify semantics in a non-structural way (i.e., a list) will keep you from explaining all the structural semantics, so the specification is handicapped from the beginning. I firmly believe that data elements have only partial meaning if they are without a logical structural context. A data element value has to be about something -- hopefully the thing identified by the primary key of the table it's in. At least that's the way to express it in the relational model. Functional dependency is perhaps more specifically the issue -- the heart of data normalization -- which applies under all data models (relational, hierarchical, object, whatever).

When we convened the Soc Vert Paleo working group (~10 people) in New York (1990), we sat around a table for almost three days, and tried to define data fields (specify semantics) in the context of two flat files. It was not a pleasant experience. The problem was that people couldn't uncouple logical structure from physical implementation, and they didn't want to specify a whole bunch of "tables" because "no one would be able to get good performance out of a database with that many tables".

No doubt you've had the same experience with folks at TDWG. Aren't I preaching to the converted?

To index page. This page last updated October 6, 1996. Contact: W. Berendsohn,