Integrating metadata into the data model

Mathematical models define infinite precision real numbers and functions with infinite domains, whereas computer data objects contain finite amounts of information and must therefore be approximations to the mathematical objects that they represent. Several forms of scientific metadata serve to specify how computer data objects approximate mathematical objects and these are integrated into our data model. For example, missing data codes (used for fallible sensor systems) may be viewed as approximations that carry no information. Any value or sub-object in a VIS-AD data object may be set to the missing value. Scientists often use arrays for finite samplings of continuous functions, as, for example, satellite image arrays are finite sampling of continuous radiance fields. Sampling metadata, such as those that assign Earth locations to pixels, and those that assign real radiances to coded (e.g., 8-bit) pixel values, quantify how arrays approximate functions and are integrated with VIS-AD array data objects.

The integration of metadata into our data model has practical consequences for the semantics of computation and display. For example, we define a data type goes_image as an array of ir radiances indexed by lat_lon values. Arrays of this data type are indexed by pairs of real numbers rather than by integers. If goes_west is a data object of type goes_image and loc is a data object of type lat_lon then the expression goes_west[loc] is evaluated by picking the sample of goes_west nearest to loc. If loc falls outside the region of the Earth covered by goes_west pixels then goes_west[loc] evaluates to the missing value. If goes_east is another data object of type goes_image, generated by a satellite with a different Earth perspective, then the expression goes_west – goes_east is evaluated by resampling goes_east to the samples of goes_west (i.e., by warping the goes_east image) before subtracting radiances. In Earth regions where the goes_west and goes_east images do not overlap, their difference is set to missing values. Thus metadata about map projections and missing data contribute to the semantics of computations.

Metadata similarly contribute to display semantics. If  both goes_east and goes_west are selected for display, the system uses the sampling of their indices to co-register these two images in a common Earth frame of reference. The samplings of 2-D and 3-D array indices need not be Cartesian. For example, the sampling of lat_lon may define virtually any map projection. Thus data may be displayed in non-Cartesian coordinate systems.

If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.