Documentation Guidelines
The documentation (i.e. the "Readme" file) that accompanies each project data set is as important as the data itself. This information permits collaborators and other analysts to become aware of the data and to understand any limitations or special characteristics of data that may impact its use elsewhere. The data set documentation should accompany all data set submissions and contain the information listed in the outline below. While it will not be appropriate for each and every dataset to have information in each documentation category, the following outline (and content) should be adhered to as closely as possible to make the documentation consistent across all data sets. It is also recommended that a documentation file submission accompany for each preliminary and final data set.
---TITLE: This should match the data set name
---AUTHOR(S):
-Name(s)
of PI and all co-PIs
-Complete
mailing address, telephone/facsimile Nos., web pages and E-mail address of PI
-Similar
contact information for data questions (if different than above)
---FUNDING SOURCE AND GRANT NUMBER:
---DATA SET OVERVIEW:
-Introduction
or abstract
-Time
period covered by the data
-Physical
location of the measurement or platform (latitude/longitude/elevation)
-Data
source, if applicable (e.g. for operational data include agency)
-Any
World Wide Web address references (i.e. additional documentation such as
Project WWW site)
---INSTRUMENT
DESCRIPTION:
-Brief
text (i.e. 1-2 paragraphs) describing the instrument with references
-Figures
(or links), if applicable
-Table
of specifications (i.e. accuracy, precision, frequency, etc.)
---DATA COLLECTION
and PROCESSING:
-Description
of data collection
-Description
of derived parameters and processing techniques used
-Description
of quality control procedures
-Data
intercomparisons, if applicable
---DATA FORMAT:
-Data
file structure, format and file naming conventions (e.g. column delimited
ASCII, NetCDF, GIF, JPEG, etc.)
-Data
format and layout (i.e. description of header/data records, sample records)
-List
of parameters with units, sampling intervals, frequency, range
-Description
of flags, codes used in the data, and definitions (i.e. good, questionable,
missing, estimated, etc.)
-Data
version number and date
---DATA REMARKS:
-PI's
assessment of the data (i.e. disclaimers, instrument problems, quality issues,
etc.) Missing data periods
-Software
compatibility (i.e. list of existing software to view/manipulate the data)
---REFERENCES:
-List
of documents cited in this data set description
Data Format Guidelines
An inherent flexibility of the EOL data management system permits data in all different formats to be submitted, stored and retrieved from CODIAC. EOL is prepared to work with the participants to bring their data to the archive and make sure it is presented, with proper documentation, for exchange with other project participants. In anticipation of receiving many data sets from the field sites in ASCII format we are providing guidelines below that will aid in the submission, integration and retrieval of these data. EOL will work with any participants submitting other formats including NetCDF, AREA, HDF and GRIB to assure access and retrieval capabilities within CODIAC.
The following ASCII data format guidelines are intended to help standardize the information provided with any data archived for the project. These guidelines are based on EOL experience in handling thousands of different data files of differing formats. Specific suggestions are provided for naming a data file as well as information and layout of the header records and data records contained in each file. This information is important when data are shared with other project participants to minimize confusion and aid in the analysis. An example of the layout of an ASCII file using the guidelines is provided below.Keep in mind that it is not mandatory that the data be received in this format. However, if the project participants are willing to implement the data format guidelines described below, there are some improved capabilities for integration, extraction, compositing and display via CODIAC that are available.
Naming Convention
A) All data files should be uniquely named. For example, it is very helpful if date can be included in any image file name so that the file can be easily time registered. Also include an extension indicating the type of file:
i.e.
.gif = GIF image format file
.jpg = jpg image format file
.txt = Text or ASCII format file
.cdf = NetCDF format file
.tar = archival format
If compressed, the file name should have an additional extension indicating the type of compression (i.e. .gz, .z, etc.).
Spreadsheet and Columnar ASCII Data
For spreadsheet and columnar ASCII data to be converted into shape files, please use the following template:
CruiseID | text field | blank if not applicable (examples: HLY0601, PSEA0902) |
StationNum | text field | blank if not applicable |
StationNme | text field | blank if not applicable |
DataDate | text field, YYYY-MM-DD | other formats acceptable, e.g. YYYYMMDD |
DataYear | text field | blank if not applicable |
DataTime | text field, HH:mm:SS | other formats acceptable, e.g. HHmm or HHmmSS |
TimeZone | text field, UTC, MST, AKST, etc. | blank if not applicable |
UTCOffset | text field, number of hours from UTC, + or - | used if TimeZone is not UTC - blank if not applicable |
Latitude | decimal degrees | use negative values for south latitude; set to -999.99 if missing |
Longitude | decimal degrees | use negative values for west longitude; set to -999.99 if missing |
Depth | meters | if not applicable, should be set to -999.99 |
Data... | as many columns as needed |
The order of the fields should be as they appear in the list. Any other variables can be included after Depth in any order. The number -999.99 should be used for missing numeric values and a blank field for missing text values, or "unknown" if applicable.
Note: Shape files limit names of fields to 10 characters. Certain reserved words for shape files can not be used for field names in the data files, such as Date, Year and Time.
Further Information
Please direct any questions regarding data format(s) and documentation guidelines to the PacMARS data management support team (pacmars at eol dot ucar dot edu).