GAGE DATA README FILE Table of Contents Introduction Archive file naming convention What is the format of the files ? What type of metadata is there for the sites, and what is the format ? What kind of quality control steps are taken ? References related to SHEF processing Who do I contact if I have a question ? INTRODUCTION The Climate Prediction Center (CPC), a component of the National Centers for Environmental Prediction (NCEP) acquires gage-based precipitation reports in near real-time from several thousand sites across the contiguous 48 states of the USA. These data support a variety of NCEP tasks, especially the prototype, real-time, hourly, multi-sensor National Precipitation Analysis (NPA), a joint effort of NCEP and the NWS Office of Hydrology (OH). Approximately 3000 automated, hourly rain gage observations are available over the contiguous 48 states via the GOES Data Collection Platform (DCP) administered by OH. These hourly reports are transmitted continuously throughout the day in groups of 3 or 4 reports per transmission to NCEP in a SHEF-encoded message. Meanwhile, there are approximately 5,800 daily rain gage reports per day out of a total available set of 9,000. These daily precipitation reports are collected by the 12 River Forecast Centers (RFC) and sent to NCEP via AFOS in a SHEF-encoded message. These sites are predominately cooperative sites in that a majority also belong to the National Climatic Data Center (NCDC) cooperative reporting network. However, many other sites belong to networks that do not fall within the standard cooperative network grouping. NATIONAL PRECIPITATION ANALYSIS ARCHIVE FILE NAMING CONVENTION The format of the gage data files is ASCII. The files are compressed using the UNIX "compress" command and "uncompress" must be used before decoding. FILE NAME INFO: gage.lll.prcp.YYYYMMDD.Z lll: Time duration of the reports dly - 24 hour precipitation amounts hrly - 1 hour precipitation amounts YYYYMMDD: Valid date This is the year, month, and day for the gage reports. MM & DD have a zero preceeding the units digit when MM or DD is less than 10. WHAT IS THE FORMAT OF THE FILES ? The format of the gage files are the same for the first 37 characters. The daily gage reports have additional information about the report which is not a part of the hourly reports. Therefore, the additional information appear after the common information. COLUMNS FIELD NAME 1 - 4 Year -- Time 6 - 7 Month -- of 9 - 10 Day -- the 12 - 13 Hour -- Observation 15 - 16 Minute -- (All digits, all times are GMT) 18 - 25 Site ID 27 - 28 Element Code (PP=Precipitation) 30 - 37 Element Value (Real number with 2 decimal places) 39 - 39 Revision Code (0=No, 1=Yes) 42 - 45 Duration Code (2001=1 24-hour day, 5004=Time period begining at 7 am local time and ending at the time of observation) 47 - 47 Type Code (F=Forcast, P=Processed, R=Reading) 49 - 49 Source Code (G=GOES, M=Meteor Burst, P=Phone, Z=None) 51 - 51 Extreme Code (N=Minimum of day, X=Maximum of Day, Z=None) 53 - 60 Data Source (ID of NWS office sending report, if known) 62 - 62 Changed (0=No, 1=Yes) Has the original value been changed? Only 1 update is allowed. 64 - 64 Quality Flag (Z=None, B=Buddy Check, C=Climatology Check, R=Bad Zero Check, E=Estimated value, V=Value verified, M=Manual overide of B/C/R) WHAT TYPE OF METADATA IS THERE FOR THE SITES AND WHAT IS THE FORMAT ? Most of the metadata for these sites are the geographical coordinates of the site, elevation (if known) and the city, state names. The primary site ID is based upon the National Weather Service Communications Handbook #5. For those sites without such a ID, the NESDIS platform ID may be used, or lastly an ID given to the site by the owner of the network. In most cases, the secondary ID for a site is the NCDC cooperative network ID or NESDIS platform ID. However, site IDs that are known to be invalid are excluded in the initial processing at NCEP's main computer facility. There continue to exist site IDs without metadata that NCEP continues to search out and incorporate in its files. COLUMNS FIELD NAME 1 - 8 Primary Site ID 10 - 15 Latitude (Real number with 2 decimal places 0-90) 17 - 22 Longitude (Real number with 2 decimal places 0-180) 24 - 31 Station Type 33 - 41 Secondary Site ID (-- = no value for this field) 44 - 48 Elevation (Integer number, feet above MSL) 50 - 51 State (Post Office Abbreviation) 53 - 78 Place Name (City or town name) WHAT TYPE OF QUALITY CONTROL STEPS ARE TAKEN? For 24-hour reports, any value greater than 20 is excluded from the dataset. Begining with the Jan 1999 data sets, a quality flag is assigned to each report, with Z being the default value. Please be aware that these quality flags are an indication that a report is questionable. This does not mean that the value is incorrect, especially flags B & C in certain circumstances. These flags are generated within 24 hours of receipt and the quality control procedure is run only once. Late arrival of reports or isolated storm events may cause inappropriate flags to be assigned. There are 4 additional quality flags that may be assigned to a 24-hour report. B - Fails a buddy check of nearby reports. C - Value is more than 5 standard deviations for the climatology of that site using the historical record from National Climatic Data Center. R - Value is zero, yet radar reports precipitation fell during that day. M - Manual overide that report is ok despite a previous B/C/R flag setting. (This flag begun with 1-Jun-2000 reports) E - The value in the report is estimated, not actual reading. V - Value has been verified as correct by procedures at the data source. For hourly reports, no quality control is performed. Because of this, there may be times when inappropriate (not incorrect) values are recorded. The cause is due to the fact that these automatic reporting sites may change between incremental and cumulative reporting modes in a random manner. A sudden, large increase in values that persists for many hours is usually an indication of cumulative reporting mode. The user can calculate the differences to arrive at an hourly, incremental value. An analysis of the data indicated that a number of reports for hourly precipitation had invalid site IDs due to transmission errors. Begining in Jan 1997, the hourly precipitation files were revised to exclude these bad data. Additional information may be found at: http://www.emc.ncep.noaa.gov/mmb/mesoscale.html REFERENCES RELATED TO SHEF PROCESSING ...Standard Hydrometeorological Exchange Format (SHEF), Version 1.2 Hydrology Handbook No. 1, Office of Hydrology, National Weather Service WHO TO CONTACT IF YOU HAVE QUESTIONS Stage IV analysis : Ying.Lin@noaa.gov Gage-only, multi-sensor techniques : Dongjun.Seo@noaa.gov Precipitation data: Sid.Katz@noaa.gov This file last updated on 7-Sep-2005