SURFACE LAND DAILY COOPERATIVE SUMMARY OF THE DAY TD-3200 Prepared by Charles Phillips National Climatic Data Center Federal Building Asheville, North Carolina 02 May 2000 This document was prepared by the U.S. Department of Commerce, National Oceanic and Atmospheric Administration, National Environmental Satellite Data and Information Service, National Climatic Data Center, Asheville, North Carolina. This document is designed to provide general information on the current, origin, format, integrity and the availability of this data file. Errors found in this document should be brought to the attention of the Data Base Administrator, NCDC. See topic 58 for a summary of this data set. Table of Contents Page Topic Number __________________________________________________________________ INTRODUCTORY TOPICS 1. Data Set ID................................................5 2. Data Set Name..............................................5 3. Data Set Aliases...........................................5 _________________________________________________________________ DESCRIPTION 4. Access Method and Sort for Archived Data...................5 5. Access Method and Sort for Supplied Data...................9 6. Element Names and Definitions.............................11 7. Start Date................................................18 8. Stop Date.................................................18 9. Coverage..................................................18 10. Location..................................................18 11. Keywords..................................................18 12. How to Order Data.........................................19 ________________________________________________________________ DATA CENTER 13. Archiving Data Center. ...................................19 _________________________________________________________________ PERSONNEL 14. Technical Contact.........................................19 15. Known Uncorrected Problems................................20 16. Quality Statement.........................................20 _________________________________________________________________ DATES 17. Revision Date.............................................20 _______________________________________________________________ OTHER DATA SETS 18. Source Data Sets..........................................20 19. Essential Companion Data Sets.............................20 20. Derived Data Sets.........................................20 _________________________________________________________________ SUMMARIZATION 21. References................................................21 22. Summary...................................................21 _________________________________________________________________ APPENDICES Appendix A: 3200fix_1.c program documented source code.........24 Appendix B: 3200fix_2.c program documented source code.........51 1. Data Set ID: 3200 2. Data Set Name: Cooperative Summary of the Day 3. Data Set Aliases: Surface Land Daily Cooperative Data Summary of the Day Data Co-op Data Climatological Data Daily Weather Data SOD 4. Access Method and Sort for Archived Data: TD-3200 data are archived in a variable length element file structure. The element file structure is designed to allow maximum flexibility in requesting data. Only those elements or groups of elements of particular interest need be ordered. A TD-3200 record contains one month of daily data of one data type for one weather station. The record consists of two distinct portions: The first portion, consisting of the first 30 characters of the record, is the ID section. The second portion, consisting of the rest of the record, is the data. These portions are discussed in more detail in section 6, "Element Types and Definitions". In general, there is only one record with a particular station, data type, year and month. Exceptions are the soil temperature data types and the "days with weather" data type, in which there may be multiple records. The record format lends itself to variable-length records, and that is one way the user may receive it. Data may also be received in a fixed length record structure described in topic 5 "Description: Access Method and Sort for Supplied Data". a. COBOL Data Description (1 example) b. FORTRAN Data Descriptions (2 examples) c. Control Language Notes d. List of Variables ("Elements") e. Schematic Variable Length Record Format Layout The following COBOL and FORTRAN statements are to be used as guidelines only. NCDC recognizes the fact that many different types of equipment are used in processing these data. It is impossible to cover all the idiosyncrasies of every system. a. COBOL Data Description This is a typical ANSI Standard COBOL Variable Length Description. FD INDATA LABEL RECORDS ARE STANDARD RECORDING MODE D BLOCK CONTAINS 12000 CHARACTERS DATA RECORD IS DATA-RECORD. 01 DATA-RECORD. 02 RECORD-TYPE PIC X(3). 02 STATION-ID PIC X(8). 02 ELEMENT-TYPE PIC X(4). 02 ELEMENT-UNITS-CODE PIC XX. 02 YEAR PIC 9(4). 02 MONTH PIC 99. 02 FILLER PIC 9(4). 02 NUMBER-VALUES PIC 9(3). 02 DAILY-ENTRY OCCURS 1 TO 100 TIMES DEPENDING ON NUMBER-VALUES. 04 DAY PIC 99. 04 HOUR PIC 99. 04 DATA-VALUE PIC S9(5) SIGN LEADING SEPARATE. 04 D-VAL REDEFINES DATA-VALUE. 05 SIGN-VAL PIC X. 05 DATA-IN PIC X(5). 04 FLAG-1 PIC X. 04 FLAG-2 PIC X. b. FORTRAN Data Description (1) FORTRAN 77 Example 1 This description is for those systems that can handle variable blocked records normally. IMPLICIT INTEGER (A-Z) OPEN (10,FILE = 'FILENAME',ACCESS = 'SEQUENTIAL', STATUS = 'OLD', + RFORM = 'VB',MRECL = 1230,TYPE = 'ANSI',BLOCK = + 12000) C LAST 2 lines of OPEN statement are SPERRY UNIQUE DEFINE FILE 10 (ANSI, VB, 1230, 12000) CHARACTER*3 RECTYP CHARACTER*8 STNID CHARACTER*4 ELMTYP CHARACTER*2 EUNITS CHARACTER*1 FLAG1, FLAG2 DIMENSION IDAY(100), IHOUR(100), IVALUE(100), FLAG1(100), + FLAG2(100) 10 READ (10,20,END=999,ERR=10) RECTYP, STNID, ELMTYP, EUNITS, IYEAR, + IMON, IFIL, NUMVAL, (IDAY(J), IHOUR(J), IVALUE(J), + FLAG1(J), FLAG2(J), J=1, NUMVAL) 20 FORMAT (A3, A8, A4, A2, I4, I2, I4, I3, 100(2I2, I6, 2A1)) (2) FORTRAN 77 Example 2 This description is for those systems that can't handle variable blocked records normally. PROGRAM TAPEREAD IMPLICIT INTEGER (A-Z) ..... OPEN(1,FILE=TAPE:',ACCESS='SEQUENTIAL',FORM=FORMATTED', + STATUS='OLD',READONLY) CHARACTER BUFFER*12000 ! YOUR MACHINE MUST SUPPORT ! CHARACTER VARIABLES THIS LARGE CHARACTER*3 RECTYP CHARACTER*8 STNID CHARACTER*4 ELMTYP CHARACTER*2 EUNITS CHARACTER*1 FLAG1, FLAG2 DIMENSION IDAY(100), IHOUR(100), IVALUE(100), FLAG1(100) + FLAG2(100) ..... NBYTES=0 5 NBEG=1 READ(1,101,END=99)BUFFER !READ IN PHYSICAL RECORD (BLOCK) 10 NBEG=NBEG+NBYTES READ(BUFFER(NBEG:NBEG+3,102)NBYTES !READ THE CONTROL WORD IF( NBYTES.EQ.0 )GO TO 5 READ(BUFFER(NBEG+4:NBEG+NBYTES-1),103) RECTYP, STNID, ELMTYP, + EUNITS, 1YEAR, IMON, IFIL, NUMVAL, (DAY(J), IHOUR(J), + IVALUE(J), FLAG1(J), FLAG2(J), J=1, NUMVAL) ..... ..... GO TO 10 99 CONTINUE ..... ..... STOP 'FINISHED' 101 FORMAT(A) 102 FORMAT(I4) 103 FORMAT (A3, A8, A4, A2, I4, I2, I4, I3, 100(212, 16, 2A1)) END c. Control Language Notes (1) IBM JCL Notes For ASCII Variable specify: LRECL = 1234 RECFM = DB OPTCODE = Q For EBCDIC Variable specify: LRECL = 1234 RECFM = VB (2) VAX DCL Notes $ MOUNT/FOREIGN/BLOCKSIZE=12000 MT: tapename TAPE: d. List of Variables ELEMENT WIDTH POSITION 001 RECORD TYPE (= DLY) 3 001-003 -- 002 STATION ID 8 004-011 | 003 METEOROLOGICAL ELEMENT TYPE 4 012-015 | 004 MET. ELEMENT MEASUREMENT UNITS CODE 2 016-017 -- ID 005 YEAR 4 018-021 | PORTION 006 MONTH 2 022-023 | 007 FILLER (= 9999) 4 024-027 | 008 NUMBER OF DATA PORTIONS THAT FOLLOW 3 028-030 -- 009 DAY OF MONTH 2 031-032 -- 010 HOUR OF OBSERVATION 2 033-034 | 011 SIGN OF METEOROLOGICAL ELEMENT VALUE 1 035 -- DATA VALUE OF METEOROLOGICAL ELEMENT 5 036-040 | PORTION QUALITY CONTROL FLAG 1 1 041 | QUALITY CONTROL FLAG 2 1 042 -- DATA GROUPS IN THE SAME FORM AS ELEMENT 12 043-054 DATA PORTION POSITIONS 31-42 REPEATED AS MANY 12 055-066 DATA PORTION TIMES AS NEEDED TO CONTAIN ONE MONTH 12 067-078 DATA PORTION OF RECORDS. ..... ..... 608 12 1219-1230 DATA PORTION e. Format (Variable Length Record Layout) 1. The first eight elements (positions 001-030) constitute the ID PORTION of the record and describe the characteristics of the entire record. The next six elements, the DATA PORTION of the record contains information about each meteorological element value reported. This portion is repeated for as many values as occur in the monthly record. 2. Each logical record is of variable length with a maximum of 774 characters. (In the FORTRAN and COBOL examples, a larger number of characters may be allowed for.) Each logical record contains a station's data for a specific meteorological element over a one month interval. The form of a record is: ID PORTION (30 Characters) Fixed length ******************************************************** * REC | STATION | ELEM | | | | | NUM > * TYP | ID | TYPE | UNT | YEAR | MO | FILL | VAL > ******|**********|******|************|****|******|****** * XXX | XXXXXXXX | XXXX | XX | XXXX | XX | XXXX | XXX > ******************************************************** ELEMENTS 001 002 003 004 005 006 007 008 DATA PORTION (12 Characters, repeated "NUM-VAL" times--up to 100) **************************************************************** < DY | HR | MET. ELEM | FL | FL | DY | HR | MET. ELEM | FL | FL > < | | | 1 | 2 | | | | 1 | 2 > < | |************| | | | |************| | > < | | | DATA | | | | | | DATA | | > < | | S | VALUE | | | | | S | VALUE| | > <****|****|****|*******|****|****|****|****|*****| *****|****|*** > < XX | XX | X | XXXXX | X | X | XX | XX| X | XXXXX| X | X > **************************************************************** ELEMENTS 009 010 011 012 013 014 015 016 017 018 019 020 ******************************** < DY | HR | MET. ELEM | FL | FL * < | | | 1 | 2 * < | |***********| | * < | | | DATA | | * < | | S | VALUE | | * <****|****|***|*******|****|***** < XX | XX | X | XXXXX | X | X * ******************************** ELEMENTS 603 604 605 606 607 608 3. The Number of Data Portions (position 008) for the logical record of type "DLY" ranges from 1 to 62. 5. Access Method and Sort for Supplied Data: In addition to a variable length record structure, users may also receive data in a fixed length record structure as described below. However, the user must specify whether to extract either the original or edited data values. Supplied data are in the same sort as archived data (see topic 4 "Description: Access Method and Sort for Archived Data"). Provided within this section are information and examples of how to access the fixed length data records, specifically: a. COBOL Data Description b. FORTRAN Data Description c. List of Variables ("Elements") d. Schematic Fixed Length Record Format Layout a. COBOL Data Description This is a typical ANSI Standard COBOL Fixed Record Length Description FD INDATA LABEL RECORDS ARE STANDARD (FOR STD LABEL TAPES) RECORDING MODE F BLOCK CONTAINS 15 RECORDS DATA IS DATA-RECORD 01 DATA RECORD 02 RECORD-TYPE PIC X(3). 02 STATION-ID PIC X(8). 02 ELEMENT-TYPE PIC X(4). 02 ELEMENT-UNITS PIC XX. 02 YEAR PIC 9(4). 02 MONTH PIC 99. 02 FILLER PIC 9(4). 02 NUMBER-VALUES PIC 9(3). 02 DAILY-ENTRY OCCURS 31 TIMES. 04 DAY PIC 99. 04 HOUR PIC 99. 04 DATA-VALUE PIC S9(5) SIGN LEADING SEPARATE. 04 D-VAL REDEFINES DATA-VALUE. 05 SIGN-VAL PIC X. 05 DATA-IN PIC X(5). 04 FLAG-1 PIC X. 04 FLAG-2 PIC X. b. FORTRAN Data Description FORTRAN 77 Example DEFINE FILE 10 (ANSI, FB, 402, 6030) CHARACTER*3 RECTYP CHARACTER*8 STNID CHARACTER*4 ELMTYP CHARACTER*2 EUNITS CHARACTER*1 FLAGI, FLAG2 DIMENSION IDAY(31), IHOUR(31), + IVALUE(31), FLAG1(31), FLAG2(31) 10 READ (10,20,END=999,ERR=10)RECTYP, STNID, ELMTYP, EUNITS, IYEAR, + IMON, IFIL, NUMVAL, (IDAY(J), IHOUR(J), IVALUE(J), + FLAG1(J), FLAG2(J), J=1, 31) 20 FORMAT (A3, A8, A4, A2, I4,I2, I4, 13, 31(212, I6, 2A1)) c. List of Variables ELEMENT WIDTH POSITION 001 RECORD TYPE (= DLY) 3 001-003-- 002 STATION ID 8 004-011 | 003 METEOROLOGICAL ELEMENT TYPE 4 012-015 | 004 MET. ELEMENT MEASUREMENT UNITS CODE 2 016-017 |--ID 005 YEAR 4 018-021 | PORTION 006 MONTH 2 O22-023 | 007 FILLER (= 9999) 4 024-027 | 008 NO. OF DATA PORTIONS THAT FOLLOW (= 031) 3 028-030-- 009 DAY OF MONTH 2 031-032-- 010 HOUR OF OBSERVATION 2 033-034 | DATA 011 SIGN OF METEOROLOGICAL ELEMENT VALUE 1 035 |--PORTION VALUE OF METEOROLOGICAL ELEMENT 5 036-040 | QUALITY CONTROL FLAG 1 1 041 | QUALITY CONTROL FLAG 2 1 042 -- DATA GROUPS IN THE SAME FORM AS ELEMENT 12 043-054 DATA PORTION POSITIONS 31-42 ARE REPEATED 12 055-066 DATA PORTION 31 TIMES. ..... ..... 194 12 391-402 DATA PORTION d. Format (Fixed Length Record Layout) 1. The first eight elements (positions 001-030) constitute the ID PORTION of the record and describe the characteristics of the entire record. The next six elements, the DATA PORTION of the record, contain information about each meteorological element value, reported. This portion is repeated 31 times. 2. Each logical record is fixed with 402 characters. Each logical record contains a station's data for a specific meteorological element over a one month interval. The form of a record is: ID PORTION (30 characters) Fixed Length ****************************************************** * REC | STATION | ELEM | | | | | NUM > * TYP | ID | TYPE | UNT | YEAR | MO | FILL | VAL > ** ***|**********|******|*****|******|****|******|**** > * XXX | XXXXXXXX | XXXX | XX | XXXX | XX | XXXX | XXX > ****************************************************** ELEMENTS 001 002 003 004 005 006 007 008 DATA PORTION (12 Characters, repeated 31 Times) ***************************************************** < DY | HR | MET. ELEM | FL | FL | DY | HR | MET. ELEM > < | |***********| | | | |********** > < | | | DATA | | | | | | DATA > < | | S | VALUE | | | | | S | VALUE > < ***|****|***********|****|****|****|****|***|****** > < XX | XX | X | XXXXX | X | X | XX | XX | X | XXXXX > ***************************************************** ELEMENTS 009 010 011 012 013 014 015 016 017 018 ******************************** < DY | HR | MET. ELEM | FL | FL * < | |***********| 1 | 2 * < | | | DATA | | * < | | S | VALUE | | * < ***|****|***|*******|****|***** < XX | XX | X | XXXXX | X | X * ******************************* ELEMENTS 189 190 191 192 193 194 6. Element Names and Definitions: RECORD TYPE The type of data stored in this record. (Value is "DLY"). Each record contains one month of daily values. STATION-ID This 8-character alphanumeric station identifier is assigned by the National Climatic Data Center. The first two digits refer to a state code (value range is 01-91; reference Table "A"). The next four digits refer to the Cooperative Network Index number (value range is 0001-9999). The last two digits are the Cooperative Network Division Number (value range is 01-10; 99 = Missing Division Number; reference Table "B"). METEOROLOGICAL ELEMENT-TYPE The type of meteorological elements stored in this record. Range of values are listed below. DYSW The different types of weather occurring that day (reference Table "C"; See topic 46 "Data Quality: Confidence Factors"). EVAP Daily evaporation (not reported when temperature below freezing). Unit Measurement, Inches & Hundredths of Inches. MNPN Daily minimum temperature of water in an evaporation pan (effective September 1963). Unit Measurement, Whole Degrees Fahrenheit. MXPN Daily maximum temperature of water in an evaporation pan (effective September 1963). Unit Measurement, Whole Degrees Fahrenheit. PRCP Daily precipitation. (Precipitation reading for 24 hours ending at time of observation. Trace is less than 0.005 inch. Unit Measurement, Inches to Hundredths. SNOW Daily Snowfall (Snowfall includes sleet). Amount is for 24-hour period ending at observation time. Hail was included with snowfall from July 1948 through December 1955. Hail occurring alone was not included with either snowfall or snow depth before and after that period. Trace is less than 0.05 inch. Unit Measurement, Inches to Tenths. SNWD Snow depth at observation time. (Snow depth is depth of snow on the ground at time of observation. Trace is depth less than 0.5 inch.) Unit Measurement, Whole Inches. TMAX Daily maximum temperature. (Maximum temperature reading for 24 hours ending at time of observation.) Unit Measurement, Whole Degrees Fahrenheit. TMIN Daily minimum temperature. (Minimum temperature reading for 24 hours ending at time of observation.) Unit Measurement, Whole Degrees Fahrenheit. TOBS Temperature at observation time. Unit Measurement, Whole Degrees Fahrenheit. WDMV 24-hour wind movement. Unit Measurement, Whole Miles. WTEQ Water equivalent of snow depth. (For principal stations only. Effective October 1963 for snow depth equal or greater than 2 inches). Unit Measurement, Inches to Tenths. SNyz Daily minimum soil temperature (see note below). Unit Measurement, whole degrees Fahrenheit. SOyz Soil temperature at observation time (see note below). Unit Measurement, whole degrees Fahrenheit. SXyz Daily maximum soil temperature (see note below). Unit Measurement, whole degrees Fahrenheit. Soils Note 1: Even though TD-3200 is daily data, many cases of two or more soil temperature measurements per day, at different times, can be found. Soils Note 2: Positions "y" and "z" of the soil temperatures are encoded using reference Table "D", e.g., SN12 indicates that the daily minimum soil temperatures that follow are measured in an area covered with grass and at a depth of four inches or 10 centimeters. METEOROLOGICAL ELEMENT MEASUREMENT UNITS CODE The units and decimal position (precision) of the data value for this record (reference Table "E"). See topic 45 "Data Quality: Known Uncorrected Problems" for additional details. YEAR This is the year of the record. Range of values is 1850-current year processed. MONTH This is the month of the record. Range of values is 01-12 LST. FILLER Filler value is 9999. NUMBER OF DATA PORTIONS THAT FOLLOW The number of data values in this record. Range of values is 01-62. NOTE: A record may contain fewer or more data values than you might expect. A maximum of two DATA PORTIONS are used for each day of the month so as to allow one original data value and one edited data value. At most, 62 data values may be contained in any given record (e.g., 30 + (62 x 12) = 774 characters). Thus, while a maximum of 1,230 characters has been assigned in some of the program examples, no more than 774 characters will be used for the daily data record types. If a particular data value was not taken or is unavailable, there is no entry for it. For meteorological elements observed once a day, if all the daily observations of a given month are received and pass QC checks, there will be one DATA PORTION for each day. And, if every value failed the quality control, there may be two DATA PORTIONS for every day of that month. When two DATA PORTIONS for a daily record are encountered (with the exception of DYSW), one finds that the original data values are flagged and the second DATA PORTION is the best available replacement. (See code definitions for the Flag 2 element). DAY OF MONTH Contains the day of the month on which an observation was made. Range of values is 01- 31. HOUR OF OBSERVATION Contains the hour of the daily observation. Hour of observation is reported using the 24-hour clock with values ranging from 00-23 LST, except in the cases of soil temperatures element-type (where the hour is 99 to indicate missing) and "days with weather" (where the hour is 24). Through June 1967 observations were designated as "AM" or "PM"; these values were set to 06 or 18 respectively during a conversion of earlier data to TD-3200. From July 1967 through 1981, all observations were set to hour 18 (because the majority are p.m. observations). Beginning January 1982, the actual hour of the observation is available and is indicated. (See topic 32 "Stations: Station Observation Schedule" for additional information). SIGN OF METEOROLOGICAL VALUE The algebraic sign of the meteorological data value is given as either a blank or a minus sign (-). Blank indicates a positive value and a minus sign represents a negative value. VALUE OF METEOROLOGICAL ELEMENT The actual data value is given as a five-digit integer. One major exception does exist however, for the DYSW (days with weather code) element-type values as explained in Table "C". For fixed length records only when a data value is missing, the sign of the data value is set to "-", the data value is set to "99999", flag position 1 is set to "M" and flag position 2 is blank. For variable-length records, the minus sign is omitted for any such values. A special case of missing data occurs in the precipitation data types when, for one or more days, the observer made no measurements. On a later day, the observer resumed measurements. The first new measurement showed the accumulated total of precipitation during all of the missing days. This case is represented in TD-3200 data by observations for two days: The first day is the first on which no measurement was made. It has a data value of "99999" (missing) and has "S" in flag 1. The second day is the day of the first new measurement. The data value of the second day is the accumulated total and has "A" or "B" in flag 1. FLAG1 The Data Measurement FLAG (reference Table "F"). FLAG2 The Data Quality FLAG (reference Table "G"). TABLES TABLE "A" State-Code Table 01 Alabama 28 New Jersey 02 Arizona 29 New Mexico 03 Arkansas 30 New York 04 California 31 North Carolina 05 Colorado 32 North Dakota 06 Connecticut 33 Ohio 07 Delaware 34 Oklahoma 08 Florida 35 Oregon 09 Georgia 36 Pennsylvania 10 Idaho 37 Rhode Island 11 Illinois 38 South Carolina 12 Indiana 39 South Dakota 13 Iowa 40 Tennessee 14 Kansas 41 Texas 15 Kentucky 42 Utah 16 Louisiana 43 Vermont 17 Maine 44 Virginia 18 Maryland 45 Washington 19 Massachusetts 46 West Virginia 20 Michigan 47 Wisconsin 21 Minnesota 48 Wyoming 22 Mississippi 49 Not Used 23 Missouri 50 Alaska 24 Montana 51 Hawaii 25 Nebraska 66 Puerto Rico 26 Nevada 67 Virgin Islands 27 New Hampshire 91 Pacific Islands TABLE "B" Cooperative Network Division Table NOTE: The division number for a station may change over time. HAWAII (STATE 51) ISLAND NAME DIVISION Kauai 01 Oahu 02 Molokai 03 Lanai 04 Maui 05 Hawaii 06 PACIFIC ISLANDS (STATE 91) Division 02 - East of 180th Meridian - Phoenix Islands, Line Islands, and American Samoa 03 - Western Pacific Islands, North of 12N 04 - Caroline and Marshall Islands TABLE "C" DYSW - Daily Occurrence of Weather Tables POR = Period of Record. 00 - Day of no occurrence 01 - Day with smoke or haze (POR through 1963 and 1982 to Present 02 - Day with fog (POR through 1963 and 1982 to Present) 04 - Day with drizzle (POR through 1963 and 1982 to Present) 05 - Day with ice pellets (sleet) 06 - Day with glaze 07 - Day with thunder 08 - Day with hail 09 - Day with dust or sand storm (POR through 1963 and 1982 to Present) 10 - Day with blowing snow 11 - Day with high wind (POR through 1963 and 1982 to Present) 12 - Day with tornado (POR through 1963 and 1982 to Present) 13 - Day with rain (1982 to Present) 14 - Day with snow (1982 to Present) Note: These two-character DYSW element- type codes are stored into the rightmost four characters of the data value portion of the meteorological element. The leftmost character is always zero (0). Within the four characters used, the weather codes are entered left justified. Thus, if one type of weather occurs during a day, the data values would appear as OXXOO, where XX is the appropriate weather code. If two types of weather occur, the data value will contain OXXYY, where XX is the first code and YY is the second. If more than two types of weather occur on the same day, they will be stored into additional records of the element-type code "DYSW" as needed. TABLE "D" Soil Temperature Table (y = Code for soil cover) (z = Code for soil depth) ******************************************* |Code| Cover | |Code | Depth | Depth | | | | | | (inches) | (cm) | |************| |*******|**********|*******| |y=1 | Grass | | | | | | 2 | Fallow| |z = 1 | 2 | 5 | | 3 | Bare | | | | | | | ground| | 2 | 4 | 10 | | 4 | Brome | | | | | | | grass | | 3 | 8 | 20 | | 5 | Sod | | | | | | 6 | Straw | | | | | | | mulc | | 4 | 20 | 50 | | 7 | Grass | | | | | | | muck | | 5 | 40 | 100 | | 8 | Bare | | | | | | | muck | | 0 | Unknown |Unknown| | 0 |Unknown| | | | | ******************************************* NOTE: Soil records are kept since 1982. Some stations may report soil temperatures at observation time twice a day. Separate records will occur for both observation times. TABLE "E" Units of Measurement Table Range of values where b = Blank: bF Whole degrees Fahrenheit (right justified HI Hundredths of inches bI Whole inches (right justified) bM Whole miles (right justified) NA No units applicable (nondimensional) TI Tenths of inches TABLE "F" Data Measurement Flag 1 A - Accumulated amount since last measurement. B - Accumulated amount includes estimated values (since last measurement). E - Estimated (see Table "G" for estimating method). J - Value has been manually validated. M - For fixed length records only. Flag1 is "M" if the data value is missing. In this case, the sign of the meteorological value is assigned "-" and the value of the meteorological element is assigned "99999". S - Included in a subsequent value. (data value = "00000" OR "99999"). T - Trace (data value = 00000 for a trace). ( - Expert system edited value, not validated. ) - Expert system approved edited value. Blank - Flag not needed. Notes: An original observation that is followed by an edited observation is indicated by "2" in Flag 2. Flag 1 values of "S" and "A" occur in pairs (i.e. a daily value will have "S" in Flag 1 and the next daily value will have "A" or "B" in Flag 1). These flags have traditionally been associated with PRCP and SNOW data, but can occur in any of the data types in which the data may be cumulative from day to day. TABLE "G" Data Quality Flag 2 0 - Valid data element. 1 - Valid data element (from "unknown" source, pre-1982). 2 - Invalid data element (subsequent value replaces original value). 3 - Invalid data element (no replacement value follows). 4 - Validity unknown (not checked). 5 - Original non-numeric data value has been replaced by its deciphered numeric value. A - Substituted TOBS for TMAX or TMIN. B - Time shifted value. C - Precipitation estimated from snowfall. D - Transposed digits. E - Changed units. F - Adjusted TMAX or TMIN by a multiple of + or -10 degrees. G - Changed algebraic sign. H - Moved decimal point. I - Rescaling other than F, G, or H. J - Subjectively derived value. K - Extracted from an accumulated value. L - Switched TMAX and/or TMIN. M - Switched TOBS with TMAX or TMIN. N - Substitution of "3 nearest station mean". O - Switched snow and precipitation data value. P - Added snowfall to snow depth. Q - Switched snowfall and snow depth. R - Precipitation not reported; estimated as "O". S*- Manually edited value. T - Failed internal consistency check. U - Failed areal consistency check (beginning Oct. 1992). * Manually edited value could be derived by any procedure - Procedure unspecified. 7. Start Date: Data from 1948 onward comprise the bulk of the data set. This is when punch cards were used to help summarize climatological data. Data from earlier periods were placed upon punch cards as a result of various agreements with state universities. Meteorological Element-Type Daily Temperature 1850 Daily Precipitation 1862 Daily Evaporation 1951 Soil Data 1982 8. Stop Date: Present 9. Coverage: Southernmost Latitude 15S Northernmost Latitude 75N Westernmost Longitude 130E Easternmost Longitude 60W 10. Location: North America > USA Alaska Hawaii US Pacific Islands Puerto Rico US Virgin Islands 11. Keywords: Atmospheric Dynamics Evaporation Daily Evaporation Daily Maximum Water Temperature in Evaporation Pan Daily Minimum Water Temperature in Evaporation Pan Atmospheric Temperature Daily Maximum Temperature Daily Minimum Temperature Observation Time Temperature Geography & Land Cover Soil Soil Temperature at Observation Time Daily Maximum Soil Temperature Daily Minimum Soil Temperature Precipitation Daily Precipitation Daily Snowfall Daily Snow Depth Water Equivalent of Snow Depth Wind Wind Movement Earth Science Atmospheric Meteorology Climatology Land Hydrology Agriculture Temperature Weather Drizzle Ice Pellets Glaze Thunder Hail Dust Sand Storm Blowing Snow Winds Tornado Rain Smoke Fog TD-3200 3200 12. How to Order Data: These data are available for purchase from the National Climatic Data Center, Climate Services Branch, Federal Building, 151 Patton Avenue, Room 120, Asheville, NC., 28801-5001, phone number (828)-271-4800. Also they can be ordered through NCDC's web site at www.ncdc.noaa.gov or via e-mail at orders@ncdc.noaa.gov. 13. Archiving Data Center: National Climatic Data Center, NOAA/NESDIS/NCDC 151 Patton Avenue, Room 120 Asheville, NC 28801-5001 14. Technical Contact: Climate Services Division NOAA/NCDC 151 Patton Avenue, Room 120 Asheville, NC 28801-5001 PHONE: 828-271-4800 15. Known Uncorrected Problems: Users should also be aware of a potential for a "lag" in the change of observation times used in the logical record and what is actually in practice at the site (that is, several months may be archived digitally under an "old" observation time before NCDC received notification). 16. Quality Statement: The quality is very good except for DYSW data, which is fair to poor, depending on the station. The "days with weather" element type (DYSW) cannot be used with any measure of confidence. Principal climatological stations operating 24 hours a day are expected to be the most reliable source of "days with weather". Reporting of this element by cooperative observers is not a requirement and criteria for reporting is not definitive. Most stations do not record this information. 17. Revision Date: May 13, 1993 Nov 11, 1993 Dec 17, 1993 Jan 5, 1994 Jan 13, 1994 Feb 2, 1994 Apr 13, 1994 Jun 17, 1996 Jan 21, 1997 Nov 20, 2000 Apr 30, 2001 18. Source Data Sets: This data set is currently compiled from data received digitally from the National Weather Service and other federal agencies. Historical and other sources are described in topic 17 "Historical and Current Data Sources". 19. Essential Companion Data Sets: TD-3200 requires use of NCDC's in-house Station History file. 20. Derived Data Sets: Summary of Month Climatological Data Publication Historical Climatology Network-Daily Freeze Data and Growing Degree Days 21. References: National Weather Service, 1987: Cooperative Program Management, Weather Service Operations Manual B-17 (revised), NOAA-NWS, Silver Springs, MD. Reek T., S. R. Doty, and T. Owen, 1991: ValHiDD Documentation and Users Guide, Internal NCDC document, 20 pp. 22. Summary: This data set has been used in countless climatological studies, legal litigations, insurance claims, and various other research applications. Background This data is a compilation of daily observations initially obtained from state universities, state cooperatives, and the National Weather Service. Currently, there are approximately 8,000 active stations, known as cooperative observers, but data are in these files for approximately 23,000 stations for various years. Selected Summary of the Day data from related file TD-3210 for National Weather Service "first order" or principal climatological stations and "second order" stations have also been included in this file. The period of record and number of stations varies among the states. Most states began collecting data during 1948, but some began in 1946. Prior to 1948, most of these data were collected through cooperative agreements with state universities and other state organizations. Many years of records were digitized with some going as far back as the 1850s. These data have received a high measure of quality control through computer and manual edits. These data are subjected to internal consistency checks, compared against climatological limits, checked serially, and evaluated against surrounding stations. Quality control "flags" are appended to each element value to show how they fared during the edit procedures and to indicate what, if any, action was taken. The historical data prior to 1982 were converted from existing files then placed in the element file structure format after being processed only through a gross value check. In November 1993 the entire historical period of record was processed through a stringent quality control. Another round of quality control in November 2000 increased the data set's quality still more. Quality control The historical data were converted from existing digital files and placed in the element structure format in January 1982. At that time the data were only processed through a gross value check. Shortly thereafter, NCDC instituted a greatly enhanced computer algorithm for operational automated validation of digital archives. The revised edit system performs internal consistency checks and evaluates against surrounding stations in addition to climatological limits and serial checks. Quality control flags are appended to each element to show how they fared during the edit procedures and to indicate what, if any, action was taken. The files then, consisted of observed values only prior to 1982, with both observed and edited values (as necessary) being supplied since 1982. Since 1982 the operational edit system has evolved into a Geographical Edit and Analysis (GEA) expert system which affords interactive graphics presentations for the human editors. As of 1991 additional capabilities to detect systematic errors in the daily data have been incorporated using the Validation of Historical Daily Data (ValHiDD) system. In November 1993 the entire historical period of record was independently processed (no human editing) through the ValHiDD system for 5 elements (temperature: maximum and minimum; precipitation: total, snowfall, and snow depth). Hence, the entire period of record for these elements offers observed with edited values. (See topic 45 "Data Quality: Known Uncorrected Problems.") In November 2000 and April 2001, two new quality control computer programs, designed to supplement ValHiDD, were run on the entire historical period of record for all elements. The full documentation and source codes for these programs, "3200fix_1.c" and "3200fix_2.c", are appended to the end of this section. Stations - The TD-3200 data base is comprised primarily of stations in the National Weather Service (NWS) cooperative station network. The vast majority of the observers are volunteers (non-paid, private individuals). However, the cooperative network also includes the NWS principal climatological stations, which are operated by highly trained observers. The NWS cooperative station network also includes stations that are supported by other federal agencies (e.g., the Department of Interior and Department of Transportation). Commonly the observers at these stations are employees of the Federal Aviation Administration (FAA), National Park Service (NPS), Bureau of Land Management (BLM), U.S. Forest Service (USFC), U.S. Geological Survey (USGS), and Tennessee Valley Authority (TVA). A handful of stations in the TD-3200 data base are stations of the U.S. Department of Defense (USDOD) (e.g., Air Force and Navy bases). The observing equipment used at all of the stations, whether at volunteer sites or federal installations, are calibrated and maintained by NWS field representatives, Cooperative Program Managers (CPMs) and Hydro-Meteorological Technicians (HMTs). Years ago, stations were selected according to a grid plan. Today stations are selected if another station closes or a different requirement exists for a specific area. When a previously established station closes, a new station compatible to it is established within 5 miles when possible. There are no specifics for distribution of stations, other than an attempt to give a reasonable sampling of the entire region. Currently there are approximately 8,000 active stations in the data set. Historically, approximately 23,000 stations are included for various years. Elevations for fixed surface locations for the data set are mostly below 1,000 meters above sea level. The minimum elevation is -30 meters and the maximum is 3,000 meters. The accuracy of the maximum-minimum temperature system (MMTS) is +/- 0.5 degrees C, and the temperature is displayed to the nearest 0.1 degree F. The observer records the values to the nearest whole degree F. A Cooperative Program Manager calibrates the MMTS sensor annually against a specially-maintained reference instrument. Observations The primary intent of the cooperative observing program today is the recording of 24-hour precipitation amounts, but about 55% of the stations also record maximum and minimum temperatures. These observations are for the 24-hour period ending at the time of observation and might be called Summary of the Day observations. The vast majority of observation times are near either 7:00 a.m. or 7:00 p.m. local time. Principal climatological stations are usually fully instrumented, and therefore record a more complete range of meteorological parameters than other stations. These observations are usually for the 24-hour period midnight to midnight. However, observation times do vary because of special program needs, and because many of the observers are unpaid volunteers with their own scheduling needs. Ideally, all observations should be taken at the same time. But it not possible to establish and enforce such requirements. The next best thing is to have adequate and accurate information about the observing practices at each station. Even this is an elusive target. Volunteer observers occasionally change their observing schedule without notifying either the NWS or NCDC; in some instances precipitation observations are taken in the morning and temperature observations in the evening, or vice versa. Users should also be aware of a potential for a time lapse between a change of observation times at the observing site and the first appearance of the time change in NCDC records (that is, several months may be archived under an "old" observation time before NCDC received notification). The TD-3200 data itself contains certain observation-time aberrations: Observations through June 1967 originally carried only the designation "AM" or "PM"; these were later set to 06 or 18 respectively. (The actual hour is shown in the Climatological Data bulletin.) From July 1967 through December 1981, all observation times are shown as 18 (and in fact the majority are p.m. observations. Users must take these facts into consideration. From January 1982 onward, the official time of observation is carried with the data. Computer Quality Control Program Documentation - Computer programs "3200fix_1.c" and "3200fix_2.c" are the latest quality control measures used on data set TD-3200. Both programs are written in the "C" language. Their source code is included in this version of the documentation. APPENDIX A: Program 3200fix_1.c /* WHAT THIS PROGRAM DOES: This program is the first of two that make corrections to NCDC TD-3200 data. The program itself is generic "C". ------------------------------------------------------------------------------ COMPILE AND RUN INSTRUCTIONS: The source code file is "3200fix_1.c". A suggested compile command for the SUN Solaris UNIX environment is cc -O -o 3200fix_1 3200fix_1 -lm (-O and -o are upper- and lower-case letters.) This command creates an optimized program named "3200fix_1". To run the program: Place all input files in one directory. Make that directory the current directory. Then type in 3200fix_1 where is the location of the program. can be omitted if the program is located in the current (input files) directory. ------------------------------------------------------------------------------ DETAILS: Five types of files are involved in running this program. All but (1) are ASCII text. (1) The program itself, "3200fix_1". (2) TD-3200 input files (in NCDC's standard element format for TD-3200). (3) One output data file per input file. (4) One output file that is a summary log of changes. (5) One output file that is a detailed log of changes. See (2): "3200fix_1" will attempt to open, as input files, all ASCII files in the current directory whose names do NOT contain any periods (.). Files that are directories or programs will be ignored, regardless of name. Records may be fixed or variable length; it makes no difference. See (3): One output file is created for each input file. Each output file name is the same as the input file name, except that the extension ".nd" is appended. Output files are variable length ASCII. See (4): One summary log file named "3200log.summary1" is created. The file shows the number of records read, deleted, modified, swapped and output for each input file and station. See (5): One detailed log file named "3200log.details1" is created. It shows each modified record before and after the changes. For deleted records, it shows the record before deletion. Because it contains record images, it is variable length ASCII like (3). ------------------------------------------------------------------------------ OTHER PROGRAM INFORMATION: Functions appear in alphanumeric order in the "Function Prototypes" section. Variable names try to suggest what the variable represents. But names longer than a dozen characters are rare and local work variable names are usually quite short, often 1 or 2 characters. (The author's prejudice is that overly long variable names are more clumsy than helpful.) Variable mames are composed of some combination of letters, underscores, and numbers. Underscores and numbers may occur in any name. Letters occur in all names. Letter case is used to indicate the type of variable. a. All letters are upper case in global definition names and macro names. b. An upper case 'G' is always the first character of a global variable name. Other letters are always lower case. c. All letters in function names and in names of local variables are lower case. A few words about pointers and copy-overwrites, methods heavily used in this program, can be found at the end of this file. */ /* Include files */ #include #include #include #include #include #include /* Definitions and macros */ #define MAXOBS 62 #define MAXLEN 3000 #define MYMAX(x,y) ((x) >= (y) ? (x) : (y)) #define NBETWEEN(x,y,z) ((x) >= (y) && (x) <= (z)) #define MYISDIGIT(x) NBETWEEN((x), '0', '9') /* Function prototypes */ int analyz1(char *, int *), dostat(), dupheader0(char *, char *), dupheader1(char *, char *), dupheader2(char *, char *), halfequal(char *, char *), leapyr(int), msgfiles(), myfgets(char *, int, FILE *), mystrlen(char *, int), ndays(char *), opndir(char *), whichisreal(char *, char *), zerovalue(char *) ; void closefiles(), logcounts(), mycsort(char **, int), mystrcpy(char *, char *), readprocess() ; FILE *infile(), *outfile() ; /* Global variable declarations */ struct stat Gstbuf ; struct dirent *Gdp ; DIR *Gdfd ; FILE *Gfpin, *Gfpout, *Gfp0, *Gfp1 ; char *Gfilnam ; char Gfilnam2[50], Gscomm[200] ; int Gnumobs1, Gnumobs2 ; long Gn_ins, Gn_outs, Gn_exacts, Gn_dysw, Gn_dups1, Gn_dups2, Gn_stubs, Gn_nogo, Gn_soils, Gn_cons ; /****************************************************************************/ main() { char **filnams ; int nelems ; register int i ; /* Create message files and open the current directory for reading. */ if (!msgfiles() || !opndir(".")) return ; filnams = (char **) malloc(sizeof(char **)) ; /* Read the directory listing and identify files to be processed. Not all entities from the directory listing are files. Some of the files may be ineligible for processing. */ i = -1 ; while ((Gdp = readdir (Gdfd)) != NULL) { Gfilnam = Gdp->d_name ; /* This "if" excludes directories, links, executable files, files whose names contain periods. The order of calls is significant: "dostat" returns file information including file name; the bitwise "and" ("&") expression excludes any directory item that isn't a nonexecutable file; "strchr()" eliminates a file with a period (.) in the name. */ if (!dostat() || (Gstbuf.st_mode & S_IXUSR) || strchr(Gfilnam, '.') != NULL) continue ; /** patch: Exclude year-to-date files ** if (strstr (Gfilnam, "ytd") != NULL) continue ; */ /* Make room in dynamic array 'filnams[]' for a new file name 'Gfilnam'. Copy 'Gfilnam' into the array. */ filnams = (char **) realloc(filnams, (++i + 1) * sizeof(char **)) ; filnams[i] = (char *) malloc(strlen(Gfilnam) + 2) ; strcpy(filnams[i], Gfilnam) ; } nelems = i + 1 ; /* Here is the whole point of having an array of file names: To put the file names in alphanumeric order before processing. */ mycsort(filnams, nelems) ; /* Process input files in alphanumeric order, fetching the names from array 'filnams'. */ for (i = 0 ; i < nelems ; i++) { Gfilnam = filnams[i] ; sprintf (Gfilnam2, "%s%s", Gfilnam, ".nd") ; /* Open this file and a new output file. */ if (!infile() || !outfile()) continue ; fprintf (Gfp1, "\n\n\n%s%s\n%s%s\n", "*******************************", "************************************", Gfilnam, " being processed...") ; readprocess() ; /* read & process file */ closefiles() ; logcounts() ; /* print "counts" log */ } } /****************************************************************************** * Given a record 'rec', deduce the actual number of observations and place the * number in 'num'. Ignore the number of observations given in columns 28-30. * Check for observations of incorrect length and for bad day-time values and * sequences. Return 1 on success and 0 on failure. */ int analyz1 (char *rec, register int *num) { register int j, ldata, comp ; register char *data, *thisdt, *lastdt ; /* Store the length of the record in 'ldata'. Exclude any trailing '\n' characters from the count. */ ldata = strlen (rec) ; while (*(rec + ldata - 1) == '\n') --ldata ; /* If the record is too short to have a complete header section (30 bytes), return 0 for error. */ if (ldata < 30) return 0 ; /* Let 'data' be the address 30 bytes past 'rec', and let 'ldata' be the length of the data part of the record. */ data = rec + 30 ; ldata -= 30 ; /* Length of data section only */ /* A 'ldata' value of 12 or greater is expected. 12 is the length of one observation. There may be up to 62 obs, allowing for one obs for each day of a 31-day month, and also one edited obs for each original obs. Compute the number of obs and store in 'num'. */ if (ldata >= 12) { *num = ldata / 12 ; /* However, if 'ldata' is not a MULTIPLE of 12 or if 'num' is more than 62 obs, return 0 for error, anyway. */ if (ldata % 12 != 0 || *num > 62) return 0 ; } /* If 'ldata' is less than 10, the record is a "stub": too short to have even one obs. Return 0 for error. */ else if (ldata < 10) { return 0 ; } /* Allow for the special case of 'ldata' == 10 or 11. This indicates that there is one obs, but it is missing one or both flags. This is known to occur in the data and is correctable. */ else { while (ldata < 12) { mystrcpy(data + ldata, data + ldata - 1) ; *(data + ldata) = ' ' ; ldata++ ; } *num = 1 ; } /* Return error on days in decreasing order, or equal order without a flag 2 == '2'. */ for (j = 1, thisdt = data ; j < *num ; j++) { lastdt = thisdt ; thisdt = data + (j*12) ; if ((comp = strncmp(lastdt, thisdt, 2)) >= 0) if (comp > 0 || *(lastdt + 11) != '2') return 0 ; } return 1 ; } /***************************************************************************** * Closeout for one input file. This contains code specific to the operating * system. */ void closefiles() { fclose (Gfpin) ; fclose (Gfpout) ; /* The next line builds a character string in 'Gscomm' that looks like a UNIX command line: "compress ". */ sprintf (Gscomm, "compress %s\n", Gfilnam) ; /* The next line calls the operating system software (UNIX in this case) to execute the command line. */ system (Gscomm) ; } /***************************************************************************** * The "stat" function returns information about the item named 'Gfilnam' */ int dostat() { if (stat (Gfilnam, &Gstbuf) == -1) { fprintf (Gfp0, "%s: Can't use stat function\n", Gfilnam) ; return 0 ; } return 1 ; } /***************************************************************************** * Check two records "rec1" and "rec2", already known to be identical in the * first 27 characters, for the possibility that one is the continuation of the * other, and that together they make up one month of data. If so, append in * an appropriate way the data section of the later record to the end of the * earlier record, then "delete" the later record by placing '\0' in column 1. */ int dupheader0(char *rec1, char *rec2) { char cnobs[4], *src, *tgt ; int *nobssrc, *nobstgt ; register int icomp ; register char *a, *b ; a = rec1 + 30 + ((Gnumobs1-1) * 12) ; b = rec2 + 30 ; /* 'a' is the address of the day of the last obs in 'rec1', and 'b' is the first obs in 'rec2'. Compare the days. "Strncmp(a,b,2) > 0" (last day in 'rec1' later than first day in 'rec2') is by far the most likely case because normal records begin with a low-numbered day and end with a high- numbered. */ if ((icomp = strncmp(a, b, 2)) > 0) /* "Original test line". */ { /* Normal records will take this path through the code. But wait. There's still a chance the records are two parts of the same month: Maybe "strcmp(a,b,2)" happened because the wrong ends of the records were tested. Change 'a' to the first obs in 'rec1' and 'b' to the last obs in 'rec2' and try again. */ a = rec1 + 30 ; b = rec2 + 30 + ((Gnumobs2-1) * 12) ; /* Now with the day sense reversed, "strncmp(a,b,2) < 0" (first day of 'rec1' earlier than last day of 'rec2') is expected for normal records. Another case, the days being equal, is highly unlikely, but will be assumed to show normal records. Either case causes "return 0" here. */ if (strncmp(a, b, 2) <= 0) return 0 ; /* Route of almost all normal records. */ /* Arrive here if "strncmp(a,b,2) > 0" in the latest test. The records are for the same station, same data type, and same date; if the last day of one record is earlier than the first day of the other, the two records should be concatenated. */ src = rec1 ; /* Set up "rec1" as "source" record. */ nobssrc = &Gnumobs1 ; tgt = rec2 ; /* Set up "rec2" as "target" record. */ nobstgt = &Gnumobs2 ; } /* Suppose "Original test line" was false and was instead "< 0". In this case it is known that the last day of one record is earlier than the first day of the other, and concatenation should proceed. */ else if (icomp < 0) { src = rec2 ; /* Set up "rec1" as "target" record. */ nobssrc = &Gnumobs2 ; tgt = rec1 ; /* Set up "rec2" as "source" record. */ nobstgt = &Gnumobs1 ; } /* Or suppose "Original test line" was false and was instead "== 0", a highly unlikely but possible result. The records are assumed to be normal. */ else if (icomp == 0) { return 0 ; } /* Arrive here if the records are to be combined. Append the data section of 'src' to 'tgt'. */ fprintf (Gfp1, "\n\n%s\n%s\n%s\n%s\n", "Dups due to fractured records." " To this record:", tgt, "The data section of this record was " "appended, then this record was deleted:", src) ; strcpy(tgt+30+(*nobstgt*12), src+30) ; /* Join in 'tgt'. */ *nobstgt += *nobssrc ; /* Compute new number of obs. */ *src = '\0' ; /* "Delete" 'src'. */ *nobssrc = 0 ; sprintf(cnobs, "%3.3d", *nobstgt) ; strncpy(tgt+27, cnobs, 3) ; /* Insert new number of obs in 'tgt'. */ fprintf (Gfp1, "%s\n%s\n", "The joined product was saved as this record:", tgt) ; return 1 ; } /***************************************************************************** * Check two records "rec1" and "rec2", already known to be identical in the * first 27 characters, for the possibility that one was intended to be a * correction to the other. If so, change the other as the correction record * indicates, then delete the correction record. */ int dupheader1(char *rec1, char *rec2) { register char *s, *t, *p ; register int i, j ; register char *srcp30, *tgtp30 ; register int nsrc, ntgt, lastj, comp, nsrcm1, ntgtm1 ; char *src, *tgt ; char cnum[4] ; /* One record may have been intended to be a correction for the other if one is quite short (the correction record) and the other is comparatively long (the intended target). If not, exit. */ if (Gnumobs1 <= 5 && Gnumobs2 >= 10) /* "Rec1" short & "rec2" long? */ { src = rec1 ; nsrc = Gnumobs1 ; tgt = rec2 ; ntgt = Gnumobs2 ; } else if (Gnumobs1 >= 10 && Gnumobs2 <= 5) /* "Rec1" long & "rec2" short? */ { src = rec2 ; nsrc = Gnumobs2 ; tgt = rec1 ; ntgt = Gnumobs1 ; } else /* Neither? */ { return 0 ; } fprintf (Gfp1, "\n\n%s\n%s\n", "Dups due to original record + record of " "corrections to be applied. This original record:", tgt) ; /* Do the following bits of arithmetic and save them to avoid having to do them over and over in the "for" loops below. */ nsrcm1 = nsrc - 1 ; ntgtm1 = ntgt - 1 ; srcp30 = src + 30 ; tgtp30 = tgt + 30 ; /* Go through 'src' day-times one by one. 's', 't', and 'p' are pointers used to point to different addresses inside the 'src' and 'tgt' obs. */ for (i = lastj = 0 ; i < nsrc ; i++) { s = srcp30 + (i*12) ; /* Current 'src' obs */ /* Experience shows that 'src' obs may have correct days and data values, yet still have many problems. Make a few fixes to 'src' before copy to 'tgt'. More serious fixes are deferred to another program. */ p = s + 4 ; if (strchr (" -+", *p) == NULL) *p = ' ' ; p = s + 10 ; /* 'src' obs 1st flag location */ /* On one of the internal 'src' obs, both flags may be blank. Standardize to ' ' and '4'. */ if (i != nsrcm1) /* 'src' obs is not the last obs? */ { if (*p == ' ' && *(++p) == ' ') *p = '4' ; } /* On the last 'src' obs, flag 2 or both flags may be blank or truncated. Standardize to ' ' and '4' and end the 'src' record correctly. */ else /* Last 'src' obs? */ { if (*p == ' ' || *p == '\n' || *p == '\0') /* flags */ *p = ' ' ; if (*(++p) == ' ' || *p == '\n' || *p == '\0') *p = '4' ; *(++p) = '\n' ; *(p+1) = '\0' ; } /* Now move forward in time through 'tgt' obs until a match with the current 'src' day is found, or until it can be reasoned that there is no match. This loop is restarted for each new 'src' obs. */ for (j = lastj ; j <= ntgt ; ++j) { t = tgtp30 + (j*12) ; /* COPY of current 'src' obs to 'tgt' obs happens at end of the "if()" structure that starts on the next code line. 3 of the 4 "if()" branches drop through to COPY. */ /* First part of "if()". Is current 'src' obs later than the last 'tgt' obs? If so, 't' points to the address just after the last flag of the last obs. */ if (j == ntgt) { *(t+2) = *(t-10) ; /* Copy time from previous obs */ *(t+3) = *(t-9) ; /* Subcase where 'src' obs is first of a pair of obs with the same day. First obs is original value; second is edited. Create a new end for 't' 24 characters to the right. The old end will be overwritten during COPY. */ if (*(s+12) == *s && *(s+13) == *(s+1) && i != nsrcm1) { strncpy (t+12, s+12, 12) ; /* Copy 2nd obs */ *(t+24) = '\n' ; /* End 'tgt' correctly */ *(t+25) = '\0' ; ntgt += 2 ; /* Gain two obs */ j += 2 ; /* Prepare to bypass 2nd obs */ } /* Normal subcase. Create a new end for 't' 12 characters to the right. The old end will be overwritten during COPY. */ else { *(t+12) = '\n' ; /* End 'tgt' correctly */ *(t+13) = '\0' ; ntgt++ ; /* Gain one obs */ } } /* Current 'src' obs is same day as current 'tgt' obs? */ else if ((comp = strncmp(s, t, 2)) == 0) { /* Case where 'src' obs is first of a pair of obs with the same day. First obs is the original value; second is edited. */ if (*(s+12) == *s && *(s+13) == *(s+1) && i != nsrcm1) { /* If there are not two 'tgt' obs with the same day, shift 'tgt' 12 characters to the right so that there will be. */ if (*(t+12) != *t || *(t+13) != *(t+1)) { mystrcpy (t+12, t) ; /* Shift 't' right */ ntgt++ ; /* Gain one obs */ } *(t+12) = *(s+12) ; /* Copy 2nd obs day */ *(t+13) = *(s+13) ; strncpy (t+16, s+16, 8) ; /* 2nd obs data & flags */ j += 2 ; /* Prepare to bypass 2nd obs */ } } /* Current 'src' obs is later than current 'tgt' obs? Then bypass this 'tgt' obs and fetch the next one. This is the only case that does not drop to the COPY, but instead keeps the inner loop going. */ else if (comp > 0) { continue ; } /* Current 'src' obs is earlier than current 'tgt' obs? Occurs when there is no 'tgt' obs with the same day as the current 'src' obs. */ else { /* Case where 'src' obs is first of a pair of obs with the same day. First obs is the original value; second is edited. */ if (*(s+12) == *s && *(s+13) == *(s+1) && i != nsrcm1) { mystrcpy (t+24, t) ; /* Shift 'tgt' 2 obs right */ ntgt += 2 ; /* Gain two obs */ } /* Normal subcase: 'src' obs is an ordinary obs. */ else { mystrcpy (t+12, t) ; /* Shift 'tgt' 1 obs right */ ntgt++ ; /* Gain one obs */ } } /* End of the "if()" structure. 3 of the 4 primary cases drop to here, execute the COPY, and "break" out of the "for()" loop. COPY copies current 'src' days, data values and flags (no times) to current 'tgt' obs. Also, the restart point 'lastj' of this inner "for" loop is set. */ *t = *s ; /* Copy day */ *(t+1) = *(s+1) ; /* ditto */ strncpy (t+4, s+4, 8) ; /* COPY data value and flags */ lastj = j ; break ; } /* End of "for (j = lastj..." */ } /* Make sure displayed number of observations is correct. Bump counters. "Delete" 'src' so that only 'tgt' will be written to output. */ sprintf (cnum, "%.3d", ntgt) ; strncpy (tgt+27, cnum, 3) ; Gn_dups1++ ; fprintf (Gfp1, "%s\n%s\n%s\n%s\n", "Was corrected and saved as:", tgt, "This correction record was deleted:", src) ; *src = '\0' ; return 1 ; } /***************************************************************************** * Check two records 'rec1' and 'rec2', already known to be identical in the * first 27 characters, that one is a corrected version of the other. If so, * keep the corrected record and delete the other, and return 1. Otherwise, * return 0. */ int dupheader2(char *rec1, char *rec2) { int relate ; static char cdups[] = "Dups due to original record + corrected record. ", ckept[] = "This corrected record kept:", cdele[] = "This original record deleted:", cfmt[] = "\n\n%s%s\n%s\n%s\n%s\n" ; /* One record is considered to be a corrected version of the other if all of the following are true: (1) The records have identical headers and at least 6 observations each. (2) A reasonable number of obs in one record match in day and data value with obs in the other record. See function "halfequal" for a fuller explanation. (3) One record has more obs than the other, or has more accumulation flags and error flags than the other. */ if (Gnumobs1 < 6 || Gnumobs2 < 6 || !halfequal(rec1, rec2)) { return 0 ; } else if (Gnumobs1 > Gnumobs2) { Gn_dups2++ ; fprintf (Gfp1, cfmt, cdups, ckept, rec1, cdele, rec2) ; *rec2 = '\0' ; } else if (Gnumobs1 < Gnumobs2) { Gn_dups2++ ; fprintf (Gfp1, cfmt, cdups, ckept, rec2, cdele, rec1) ; *rec1 = '\0' ; } else if ((relate = whichisreal(rec1, rec2)) > 0) { Gn_dups2++ ; fprintf (Gfp1, cfmt, cdups, ckept, rec1, cdele, rec2) ; *rec2 = '\0' ; } else if (relate < 0) { Gn_dups2++ ; fprintf (Gfp1, cfmt, cdups, ckept, rec2, cdele, rec1) ; *rec1 = '\0' ; } else { return 0 ; } return 1 ; } /***************************************************************************** * See if a "reasonable number" of the data values of the two records 'rec1' * and 'rec2' match. (At one time during development it was simply half, * hence the function name.) Type DYSW records should not be evaluated by * this program. */ int halfequal(char *rec1, char *rec2) { register char *reca, *recb, *a, *b ; register int i, startj, j, numobsa, numobsb, n_equal, comp, iszcom, n_nonzero, minequal ; numobsa = numobsb = n_equal = comp = iszcom = n_nonzero = minequal = 0 ; /* Many different record types occur, but each type is a member of one of two groups: (1) Zero (0) occurs frequently as a data value in most records, whether the records are related or not. Zero values cannot be used in comparing 'rec1' to 'rec2'. This group includes all precipitation data types. (2) Zero (0) is an ordinary data value, no different from any other value, and can be used in comparing 'rec1' to 'rec2'. This group includes all temperature data types. The following "if" detects a record in group (1) and counts the number of non-zero values for later reference. */ b = rec1 + 11 ; /* "b" is data type location */ if ((iszcom = strncmp (b, "PRCP", 4) == 0 || strncmp (b, "SNOW", 4) == 0 || strncmp (b, "SNWD", 4) == 0 || strncmp (b, "WDMV", 4) == 0 || strncmp (b, "WTEQ", 4) == 0) != 0) { /* Load 'reca' with the address of the first obs in 'rec1', and let 'numobsa' be the number of obs in 'rec1'. */ reca = rec1 + 30 ; recb = rec2 + 30 ; numobsa = Gnumobs1 ; /* Run "infinite" outer loop that terminates on the second pass via "if (reca == recb)". */ for (;;) { /* The inner loop sets 'a' to each obs, where the first obs is at 'reca'. If an obs' data value is found to be valid AND not zero (recall that this is group (1)), counter 'j' is incremented. */ for (i = j = 0 ; i < numobsa ; i++) { a = reca + (i*12) ; if (!zerovalue (a+4) && *(a+11) != '2') j++ ; } if (reca == recb) /* Exit outer loop on 2nd pass. */ break ; /* These lines are reached at the end of the first pass of the outer loop. Loads 'reca' with the address of the first obs in 'rec2', and sets 'numobsa' to the number of obs in 'rec2'. Because of the "break" above, these lines are run only once. */ reca = recb ; numobsa = Gnumobs2 ; n_nonzero = j ; } /* Set 'n_nonzero' to the largest number of zeroes found in one record. 'Minequal' will be the smallest "reasonable number" of matches mentioned above. Set it to 1 or 1/3 of 'n_nonzero', whichever is larger. */ if (j > n_nonzero) n_nonzero = j ; minequal = MYMAX (1, n_nonzero / 3) ; } /* Associate the 'a' group of variables ('a', 'reca', 'numobsa') with the shorter of 'rec1' and 'rec2'. The "b" group of similarly-named variables is associated with the remaining record. */ if (Gnumobs1 <= Gnumobs2) { reca = rec1 + 30 ; recb = rec2 + 30 ; numobsa = Gnumobs1 ; numobsb = Gnumobs2 ; } else { reca = rec2 + 30 ; recb = rec1 + 30 ; numobsa = Gnumobs2 ; numobsb = Gnumobs1 ; } /* Set a 'minequal' ("minimum reasonable number") value for group (2) records or for the special case of group (1) records where all data values in both records are zero. */ if (!iszcom || n_nonzero == 0) minequal = numobsb / 3 ; /* 'a' and 'b' point to the days of the current obs of the two records. The days are compared. When a day pair is the same, the data values are compared. When the data values are equal, 'n_equal' is increased by 1 if at least one of the following is true: (1) Record type is in group (2). (2) Record type is in group (1) and all values in both records are zero. (3) Record type is in group (1) and current matched values are non-zero. */ for (i = startj = 0 ; i < numobsa ; i++) { a = reca + (i*12) ; if (*(a+11) == '2') /* Don't compare invalids. */ continue ; for (j = startj ; j < numobsb ; j++) { b = recb + (j*12) ; /* Don't compare invalids or non-matching days. */ if ((comp = strncmp (a, b, 2)) > 0 || *(b+11) == '2') continue ; /* Do the stuff in the long comment before the outer loop. */ if (comp == 0 && strncmp (a+4, b+4, 6) == 0) if (!iszcom || !n_nonzero || !zerovalue (a+4)) n_equal++ ; startj = j ; break ; } } /* Is the number of equal values between the two records 'n_equal' equal to or greater than 'minequal', a "reasonable number"? Return 1 if yes and 0 if no. */ return (n_equal >= minequal) ; } /***************************************************************************** * Open a file for reading. */ FILE *infile() { if ((Gfpin = fopen (Gfilnam, "r")) == NULL) fprintf (Gfp0, "%s: Can't open for reading\n", Gfilnam) ; return Gfpin ; } /***************************************************************************** * Return 1 if integer "yr" is a leap year and 0 if not. Use the full year * for best results. */ int leapyr(register int yr) { return (yr % 4 == 0 && (yr % 100 != 0 || yr % 400 == 0)) ; } /***************************************************************************** * Generate summary report. */ void logcounts() { fprintf (Gfp0, "%s being processed....\n", Gfilnam) ; fprintf (Gfp0, " %10ld total records input\n", Gn_ins) ; fprintf (Gfp0, " %10ld short records without obs, deleted\n", Gn_stubs) ; fprintf (Gfp0, " %10ld dup headers, exact duplicate records, one record deleted\n", Gn_exacts) ; fprintf (Gfp0, " %10ld dup headers, correction done and short correction record deleted\n", Gn_dups1) ; fprintf (Gfp0, " %10ld dup headers, correct record kept, original record deleted\n", Gn_dups2) ; fprintf (Gfp0, " %10ld dup headers, complementary records, joined into one\n", Gn_cons) ; fprintf (Gfp0, "(%10ld dup headers, DYSW, nothing deleted )\n", Gn_dysw) ; fprintf (Gfp0, "(%10ld dup headers, soils, nothing deleted )\n", Gn_soils) ; fprintf (Gfp0, "(%10ld dup headers, reason unknown, nothing deleted )\n", Gn_nogo) ; fprintf (Gfp0, " %10ld total records deleted\n", Gn_ins - Gn_outs) ; fprintf (Gfp0, " %10ld total records output\n\n", Gn_outs) ; } /***************************************************************************** * Create two message files to report results */ int msgfiles() { if ((Gfp0 = fopen ("3200log.summary1", "w")) == NULL || (Gfp1 = fopen ("3200log.details1", "w")) == NULL) { fprintf (stderr, "Can't create message files. Aborting...\n") ; return 0 ; } return 1 ; } /****************************************************************************** * Sort, in ascending order, "n" elements of array of character pointers "acp". * Each element of the array is a pointer that points to a '\0'-terminated * character string; the sort uses the contents of these strings. */ void mycsort(register char *acp[], register int n) { register int i, j, k ; register char *save ; for (i = 1 ; i < n ; i++) { for (save = acp[i], j = 0 ; j < i ; j++) { if (strcmp(save, acp[j]) < 0) { for (k = i ; k > j ; k--) acp[k] = acp[k-1] ; acp[j] = save ; break ; } } } } /***************************************************************************** * This is a version of standard "C" library function "fgets()". It reads one * record to and including the newline (\n) and writes into buffer 's'. Or, * it reads and writes until "n"-1 characters are transfered, if that occurs * first. This function does two things differently from "fgets()": First, * it returns the number of characters read instead of a pointer to 's'. * Second, even if there is no newline, it compensates and reads the record * successfully. On EOF and no characters read, -1 is returned. */ int myfgets (register char *s, register int n, register FILE *fp) { register int c, m ; register char *cs, *sp27 ; static char hold[] = {'\0', '\0', '\0'} ; /* If three capital letters remain from the previous call to this function, prefix them onto 's' and start at column 4. */ if (*hold) { strncpy (s, hold, 3) ; *hold = '\0' ; m = 3 ; cs = s + m ; } /* Else start at column 1. */ else { m = 0 ; cs = s ; } sp27 = s + 27 ; /* Read until end of record or until 'm' equals or exceeds 'n'. */ while ((c = getc (fp)) != EOF) /* Read loop. */ { /* '\n' is the normal end of record indicator. */ if (c == '\n') { if (cs == s) continue ; /* Nothing in "s" yet? Ignore '\n'. */ else break ; /* Normal end of record path. */ } /* Stop if too many characters have been read. */ if (++m >= n) break ; /* Check for a '\0'. There shouldn't be any in the input file, but sometimes there are. '\0' as the second character of the division # is known to occur. */ if (!c) { if (cs == s+10) strncpy (cs++ - 1, "99", 2) ; /* At division # */ else *cs++ = ' ' ; /* Everywhere else */ continue ; } /* All other characters read are loaded here. */ *cs++ = c ; /* Should the record have ended already? Sometimes there is no closing '\n'. Detect this by looking for "DLY". These are the first three characters of any TD-3200 record and should not appear anywhere else. If they do appear somewhere else, they must be in the next record, so they are placed in "hold" to use on the next call to this function. */ if (cs > sp27 && strncmp("DLY", cs-3, 3) == 0) { cs -= 3 ; strncpy (hold, cs, 3) ; /* Copy "DLY" to 'hold' */ break ; } } /* End of read loop. */ /* Arrive here if the above read loop has terminated for any reason. End the input record correctly (with '\n' and '\0') and return the count of characters read. If EOF occurs at the end of a record, it is treated as if it were '\n'. On the next call to this function, when EOF will be the first character read and there is no record to return, this function sets the first character of "s" to '\0' and returns -1. */ if (c == EOF && cs == s) { *cs = '\0' ; return -1 ; } else { *cs++ = '\n' ; /* Force the correct record closure */ *cs = '\0' ; return ++m ; } } /***************************************************************************** * This is a version of library function "strcpy(a,b)". "Mystrcpy()" is * guaranteed to handle "overlap" properly (where "b" is overwritten during * the write to "a"). The ANSI "C" Standard is silent on how "strcpy()" should * behave on "overlap". */ void mystrcpy (register char *a, register char *b) { register char *p ; /* If address 'a' is not later than 'b', use a forward copy method. */ if (a <= b) { while (*a++ = *b++) ; } /* If 'a' is later than 'b', there is a possibility of data loss if the forward copy is used. Use reverse copy. */ else { for (p = b ; *b ; a++, b++) ; for (; p <= b ; *a-- = *b--) ; } } /***************************************************************************** * Version of "strlen()" that counts the number of characters in 's' and * returns that value, except if the number exceeds 'len', counting stops * and 'len' is returned. */ int mystrlen (register char *s, register int len) { register int i ; for (i = 0 ; i < len && *(s+i) ; i++) ; return i ; } /***************************************************************************** * Return the number of days in a month, given "y", the address of the first * character of a date string that begins with YYYYMM. */ int ndays(char *y) { int yr, mo ; sscanf(y+4, "%2d", &mo) ; switch (mo) { case 1: case 3: case 5: case 7: case 8: case 10: case 12: return 31 ; case 4: case 6: case 9: case 11: return 30 ; case 2: if (sscanf(y, "%4d", &yr) == 1) return (leapyr(yr)) ? 29 : 28 ; else return 0 ; default: return 0 ; } } /***************************************************************************** * Open directory for reading. */ int opndir (char *dirnam) { if ((Gdfd = opendir (dirnam)) == NULL) { fprintf (Gfp0, "Can't open directory %s to read files\n", dirnam) ; return 0 ; } return 1 ; } /***************************************************************************** * Open a file for writing. */ FILE *outfile() { if ((Gfpout = fopen (Gfilnam2, "w")) == NULL) fprintf (Gfp0, "%s: Can't open for writing\n", Gfilnam2) ; return Gfpout ; } /***************************************************************************** * Read records from an input file and write to output. Some records are not * written to output, which effectively "deletes" them; and some are modified * before writing. Most of the checks are for different types of "duplicate * records". */ void readprocess() { register int nrecsm1, i, j, *k, nread, dt ; register char *rec1, *rec2, *p ; /* These "static" variables are initialized the FIRST time this function is called. They retain the latest values on exit from this function, and still have those values on re-entry. */ static int nplaces = 1; static char **stns = NULL, toomany[] = "Too many characters read -- bad input file\n" ; nread = 0 ; nrecsm1 = 0 ; Gn_ins = Gn_outs = Gn_exacts = Gn_dysw = Gn_dups1 = Gn_dups2 = Gn_stubs = Gn_nogo = Gn_soils = Gn_cons = 0L ; /* Initialize resizeable array 'stns[]' (it's really a pointer-to-a-pointer) when this function is called for the FIRST time. */ if (stns == NULL) { stns = (char **) malloc(sizeof(char **)) ; stns[0] = (char *) malloc(MAXLEN) ; } /* Outer loop reads all records in this file. */ for(;;) { rec1 = stns[nrecsm1] ; /* 'Rec1' is where new record is stored */ nread = myfgets(rec1, MAXLEN, Gfpin) ; /* Read a new record */ /* When EOF was just returned, skip code that is designed for the new record. */ if (nread != -1) { if (nread >= MAXLEN) /* Bad input file? */ { fputs(toomany, Gfp0) ; return ; } Gn_ins++ ; /* If 'rec1' is a record fragment without even one observation, discard it by ignoring it; reuse its storage area for the next new record. */ if (mystrlen (rec1, 35) <= 34) { Gn_stubs++ ; fprintf (Gfp1, "\n\nShort record deleted:\n%s\n", rec1) ; continue ; } /* If the header in the most recently read record is the same as the header in the first record, increment "nrecsm1" and read the next record. On different headers (most common case), fall through the "if()". Usually 'rec1' points to 'stns[1]' or higher, and may be different from 'stns[0]'. An exception is on the first read from a file, when 'rec1' points to 'stns[0]'; the "if()" will test as "true" then. */ if (strncmp(rec1, stns[0], 27) == 0) { if (++nrecsm1 >= nplaces) { /* If adjustable array 'stns[]' is not big enough, make it bigger. */ nplaces = nrecsm1 + 1 ; stns = (char **) realloc(stns, nplaces * sizeof(char **)) ; stns[nrecsm1] = (char *) malloc(MAXLEN) ; } continue ; } } /* End of "if (nread != -1)" */ /* Arrive here when the latest record has a different header, or is EOF. Decrement "nrecsm1" by 1 so that the latest record is not processed yet. */ nrecsm1-- ; /* If the data type in record 'rec1' is soils or DYSW, increment the appropriate counter. */ dt = '\0' ; switch (*(p = stns[0]+11)) /* Test for soils or DYSW */ { case 'S': /* Soil test */ if (MYISDIGIT(*(p+2)) && MYISDIGIT(*(p+3))) if (strchr ("NOX", *(p+1)) != NULL) { Gn_soils += nrecsm1 + 1 ; dt = 's' ; } break ; case 'D': /* DYSW data type test */ if (strncmp (p+1, "YSW", 3) == 0) { Gn_dysw += nrecsm1 + 1 ; dt = 'd' ; } break ; } /* Do quality control checks on pairs of records with the same header that is stored in 'stns[]'. This loop runs only when there are at least two records with the same header; if 'nrecsm1' == 0 (only one record), there is nothing to do. */ for (i = 0 ; i < nrecsm1 ; i++) { rec1 = stns[i] ; if (!*rec1) /* Bypass "deleted" records */ continue ; sscanf(rec1 + 27, "%3d", &Gnumobs1) ; /* Look for certain types of date and obs-length errors in 'rec1' that this program will not attempt to handle. If so, leave the record to be output in its original form and get the next record. */ if (!analyz1(rec1, &Gnumobs1)) continue ; /* Compare all possible pairings of "stns[]" records. 'Rec1' and 'rec2' are the two being compared. */ for (j = i + 1 ; j <= nrecsm1 ; j++) { rec2 = stns[j] ; if (!*rec2) /* Skip "deleted" records */ continue ; /* When 'rec1' and 'rec2' are exact duplicates, 'rec2' is marked for deletion. Characters 28-30 (number of obs) are not considered. */ if (strcmp (rec1+30, rec2+30) == 0) { Gn_exacts++ ; *rec2 = '\0' ; continue ; } /* Don't do further checks on soils or DYSW data. Note that exact duplicates of those types were deleted, however. (See previous note.) */ else if (dt) { continue ; } sscanf(rec2 + 27, "%3d", &Gnumobs2) ; /* Look for certain types of date and obs-length errors in 'rec2' that this program will not attempt to handle. If so, get the next 'rec2'. */ if (!analyz1(rec2, &Gnumobs2)) { continue ; } /* One record may have been intended as a correction for the other. There are two subtypes. In either case one record is deleted. See the two functions for more information. */ else if (dupheader1(rec1, rec2) || dupheader2(rec1, rec2)) { continue ; } /* One record may be a continuation of the other. If so, join the records correctly in one of 'rec1' or 'rec2' and mark the other for deletion. */ else if (dupheader0(rec1, rec2)) { Gn_cons++ ; } /* The two records may not have a relationship that can be deduced by this program. (Maybe one station id is wrong and that record is from a different station.) */ else { Gn_nogo++ ; fprintf (Gfp1, "\n\nDups, unknown case. Both of " "these records kept:\n%s%s\n", rec1, rec2) ; } } /* End of "for (j = i + 1..." */ } /* End of "for (i = 0..." */ /* Output loop. Write to the output file all records in 'stns[]' that aren't marked as deleted. Count the writes. */ for (i = 0 ; i <= nrecsm1 ; i++) { rec1 = stns[i] ; if (*rec1) /* Output only if record is not "deleted" */ { fputs (rec1, Gfpout) ; Gn_outs++ ; } } /* Exit now if the latest record was EOF. */ if (nread == -1) break ; /* Copy the record 'stns[nrecsm1+1]', which was not processed, into 'stns[0]' where it will be processed on the next pass. Change 'nrecsm1' so that the next read will put a record in 'stns[1]'. */ strcpy(stns[0], stns[nrecsm1+1]) ; nrecsm1 = 1 ; } /* End of "for(;;)" */ } /****************************************************************************** * Count the number of error and accumulation flags in two records and return * the difference. */ int whichisreal(char *rec1, char *rec2) { register int i, j, numobs, *n ; register char *p, *recp40 ; int nflgs1, nflgs2 ; nflgs1 = nflgs2 = 0 ; numobs = Gnumobs1 ; n = &nflgs1 ; recp40 = rec1 + 40 ; for (;;) { for (i = 0 ; i < numobs ; i++) { p = recp40 + (i*12) ; if (strchr("ABES", *p) != NULL || strchr("01 ", *(p+1)) == NULL) ++*n ; } if (n == &nflgs2) break ; n = &nflgs2 ; recp40 = rec2 + 40 ; } return nflgs1 - nflgs2 ; } /****************************************************************************** * Determine if 6 characters starting at 'adr' evalaute to a numeric value of * zero. Return 1 if yes and 0 if no. */ int zerovalue (char *adr) { int val ; sscanf (adr, "%6d", &val) ; return !val ; } /* Discussion of methods used ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- Pointers and pointer arithmetic are much used in this program to copy character strings or to move to different locations in character strings. There is a discussion below, in case the concepts are unfamiliar. ------------------------------------------------------------------------------- First, basic information: In a character string copy in COBOL or FORTRAN, the variables involved in a character string copy are character variables. In FORTRAN, copying the character string stored in variable "source", to variable "target", is done by the expression target = source which is simple and to the point. The programmer does not need to know how the compiler does the copy internally. "C" is different in that the programmer must work with one character at a time and understand what happens. A character string copy is typically done with a function call like strcpy(target, source) ; "Strcpy()" is a "C" library function defined by the ANSI "C" Standard. "Target" and "source" represent character addresses. Either may look like a variable name or like an algebraic expression. The function copies the string one character at a time, and quits copying after a NULL character, the signal used by "C" to indicate the end of a character string, has been copied. An address is a number that "C" interprets as a memory location. An array variable or a pointer variable contains an address. An address can also be an algebraic expression that includes arrays or pointers: For example, if "a" is an address, then "a+12" is also an address. An array and a pointer are similar. The biggest difference is that the address stored in an array "variable" cannot be changed, but the address stored in a pointer variable can be changed. In this program, the reader will often see a pointer used to move to different positions in an array. Referring back to "target" and "source": These may be arrays, pointers, or expressions, as noted above; but they are simply character addresses. Whatever is stored at a character address is interpreted as a 1-byte ASCII character. A slightly different kind of copy is done by standard ANSI "C" library function strncpy(target, source, n) ; This function copies until a NULL character is copied, like "strcpy()", or until "n" (a positive integer) characters are copied, whichever occurs first. My preference is not to make a function call to copy three or fewer characters. Instead of strncpy(a, b, 2) ; I prefer the pointer expression form *a = *b ; *(a+1) = *(b+1) ; or the equivalent array form a[0] = b[0] ; a[1] = b[1] ; All three methods do the same thing. but the last two methods are probably faster than the function call. ------------------------------------------------------------------------------- More about pointer arithmetic and character string copying: One form of character string copying that is much used in this program is the copy-overwrite, in which pointer arithmetic is rampant. Typical are lines of code like (1) mystrcpy(a, a+12) ; (2) mystrcpy(a+12, a) ; where "mystrcpy()" is a general-purpose function in this program. "a" and "a+12" are character addresses inside the same long character string, whose starting address is typically be stored in some other variable that is not mentioned; "a+12" is 12 bytes to the right (toward higher addresses) of "a". "a-12" is 12 bytes to the left (toward lower addresses) of "a". Code line (1) copies, to "a", the character substring that begins at "a+12" and ends after the long string's NULL character. In effect, the 12 bytes between "a" and "a+11" are lifted out of the long string and discarded; the right end of the long string is slid to the left to fill the gap. The long string is 12 bytes shorter. This trick removes a 12-byte observation in a TD-3200 record. Code line (2) copies, to "a+12", the character substring that begins at "a" and ends after the long string's NULL character. In effect, the right end of the long string starting at "a+12" is slid to the right, leaving a 12-byte gap; and the 12 bytes between "a" and "a+11" are duplicated in the gap. The long string is 12 bytes longer. A new 12-byte observation in a TD-3200 record can be copied into the gap. [Charles Phillips] */ APPENDIX B: Program 3200fix_2.c /* WHAT THIS PROGRAM DOES: This program is the second of two that make corrections to NCDC TD-3200 data. The program itself is generic "C". ------------------------------------------------------------------------------ COMPILE AND RUN INSTRUCTIONS: The source code file is "3200fix_2.c". A suggested compile command for the SUN Solaris UNIX environment is cc -O -o 3200fix_2 3200fix_2 -lm (-O and -o are upper- and lower-case letters.) This command creates an optimized program named "3200fix_2". To run the program: Place all input files in one directory. Make that directory the current directory. Then type in 3200fix_2 where is the location of the program. can be omitted if the program is located in the current (input files) directory. ------------------------------------------------------------------------------ DETAILS: Five types of files are involved in running this program. All but (1) are ASCII text. (1) The program itself, "3200fix_2". (2) TD-3200 input files (in NCDC's standard element format for TD-3200). (3) One output data file per input file. (4) One output file that is a summary log of changes. (5) One output file that is a detailed log of changes. Referring to (2), input file names must be of the form "*.nd" where "*" is any group of characters that make a legal file name in your computer's operating system. The extension ".nd" is required. Referring to (3), an output file is created for each input file. Each output file name is the same as the input file name, except the extension is ".nd2". Output files contain variable-length records. Referring to (4), one summary log file named "3200log.summary2" is created. It shows the number of records read, deleted, modified, and output for each input file and station. Referring to (5), one detailed log file named "3200log.details2" is created. It shows each modified record before and after the changes. For deleted records, the original record is shown. Records with no changes do not appear in this log file. ------------------------------------------------------------------------------ OTHER USEFUL INFORMATION: Function prototypes appear in alphanumeric order in the "Function Prototypes" section. Functions appear in the following order in the code itself: (1) "Main()". (2) Specialized functions in alphanumeric order. (3) General-purpose functions in alphanumeric order. The phrase "significant obs" or "significant observation" is often seen in the comments. The basic meaning of this is "an edited obs, or an original obs without an edited obs following". Conversely, a "non-significant obs" is "an original obs with an edited obs following". Sometimes, a "significant obs" must also not be error-flagged. Variable names usually suggest what the variable represents. Names longer than a dozen characters are rare. Work/scratch variable names are usually quite short, often 1 letter. Names are composed of letters, underscores, and numbers. Letter case is significant: a. Global variable names always begin with an upper case letter, often a 'G'. Any other letters in the names are lower case. b. Global definition names and macro names never contain lower case letters. c. Ordinary variable names and function names never have upper case letters. Discussion of programming methods used is at the end of this file. */ /* Resource files. */ #include #include #include #include #include #include #include #include /* Definitions, constant */ #define DONT 0 #define FACTOR 5 #define MAXOBS 62 #define MAXRECS 30 #define NOSIGN 0 #define SIGN 1 #define MAXLEN ((MAXOBS*12) + 30) #define GNNONE -1 #define GNDYSW 0 #define GNEVAP 1 #define GNMNPN 2 #define GNMXPN 3 #define GNPRCP 4 #define GNSNOW 5 #define GNSNYZ 6 #define GNSNWD 7 #define GNSOYZ 8 #define GNSXYZ 9 #define GNTMAX 10 #define GNTMIN 11 #define GNTOBS 12 #define GNWDMV 13 #define GNWTEQ 14 /* Definitions, macro */ #define ERR_FLG(a) (strchr("23TU", *((a)+11)) != NULL) #define MAX(x,y) (((x) >= (y)) ? (x) : (y)) #define NBETWEEN(x,y,z) ((x) >= (y) && (x) <= (z)) #define SET_DATE(d) (((d) % 100 == 13) ? (d) + 88 : (d)) #define SIG_OBS1(a) (*((a)+1) != *((a)+13) || *(a) != *((a)+12)) #define SIG_OBS2(a) (SIG_OBS1(a) && !ERR_FLG(a)) /* Declarations, global variables */ FILE *Gfpout, *Gfplog0, *Gfplog1, *Gfpwork ; int Gnrec0, Gnrec, Gprcp_accum, Gsnow_accum, Gsnwd_accum, Gwteq_accum, Gwdmv_accum, Gevap_accum, Gin[15], Gout[15], Gmod[15], Gdel[15], Gelems[15][2], Gmean[15][12], Gsd[15][12], Gntypesfnd, Gndays[2][13] = {{0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}, {0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}} ; char *Gfilnam, Ginbuf[MAXLEN], Grec0[MAXRECS][MAXLEN], Grec[MAXRECS][MAXLEN], Earliest_date[7], Latest_date[7], Gtypesfnd[16][5], *Gtypes[] = { "DYSW", "EVAP", "MNPN", "MXPN", "PRCP", "SNOW", "SNYZ", "SNWD", "SOYZ", "SXYZ", "TMAX", "TMIN", "TOBS", "WDMV", "WTEQ" }, *Gstn = Ginbuf + 3, *Gdate = Ginbuf + 17, *Gda56 = Ginbuf + 21, *Gobs1 = Ginbuf + 30 ; /* Function prototypes */ int bad_chars(char *, int) ; void bldval(char *, int *, int), data_type_store(), check_time_place(char *, int, int), close_logs(), close_mainwork(), get_1_date(char *, int), get_all_stats() ; int get_data_type(char *), good_date(char *) ; void init1(), init2(), init3() ; int is_data_type_stored(char *), is_leap_year(int) ; void log_headers(char *, char *, char *), manage_block(char *), manage_detail_log(), manage_processing(), manage_summary_log(), mycsort(char **, int) ; int mygets(char *, int, FILE *) ; void mystrcpy(char *, char *) ; int numobs(char *) ; void open_logs(), open_mainwork() ; DIR *opndir (char *) ; FILE *opnfil(char *, char *) ; void num_recs_err(), prcp_accum(char *, int, int *) ; int prcp_snow_switched(char *, char *) ; void process_generic(int, int), process_prcp(int, int *), process_temp(int, int), qc_dysw() ; int qc_generic(char *, int) ; void qc_min_max(int, int, char, int) ; int qc_prcp(char *, int *), qc_prcp_AnoS(char *, int *, int), qc_prcp_SnoA(char *, int, int) ; void qc_prcp_snow_snwd_wteq(), qc_prcp_std0(char *, int) ; int qc_prcp_std1(char *, int), qc_prcp_std2(char *, int *), qc_prcp_zeroacc(char *, int *) ; void qc_snow_vs_prcp(), qc_snow_vs_snwd1(), qc_snow_vs_snwd2(), qc_snwd_vs_wteq() ; int qc_temp(char *, int, int), qc_temp2(char *) ; void read_err(int, char *, FILE *) ; int realmax(int), realmin(int) ; void rec_siz_err(), repair_minmax(char *, char *, int) ; int scale_err_1(char *, int, int), scale_err_2(char *, int), shift_event_1(char *) ; void write_1_date(), write_nobs(char *, int) ; /*----------------------------------------------------------------------------- *----------------------------------------------------------------------------- *----------------------------------------------------------------------------- *----------------------------------------------------------------------------- ****************************************************************************** * Main program. Open each input file and send it off to other functions that * do the actual processing. With no arguments, the program searches the * current directory for input files and processes all that end with the * characters ".nd". */ main() { struct dirent *dp ; DIR *dfd ; char **filnams ; int nelems ; register int i ; char gscomm[100] ; open_logs() ; /* Open the two log files */ filnams = (char **) malloc(sizeof (char **)) ; /* Open the current directory for reading. */ if ((dfd = opndir(".")) == NULL) return ; /* Read the names of directory listings, which are not all input file names. */ i = -1 ; while ((dp = readdir(dfd)) != NULL) { Gfilnam = dp->d_name ; /* Accept only names that end in ".nd" and store the names in an array. Only names of input files end with those characters. */ if (strcmp(Gfilnam + strlen(Gfilnam) - 3, ".nd") == 0) { filnams = (char **) realloc(filnams, (++i + 1) * sizeof(char **)) ; filnams[i] = (char *) malloc(strlen(Gfilnam) + 2) ; strcpy (filnams[i], Gfilnam) ; } } nelems = i + 1 ; /* And here is the whole point of having an array of file names: To put the file names in alphanumeric order before processing. */ mycsort(filnams, nelems) ; /* Loop to process the files. */ for (i = 0 ; i < nelems ; i++ ) { Gfilnam = filnams[i] ; manage_processing() ; sprintf(gscomm, "compress %s\n", Gfilnam) ; system(gscomm) ; /* unlink(Gfilnam) ; */ } close_logs() ; close_mainwork() ; } /*^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ *^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ *^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * Group of specialized functions (written specifically for this program). */ /****************************************************************************** * Evaluate the data field of an observation for bad characters. If found, * return 1; else return 0. The obs is 12 characters long and starts at * address "s". The data field is characters 5 through 10. */ int bad_chars(register char *s, int sign) { int ierr ; char *z ; ierr = 0 ; s += 4 ; /* Address of arithmetic sign. */ /* Check the arithmetic sign. */ switch (*s) { /* Space and '-' are the usual values of the arithmetic sign, but don't allow '-' if "sign" is false. */ case '-': if (!sign) ierr = 1 ; break ; case ' ': case '0': case '+': break ; default: ierr = 1 ; /* Anything else is an error. */ break ; } /* If the sign is OK, check the data field. Any non-digit is an error. */ if (!ierr) for (z = ++s + 5 ; s < z ; s++) if (!isdigit(*s)) { ierr = 1 ; break ; } return ierr ; } /****************************************************************************** * Fill integer array 'arr[]', which has 31 elements, with integer values, * using significant obs values from record 'rec'. The 31 elements represent * the significant obs for the month indicated by 'rec'. Use -1 in each * element that does not have a corresponding non-error-flagged significant * obs in 'rec'. */ void bldval(char *rec, int arr[], int nobs) { int day, val ; register int i ; register char *a, *rec30 ; /* * Initialize all 31 'arr[]' element values to -1. */ for (i = 0 ; i < 31 ; i++) arr[i] = -1 ; rec30 = rec + 30 ; /* * Look at all obs in 'rec' and put an integer version of the data value into * 'arr[]' when appropriate. */ for (i = 0 ; i < nobs ; i++) { a = rec30 + (i*12) ; /* * The current obs is bypassed if it is non-significant, or has error * flags, or has a non-numeric data value, or has a data value of "99999". */ if (!SIG_OBS2(a) || !sscanf(a+4, "%6d", &val) || val == 99999) continue ; /* * Load 'arr[]' with 'val' when 'val' is non-negative. 'Day' is the day * number of the obs; 'day'-1 is the element number of 'arr[]' to be * loaded. 'Arr[]' has 31 elements to accomodate 31-day months; for * shorter months the last one, two or three elements are not loaded. */ if (val >= 0) if (sscanf(a, "%2d", &day)) arr[day-1] = val ; } } /****************************************************************************** * Check SNOW, SNWD, or WTEQ record 'rec' for frozen precip events in the U.S. * in times and places in which they could not possibly occur. 'Rec' has * 'nobs' observations and is of data type 'ntyp', which is 5 (SNOW), 7 * (SNWD), or 14 (WTEQ). (A separate function deals with possible data value * scale factor problems.) */ void check_time_place(char *rec, int nobs, int ntyp) { int casenum, val ; char state[3], month[3] ; static char snow_ever[] = "02,04,05,10,24,26,29,35,42,45,48,50", snow_rare[] = "01,08,16,22,38,66,67,91", summer[] = "06,07,08" ; register int i, limit ; register char *a, *rec30 ; /* * Characters 4-5 of 'rec' is a two-digit numeric code that represents a U.S. * state or region. For programming convenience, some of the state codes are * put in variables: * * 'Snow_ever' contains codes for the states of Alaska, Arizona, California, * Colorado, Idaho, Montana, Nevada, New Mexico, Oregon, Utah, Washington, * and Wyoming. These are states where some stations are cold and snowy due * to high mountains or a far northern location. These stations can have * significant snow at any time. * * 'Snow_rare' is the states or regions of Alabama, Florida, Louisiana, * Mississippi, South Carolina, Puerto Rico, U.S. Virgin Islands, and U.S. * Pacific islands. These are warm areas, where the coldest stations rarely, * if ever, have snow, although hail is possible. * * Note that only the coldest stations in a state have been mentioned. 'Rec' * contains only the state/region of a station, not its elevation or latitude * or other information that might distinguish the station's climate from * another station's in the same state. As a result, each state must be * treated like its coldest and snowiest sites. This is why, for example, * Hawaii is not found in the 'snow_rare' group; it has 13,000 ft. volcanoes * that receive significant winter snow. * * 'Summer' is the warmest 3 months of the year: June, July, August. */ *state = *(rec+3) ; *(state+1) = *(rec+4) ; *(state+2) = '\0' ; /* * In 'snow_ever' states, significant snow anytime is possible; 'rec' does not * need to be tested. */ if (strstr(snow_ever, state) != NULL) return ; *month = *(rec+21) ; *(month+1) = *(rec+22) ; *(month+2) = '\0' ; /* * Most states can have large amounts of frozen precipitation during the colder * months. Test for this case by asking if 'month' is not a summer month and * if 'state' is not in the 'snow_rare' group. */ if (strstr(summer, month) == NULL && strstr(snow_rare, state) == NULL) return ; /* * For any other combination of 'month' and 'state', a limited amount of snow * or hail is possible, but it can't be a large amount. Set an upper limit * for a frozen precip data value in 'rec', based on 'ntyp'. */ limit = (ntyp == 5) ? 99 : 10 ; rec30 = rec + 30 ; for (i = 0 ; i < nobs ; i++) { a = rec30 + (i * 12) ; if (!SIG_OBS2(a) || !sscanf(a+5, "%5d", &val) || val == 99999) continue ; if (val > limit) { *(a+11) = 'T' ; *rec = 'M' ; } } } /****************************************************************************** * Close the log files. */ void close_logs() { fclose(Gfplog0) ; fclose(Gfplog1) ; } /****************************************************************************** * Close and delete the main work file. */ void close_mainwork() { fclose(Gfpwork) ; unlink("3200work.dat") ; } /***************************************************************************** * Store the data type of the current record in "Gtypesfnd". */ void data_type_store() { register int i ; i = get_data_type(Ginbuf + 11) ; if (NBETWEEN(i, 0, 14)) { (Gin[i])++ ; if (!is_data_type_stored(Gtypes[i])) strcpy(Gtypesfnd[Gntypesfnd++], Gtypes[i]) ; } } /****************************************************************************** * Given a date "date", find records for that date in general work file * "Gfpwork" and write them to array "Grec0". In the process, find the start * and stop element numbers for each data type and place those in array * "Gelems". */ void get_1_date(char *cdate, int init) { static char save[MAXLEN] ; int jlast, gnrec0 ; char *ginbuf, *gdate ; FILE *gfpwork ; register int j, icomp ; gfpwork = Gfpwork ; ginbuf = Ginbuf ; gdate = Gdate ; gnrec0 = j = 0 ; /* "Init" indicates whether work file "gfpwork" should be rewound and the first record read, or if, instead, reading is ongoing and a previously read record, stored in "save", should be brought up. */ if (init) { rewind(gfpwork) ; mygets(ginbuf, MAXLEN, gfpwork) ; } else { strcpy(ginbuf, save) ; } /* Search file "gfpwork" for records with date equal to "cdate". When found, copy them to array "Grec0". "Gelems" keeps track of the element numbers of the first and last records of each data type in "Grec0". The dates are in increasing order; when dates become later than "cdate", exit the loop. */ do { if ((icomp = strncmp(gdate, cdate, 6)) == 0) { if (gnrec0 >= MAXRECS) num_recs_err() ; jlast = j ; /* The record must be of a known data type. */ if ((j = get_data_type(ginbuf + 11)) > -1) { if (j != jlast && Gelems[j][1] == -1) Gelems[j][0] = gnrec0 ; Gelems[j][1] = gnrec0 ; strcpy(Grec0[gnrec0++], ginbuf) ; } } } while (icomp <= 0 && mygets(ginbuf, MAXLEN, gfpwork)) ; /* Save the number of records copied to "Grec0" and store the last record read in "save". */ Gnrec0 = gnrec0 ; strcpy(save, ginbuf) ; } /****************************************************************************** * Compute the means and sample standard deviations for the temperature data of * type "temp" in work file "Gfpwork". */ void get_all_stats() { register int i, j ; int ityp, nobs, mn, nn, val, n[15][12] ; char *a, *ginbuf, *gda56, *gobs1 ; FILE *gfpwork ; long sum[15][12] ; long long sumsq[15][12] ; long double dsum0, dm, d1 ; ginbuf = Ginbuf ; gfpwork = Gfpwork ; gda56 = Gda56 ; gobs1 = Gobs1 ; /* Initialization of means, standard deviations and temporary variables used during their computation. */ for (i = 2 ; i <= 12 ; i++) { switch (i) { case GNPRCP: case GNSNOW: case GNSNWD: continue ; } for (j = 0 ; j < 12 ; j++) { Gmean[i][j] = Gsd[i][j] = -1 ; n[i][j] = sum[i][j] = sumsq[i][j] = 0 ; } } /* Read all records in the work file. */ rewind (gfpwork) ; while (mygets(ginbuf, MAXLEN, gfpwork)) { ityp = get_data_type(ginbuf + 11) ; /* Ignore a record where the data type is not one of the temperatures. */ switch (ityp) { case GNNONE: case GNDYSW: case GNEVAP: case GNPRCP: case GNSNOW: case GNSNWD: case GNWDMV: case GNWTEQ: continue ; } sscanf(gda56, "%2d", &val) ; /* Also ignore a record with an invalid month. */ if (!NBETWEEN(val, 1, 12)) continue ; mn = val - 1 ; nobs = numobs(ginbuf) ; /* Number of observations. */ /* Initialize "a" to point to the day of the first obs; on each loop iteration, move "a" to the day of the next obs. */ for (i = 0 ; i < nobs ; i++) { a = gobs1 + (i * 12) ; /* Use this obs if it is a significant obs without error flags and if it has a numeric temperature value other than 99999. */ if (SIG_OBS2(a) && sscanf(a+4, "%6d", &val) && val != 99999) { /* The sum of squares, especially, may involve big numbers. */ (n[ityp][mn])++ ; /* Number of obs used. */ val += 460 ; /* Convert temperature to Rankine. */ sum[ityp][mn] += val ; /* Running sum. */ sumsq[ityp][mn] += val * val ; /* Running sum of squares. */ } } } /* Having gathered data values from the record, compute the mean and sample standard deviation from the number of obs "n", the sum of obs "sum", and the sum of squares of obs "sumsq". Calculate the mean and the sample standard deviation. */ for (i = 2 ; i <= 12 ; i++) { switch (i) { case GNPRCP: case GNSNOW: case GNSNWD: continue ; } for (j = 0 ; j < 12 ; j++) { nn = n[i][j] ; /* But do it only if "n[i][j]" shows that at least 15 obs were summed. If not, leave -1 in both. */ if (nn >= 15) { /* There are a goodly number of arithmetic type conversions in the following lines; to avoid problems, the lines are simple. See the variable declarations at the start of this function. */ /* Save the type "long" running sum of Rankine temperatures in type "long double" 'dsum0'. */ dsum0 = sum[i][j] ; /* Compute the arithmetic mean (sum/n) of Rankine temperatures and store it in type "long double" variable 'dm'. */ dm = dsum0 / nn ; /* Create a Fahrenheit mean from 'dm', then save it in type "long double" scratch variable 'd1'. */ d1 = dm - 460.0; /* Convert the type "long double" value of 'd1' to a rounded integer and store it in type "int" 'Gmean[i][j]'. */ Gmean[i][j] = (d1 < 0.0) ? d1 - 0.5 : d1 + 0.5 ; /* Multiply 'dm' by 'dsum0' ((sum*sum)/n) and store it in 'd1'. */ d1 = dm * dsum0 ; /* Now use the type "long long" running sum of squared Rankine temperatures to compute (sumsq-(sum*sum)/n)/(n-1) and store it in 'd1'. */ d1 = (sumsq[i][j] - d1) / (nn - 1) ; /* Compute the square root of 'd1', which is the sample standard deviation. Convert it to a rounded positive integer and put it in 'Gsd[i][j]'. */ Gsd[i][j] = sqrt(d1) + 0.5 ; } } } } /****************************************************************************** * String 'typ' begins with 4 characters which are a data type. Return that * type in the form of an integer. */ int get_data_type(char *typ) { int dt ; register char *a ; a = typ ; dt = GNNONE ; switch (*a) { case 'D': /* D??? */ if (strncmp(++a, "YSW", 3) == 0) dt = GNDYSW ; /* DYSW */ break ; case 'E': /* E??? */ if (strncmp(++a, "VAP", 3) == 0) dt = GNEVAP ; /* EVAP */ break ; case 'M': /* M??? */ if (strncmp(++a, "NPN", 3) == 0) dt = GNMNPN ; /* MNPN */ else if (strncmp(a, "XPN", 3) == 0) dt = GNMXPN ; /* MXPN */ break ; case 'P': /* P??? */ if (strncmp(++a, "RCP", 3) == 0) dt = GNPRCP ; /* PRCP */ break ; case 'S': /* S??? */ switch (*++a) { case 'N': /* SN?? */ if (strncmp(++a, "OW", 2) == 0) dt = GNSNOW ; /* SNOW */ else if (strncmp(a, "WD", 2) == 0) dt = GNSNWD ; /* SNWD */ else if (isdigit(*a++) && isdigit(*a)) dt = GNSNYZ ; /* SNyz */ break ; case 'O': /* SO?? */ if (isdigit(*++a) && isdigit(*++a)) dt = GNSOYZ ; /* SOyz */ break ; case 'X': /* SX?? */ if (isdigit(*++a) && isdigit(*++a)) dt = GNSXYZ ; /* SXyz */ break ; } break ; case 'T': /* T??? */ if (strncmp(++a, "MAX", 3) == 0) dt = GNTMAX ; /* TMAX */ else if (strncmp(a, "MIN", 3) == 0) dt = GNTMIN ; /* TMIN */ else if (strncmp(a, "OBS", 3) == 0) dt = GNTOBS ; /* TOBS */ break ; case 'W': /* W??? */ if (strncmp(++a, "DMV", 3) == 0) dt = GNWDMV ; /* WDMV */ else if (strncmp(a, "TEQ", 3) == 0) dt = GNWTEQ ; /* WTEQ */ break ; } return dt ; } /****************************************************************************** * Character string "dat" supposedly begins with a 6-digit date. If so, return * 1. Otherwise, return 0. */ int good_date(char *dat) { int yr, mn, count ; count = sscanf(dat, "%4d%2d", &yr, &mn) ; return (count == 2 && NBETWEEN(yr, 0, 9999) && NBETWEEN(mn, 1, 12)) ; } /****************************************************************************** * Initialization of variables before beginning a new input file. */ void init1() { Gprcp_accum = Gsnow_accum = Gsnwd_accum = Gwteq_accum = Gwdmv_accum = Gevap_accum = 0 ; Gtypesfnd[15][0] = '\0' ; } /****************************************************************************** * Initialization of variables before processing a new block of records from * one input file. */ void init2() { register int i ; for (i = 0 ; i < 15 ; i++) { Gin[i] = Gout[i] = Gmod[i] = Gdel[i] = 0 ; Gtypesfnd[i][0] = '\0' ; } Gntypesfnd = 0 ; strcpy(Earliest_date, "999999") ; strcpy(Latest_date, "000000") ; } /****************************************************************************** * Initialization of variables before processing a new day of records. */ void init3() { register int i ; for (i = 0 ; i < 15 ; i++) { Gelems[i][0] = 0 ; Gelems[i][1] = -1 ; } } /****************************************************************************** * See if "Gtypesfnd" contains data type "typ". If so, return 1. If not, * return 0. */ int is_data_type_stored(register char *typ) { register int i ; register char *p ; for (i = 0, p = Gtypesfnd[0] ; *p ; p = Gtypesfnd[++i]) if (strncmp(p, typ, 4) == 0) return 1 ; return 0 ; } /****************************************************************************** * Just a place to put some repeated code. */ void log_headers(char *str1, char *str2, char *str3) { char fmt[] = "\n\n\n%s\n%s %s being processed\n\n" ; fprintf(Gfplog0, fmt, str1, str2, str3) ; fprintf(Gfplog1, fmt, str1, str2, str3) ; } /****************************************************************************** * Manage the processing of a block of records, containing many data types, for * one station. They are handled one date at a time. */ void manage_block(char *stn) { int d, m0, sd0, mn, init, date1 ; char cdate[7] ; register int i, j, date, date2 ; sscanf(Earliest_date, "%d", &d) ; date1 = d ; sscanf(Latest_date, "%d", &d) ; date2 = d ; /* Compute means and standard deviations for all types read and all months. */ get_all_stats() ; /* Advance through each year-month between "Earliest_date" and "Latest_date". Use "get_1_date()" to find any records in work file "Gfpwork" with that date and write them to arrays "Grec0[]" and "Grec[]". Then use the various "process_..." functions to qc the records stored in "Grec[]". */ init = 1 ; for (date = date1 ; date <= date2 ; date = SET_DATE(date+1)) { mn = date % 100 ; sprintf(cdate, "%6.6d", date) ; init3() ; /* Write all records for this station for the current date in array "Grec0[]". "Gnrec0" is the total number of records in "Grec0[]". */ get_1_date(cdate, init) ; init = 0 ; if ((j = Gnrec0) == 0) continue ; /* Create a working array "Grec[]" and number of records "Gnrec" which initially are the same as "Grec0[]" and "Gnrec0". */ for (i = 0 ; i < j ; i++) strcpy(Grec[i], Grec0[i]) ; Gnrec = j ; /* Process one year-month of each data type, doing quality control and making changes as necessary to working array "Grec". The names of global integers like 'GNPRCP' indicate the data types that each function processes. */ process_generic(GNDYSW, NOSIGN) ; qc_dysw() ; process_prcp(GNEVAP, &Gevap_accum) ; process_generic(GNMNPN, SIGN) ; process_generic(GNMXPN, SIGN) ; process_prcp(GNPRCP, &Gprcp_accum) ; process_prcp(GNSNOW, &Gsnow_accum) ; process_prcp(GNSNWD, &Gsnwd_accum) ; process_generic(GNSNYZ, SIGN) ; process_generic(GNSOYZ, SIGN) ; process_generic(GNSXYZ, SIGN) ; process_temp(GNTMAX, mn) ; process_temp(GNTMIN, mn) ; process_temp(GNTOBS, mn) ; process_prcp(GNWDMV, &Gwdmv_accum) ; process_prcp(GNWTEQ, &Gwteq_accum) ; qc_min_max(GNMNPN, GNMXPN, 'L', DONT) ; qc_min_max(GNSNYZ, GNSXYZ, 'L', DONT) ; qc_min_max(GNSNYZ, GNSOYZ, 'M', DONT) ; qc_min_max(GNSOYZ, GNSXYZ, 'M', DONT) ; qc_min_max(GNTMIN, GNTMAX, 'L', DONT) ; qc_min_max(GNTMIN, GNTOBS, 'M', DONT) ; qc_min_max(GNTOBS, GNTMAX, 'M', DONT) ; qc_prcp_snow_snwd_wteq() ; qc_snow_vs_snwd1() ; qc_snow_vs_snwd2() ; qc_snow_vs_prcp() ; qc_snwd_vs_wteq() ; /* Write working array "Grec" to the output file. */ write_1_date() ; /* Compare "Grec" to "Grec0" and write the results to the detailed log. */ manage_detail_log() ; } /* After the whole station has been done, write counts to the summary log. */ manage_summary_log() ; } /****************************************************************************** * Manage writes to the detailed log file, using the contents of arrays * 'Grec0' and 'Grec'. */ void manage_detail_log() { register int j, i ; /* Look at each record in array "Grec[]". The straightforward method is the single loop "for (i = 0 ; i < Gnrec ; i++)". But we need to increment 'Gmod' and 'Gdel'. To do that, we need to know what is the current data type. A method using two nested loops is used. */ for (j = 0 ; j < 15 ; j++) { for (i = Gelems[j][0] ; i <= Gelems[j][1] ; i++) { /* * Check the first character in record 'Grec[i]'. */ switch (Grec[i][0]) { /* * 'D' means the record is unchanged since it was copied there. */ case 'D': break ; /* * 'M' means the record was modified. */ case 'M': ++(Gmod[j]) ; fprintf(Gfplog1, "%s\n%s\n%c%s\n", "This record was changed from/to:", Grec0[i], 'D', Grec[i] + 1) ; fprintf(Gfplog1, "\n") ; break ; /* * A null character '\0' means the record was "deleted". */ case '\0': ++(Gdel[j]) ; /* * "DYSW" records were deleted because their contents were * used in another record. Message to this effect. */ if (j == 0) fprintf(Gfplog1, "%s\n%s\n", "This record used to " "modify another record, then deleted:", Grec0[i]) ; /* * Non-"DYSW" records. */ else fprintf(Gfplog1, "%s\n%s\n", "This record deleted:", Grec0[i]) ; fprintf(Gfplog1, "\n") ; break ; } } } } /****************************************************************************** * Manage processing of one input file. */ void manage_processing() { static char save[MAXLEN] ; char filnam[200], laststn[7], dash[] = "--------------------------------" "-----------------------------------" ; register int nread, n ; register char *ginbuf, *gdate, *gstn, *earliest_date, *latest_date, *gfilnam ; FILE *gfpin, *gfpout, *gfpwork, *gfplog0 ; /* Open work file for overwrite. */ open_mainwork() ; /* Open input file. */ gfilnam = Gfilnam ; if ((gfpin = opnfil(gfilnam,"r")) == NULL) return ; /* Print input file message lines to the two log files and the screen. */ log_headers(dash, "File", gfilnam) ; /* printf("\nFile %s being processed\n", gfilnam) ; */ /* Create an output file name and attempt to open a file with that name for writing. */ strcpy(filnam, gfilnam) ; strcpy(filnam + strlen(filnam) - 3, ".nd2") ; if ((Gfpout = opnfil(filnam, "w")) == NULL) return; /* Initializations. */ ginbuf = Ginbuf ; gstn = Gstn ; gdate = Gdate ; earliest_date = Earliest_date ; latest_date = Latest_date ; gfpwork = Gfpwork ; gfplog0 = Gfplog0 ; gfpout = Gfpout ; *laststn = '\0' ; *(laststn+6) = '\0' ; init1() ; init2() ; /* Each loop iteration reads one record from an input file. Records for any one station are collected in the work file pointed to by "gfpwork". When a record is read for another station, or EOF occurs, the collected records are processed. */ while ((nread = mygets(ginbuf, MAXLEN, gfpin)) > 0) { if (nread >= MAXLEN) { read_err(MAXLEN, gfilnam, gfplog0) ; return ; } /* Make sure this record is labeled daily-resolution, element-style. */ if (strncmp(ginbuf, "DLY", 3) != 0) continue ; /* When a new station's record is read AND the new station is not the first station, process the previous station's records. */ if (strncmp(laststn, gstn, 6) != 0 && *laststn) { log_headers("...", "Station", laststn) ; strcpy(save, ginbuf) ; manage_block(laststn) ; fclose (gfpwork) ; open_mainwork() ; init2() ; strcpy(ginbuf, save) ; } /* Discard this record if the date is bad. If good, is it the earliest or latest date read so far? Update "earliest_date" or "latest_date" if appropriate. */ if (!good_date(gdate)) continue ; if (strncmp(earliest_date, gdate, 6) > 0) strncpy(earliest_date, gdate, 6) ; if (strncmp(latest_date, gdate, 6) < 0) strncpy(latest_date, gdate, 6) ; /* Track the data types read. */ data_type_store() ; /* Copy the current station number into "laststn", which is used at an earlier point in this "while()" loop to check for a new station. */ strncpy(laststn, gstn, 6) ; /* Write the record to the work file. */ fprintf(gfpwork, "%s\n", ginbuf) ; } fclose(gfpin) ; log_headers("...", "Station", laststn) ; manage_block(laststn) ; /* Do the last station. */ fclose(gfpwork) ; } /****************************************************************************** * Manage writing to the summary log. All records are from one station. */ void manage_summary_log() { register int i ; register char *p ; static char inpu[] = " records read : ", outp[] = " records written : ", mods[] = " records modified: ", dels[] = " records deleted : ", fmt[] = "%s%s%6d\n%s%s%6d\n%s%s%6d\n%s%s%6d\n\n" ; /* For each data type ("for()" statement), if there were any records for the data type ("if()" statement), print the summary lines for the data type. */ for (i = 0 ; i < 15 ; i++) { if (Gin[i]) { p = Gtypes[i] ; fprintf(Gfplog0, fmt, p, inpu, Gin[i], p, dels, Gdel[i], p, mods, Gmod[i], p, outp, Gout[i]) ; } } } /****************************************************************************** * Return the number of observations from columns 28-30 in TD-3200 record * "rec". */ int numobs(char *rec) { int nobs ; sscanf(rec+27, "%3d", &nobs) ; return nobs ; } /****************************************************************************** */ void num_recs_err() { fprintf(Gfplog0, "Too many records for one date near \n\n%s\n in file %s\n", Ginbuf, Gfilnam) ; exit(-1) ; } /****************************************************************************** * Open the log files. */ void open_logs() { Gfplog0 = fopen("3200log.summary2", "w") ; Gfplog1 = fopen("3200log.details2", "w") ; if (Gfplog0 == NULL || Gfplog1 == NULL) { printf("Cannot open log files.\n") ; exit(-1) ; } } /****************************************************************************** * Open the main work file. */ void open_mainwork() { if ((Gfpwork = fopen("3200work.dat", "w+")) == NULL) { printf("Cannot open work file.\n") ; exit(-1) ; } } /****************************************************************************** * Track ongoing accumulation status ("accum") through a month of precipitation * data whose first obs is at "reca_obs1". Goal is to set or reset "accum". */ void prcp_accum(register char *reca_obs1, register int nobs, register int *accum) { register int i ; register char *a ; /* "Accum" can be either 0 or 1 initially. 1 indicates an ongoing accumulation carried over from the previous month, so 0 should be by far the most common initial value. "Accum" is changed as flag 1 == 'S', 'A', or 'B'. */ for (i = 0 ; i < nobs ; i++) { a = reca_obs1 + (i * 12) ; if (!SIG_OBS2(a)) /* Skip non-significant obs and errors. */ continue ; if (*accum) { if (*(a+10) == 'A' || *(a+10) == 'B') *accum = 0 ; } else { if (*(a+10) == 'S') *accum = 1 ; } } } /****************************************************************************** * Return 1 if precipitation record "reca" and the snow records "recb" for the * same station and date are reversed, and return 0 otherwise. */ int prcp_snow_switched(char *reca, char *recb) { register char *a, *b ; register int i, j ; char *reca_obs1, *recb_obs1 ; int snowval, prcpval, icomp, nobsa, nobsb, nbigger, ncompared, nnzsnow, nnzprcp ; nobsa = numobs(reca) ; nobsb = numobs(recb) ; reca_obs1 = reca + 30 ; recb_obs1 = recb + 30 ; nbigger = ncompared = nnzsnow = nnzprcp = 0 ; /* "Reca" is the precip record and "recb" is the snow record. */ /* Loop through all obs in precipitation record "reca". Note "!nbigger". If "nbigger" != 0 -- that is, if even one snow obs is bigger than the precip obs for the same day, we assume the records are not reversed, and testing stops. */ for (i = j = 0 ; i < nobsa && !nbigger ; i++) { a = reca_obs1 + (i * 12) ; /* Ignore non-significant precip obs and any obs with errors. */ if (!SIG_OBS2(a)) continue ; /* Ignore any precip obs with a flag 1 of 'S' or with a zero or missing value. */ if (*(a+10) == 'S' || strncmp(a+5, "00000", 5) == 0 || strncmp(a+5, "99999", 5) == 0) continue ; /* Keep a running count of the number of nonzero precip values. */ nnzprcp++ ; /* Loop through all obs in snow record "recb". This loop is designed to stay in day synchronization with the "reca" loop. Same comment about "!nbigger" applies to this loop. */ for ( ; j < nobsb && !nbigger ; j++) { b = recb_obs1 + (j * 12) ; /* Compare dates. If "a" has an earlier day than "b", break the inner loop to get the next precip obs in the outer loop. */ if ((icomp = strncmp(a, b, 2)) < 0) break ; /* Skip the rest of the loop if "icomp" == 0 -- that is, if "a" and "b" are obs for same day. */ if (icomp == 0) { /* Ignore non-significant snow obs, and any snow obs with an error flag. */ if (!SIG_OBS2(b)) continue ; /* Ignore any snow obs with a flag 1 of 'S', or with a zero or missing data value. */ if (strncmp(b+5, "00000", 5) == 0 || strncmp(b+5, "99999", 5) == 0 || *(b+10) == 'S') continue ; /* Arrive here if "a" and "b" can be compared. Keep running counts of the number of nonzero snow values and of how many obs are compared. Convert the obs data values to integers. */ nnzsnow++ ; ncompared++ ; sscanf(a+4, "%6d", &prcpval) ; sscanf(b+4, "%6d", &snowval) ; /* Units of snow are 0.1 in and of precip are 0.01 in. For normal records, the snow value is expected to be larger than the precip value after allowing for the element format units difference. There may be cases where both snow and liquid precip occur and the snow amount is smaller, but this should not happen often. If the snow value is bigger, bump "nbigger" and exit the loops. */ if (snowval >= prcpval/10) nbigger++ ; } } } /* A real precip record has as many or more non-zero values as a real snow record for the same station and date. If so, return 0. If not, examine "ncompared" and "nbigger". */ return (nnzprcp >= nnzsnow || ncompared == 0) ? 0 : !nbigger ; } /****************************************************************************** * Manage processing of records of any data type that uses only "qc_generic()" * for quality control. */ void process_generic(int t, int sign) { register int i ; for (i = Gelems[t][0] ; i <= Gelems[t][1] ; i++) qc_generic(Grec[i], sign) ; } /****************************************************************************** * Manage processing of any of temperature data types "TMAX", "TMIN", or "TOBS". */ void process_temp(register int t, int mo) { register int i ; mo-- ; /* Call several quality control functions for each record "i". */ for (i = Gelems[t][0] ; i <= Gelems[t][1] ; i++) { qc_generic(Grec[i], SIGN) ; qc_temp(Grec[i], Gmean[t][mo], Gsd[t][mo]) ; qc_temp2(Grec[i]) ; } } /****************************************************************************** * Manage processing of any data type that does not have negative data values * and may have accumulations. As the name "process_prcp" implies, this * function was originally written for precipitation data types. */ void process_prcp(register int t, int *paccum) { register int i ; /* Call two quality control functions for each record "i". */ for (i = Gelems[t][0] ; i <= Gelems[t][1] ; i++) { qc_generic(Grec[i], NOSIGN) ; qc_prcp(Grec[i], paccum) ; } } /****************************************************************************** * Do quality control on DYSW data records for one station for one date. This * function standardizes the format. */ void qc_dysw() { register int i, first, last ; int nobsa0, nobsa, nobsb0, nobsb, imark, icomp, change1, change2 ; register char *a, *b ; char *reca, *recb, *lasta, *lastb, *reca_obs1, *recb_obs1 ; imark = change1 = change2 = nobsa0 = nobsa = nobsb0 = nobsb = 0 ; /* Look at DYSW records in "Grec" that have the same date. This qc function is different from other qcs because multiple records for one date are legal here. */ first = Gelems[GNDYSW][0] ; last = Gelems[GNDYSW][1] ; if (last == -1) return ; for (i = first ; i <= last ; i++) /* Process DYSW records */ { reca = Grec[i] ; /* The standard DYSW data field format is "0AABB", where "AA" and "BB" are codes for the first and second weather events of the day. If there was only one weather event, the format should look like "0AA00". If no weather event, "00000". But never "000AA". Convert any occurrence of "000AA" to "0AA00". (The first "if()" reduces the number of duplicate calls to this function.) */ if (first == last || i < last) if (shift_event_1(reca)) change1 = 1 ; /* If there is not another DYSW record after "reca", skip the rest of the code in the "for()" loop. */ if (i == last) continue ; /* When there is another DYSW record, run it through the same function. */ recb = Grec[i+1] ; if (shift_event_1(recb)) change2 = 1 ; nobsa0 = numobs(reca) ; nobsb0 = numobs(recb) ; /* It's convenient to let "reca" be the longer record. If it isn't the longer record now, make it so. */ if (nobsb0 > nobsa0) { a = recb ; recb = reca ; reca = a ; nobsa = nobsa0 ; nobsa0 = nobsb0 ; nobsb0 = nobsa ; icomp = change1 ; change1 = change2 ; change2 = icomp ; } /* Initialization. */ reca_obs1 = reca + 30 ; recb_obs1 = recb + 30 ; lasta = reca_obs1 + ((nobsa0-1) * 12) ; lastb = recb_obs1 + ((nobsb0-1) * 12) ; a = reca_obs1 ; b = recb_obs1 ; nobsa = nobsa0 ; nobsb = nobsb0 ; /* Loop through all obs in "reca" and "recb". */ while (a <= lasta && b <= lastb) /* Process obs. */ { /* If the day at "a" is earlier than the day at "b", move "a" to the next later obs. */ if ((icomp = strncmp(a, b, 2)) < 0) { a += 12 ; } /* If the day at "a" is the same as the day at "b", and "b" is OK, transfer the "b" events into "a" if "a" has unused space (code="00"). Blank any transferred events in "b". "Imark" is the number of transfers per obs: 0, 1, or 2. */ else if (icomp == 0) { if (*(b+11) == '3' || *(b+11) == '2') { b += 12 ; continue ; } imark = 0 ; if (*(a+6) == '0' && *(a+7) == '0') { strncpy(a, b, 12) ; strncpy(b+4, " 00000 ", 8) ; imark = 2 ; } else if (*(a+8) == '0' && *(a+9) == '0') { *(a+8) = *(b+6) ; *(a+9) = *(b+7) ; *(b+6) = *(b+8) ; *(b+7) = *(b+9) ; *(b+8) = '0' ; *(b+9) = '0' ; imark = 1 ; } /* There is one possible original-edited obs occurrence that needs special handling. This is when the obs at "a" is original without a following edited obs at "a"+12, AND "b" is original WITH a following edited obs at "b"+12. The next code line checks for this possibility. */ if (SIG_OBS1(a) && (a == reca_obs1 || SIG_OBS1(a-12)) && !SIG_OBS1(b)) { /* If no event was transferred from "b" to "a" earlier, no action is needed. Otherwise, open a 12-character gap at "a"+12. Then transfer the edited "b" obs into the gap, imitating the earlier transfer. */ b += 12 ; if (imark && *(b+11) != '3') { if (nobsa >= MAXOBS) rec_siz_err() ; nobsa++ ; a += 12 ; mystrcpy(a+12, a) ; lasta += 12 ; if (imark == 1) { *(b+6) = *(b+8) ; *(b+7) = *(b+9) ; *(b+8) = '0' ; *(b+9) = '0' ; strncpy(a, b, 12) ; } else { strncpy(a, b, 12) ; strncpy(b+4, " 00000 0", 8) ; } } } a += 12 ; b += 12 ; if (imark) change1 = change2 = 1 ; } /* If the day of the obs at "a" is greater than the day of the obs at "b", copy "b" obs into "a" and blank them in "b". */ else { if (nobsa >= MAXOBS) rec_siz_err() ; nobsa++ ; mystrcpy(a+12, a) ; lasta += 12 ; strncpy(a, b, 12) ; strncpy(b+4, " 00000 0", 8) ; a += 12 ; b += 12 ; change1 = change2 = 1 ; } } /* End process obs. */ /* If b <= lastb is still true, "reca" is finished but there are obs in "recb" remaining to be appended onto the end of "reca". */ while (b <= lastb) { if (nobsa >= MAXOBS) rec_siz_err() ; nobsa++ ; strncpy(a, b, 12) ; *(a+12) = '\0' ; lasta += 12 ; strncpy(b+4, " 00000 0", 8) ; a += 12 ; b += 12 ; change1 = change2 = 1 ; } /* If "reca" was modified, mark it so. */ if (change1) *reca = 'M' ; if (nobsa0 > nobsa) write_nobs(reca, nobsa) ; /* Delete all obs in "recb" that are empty of events as a result of the transfer. Bypass non-significant obs; but if an edited is deleted, its original becomes significant and must be checked. */ for (b = recb_obs1, icomp = 1 ; b <= lastb ; b += 12) { if (SIG_OBS1(b)) { if (strncmp(b+5, "0000", 4) == 0) { mystrcpy(b, b+12) ; nobsb-- ; lastb-- ; change2 = 1 ; b -= (icomp) ? 12 : 24 ; } icomp = 1 ; } else { icomp = 0 ; } } /* If all obs in "recb" were deleted, "delete" the entire record. If all of the obs in "recb" were not deleted, change the number of obs as appropriate. */ if (nobsb0 > nobsb) { if (nobsb == 0) { *recb = '\0' ; change2 = 0 ; /* See next note. */ } else { write_nobs(recb, nobsb) ; change2 = 1 ; } } /* If "recb" was modified, mark it so. Note, as the note above suggested, that a "deleted" "recb" is not counted among the modified. */ if (change2) *recb = 'M' ; } /* End process DYSW records. */ } /****************************************************************************** * Do generic quality control on data record "reca". Check for invalid * characters in the data field of each obs and change original data flags or * remove bad edited observations as appropriate. If "sign" is false, a minus * sign in the arithmetic sign field is an invalid character. Correct an * arithmetic sign of "+" or "0" to " ". Check for invalid days in a month and * delete the offending obs. Return 0 if no errors were found, and 1 if so. */ int qc_generic(char *reca, int sign) { register int i ; register char *a ; char *reca_obs1 ; int day, yr, mo, nobs0, nobs, ret, nday ; nobs0 = nobs = numobs(reca) ; reca_obs1 = reca + 30 ; ret = 0 ; sscanf(reca + 17, "%4d%2d", &yr, &mo) ; nday = Gndays[is_leap_year(yr)][mo] ; /* Loop through all observations in "reca". */ for (i = 0 ; i < nobs ; i++) { a = reca_obs1 + (i * 12) ; /* Let "a" point to day of each obs. */ /* Is the day bad? Cases of April 31, February 30, etc. are known. If so, delete the obs and loop back to get the next obs. */ sscanf(a, "%2d", &day) ; if (!NBETWEEN(day, 1, nday)) { mystrcpy(a, a+12) ; i-- ; nobs-- ; ret = 1 ; continue ; } /* Is the arithmetic sign '0' or '+'? Either one is a harmless error (but an error nonetheless) that is corrected to ' '. */ if (*(a+4) == '0' || *(a+4) == '+') { *(a+4) = ' ' ; ret = 1 ; } /* Deal with an undocumented value of flag 1 of no known use. There are just a few of these. */ if (*(a+10) == 'C') { *(a+10) = ' ' ; ret = 1 ; } /* Is flag 2 blank? It's a harmless error (but an error nonetheless) that is corrected to '0'. */ if (*(a+11) == ' ') { *(a+11) = '0' ; ret = 1 ; } /* Is "a" an edited obs? It is if there is a previous obs at "a-12" that appears to be an original obs followed by an edited. */ if (i > 0 && !SIG_OBS1(a-12)) { /* It's remotely possible that there are several edited obs in succession. If so, delete all but the rightmost. */ while (!SIG_OBS1(a)) { mystrcpy (a, a+12) ; nobs-- ; ret = 1 ; } /* Exit if there are no more obs remaining as a result of the action of the above loop. */ if (i >= nobs) continue ; /* Now it is safe to process the current obs as edited. The obs is deleted if it is not truly a correction to the original obs. */ if (*(a+11) == '3' || *(a+11) == '2' || bad_chars (a, sign) || (strncmp(a+5, "99999", 5) == 0 && *(a+10) != 'S')) { mystrcpy (a, a+12) ; nobs-- ; i-- ; *(a-1) = '3' ; ret = 1 ; } } /* Is the current obs an original obs not followed by an edited? This is by far the most common case in TD-3200 data. Actually, the following just checks for a significant obs. But the initial "if" of this "else if" would have caught a significant obs that is an edited obs; a significant obs that is an original not followed by an edited is the only kind remaining. */ else if (SIG_OBS1(a)) { /* If this original obs already has '3' in flag 2, do nothing. Otherwise, check for bad characters or a '2' in flag 2, and if found, put '3' in flag 2. */ if (*(a+11) != '3') { if (bad_chars (a, sign) || (strncmp(a+5, "99999", 5) == 0 && *(a+10) != 'S')) { *(a+11) = '3' ; ret = 1 ; } } } /* The "if-else if" of which this "else" is part, caught every kind of obs except an original obs followed by an edited. This "else" treats this kind of obs. It should have a '2' in flag 2. */ else { if (*(a+11) != '2') { ret = 1 ; *(a+11) = '2' ; } } } /* If the number of obs has changed due to editing, put the correct number in the number of obs field. */ if (nobs0 != nobs) { write_nobs(reca, nobs) ; ret = 1 ; } /* If editing has occurred, change the first character in the record ("reca") to 'M'. */ if (ret) *reca = 'M' ; return ret ; } /****************************************************************************** * This function compares obs from records of two different data types, where * the first type is a minimum-value type and the second is a maximum-value * type. (Two data types with this expected relationship are TMIN and TMAX.) * This function does appropriate record editing based on these expectations * and comparisons. */ void qc_min_max(int min, int max, char flg2, int enable) { register char *a, *b ; register int i, j, k, m, is_soils ; char *recmn, *recmx, *recmn_obs1, *recmx_obs1 ; int icomp, ngood, nbad, nobsmn, nobsmx, min1, min2, max1, max2, minval, maxval ; /* The records are located in array "Grec[]" at the element numbers defined by array "Gelems[][]". For simplicity, let "min1" and "min2" be element numbers for the minimum-type records and let "max1" and "max2" be for the maximum-type records. A value of -1 for the last two indicates there are no records available for the type. */ min1 = Gelems[min][0] ; min2 = Gelems[min][1] ; max1 = Gelems[max][0] ; max2 = Gelems[max][1] ; if (min2 == -1 || max2 == -1) return ; is_soils = (min == 6 || min == 8) ; /* Soils records? */ /* "Ngood" is the number of data value comparisons that are "good", i.e. where "min" < "max". "Nbad" is the number of those comparisons that are "bad", i.e. where "min" > "max". Equality, "min" == "max", is neither "good" nor "bad". */ ngood = nbad = 0 ; /* Allow multiple records only for time-of-observation data types, not min or max data types. */ if ((min1 != min2 && realmin(min)) || (max1 != max2 && realmax(max))) return ; /* Loop through all records. Since there can be more than one of each data type (esp. soil temperatures), use two "for()" loops to pair each min with each max. */ for (i = min1 ; i <= min2 ; i++) { for (j = max1 ; j <= max2 ; j++) { recmn = Grec[i] ; recmx = Grec[j] ; /* Don't do anything with this pair of records if they are soils and the digits in the last two positions of the data types don't match exactly. */ if (is_soils) if (*(recmn+13) != *(recmx+13) || *(recmn+14) != *(recmx+13)) continue ; /* Don't do anything with this pair of records if there is a major difference in the number of observations. */ nobsmn = numobs (recmn) ; nobsmx = numobs (recmx) ; if (abs(nobsmn - nobsmx) > 15) continue ; recmn_obs1 = recmn + 30 ; recmx_obs1 = recmx + 30 ; /* Loop through all obs in the min record. The "for()" loop check involving "ngood" and "enable" is to end the loop when one normal comparison is found. (See the last comment in this function for a discussion of how "enable" works.) */ for (k=m=0 ; (!ngood || enable) && k < nobsmn && m < nobsmx ; k++) { /* Set "a" to the day of the next obs of the min record. If the obs is not significant or if it has an error flag in flag 2, skip it. */ a = recmn_obs1 + (k * 12) ; if (!SIG_OBS2(a)) continue ; /* Attempt to assign "b" to a day in the max record that is the same day as the current "a" in the min record. "Icomp" tracks the success of this attempt. The following "do{}" loop runs as long as "icomp" > 0, i.e. the "b" obs day is earlier than the "a" obs day. Note that if this loop reaches a "b" day later than "a", i.e. "icomp" < 0, counter "m" remains unchanged and the "b" day does not change until "a" catches up. Non- significant obs and obs with error flags are ignored. */ icomp = 1 ; do { b = recmx_obs1 + (m * 12) ; if (SIG_OBS2(b)) icomp = strncmp(a, b, 2) ; } while (icomp > 0 && ++m < nobsmx) ; /* If "icomp != 0" is now true, it indicates that no "b" day was found to equal the current "a" day, because either a greater "b" day was found (see above comment), or else the last "b" day in the record was checked and was still too early. In that case, the remaining code in this "for()" loop should not be executed. */ if (icomp != 0) continue ; /* Prepare to compare numerically the data values of the "a" and "b" obs. */ sscanf(a+4, "%6d", &minval) ; sscanf(b+4, "%6d", &maxval) ; /* Increment "ngood" if the data value relationship is normal, or increment "nbad" if bad. Don't do anything if the values are equal, since that doesn't prove anything. */ if (minval < maxval) ngood++ ; else if (minval > maxval) nbad++ ; } /* End of "for (k=m=0 ; etc." */ /* One pair of records has been compared. If "nbad" == 0, no data value comparisons were bad ("min" > "max"). In this case nothing more is done to this pair of records. But when "nbad" > 0, at least one pair of obs compared abnormally, and further action IS required. */ if (nbad > 0) { /* Function "repair_minmax()" handles wrong relationships. On obs where "min" > "max", it does the following: If "enable" is 1, it replaces the min value with the max value. If "enable" is 2, it replaces the max value with the min value. If "enable" is 0, both obs are problematic but neither gets preference: both obs get a flag 2 of 'T'. (Currently, only 0 is used.) */ repair_minmax(recmn, recmx, enable) ; if (enable == 0) /* Changes to both records. */ { *recmn = 'M' ; *recmx = 'M' ; } else if (enable == 1) /* Changes to min record. */ { *recmn = 'M' ; } else if (enable == 2) /* Changes to max record. */ { *recmx = 'M' ; } } /* End of "if (nbad)" */ } /* End of "for (j = max1 ; j <= max2 ; j++)" */ } /* End of "for (i = min1 ; i <= min2 ; i++)" */ } /****************************************************************************** * Check each obs of one of any type of precipitation (PRCP, SNOW, or SNWD) or * wind movement (WDMV) record for format problems. Make corrections. */ int qc_prcp(char *reca, int *accum) { int nobs, nobs0, ret, is_S ; register char *reca_obs1 ; ret = 0 ; reca_obs1 = reca + 30 ; nobs0 = nobs = numobs(reca) ; /* * A few records have 's', a flag that is not described by the documentation, * in flag 1. Call a function to convert this flag to 'S'. */ qc_prcp_std0(reca, nobs) ; /* Run function "qc_prcp_std1()", which converts any occurrence of data value/ flag 1 of "00000S" to "99999S". One of three different integer values is returned by this function, depending on what it finds about obs with 'S' in flag 1. If it returns 0, it found no 'S' flags in the record; the "if (is_S)" code tests for that case; and, if true, functions "qc_prcp_std2()" and "qc_prcp_SnoA()" are not executed since their sole purpose is to fix problems with EXISTING 'S' flags. If it returns any nonzero value, it found 'S' flags. Then "is_S" is set to 1, and the two functions just noted are run. If it returns a nonzero value that is specifically 1. Then not only were 'S' flags found, but also problems were found and fixed; so "ret" is set to 1. */ if ((is_S = qc_prcp_std1(reca_obs1, nobs)) != 0) /* What I just said */ { if (is_S == 1) ret = 1 ; /* Check for several consecutive "99999S" obs. If found, delete all but the first. */ if (qc_prcp_std2(reca_obs1, &nobs)) ret = 1 ; /* Look for a "99999S" obs NOT followed by an obs with flag 1 == 'A', fix that problem however many times it occurs. */ if (qc_prcp_SnoA(reca_obs1, nobs, *accum)) ret = 1 ; } /* A different approach: Suppose there should be an 'S', but isn't? If an obs with 'A' or 'B' in flag 1 is not preceded by an obs with 'S' in flag 1, put one there. */ if (qc_prcp_AnoS(reca_obs1, &nobs, *accum)) ret = is_S = 1 ; /* Suppose an accumulation has a zero accumulated value? If so, the accumulation can be replaced with a series of obs with zero data values for the accumulation period. */ if (is_S) if (qc_prcp_zeroacc(reca_obs1, &nobs)) ret = 1 ; /* Check on how accumulations are resolved between this month and the previous and next months. Set "accum" accordingly; it's used when "qc_prcp" is called again. */ prcp_accum(reca_obs1, nobs, accum) ; /* If the number of obs changed, put that information in the record. If the record was changed, change the first character to 'M' for future reference. */ if (nobs0 != nobs) { write_nobs(reca, nobs) ; ret = 1 ; } if (ret) *reca = 'M' ; return ret ; } /****************************************************************************** * Look for flag 1 == 'A' or 'B' not preceded by flag 1 == 'S' in a record * of precipitation data that has "nobs" obs and whose first obs begins at * address "reca_obs1". Create the 'S' flag, and a new obs if necessary. */ int qc_prcp_AnoS(char *reca_obs1, int *nobs, int accum) { register char *a, *a2 ; register int j, k ; int lastday, thisday, ret ; char cday[3] ; /* 'S' in flag 1 means "accumulation period begins". 'A' in flag 1 means "accumulated amount". 'B' in flag 2 means "accumulated amount that includes estimated values". */ /* Loop through all obs. "Reca_obs1" points to the first obs. */ for (j = ret = thisday = 0 ; j < *nobs ; j++) { a = reca_obs1 + (j * 12) ; /* Ignore non-significant obs for now. Later, if a new "99999S" obs is to be added, non-significant obs can have a slight role in deciding where to place the new obs. */ if (!SIG_OBS1(a)) continue ; /* Get the current obs day as an integer after saving the previous value of the current obs day. */ lastday = thisday ; sscanf(a, "%2d", &thisday) ; /* If flag 1 is not 'A' or 'B' (by far the most common case), there is nothing to do except track accumulations ("accum") and go get the next obs. */ a2 = a + 10 ; /* Used here and below. */ if (*a2 != 'A' && *a2 != 'B') { if (*a2 == 'S') accum = 1 ; continue ; } /* If an accumulation, indicated by "accum" true, is ongoing, the 'A' or 'B' in flag 1 of the current obs is assumed to be good; the "accum" flag is reset and the next obs is retrieved. */ if (accum) { accum = 0 ; continue ; } /* If there is no ongoing accumulation, then the 'A' or 'B' in flag 1 is a problem. A change will be made, so "ret" is set to 1 now. */ ret = 1 ; /* If the previous obs day is the previous day, and the previous obs is not a "99999S" obs, 'A' or 'B' is assumed to be the wrong flag 1. It is replaced with a space. */ if (lastday == thisday - 1) { *a2 = ' ' ; } /* Or if the previous obs is not for the previous day, create a new "99999S" obs. */ else { accum = 0 ; /* New obs uses "lastday" + 1 for its day. Load the day into "cday" now. */ sprintf(cday, "%2.2d", lastday + 1) ; /* If "a" is the first obs, or an original obs without an edited, insert the new "99999S" obs at address "a", which adds 12 bytes to the record length. Otherwise, if "a" is an edited obs, insert at "a"-12 (the original obs address), instead. */ a2 = (j == 0 || SIG_OBS1(a-12)) ? a : a-12 ; mystrcpy(a2+12, a2) ; *a2 = *cday ; *(a2+1) = *(cday+1) ; strncpy(a2+4, " 99999S0", 8) ; ++*nobs ; /* The insertion has caused address "a"+12 to be an obs that's been looked at. We don't want to look at it again on the next iteration of this loop, so we advance "j" a step here; and of course the loop interation will advance "j" one more. */ j++ ; } } return ret ; } /****************************************************************************** * Look for flag 1 == 'S' not followed by flag 1 == 'A' or 'B'. Create an * 'A' flag when needed. "Accum" keeps track of whether an accumulation is * ongoing, or not. */ int qc_prcp_SnoA(char *reca_obs1, int nobs, int accum) { register char *a, *a2 ; register int j ; int ret ; /* Flag 1 'S' is "accumulated period begins". Flag 1 'A' is "accumulated amount". Flag 1 'B' is "accumulated amount that includes estimated values". */ for (j = ret = 0 ; j < nobs ; j++) { a = reca_obs1 + (j * 12) ; if (!SIG_OBS1(a)) continue ; a2 = a + 10 ; if (accum) { if (*a2 != 'S') { if (*a2 != 'A' && *a2 != 'B') { ret = 1 ; *a2 = 'A' ; } accum = 0 ; } } else { if (*a2 == 'S') accum = 1 ; } } return ret ; } /****************************************************************************** * Standardize format in a record 'rec' so that 's' in flag 1 is changed to * 'S'. Either means the start of an accumulation. The only place 's' was * known to occur was in a few WDMV records. */ void qc_prcp_std0(char *rec, int nobs) { register char *a, *rec_obs1 ; register int j ; rec_obs1 = rec + 30 ; for (j = 0 ; j < nobs ; j++) { a = rec_obs1 + (j * 12) ; if (!SIG_OBS2(a)) /* Ignore non-significant obs and errors. */ continue ; if (*(a + 10) == 's') { *(a + 10) = 'S' ; *rec = 'M' ; } } } /****************************************************************************** * Standardize format in a precipitation record so that obs with 'S' in flag 1 * have data value of " 99999" instead of " 00000". Return 0 if no 'S' flag is * found, -1 if 'S' is found but no changes are done, and 1 if " 00000" was * changed to " 99999". */ int qc_prcp_std1(char *reca_obs1, int nobs) { register char *a ; register int j ; int ret ; for (j = ret = 0 ; j < nobs ; j++) { a = reca_obs1 + (j * 12) ; if (!SIG_OBS2(a)) /* Ignore non-significant obs and errors. */ continue ; if (*(a+10) == 'S') /* "Accumulated period begins" */ { if (strncmp(a+5, "00000", 5) == 0) { strncpy(a+4, " 99999", 6) ; ret = 1 ; } else if (ret == 0) { ret = -1 ; } } } return ret ; } /****************************************************************************** * Standardize format in a precipitation record so that there are never two * "99999S" obs in an accumulation. */ int qc_prcp_std2(char *reca_obs1, int *nobs) { register char *a ; register int j ; int icomp, ret ; for (j = icomp = ret = 0 ; j < *nobs ; j++) { a = reca_obs1 + (j * 12) ; /* Non-significant obs, and cases of data value and flag 1 not equal to "99999S", are not a problem and are sent back to the top of the "for()" loop. */ if (strncmp(a+5, "99999S", 6) != 0 || !SIG_OBS1(a)) { icomp = 0 ; continue ; } /* "Icomp" is true only when there was a previous obs with "99999S". See if the current obs "a" is also "99999S". If so, delete it and keep deleting as long as "a" points to an obs with "99999S". */ if (icomp) { while (j < *nobs && strncmp(a+5, "99999S", 6) == 0) { ret = 1 ; mystrcpy(a, a+12) ; --*nobs ; } icomp = 0 ; j-- ; continue ; } /* Here is where "icomp" is set for an obs with "99999S" (possibly first in a sequence), as noted in the previous comment. */ icomp = 1 ; } return ret ; } /****************************************************************************** * Search for scale factor errors in PRCP, SNOW, SNWD and WTEQ records. Check * all except PRCP for occurrences at inappropriate times and places. */ void qc_prcp_snow_snwd_wteq() { int nobs, nrec0, nrec1 ; register int i ; char *rec ; /* * For all records of each type, look for scale factor errors. Usually, any * one type will have no records or one record. When there are no records, * 'nrec1' is set to -1, so that a loop does not run at all. */ nrec0 = Gelems[GNPRCP][0] ; nrec1 = Gelems[GNPRCP][1] ; for (i = nrec0 ; i <= nrec1 ; i++) { rec = Grec[i] ; nobs = numobs(rec) ; scale_err_1(rec, nobs, GNPRCP) ; } nrec0 = Gelems[GNSNOW][0] ; nrec1 = Gelems[GNSNOW][1] ; for (i = nrec0 ; i <= nrec1 ; i++) { rec = Grec[i] ; nobs = numobs(rec) ; check_time_place(rec, nobs, GNSNOW) ; scale_err_1(rec, nobs, GNSNOW) ; } nrec0 = Gelems[GNSNWD][0] ; nrec1 = Gelems[GNSNWD][1] ; for (i = nrec0 ; i <= nrec1 ; i++) { rec = Grec[i] ; nobs = numobs(rec) ; check_time_place(rec, nobs, GNSNWD) ; scale_err_2(rec, nobs) ; } nrec0 = Gelems[GNWTEQ][0] ; nrec1 = Gelems[GNWTEQ][1] ; for (i = nrec0 ; i <= nrec1 ; i++) { rec = Grec[i] ; nobs = numobs(rec) ; check_time_place(rec, nobs, GNWTEQ) ; scale_err_2(rec, nobs) ; } } /****************************************************************************** * Remove precipitation accumulations with accumulated values of zero, and * replace with a series of normal zero-valued observations. (Not done for an * accumulation that started in the previous month.) */ int qc_prcp_zeroacc(char *reca_obs1, int *nobs) { register char *a ; register int i, j ; int ret, firstday, lastday ; char cday[3], *p ; for (j = ret = 0 ; j < *nobs ; j++) { a = reca_obs1 + (j * 12) ; /* To bypass the next "continue", flag 1 must be 'S', the current obs must be a significant obs, and it must not be the last obs. */ if (*(a+10) != 'S' || !SIG_OBS1(a) || j == *nobs-1) continue ; /* To avoid the next "continue", the obs after this one must have a data value of "00000". */ if (strncmp(a+17, "00000", 5) != 0) continue ; /* "Firstday" is an integer version of the day of the current obs. "Lastday" is for the next obs. Of course "lastday" should be greater than "firstday". If not, "continue". */ sscanf(a, "%2d", &firstday) ; sscanf(a+12, "%2d", &lastday) ; if (firstday >= lastday) continue ; ret = 1 ; /* Changes will be made. */ /* Now create a zero precip obs for each day involved in the accumulation. Note that two of the obs, for the first and last days, already exist, so those are done separately from the interior days, where obs must be created. Flags used are ' ' and '0'. */ /* First day */ strncpy(a+4, " 00000 0", 8) ; p = a + 2 ; /* Interior days, if any. If only a first and last day are needed, then firstday + 1 == lastday, and the loop won't run. Note that, if the loop runs, "mystrcpy(a+12,a)" increases the number of obs by 1 on each loop iteration. */ for (i = firstday + 1 ; i < lastday ; i++) { sprintf(cday, "%2.2d", i) ; a += 12 ; ++*nobs ; j++ ; mystrcpy(a+12, a) ; *a = *cday ; *(a+1) = *(cday+1) ; *(a+2) = *p ; *(a+3) = *(p+1) ; strncpy(a+4, " 00000 0", 8) ; } /* Last day. */ strncpy(a+16, " 00000 0", 8) ; j++ ; } return ret ; } /****************************************************************************** * Compare each precipitation record with the snow record for the same year- * month, non-zero values only. If any precipitation values are larger than * corresponding snow values, take corrective action. */ void qc_snow_vs_prcp() { register int i, j, k, m ; register char *a, *b ; int val, icomp, nobsa, nobsb0, nobsb, prcp0, prcp1, snow0, snow1 ; char *reca, *reca_obs1, *recb, *recb_obs1, cval[7] ; prcp0 = Gelems[GNPRCP][0] ; prcp1 = Gelems[GNPRCP][1] ; snow0 = Gelems[GNSNOW][0] ; snow1 = Gelems[GNSNOW][1] ; /* * If there are no precip or no snow records, or if there are different * numbers of each, exit at once. */ if (prcp1 == -1 || snow1 == -1 || prcp1 - prcp0 != snow1 - snow0) return ; nobsa = nobsb0 = nobsb = 0 ; /* Look at each snow and precip record. Generally, there should be one of each. */ for (i = snow0, j = prcp0 ; i <= snow1 ; i++, j++) { /* Let "reca" be the snow record and "recb" the precip record. */ reca = Grec[i] ; nobsa = numobs(reca) ; reca_obs1 = reca + 30 ; recb = Grec[j] ; nobsb0 = nobsb = numobs(recb) ; recb_obs1 = recb + 30 ; /* Loop through all obs in the snow record. Skip non-significant obs and errors. */ for (k = m = 0 ; k < nobsa ; k++) { a = reca_obs1 + (k * 12) ; /* "a" is current snow obs. */ if (!SIG_OBS2(a)) continue ; /* Loop through the precipitation record obs until one is found with the same day as the current snow obs, or until the lack of one is sure. Skip non-significant obs and errors. */ icomp = 1 ; do { b = recb_obs1 + (m * 12) ; /* "b" is current precip obs. */ if (SIG_OBS2(b)) icomp = strncmp(a, b, 2) ; } while (icomp > 0 && ++m < nobsb) ; /* Finding a precipitation obs with a matching day is indicated by "icomp" == 0. */ if (icomp != 0) continue ; /* If the snow obs data value is non-numeric or zero or "missing" or part of an accumulation, it can't be used to make an estimated value. If the snow value is "00001", we simply won't bother. */ if (!sscanf(a+5, "%5d", &val) || val <= 1 || val == 99999) continue ; /* Precip needs an estimated value if it is non-numeric. */ if (sscanf(b+5, "%5d", &val)) { /* Precip needs an estimated value if it is zero but not a trace. */ if (val == 0) { if (*(b+10) == 'T') continue ; } /* Precip needs an estimated value if it is missing, but not part of an accumulation. */ else if (val == 99999) { if (*(b+10) == 'S') continue ; } /* Any other precip value does not need an estimate. */ else { continue ; } } /* Arrive here if an estimated PRCP value will be supplied. Set first character of the record to 'M' for future reference. */ *recb = 'M' ; /* Brief detour to take care of the case where the current PRCP obs is an original with no edited; if so, create an edited. */ if (m == 0 || SIG_OBS1(b-12)) { if (nobsb >= MAXOBS) rec_siz_err() ; mystrcpy(b+12, b) ; nobsb++ ; m++ ; *(b+11) = '2' ; b += 12 ; } /* Copy SNOW obs to PRCP obs directly without a units conversion. This works because both are integers that represent floating point values and, in getting from SNOW to PRCP, there are two units conversions that cancel each other. The first is where the SNOW amount must be divided by 10 to get the equivalent PRCP amount; the second is where a SNOW integer represents tenths of inches and must be multiplied by 10 to get the PRCP integer and its hundredths of inches. */ strncpy(b+4, a+4, 6) ; *(b+10) = 'E' ; /* "Estimated" */ *(b+11) = 'C' ; /* "Precip est. from snow" */ } if (nobsb0 != nobsb) write_nobs(recb, nobsb) ; /* See if the records appear to be reversed. If so, set flags on each of the suspect values. See function "repair_minmax()" for more information. */ if (prcp_snow_switched(recb, reca)) { *reca = *recb = 'M' ; repair_minmax(reca, recb, 0) ; } } } /***************************************************************************** * Using SNOW, do a quality control of SNWD that looks for the special case * of zero snow depth alongside zero snowfall, yet the snow depth is error- * flagged and the snowfall is not. */ void qc_snow_vs_snwd1() { int ii, jj, snow0, snow1, snwd0, snwd1, comp, ezgood ; register char *a, *b, *recsnow30, *recsnow, *recsnwd30, *recsnwd ; register int k, j, i, nobsnow, nobsnwd ; snow0 = Gelems[GNSNOW][0] ; snow1 = Gelems[GNSNOW][1] ; snwd0 = Gelems[GNSNWD][0] ; snwd1 = Gelems[GNSNWD][1] ; /* * Require the same number of SNOW and SNWD records. Otherwise comparisons * get too complicated. */ if (snow1 == -1 || snwd1 == -1 || snow1 - snow0 != snwd1 - snwd0) return ; /* * If we get this far, we have the same number of SNOW records as SNWD * records. The outer "for()" loop pairs the records. */ for (ii = snwd0, jj = snow0 ; ii <= snwd1 ; ii++, jj++) { recsnwd = Grec[ii] ; nobsnwd = numobs(recsnwd) ; recsnwd30 = recsnwd + 30 ; recsnow = Grec[jj] ; nobsnow = numobs(recsnow) ; recsnow30 = recsnow + 30 ; /* * Look at each obs after the first one in the current SNWD record. */ for (i = 1, j = 0 ; i < nobsnwd ; i++) { a = recsnwd30 + (i * 12) ; /* * We are interested in only significant, error-flagged, zero-valued * SNWD obs. */ if (!SIG_OBS1(a) || !ERR_FLG(a) || strncmp(a+5, "00000", 5) != 0) continue ; /* * Find the previous SNWD significant value. Was it error-free and * zero-valued? Set 'ezgood' to report the results. We are not * interested in a zero current obs if the previous obs was good and * non-zero. */ for (k = i-1, ezgood = 0 ; k >= 0 ; k--) { b = recsnwd30 + (k * 12) ; if (!SIG_OBS1(b)) continue ; if (!ERR_FLG(b) && strncmp(b+5, "00000", 5) == 0) ezgood = 1 ; break ; } /* * Check results. On failure, there is nothing to do about the * current obs. */ if (!ezgood) continue ; /* * If possible, find an error-free, zero-value SNOW obs for the same * day as the SNWD obs 'a' in this inner loop. Note that 'j' was * initialized at the start of the outer loop because, even though * this loop may be restarted several times, 'j' always starts from * its previous value. Set 'comp' to report the results. */ for (comp = -1 ; j < nobsnow ; j++) { b = recsnow30 + (j * 12) ; if (!SIG_OBS2(b) || strncmp(b+5, "00000", 5) != 0) continue ; if ((comp = strncmp(a, b, 2)) <= 0) break ; } /* * Check results. Unless 'comp' == 0, there is nothing to do with the * current 'a'. */ if (comp != 0) continue ; /* * Arrive here only if all test conditions on 'a' and 'b' have been * met. */ *(a+11) = '1' ; *recsnwd = 'M' ; } } } /***************************************************************************** * Using SNOW, do a quality control of SNWD. */ void qc_snow_vs_snwd2() { int snow0, snow1, snwd0, snwd1, snowval[31], snwdval[31], snowtot, snwddif, rounderr ; char cday[3] ; register char *a, *rec30snwd, *recsnwd, *rec30snow, *recsnow ; register int k, i, j, prevsnow, m, kref, nobsnow, nobsnwd ; snow0 = Gelems[GNSNOW][0] ; snow1 = Gelems[GNSNOW][1] ; snwd0 = Gelems[GNSNWD][0] ; snwd1 = Gelems[GNSNWD][1] ; /* * Require the same number of SNOW and SNWD records. Otherwise comparisons * get too complicated. */ if (snow1 == -1 || snwd1 == -1 || snow1 - snow0 != snwd1 - snwd0) return ; /* * If we get this far, we have the same number of SNOW records as SNWD * records. The outer loop pairs the records. */ for (i = snow0, j = snwd0 ; i <= snow1 ; i++, j++) { recsnow = Grec[i] ; nobsnow = numobs(recsnow) ; /* * Build an array 'snowval[]' of integer SNOW values from the current * SNOW record. There are up to 31 SNOW values (for 31-day months). * Some 'snowval[]' elements are likely to contain -1. See "bldval" * for more information. */ bldval(recsnow, snowval, nobsnow) ; rec30snow = recsnow + 30 ; recsnwd = Grec[j] ; nobsnwd = numobs(recsnwd) ; /* * Build an array of SNWD values just like the preceding SNOW. */ bldval(recsnwd, snwdval, nobsnwd) ; rec30snwd = recsnwd + 30 ; /* * An inner loop compares elements of the two arrays and looks for * inconsistencies between them. Programmer's note: Element number 'k' * is altered inside the inner loop when an error is found; see the line * "k = kref" near loop's end. */ prevsnow = 0 ; kref = -1 ; for (k = 0 ; k < 31 ; k++) { /* * Flag 'prevsnow' is a best-guess at whether there may have been snow * on the ground at the time of a previous 'k': true or false. If * true, 'prevsnow' becomes true and is not changed again in this * inner loop. */ if (!prevsnow) if (k > 0) if (snowval[k-1] > 0 || snwdval[k-1] > 0) prevsnow = 1 ; /* * Skip 'snwdval[k]' values of -1. */ if (snwdval[k] == -1) continue ; /* * If 'kref' still has its initial value of -1, give 'kref' the value * of 'k' and skip the rest of the loop. Do the same thing if snow * depth has not increased. The latter is what happens with a huge * majority of SNWD obs. */ if (kref == -1 || (snwddif = snwdval[k] - snwdval[kref]) <= 0) { kref = k ; continue ; } /* * Arrive here only if snow depth has increased. We have to have a * sum of snowfalls to compare this to; sum the snowfalls from * 'kref'+1 to 'k', inclusive. (Usually, 'kref'+1 == 'k'.) */ snowtot = 0 ; for (m = kref + 1 ; m <= k ; m++) if (snowval[m] != -1) snowtot += snowval[m] ; /* * SNOW integer values represent tenths of inches, but SNWD integer * values represent whole inches. To compare a sum of snowfalls and a * snow depth increase, there will be a conversion factor and a term * to allow for round-off. * * It's necessary to know whether any snowfall occurred this month * prior to 'snowval[k]' to set the round-off term to its best value: * If not, 5 (0.5); if so, 10 (1.0). A discussion of this idea * follows: * * Suppose there was no previous snow. Then a snow depth of 0 is * assumed to be exactly 0. If 44 (4.4) inches of snow falls when the * snow depth is 0, the new snow depth can be as much as 4.4, which * would round off to 4. A snow depth of 5 would be incorrect. * * Suppose there was previous snow. Then a snow depth of 0 may be as * much as 0.4 inches. If 44 (4.4) inches of snow falls when the snow * depth is given as 0, the new snow depth could be 0.4 + 4.4 = 4.8 * which rounds off to 5. So 5 would be an acceptable snow depth in * this case. */ rounderr = (prevsnow) ? 10 : 5 ; /* * It's OK if the snow depth increase, times 10 (conversion factor), * is less than or equal to the snowfall sum plus the round-off term. */ if (snwddif * 10 <= snowtot + rounderr) { kref = k ; continue ; } /* * Arrive here if the snowfall sum is inconsistent with the snow depth * difference. First make a character string version of 'k'+1. This * string is the day of the obs that caused the error. */ sprintf(cday, "%2.2d", k+1) ; /* * Loop through the SNWD record and find the first significant obs * whose day matches 'cday'. This is the SNWD obs that helped cause * the inconsistency. */ for (m = 0 ; m < nobsnwd ; m++) { a = rec30snwd + (m*12) ; if (!SIG_OBS1(a)) continue ; /* * Test 'cday' against the current obs day, which is two * characters starting at 'a'. Set flag 2 to 'T' ("Failed internal * consistency check"). Set the first character of the record to * 'M' to show that a change has happened. */ if (*(a+1) == *(cday+1)) if (*a == *cday) { *(a+11) = 'T' ; *recsnwd = 'M' ; break ; } } /* * Now do almost the same for SNOW, except skip obs with error flags. * We don't want to replace one error flag with another; it just * obscures the issue.... */ for (m = 0 ; m < nobsnow ; m++) { a = rec30snow + (m*12) ; if (!SIG_OBS2(a)) continue ; if (*(a+1) == *(cday+1)) { if (*a == *cday) { *(a+11) = 'T' ; *recsnow = 'M' ; break ; } } } /* * Now comes housekeeping in the wake of flagging. If the errors had * been known when "bldval(rec, snwdval)" was first run, 'snwdval[k]' * would have been -1. Make that change now. Loop dynamics are now * different; prepare to restart the loop at 'kref'+1. */ snwdval[k] = -1 ; k = kref ; /* * Both records and 'snwdval[k]' were changed, but 'snowval[k]' was * not. In order for the loop to behave consistently, one of the two * arrays should be treated as "truth". */ } } } /****************************************************************************** * Using SNWD, create estimates values in WTEQ where needed. */ void qc_snwd_vs_wteq() { register char *a, *b ; register int i, j, k, m ; int val, icomp, nobsa, nobsb0, nobsb, snwd0, snwd1, wteq0, wteq1 ; char *reca, *recb, *reca_obs1, *recb_obs1, cval[7] ; snwd0 = Gelems[GNSNWD][0] ; snwd1 = Gelems[GNSNWD][1] ; wteq0 = Gelems[GNWTEQ][0] ; wteq1 = Gelems[GNWTEQ][1] ; /* Require the same number of SNWD and WTEQ records. Otherwise comparisons get too complicated. */ if (snwd1 == -1 || wteq1 == -1 || snwd1 - snwd0 != wteq1 - wteq0) return ; /* Look at all SNWD and WTEQ records. (One of each is most likely.) */ for (i = snwd0, j = wteq0 ; i <= snwd1 ; i++, j++) { reca = Grec[i] ; nobsa = numobs(reca) ; reca_obs1 = reca + 30 ; recb = Grec[j] ; nobsb0 = nobsb = numobs(recb) ; recb_obs1 = recb + 30 ; /* Let "a" be the address of each SNWD obs, and use "b" for WTEQ. This "for()" with nested "do{}" in used in several "qc_" functions. */ for (k = 0, m = 0 ; k < nobsa && m < nobsb ; k++) { a = reca_obs1 + (k * 12) ; /* Ignore non-significant SNWD obs and errors. */ if (!SIG_OBS2(a)) continue ; icomp = 1 ; do { b = recb_obs1 + (m * 12) ; if (SIG_OBS1(b)) /* Ignore non-significant WTEQ obs. */ icomp = strncmp(a, b, 2) ; } while (icomp > 0 && ++m < nobsb) ; /* "Continue" is executed if no WTEQ obs could be found with a day equal to the current SNWD obs day. */ if (icomp != 0) continue ; /* If the SNWD obs data value is non-numeric or zero or "missing" or part of an accumulation, it can't be used to make an estimated value. Also, simply don't bother if the snow value is "00001". */ if (!sscanf(a+5, "%5d", &val) || val <= 1 || val == 99999) continue ; /* WTEQ needs an estimated value if it is non-numeric. */ if (sscanf(b+5, "%5d", &val)) { /* WTEQ needs an estimated value if it is zero but not a trace. */ if (val == 0) { if (*(b+10) == 'T') continue ; } /* WTEQ needs an estimated value if it is missing, but not part of an accumulation. */ else if (val == 99999) { if (*(b+10) == 'S') continue ; } /* Any other precip value does not need an estimate. */ else { continue ; } } /* Arrive here if SNWD will supply an estimated value for WTEQ. Set first character of the record to 'M' for future reference. */ *recb = 'M' ; /* The "do" loop above made sure that the current "WTEQ" obs is either an edited obs or an original obs that has no edited. If the second case, add a new edited obs while placing a '2' in flag 2 of the current obs. */ if (m == 0 || SIG_OBS1(b-12)) { if (nobsb >= MAXOBS) rec_siz_err() ; mystrcpy(b+12, b) ; *(b+11) = '2' ; /* "Invalid, replacement value follows" */ b += 12 ; m++ ; nobsb++ ; } /* Copy the SNWD obs value into the WTEQ obs value field. No units conversion is needed because: SNWD must be divided by 10 to get the water equivalent for WTEQ; and, since SNWD is in whole inches and WTEQ is in tenths of inches and both must be expressed as integers, SNWD must be multiplied by 10 to get WTEQ. The two conversions cancel each other. */ strncpy(b+4, a+4, 6) ; *(b+10) = 'E' ; /* "Estimated" */ *(b+11) = 'C' ; /* "Precip est. from snow" */ } /* If the number of obs was changed as a result of this function's operations, put the correct number into bytes 28-30. */ if (nobsb0 != nobsb) write_nobs(recb, nobsb) ; } } /****************************************************************************** * Check each obs of a temperature record "reca" for excessive departure from * the standard deviation "sd" and set flags or make corrections. Integer "m" * is the mean. This function can be used for any of the temperature data * types: TMAX, TMIN, TOBS, MNPN, MXPN, SNyz, SOyz, SXyz. */ int qc_temp(char *reca, int m, int sd) { register char *a ; register int i, lval, sdxf, nobs ; int nobs0, ret, val ; char cval[7], *reca_obs1 ; ret = 0 ; if (sd <= 0) return ret ; nobs0 = nobs = numobs(reca) ; reca_obs1 = reca + 30 ; sdxf = sd * FACTOR ; lval = 99999 ; /* Advance to each significant observation in the record. Skip non-significant obs and errors. */ for (i = 0 ; i < nobs ; i++) { a = reca_obs1 + (i * 12) ; if (!SIG_OBS2(a) || !sscanf(a+4, "%6d", &val) || abs(val) == 99999) continue ; /* An obs data value is "good" if the absolute difference between it and the mean is less than the sample standard deviation times FACTOR ('sdxf'), or if the absolute difference between the data value and the previous "good" data value is less than 9. The 'sdxf' test by itself may cause an extreme value in a real heat wave or cold wave to appear to be an error; the second test excuses such a value if it is not greatly different from the previous "good" value. (Note that 'lval' is always the previous "good" value -- not simply the previous value.) */ if (abs(m - val) < sdxf || abs(lval - val) < 15) { lval = val ; continue ; } /* Arrive here if a change will be made. Set the first character of the record to 'M' for future reference. */ ret = 1 ; *reca = 'M' ; if (i > 0 && !SIG_OBS1(a-12)) /* If edited obs. */ { /* If the obs value is not between -5 and +5, redo the earlier "abs()" test, except reverse the arithmetic sign on the obs value and use the sample standard deviation alone, not multiplied by "factor". If the new test looks good, assume the original arithmetic sign was wrong and correct it. This is done for an edited obs here. Original obs is done farther down. */ if (abs(val) > 5 && abs(m - (-val)) < sd) { sprintf(cval, "%6.5d", -val) ; strncpy(a+4, cval, 6) ; *(a+10) = 'E' ; /* "Estimated" */ *(a+11) = 'G' ; /* "Changed algebraic sign" */ } /* Or if the obs value is between -5 and 5, or if changing the sign didn't help, there is no remedy. Set flag 2 to 'T'. */ else { *(a+11) = 'T' ; } } /* Else current obs is original without a following edited. */ else { /* Same opposite sign test as in edited obs case. If the test looks good with opposite sign, set the obs flag to '2' and create an edited obs having the same value with opposite sign and "EG" for the two flags.*/ if (abs(val) > 5 && abs(m - (-val)) < sd) { if (nobs >= MAXOBS) rec_siz_err() ; mystrcpy(a+12, a) ; *(a+11) = '2' ; /* "Invalid, replacement value follows" */ nobs++ ; i++ ; a += 12 ; sprintf(cval, "%6.5d", -val) ; strncpy(a+4, cval, 6) ; *(a+10) = 'E' ; /* "Estimated" */ *(a+11) = 'G' ; /* Changed algebraic sign" */ } /* Else, if the test failed, there is no known remedy. Set flag 2 to 'T'. */ else { *(a+11) = 'T' ; /* "Failed internal consistency check" */ } /* End of if current obs is edited or original.... */ } /* End of advance to each significant observation. */ } /* End of advance to each temperature record. */ if (nobs0 != nobs) write_nobs(reca, nobs) ; return ret ; } /* End of function. */ /****************************************************************************** * This function handles the case where the observer simply repeats the same * temperature value in more than "maxequal" days of observations. Edited obs * are deleted if they are part of the problem; flag 2 of original obs are set * to 'T'. */ int qc_temp2(char *reca) { register char *a ; register int j, k ; int nobs0, nobs, nequal, ret, maxequal = 10 ; char *reca_obs1, lastval[6] ; nobs0 = nobs = numobs(reca) ; reca_obs1 = reca + 30 ; *lastval = '\0' ; ret = nequal = 0 ; /* Check all significant obs. Note that "j" goes past the last obs in the "for()" loop, different from most obs loops in this program. This is so that the code beginning with "if (nequal > maxequal)" doesn't have to be duplicated below the "for()" loop. */ for (j = 0 ; j <= nobs ; j++) { if (j < nobs) { a = reca_obs1 + (j * 12) ; /* Skip non-significant obs and errors. */ if (!SIG_OBS2(a)) continue ; /* If the current obs data value is equal to the previous obs data value, increment "nequal" counter; and, only on the first time it is incremented, load "k" with "j"'s current value to retain the location of the second obs in the equal sequence. */ if (strncmp(a+4, lastval, 6) == 0) { if (++nequal == 1) k = j ; continue ; } /* When the current data value is different, save it in "lastval". */ strncpy(lastval, a+4, 6) ; } /* Arrive here if the current obs is significant AND the current obs data value is not equal to the previous, OR failing that, we're past the last obs. When "nequal" is "maxequal" or more, we assume there is a problem. "j" is the number of either the first obs after the equal sequence, or of the last obs. */ if (nequal > maxequal) { ret = 1 ; /* Changes will be made. */ /* Starting with the second obs in the equal sequence, flag the equal values. */ for ( ; k < j ; ++k) { a = reca_obs1 + (k * 12) ; /* Skip non-significant obs and errors. Otherwise place a 'T' in flag 2. */ if (SIG_OBS2(a)) *(a+11) = 'T' ; /* "Failed internal consistency check" */ } } nequal = 0 ; } /* Make sure the correct number of obs are in positions 28-30. */ if (nobs0 != nobs) { write_nobs (reca, nobs) ; ret = 1 ; } /* If changes have been made, change position 1 to 'M'. */ if (ret) *reca = 'M' ; return ret ; } /****************************************************************************** * Return 1 if "Gtypes[n]" is one of the standard maximum temperature data * types. Return 0 if not. */ int realmax(int n) { register int m ; m = get_data_type(Gtypes[n]) ; return (m == GNMXPN || m == GNSXYZ || m == GNTMAX) ; } /****************************************************************************** * Return 1 if "Gtypes[n]" is one of the standard minimum temperature data * types. Return 0 if not. */ int realmin(int n) { register int m ; m = get_data_type(Gtypes[n]) ; return (m == GNMNPN || m == GNSNYZ || m == GNTMIN) ; } /***************************************************************************** * An input record is too large. */ void rec_siz_err() { fprintf(Gfplog0, "\n\n%s%s%s\n%.30s\n%s\n%s\n", "Record in file ", Gfilnam, " that begins with", Ginbuf, "is initially too long, or the addition", "of edited obs during quality control will make it too long." ) ; exit(-1) ; } /****************************************************************************** * "Min" and "max" are a min-max record pair where the relationship between * data values for the same day should be min <= max. When the relationship is * instead min > max, take correction action. If "enable" is 1, replace the * min with the max. If "enable" is 2, replace the max with the min. If * "enable" is 0, mark both obs with a flag 2 = "T". No original data is * destroyed. */ void repair_minmax(char *min, char *max, int enable) { register char *a, *b ; register int i, j ; int minval, maxval, nobsmn0, nobsmn, nobsmx0, nobsmx, icomp ; char *min_obs1, *max_obs1 ; nobsmn0 = nobsmn = numobs(min) ; nobsmx0 = nobsmx = numobs(max) ; min_obs1 = min + 30 ; max_obs1 = max + 30 ; /* Loop through the obs in the min record. */ for (i = j = 0 ; i < nobsmn && j < nobsmx ; i++) { a = min_obs1 + (i * 12) ; /* In some cases we allow flag 2 indicating invalid data ("SIG_OBS1()" allows it) and in other cases we don't ("SIG_OBS2()" doesn't allow it). */ if (enable == 2) { if (!SIG_OBS2(a)) continue ; } else { if (!SIG_OBS1(a)) continue ; } /* Find an obs in the max record with the same day as the min obs. Same statement about allowing invalid data applies. */ icomp = 1 ; do { b = max_obs1 + (j * 12) ; if (enable == 1) { if (SIG_OBS2(b)) icomp = strncmp(a, b, 2) ; } else { if (SIG_OBS1(b)) icomp = strncmp(a, b, 2) ; } } while (icomp > 0 && ++j < nobsmx) ; if (icomp != 0) continue ; /* Arrive here if there are obs in both records with the same day. Make an integer version of each data value. */ sscanf(a+4, "%6d", &minval) ; sscanf(b+4, "%6d", &maxval) ; /* If the min value is greater than the max value, there is a problem. */ if (minval > maxval) { /* Different handling based on value of "enable". */ /* When "enable" == 0, both obs are flagged with "T" (failed internal consistency check). The obs may be either edited, or original that has no edited -- both are done the same way. */ if (enable == 0) { if (!ERR_FLG(a)) *(a+11) = 'T' ; /* min obs. */ if (!ERR_FLG(b)) *(b+11) = 'T' ; /* max obs. */ } /* When "enable" == 1, the min obs is treated as the bad obs. If that obs is edited, its data value is replaced with the value from the max obs. Otherwise, an edited obs is created using the value from the max obs. Put "EA" into the two flags. */ else if (enable == 1) { if (i == 0 || SIG_OBS1(a-12)) { mystrcpy(a+12, a) ; *(a+11) = '2' ; a += 12 ; nobsmn++ ; i++ ; } strncpy(a+4, b+4, 6) ; *(a+10) = 'E' ; *(a+11) = 'A' ; } /* When "enable" == 2, the max obs is treated as the bad obs. If that obs is edited, its data value is replaced with the value from the min obs. Otherwise, an edited obs is created using the value from the min obs. Put "EA" in the two flags. */ else if (enable == 2) { if (j == 0 || SIG_OBS1(b-12)) { mystrcpy(b+12, b) ; *(b+11) = '2' ; b += 12 ; nobsmx++ ; j++ ; } strncpy(b+4, a+4, 6) ; *(b+10) = 'E' ; *(b+11) = 'A' ; } } } /* If the number of obs in either record has changed, put in the correct number of obs. */ if (nobsmn0 != nobsmn) write_nobs(min, nobsmn) ; if (nobsmx0 != nobsmx) write_nobs(max, nobsmx) ; } /****************************************************************************** * Given PRCP or SNOW record 'rec' with 'nobs' observations. For PRCP, any * value >= 60 inches is flagged with flag2 == T. For SNOW, the same happens * for any value >= 100 inches. 'Ntyp' must be either 'GNPRCP' or 'GNSNOW'. * Return 1 on error and 0 if none. */ int scale_err_1(char *rec, int nobs, int ntyp) { int val, ret, toohuge ; register int i ; register char *a, *a0, *ap10 ; /* * Set value of 'toohuge', an integer, which will be compared against the * integer data values found in the record. The integer data values represent * floating point numbers: PRCP is in hundredths of inches, so that a data * value of 6000 actually means 60.00 inches; and SNOW is in tenths of inches, * so that 1000 actually means 100.0 inches. (See the TD-3200 documentation * for more information.) 'Ntyp' is either 'GNPRCP' or 'GNSNOW'. */ toohuge = (ntyp == GNPRCP) ? 6000 : 1000 ; ret = 0 ; a0 = rec + 30 ; /* * Look at each obs. */ for (i = 0 ; i < nobs ; i++) { a = a0 + (i * 12) ; ap10 = a + 10 ; /* * Skip non-significant obs and error-flagged obs. Also skip obs that are * any part of an accumulation. */ if (!SIG_OBS2(a) || *ap10 == 'S' || *ap10 == 'A' || *ap10 == 'B') continue ; /* * Convert the current obs data value starting at location 'a'+4 to an * integer in 'val'. If the conversion fails, take no action. If the * conversion succeeds and the integer in 'val' is very large, assume a * scaling error and set flag 2 to 'T'. 99999 is valid; it means the * data value is "missing". */ if (sscanf(a+4, "%6d", &val) && val >= toohuge && val != 99999) { *(a+11) = 'T' ; ret = 1 ; } } if (ret) *rec = 'M' ; /* Mark record for later reference. */ return ret ; } /****************************************************************************** * Given SNWD or WTEQ record 'rec' with 'nobs' observations. For SNWD, any * increase >= 100 inches is flagged with flag2 == 'T'. For WTEQ, the same * happens for any increase >= 10 inches. Return 1 on error and 0 if not. */ int scale_err_2(char *rec, int nobs) { int val ; register int i, bigdif, toohuge, val_init, ret, val_valid ; register char *a, *a0, *ap10 ; /* * Set values of 'bigdif' and 'toohuge'. These are integers and are compared * to the integer data values found in the record. SNWD is in whole inches, * so that 100 actually means 100 inches; but WTEQ is in tenths of inches, so * that 100 means 10.0 inches. (See the TD-3200 documentation for more * information.) */ bigdif = 100 ; toohuge = 600 ; /* * Other initializations. Each of these is likely to have a different value * on different iterations of the coming "for ()" loop. */ ret = 0 ; val_init = val_valid = 123456 ; /* * Loop to check each obs for errors. */ for (i = 0, a0 = rec + 30 ; i < nobs ; i++) { a = a0 + (i * 12) ; ap10 = a + 10 ; /* * Certain obs should be ignored: Non-significant obs, obs with errors, * obs that are any part of an accumulation, obs with value 99999, and obs * with non-numeric data values. The sign byte at 'a'+4 can be ignored * because negative values were ruled out earlier. */ if (!SIG_OBS2(a) || *ap10 == 'S' || *ap10 == 'A' || *ap10 == 'B' || !sscanf(a+5, "%5d", &val) || val == 99999) continue ; /* * If 'val_valid' is still at its initialized value.... */ if (val_valid == val_init) { /* * If the current data value is "too huge", there is a scale error: * Set 'err' to 1, set flag 2 to 'T', and set 'ret' to 0 to indicate * that an error occurred. This is not a sensitive test and will miss * many errors; but the better "big difference" test (below) cannot be * used if 'val_valid' is still at its initialized value. */ if (val >= toohuge) { *(a+11) = 'T' ; ret = 1 ; } /* * If the current data value passed the "too huge" test, ensure that * 'err' is at 0. Also, use the opportunity to set 'val_valid'. */ else { val_valid = val ; } } /* * Else, if 'val_valid' has changed from its initial value.... */ else { /* * "Big difference" test: Check the difference between the current * value 'val' and a smaller earlier value 'val_valid'. It is an * error if 'val' is larger than 'val_valid' by as much as 'bigdif'. * On error, set flag 2 to 'T' and 'ret' to 1. */ if (val - val_valid >= bigdif) { *(a+11) = 'T' ; ret = 1 ; } /* * If the current data value passed the "big difference" test, ensure * that 'err' is at 0. Also, use the opportunity to set 'val_valid'. */ else { val_valid = val ; } } } /* End of "for (i = 0 ; i < nobs ; i++)" */ if (ret) *rec = 'M' ; /* Mark record for later reference. */ return ret ; } /****************************************************************************** * If DYSW record "rec" has any observations in "000AA" format, convert them to * "0AA00". If a conversion happens, return 1. Otherwise, return 0. */ int shift_event_1(char *rec) { register char *p ; register int i ; int ret, nobs ; nobs = numobs(rec) ; ret = 0 ; rec += 30 ; for (i = 0, p = rec ; i < nobs ; p = rec + (++i * 12)) { if (p[6] == '0' && p[7] == '0' && (p[8] != '0' || p[9] != '0')) { p[6] = p[8] ; p[7] = p[9] ; p[8] = '0' ; p[9] = '0' ; ret = 1 ; } } return ret ; } /****************************************************************************** * Write 1 year-month of output data, except where the first character is * '\0', indicating a "deleted" record, */ void write_1_date() { register int i, j ; for (i = 0 ; i < 15 ; i++) { for (j = Gelems[i][0] ; j <= Gelems[i][1] ; j++) { if (Grec[j][0]) { ++(Gout[i]) ; fprintf(Gfpout, "%c%s\n", 'D', Grec[j]+1) ; } } } } /****************************************************************************** * Write the number of obs "n" into positions 28-30 of record "rec". */ void write_nobs(char *rec, int n) { char cnobs[4] ; sprintf(cnobs, "%3.3d", n) ; rec[27] = cnobs[0] ; rec[28] = cnobs[1] ; rec[29] = cnobs[2] ; } /*^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ *^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ *^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ *^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ *^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ *^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ *^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * Group of general utility functions. These functions are not specific to * this application. Could be used anywhere! */ /****************************************************************************** * Return 1 if integer "yr" is a leap year and 0 if not. */ int is_leap_year(int yr) { return (yr % 4 == 0 && (yr % 100 != 0 || yr % 400 == 0)) ; } /****************************************************************************** * The primary purpose of "mygets()" is to read one record from an ASCII file * "iop", minus the record-terminating '\n' character, and to write the record * to "s". "Mygets()" is similar to ANSI "C" library function "gets()". * * Details: "Mygets()" reads a character from the file pointed to by "iop", * and writes it to address "s". A second character is read from "iop", and * is written to address "s"+1, and so forth, until one of the following events * happens: * (1) A '\n' character is read. That character is not written anywhere. * (2) A EOF character is read. That character is not written anywhere. * (3) A total of "n"-1 characters has been read and written. * Then reading stops. A terminating '\0' character is written to "s". * * The user must ensure that the buffer following "s" can contain "n" * characters. * * "Mygets()" returns the number of characters written to "s", not including * the terminating '\0'; this value is equivalent to "strlen(s)". If the user * intends to read all records in a file, he/she should continue calling * "mygets()" until a zero (0) is returned. * */ int mygets(register char *s, int n, FILE *iop) { register int i, c ; n-- ; i = 0 ; while ((c = getc(iop)) != EOF && c != '\n') { *(s++) = c ; if (++i >= n ) break ; } *s = '\0' ; return i ; } /****************************************************************************** * Sort, in ascending lexical order, "n" elements of array of character * pointers "acp". Each element of the array is a pointer that points to a * '\0'-terminated character string; the sort uses the contents of these * strings. */ void mycsort(register char *acp[], register int n) { register int i, j, k, comp ; register char *save ; for (i = 1 ; i < n ; i++) { for (save = acp[i], j = 0 ; j < i ; j++) { /* * Need to ask if 'save' is lexically less than 'acp[j]'. One might * think that "strcmp(save,acp[j]) < 0" would do this, but maybe not. * "Strcmp(a,b)" may return 0 when 'a' and 'b' are "equal" but one is * a shorter string than the other, and therefore they are not * lexically equal. Thus extra code using 'comp' and "strlen()" to * cover that chance. */ if ((comp = strcmp(save, acp[j])) < 0 || (comp == 0 && strlen(save) < strlen(acp[j]))) { for (k = i ; k > j ; k--) acp[k] = acp[k-1] ; acp[j] = save ; break ; } } } } /***************************************************************************** * This is a more robust version of library function "strcpy(a,b)", which by * the ANSI standard is NOT guaranteed to handle "overlap" properly (where "b" * is overwritten during the write to "a"). "Mystrcpy(a,b)" does everything * "strcpy(a,b)" does, but does overlaps correctly. */ void mystrcpy(register char *a, register char *b) { register int i ; /* The forward-directed copy works correctly if "a" <= "b". */ if ( a <= b ) { while (*(a++) = *(b++)) ; } /* The backward-directed copy works correctly if "a" > "b". */ else { i = strlen(b) ; a += i ; b += i++ ; while (i--) *(a--) = *(b--) ; } } /***************************************************************************** * Open directory "dirnam" for reading. */ DIR *opndir (char *dirnam) { DIR *df ; if ((df = opendir(dirnam)) == NULL) { printf( "Can't open directory %s\n", dirnam) ; } return df ; } /***************************************************************************** * Open a file "fil" in mode "m" and write an error message on error. Return * a FILE pointer to the file, or return a NULL pointer on failure. */ FILE *opnfil(char *fil, char *m) { FILE *fp ; if ((fp = fopen(fil, m)) == NULL) { printf("Can't open file %s for %s\n", fil, (strchr(m, 'r') != NULL) ? "read" : "write") ; } return fp ; } /****************************************************************************** * Print a message if too many bytes were read earlier. "filnam" is the * name of the file which was read, and "fp" is a pointer to the file where the * message will be printed. */ void read_err(int max, char *filnam, FILE *fp) { fprintf(fp, "File %s: Read more than %d bytes.\n", filnam, max) ; } /* Discussion of methods used ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- Pointers and pointer arithmetic are much used in this program, mainly to copy or move through character strings. I thought it a good idea to discuss these ideas below, which the reader is free to skip if the concepts are not new. ------------------------------------------------------------------------------- First, basic information: In a character string copy in COBOL or FORTRAN, the variables involved in a copy would be character variables. In FORTRAN, copying character string "source" to "target" is done with the expression target = source "C" code may look like this if a single character is copied. But a character string copy is typically done with a function like strcpy(target, source) ; "Strcpy()" is an ANSI "C" Standard library function. The function copies "source" to "target", one character at a time, stopping after a NULL character has been copied. "C" assumes that a NULL character is the last character of a character string. "Target" and "source" do not give any information about string length; they represent character addresses. Either may look like a variable name or like an algebraic expression. An address is a number that "C" interprets as a memory location. Like most numbers in a program, this number may be stored in a variable or it may be the result of an algebraic expression. Variables that contain addresses are called array or pointer variables. Arrays and pointers are similar. The biggest difference is that the address and space requirements of an array are fixed when it is declared, but for a pointer they can be changed at any time. In this program, the reader will often see a pointer used as an array, or used to go to different positions in an array. A slightly different copy is done by standard ANSI "C" library function strncpy(target, source, n) ; This function copies until a NULL character is copied, like "strcpy()", or until "n" (a positive integer) characters are copied, whichever occurs first. My preference is not to make a function call to copy one or two characters. So instead of strncpy(a, b, 2) ; I prefer the pointer arithmetic form *a = *b ; *(a+1) = *(b+1) ; or the equivalent array element number form a[0] = b[0] ; a[1] = b[1] ; All three methods do the same thing, but the last two methods are probably faster than the function call. The last two methods copy one character per line. ------------------------------------------------------------------------------- More about pointer arithmetic and character string copying: One form of character string copying that is much used in this program is the copy-overwrite, in which pointer arithmetic is rampant. Typical are lines of code like (1) mystrcpy(a, a+12) ; (2) mystrcpy(a+12, a) ; where "mystrcpy()" is a general-purpose function in this program. "a" and "a+12" are character addresses inside the same long character string, whose starting address is typically stored in some other variable that is not mentioned; "a+12" is 12 bytes to the right (toward higher addresses) of "a". "a-12" is 12 bytes to the left (toward lower addresses) of "a". Code line (1) copies, to "a", the character substring that begins at "a+12" and ends after the long string's NULL character. In effect, the 12 bytes between "a" and "a+11" are lifted out of the long string and discarded; and the right end of the long string is slid to the left to fill the gap. The long string is 12 bytes shorter. This trick removes a 12-byte observation in a TD-3200 record. Code line (2) copies, to "a+12", the character substring that begins at "a" and ends after the long string's NULL character. In effect, the right end of the long string, starting at "a+12", is slid to the right, leaving a 12-byte gap; and the 12 bytes between "a" and "a+11" are duplicated in the gap. The long string is 12 bytes longer. A new 12-byte observation in a TD-3200 record can be copied into the gap. [Charles Phillips] */