About the Project
The volume validator tests for a variety of issues based on the most recent version of the PDS
Standards Reference,
Data Dictionary Document, and the current version of the
data dictionary. In limited cases where ambiguities exist in these documents, validation has been based on responses from the standards committe and/or practices and findings from senior members of the PDS.
Validation Parts
The current validation process is broken into two parts, the first being label validation. This is conducted using the product tools library, a colaborative effort between the User Center Technology team at Ames and the Engineering Node at JPL. All label parsing and validation across EN and UCT tools should be consistent from this point on.
The second part validates the collection of files and folders that makes up a data set against the rules for assembling and relating these resources as defined by the PDS3 standard.
Local Data Dictionary Support
The validator supports local data dictionaries in the following fasion. Any file in the document folder ending in ".ful" is treated as a dictionary. List values are merged, all other information is added or overwritten. Currently there are no rules of precidence for multiple local data dictionaries so you should not count on a particular order if more than one local dictionary contains the values for the same entry. Any changes or additions are made use of when validating the containing data set.
Validation Process
The way the validation process works is as follows. Clicking the validation button on the main validation page launches a java applet on your computer. Once you browse to the root folder of a data set and click validate, the entire validation process is completed on your machine. During this process, status updates are sent to the validation web page to indicate progress. Once the validation is complete, the results are serialized and sent back to the server for display. The results are cached for the duration of your web session so that you may filter and modify the view of the results.
Issue List
The list of issues being checked for appears below. It's likely that some issues the validator surfaces are not listed here due to them appearing as a more generic error or bubbling up as an exception. All efforts will be made to make this list as complete and up to date as possible.
Generic Parsing Problems
- Too Many Tokens = Too many tokens were found for a statement. This may be a result of unquoted strings or similar errors.
- Bad Line Ending = Line ended with something other than carriage return followed by line feed (0x0C 0x0A).
- Missing ID = No id was found for the statement.
- Circular Reference = An include pointer resulted in a circular reference, for instance, pointing to itself.
- Line too Long = Line length exceeds recommended 78 chars (not including line ending chars, CRLF).
- Wrong Line Length = Line length does not match RECORD_LENGTH.
- Missing End Quote = Quoted text string was not terminated.
- Missing Object Terminator = Unable to find END_OBJECT statement for object.
- Missing Group Terminator = Unable to find END_GROUP statement for group.
- Missing End Statement = Labels must end in an END statement.
- Missing Comment Terminator = Comment was not terminated.
- Missing Record Bytes = Having a RECORD_TYPE of FIXED_LENGTH requires a value for RECORD_BYTES.
- Start Byte Of Attached Data Mismatch = The found start byte for attached data, does not agree with the defined start byte.
- Possible Start Byte Of Attached Data Mismatch = The found start byte, may not agree with the defined start byte, if the data does not begin with white space.
- No Viable Alternative = Unable to parse statement at the given.
- Illegal Start of Statement = Unable to start a statement with the given token.
Label Version Problems
- Missing Version = Could not find the PDS_VERSION_ID in the first line.
- Mislocated Version = PDS_VERSION_ID was not found on the first line.
- Version Present In Fragment = Label fragments should not contain a PDS_VERSION_ID.
- SFDU Present in Fragment = The label fragment should not contain an SFDU.
Key Problems
- Unknown Key = No definition was found for the key.
- Long Namespace = Namespace exceeds a max length.
- Long Identifier = Identifier exceeds a max length.
Value Problems
- Missing Value = No value was found for the given statement.
- Unknown Value = The given value is not in the list of valid values for the key. It may be that the value needs to be added to the dictionary.
- Invalid Value = The given value is not in the list of valid values for the key.
- Non Alphabetic = The value, restricted to alphabetic chars, contains non-alphabetic characters.
- Non Alphanumeric = The value, restricted to alphanumeric chars, contains non-alphanumeric characters.
- Invalid Characters = Found illegal characters. Only ASCII characters are allowed.
- Bad Double = A value expected to be a double could not be converted.
- Bad Integer = A value expected to be an integer could not be converted.
- Too Short = A value is less than its minimum length.
- Too Long = A value is longer than its maximum length.
- Exceeds Maximum = Exceeds maximum value.
- Less Than Minimum = Less than minimum value.
- Invalid Date = Could not cast value as date.
- Missing Date Parts = No year, month, or day found in date value.
- Extra Date Parts = Value has too many parts to be a date.
- Date Out Of Range = Date or Time is out of range (ex 2/33/2009).
- Bad Year Length = Value for year is not 4 digits.
- Bad Month or Day Length = Month or day-of-year must be 2 or 3 digits in length.
- Bad Month Length = Month must be 2 digits in length.
- Bad Day Of Month Length = Day-of-month must be digits in length.
- Bad Fractional Time Length = Fractional section of time must be 1 to 3 digits in length.
- Bad Time Section = Hours, minutes, and seconds of the date must be 2 digits in length.
- Illegal Character = Illegal character for value.
- Manipulated Value = The value is only valid when the case is changed or spaces are substituted with underscores.
- Type Mismatch = The value type is illegal for the given key.
- Bad Real = A value expected to be a real could not be converted.
- Signed Non Decimal = Non decimal values must not be signed.
- Bad Non Decimal = A value expected to be a non decimal could not be converted.
- Bad Non Decimal Radix = A non decimal value has an illegal radix. 2, 4, 16 are the only valid radix supported in the PDS.
- Unknown Units = Found units not defined in dictionary.
- Invalid Units = Units are invalid for the expected value type.
- Bad Value = A value was unable to be cast to a valid type.
- Placeholder Value = The value, "NULL", is intended as a placeholder and should be replaced before delivery.
Object / Group Problems
- Invalid Element = Object or group contains an element which is neither required nor optional.
- Missing Required Element = Object or group does not contain a required element.
- Invalid Object = Object or group contains an object which is neither required nor optional.
- Missing Required Object = Object or group does not contain a required object.
- Missing Required Group = Object or group does not contain a required group.
- Invalid Group = Object or group contains a group which is neither required nor optional.
Data Set Validation Errors
- Missing Referenced File = A referenced file was not found.
- Missing Label = A label file, defined in the index, was not found.
- Missing Catalog = A catalog file was not found.
- Un-Indexed Label = A label was not listed in an index.
- Illegal Indexed Label = A label that should not appear in the index was listed in the index.
- Unknown File = Found a file that is not defined by a label.
- Missing Required Folder = Missing a required folder.
- Missing Required File = Missing a required file.
- Missing Required Child = Missing a file required by the presence of a given folder.
- Bad Pointer Name = A label contains a pointer that does not follow pointer naming conventions.
- No Indexes = No index files were found.
- Column Number Mismatch = The number of columns found in a tabular data file does not match the number defined.
- Column Length Mismatch = The test value for a column does not match the byte length of the column definition.
- Column Type Mismatch = The sample value for a column was unable to be cast to the type defined for the column.
- Invalid Integer = The sample value for a column was invalid for the given type. It must be a 1, 2, or 4 byte signed integer.
- Invalid Date = The sample value for the column was not a valid date format. It must conform to the format YYYY-MMDDThh:mm:ss.sss.
- Empty Directory = Folder contains no files.
- Empty File = File contains no data.
- Mismatched Case = Case of actual file path and described file path do not match.
- Column Length Mismatch = The column definition specified a given number of bytes but only fewer bytes remained on the test line following the specified start byte.
- Column Out Of Range = The column definition specified a given start byte which was beyond the end of the test line.