figure

Assess completeness of data set

After data entry is complete, an assessment should be made of the completeness of the data set as there are commonly missing data in many fields. If many of the records have significant amounts of missing data, certain kinds of mathematical analysis will not be possible or may give misleading results. To help determine the completeness of the data set, a record should be flagged when all possible information has been collected.

Check for errors

It is important to check for errors that may have been entered in the database. A lot of errors can be avoided through the thoughtful design of the system, e.g. data validation on data entry, storing default values, lower case/upper case conversion, range checking in numeric data etc.

A useful way of identifying typing errors is to sort (or index) the records on a particular field and then examine the data for that field. Errors are then much more obvious e.g. Gossyppium instead of Gossypium. The process can be repeated on other fields or a combination of fields. Duplicate specimens should be located by sorting on the relevant fields (e.g. accession number, collector’s number, other numbers associated with the accession). This is advisable as many collectors send duplicate sets of herbarium specimens to different international herbaria and germplasm accessions are commonly duplicated.

Errors in mapping latitude and longitude data can be detected if particular localities are shown up as obvious outliers in impossible places.


IPGRI Home Copyright © 1997, International Plant Genetic Resources Institute.