2. Description of errors in data processing

Post date: 16-Jun-2014 19:55:54

Description of errors in data processing

Introduction:

Computers make errors because people program them to make those errors. Thus the acronym GIGO - Garbage in Garbage out- errors can be described as:

1) Transcription errors

2) Computation errors and

3) Algorithm errors

1) Transcription errors

Transcription errors occur during data entry. These errors include:

i) Misreading errors

ii) Transposition errors

i) Misreading errors

These errors are caused by incorrect reading of the source document by the user and hence entering wrong values e.g. a user may enter 5 instead of S or 0 instead of o and so forth.

ii) Transposition errors

These are errors that occur due to incorrect arrangement of characters. like putting characters in the wrong order. e.g. the user may enter 369 instead of 396.

However these errors can be avoided by using modern data capture devices like the bar code reader, optical character reader, digital cameras, scanners, configuring the right data-types in the database e.t.c.

2) Computation errors

These are errors that occur when an arithmetic operation does not produce the expected results. They include:

overflow, underflow, truncation and rounding errors

a> Overflow

These are errors that occur from a calculation which appear too large to be stored in the allocated memory space. e.g. if a byte is represented by 8 bits, an overflow will occur if the result of a calculation gives a 9-bit number.

b> Underflow

Underflow (or "floating point underflow", or just "underflow") is a condition in a computer program where the result of a calculation is a smaller number than the computer can actually store in memory.

Arithmetic underflow can occur when the true result of a floating point operation is smaller in magnitude (that is, closer to zero) than the smallest value representable as a normal floating point number in the target datatype. Underflow can in part be regarded as negative overflow of the exponent of the floating point value. For example, if the exponent part can represent values from −127 to 127, then a result with absolute value less than 2−127 may cause underflow.

c> Truncation

Truncation errors result from having real numbers that have a long fractional part which cannot fit in the allocated memory space. The computer would truncate or cut off the extra characters from the fractional part For example, a number like 0.784969 can be truncated to four digits to become 0.784. The resulting number is not rounded off.

d> Rounding

Rounding errors results from raising or lowering a digit in a real number to the required rounded number. For example, to round off 30.666 to one decimal place, we raise the first digit after the decimal point if its successor is more than 5. In this case, the successor is 6 therefore 30.666 rounded up to one decimal place is 30.7. If the successor is below 5, e.g. 30.635, we round down the number to 30.6.

3) Algorithm errors

An algorithm is a set of procedural steps followed to solve a given problem. Algorithms are used as design tools when writing programs. Wrongly designed programs would result in a program that runs but gives erroneous output. Such errors that result from wrong algorithm design are referred to as algorithm or logical errors.

Data Integrity

Data integrity refers to the correctness and completeness of data entered in a computer or received from the information system. Integrity is measured in accuracy, timeliness and relevancy of data

A> Accuracy

This is how close an approximation is to an actual value. example: in a number like 34.247545, 34.2475 is more accurate than than 34.2 this is because the deviation on the former is much lesser than the latter.

B> Timeliness

This is the relative accuracy of data in respect to the current state of affairs for which it is needed. Information should be available on time for decision making. For example: In a data base to prepare worker's salaries, information on the presence of workers over the month should be processed on time before salary arrears are released every month.

C> Relevancy

Data entered into the computer must be relevant in order to get the expected output. it must meet pertinent needs at hand and must meet the requirements of the processing cycle.

Way to minimize data integrity threats

  1. Backup data on secondary storage devices or on online storage stores like Dropbox and Google Drive
  2. Control access to data by enforcing security measures
  3. Design user interfaces that minimize chances of invalid data entry
  4. Using error detection and correction software when transmitting data
  5. Using devices that directly capture data from the source such as bar code readers, digital cameras, optical character reader e.t.c.

Quiz

  1. Define the following terms a) data processing b) data processing cycle
  2. Using an illustration, describe the four primary stages of the data processing cycle
  3. outline the stages of data collection
  4. you may have come across the term garbage in garbage out (GIGO). what is its relevance to errors in data processing.
  5. explain the two types of transcription errors
  6. state three types of computational errors

a) define the term data integrity

b) give three factors that determine the integrity of data

C) state at least five ways of minimizing threats to data integrity.

NEXT: 3. Data Processing Methods