Custom Search

Floating Point Numbers

The term floating point is derived from the fact that there is no fixed number of digits before and after the decimal point; that is, the decimal point can float.

There are representations in which the number of digits before and after the decimal point is set, called fixed-point representations.

Floating point numbers are used in computer coding when the number to enter is outside the integer range of the computer (too large or too small) to enter as 'a number'. In a computer you store numbers in a register - so you are limited to the number of digits you can use. Very large numbers have lots of zeros at the end and very small ones have lots of zeros after the decimal point. This is made even worse when you change the decimal numbers into binary!

Any number that contains numbers after the radix point (in other words includes radix fractions) needs to be represented by a floating point.

  • The point is actually fixed between sign bit and body of the significand
  • The exponent indicates place value (point position)

Normalization
Floating Point numbers are usually normalized i.e. exponent is adjusted so that leading bit (MSB) of significand is 1

A normalized number is one in which the MSB of the significand is nonzero (true significand)

1 <= significand < base

Features

  • The sign is stored in the first bit of the word.
  • The value 127 is added to the true exponent to be stored in the exponent field, i.e., biased-127
  • The base is 2
  • The first bit of the true significand is always 1 and there is no need to store it

In general, floating-point representations are slower and less accurate than fixed-point representations, but they can handle a larger range of numbers.

Most floating-point numbers a computer can represent are just approximations (too few digits to have the complete number - have been 'rounded'). One of the challenges in programming with floating-point values is ensuring that the approximations lead to reasonable results.

If the programmer is not careful, small discrepancies in the approximations can snowball to the point where the final results become meaningless. Because mathematics with floating-point numbers requires a great deal of computing power, many microprocessors come with a chip, called a floating point unit (FPU ), specialized for performing floating-point arithmetic. FPUs are also called math coprocessors and numeric coprocessors.

See IEEE standard.