NORMALISATION OF FLOATING POINT NUMBERS Normalisation is the name of the process of comnverting a number to 0.E format. In the case of very large or very small numbers, they are more readable when normalised, as their approximate magnitude can be easily determined by examining their exponent. For example, 0.1673E-23 is much more readable than 0.000000000000000000000001673. Usually, floating point numbers with smaller exponent magnitudes are printed in . (unnormalised) format, and numbers with exponent magnitudes greater than about 7 are printed in normalised format, though many programming languages allow the programmer to determine the format. Floating point numbers are always stored in normalised format, whatever their magnitude. This prevents unnecessary loss of significant digits when very large or very small numbers are being stored. Normalisation rules: If an unnormalised number is less than 0.1, shift it to the left by the number of digits between the point and the most significant digit, and set the exponent to minus this number. Example: 0.00001234 --> 0.1234E-4 If an unnormalised number is greater than or equal to 1.0, shift to the right by the number of digits to the left of the point. Examples: 1.234 --> 0.1234E1 1234567.89 --> .123456789E7 These rules are applicable in any number base. EXAMPLES OF BINARY FLOATING POINT CALCULATIONS In these examples, the representation given in lecture 7 is used. 6.75 + 19.5 = 2 1 -1 -2 4 1 0 -1 (2 + 2 + 2 + 2 ) + (2 + 2 + 2 + 2 ) = 2 0 -1 -3 -4 4 0 -3 -4 -5 2 * (2 + 2 + 2 + 2 ) + 2 * (2 + 2 + 2 + 2 ) = 1.1011E2 + 1.00111E4. At this stage the numbers are stored in normalised floating point binary format. Each number has positive sign, and so bit 31 is 0. The exponents are to be stored in excess-128 notation, and so are represented as 130 (10000010 binary) and 132 (10000100) respectively. The one to the left of the point is always present in this notation (except for 0.0), and is therefore not stored explicitly. The numbers are stored as S<--exp--><-------mantissa--------> 01000001 01011000 00000000 00000000 (6.75), and S<--exp--><-------mantissa--------> 01000010 00011100 00000000 00000000 (19.5). To add the two numbers, it is necessary to separate each number'x exponent and mantissa, and then adjust the two numbers so that their exponents are the same. 01000001 01011000 00000000 00000000 --> 0 10000010 10110000000000000000000 --> 0 10000010 1.10110000000000000000000 (6.75) 01000010 00011100 00000000 00000000 --> 0 10000100 00111000000000000000000 --> 0 10000100 1.00111000000000000000000 (19.5) Now, add 2 to the exponent of the binary representation of 6.75, and shift the mantissa 2 bits to the right. 0 10000010 1.10110000000000000000000 --> 0 10000100 0.01101100000000000000000 (6.75) The two mantissass can now be added. 0.01101100000000000000000 + 1.00111000000000000000000 ------------------------- 1.10100100000000000000000 The exponent of each number is the same (10000100), and as there was no carry, the exponent of their sum is the same. To store the sum, simply remove the one to the left of the point and join together sign, exponent and mantissa, thus: 01000010 01010010 00000000 00000000 The sum can be converted into decimal as follows: Note that the sign is positive, because bit 31 is 0. Subtract 128 from the exponent to get the true exponent: 00000100 or 4. Put back the one to the left of the point: 1.101001. At this stage, sign = +, exponent = 4 mantissa = 1.101001 Shift the mantissa exponent = 4 places to the left. This gives the value of the result in binary: +11010.01. 4 3 1 -2 11010.01 = 2 + 2 + 2 + 2 = 16 + 8 + 2 + 0.25 = 26.25 2.5 * 4.25 = 1 -1 2 -2 (2 + 2 ) * (2 + 2 ) = 1 0 -2 2 0 -4 2 * (2 + 2 ) * 2 * (2 * 2 ) = 1.01E1 * 1.0001E2 = (1.01 * 1.0001)E3 = ((1 * 1.0001) + (0 * 0.10001) + (1 * 0.010001)) * 2**3 = 1.0001 + 0.010001 -------- 1.010101 * 2**3 = 1010.101 = 2**3 + 2**1 + 2**(-1) + 2**(-3) = 8+2+0.5+0.125 = 10.625. The internal representations of the numbers have been omitted here.