Here you can ask questions and find or give answers to organizational, academic and other questions about studying computer science.

870 questions

988 answers


439 users

0 votes

is there a list of alternative ways to convert numbers from and to floating point format?

The way described in chapter Floating Point Arithmetic always led to making mistakes for me.

Thank you in advance
in # Mandatory Modules Bachelor by (190 points)

1 Answer

+1 vote
Best answer

I think the best way to do this is as follows which is without remembering many complicated formulas:

  1. Normalize the given number in that you haven/double it until the mantissa is between 1 and 2 while maintaining a product with a power of 2. Example: 0.2 = 0.2*2^0 = 0.4*2^{-1} = 0.8*2^{-2} = 1.6*2^{-3}
  2. Convert the mantissa to radix 2: 1.6 = 1 + 1/2 + 1/16 + 1/32 + epsilon, i.e., 1.10011 + eps; if you need four bits, compute five here (which is the additional red one called the rounding bit).
  3. Consider the two representable numbers less and greater than the given number; in the example, these are 1.1001_2 * 2^{-3} and 1.1010_2 * 2^{-3}. Which is the nearest number? This is easily seen as follows: if eps=0, then and the red digit is 1, we are exactly in the middle between two representable numbers, otherwise a red 1 tend to the upper number, and 0 to the lower one. Tie breaking is needed in case of the middle case which is done according to the rounding modes.
  4. Rest depends on the rounding mode and whether you have a special case (denormal number, overflow, etc). 

Does this help? Read also the answer to

by (139k points)
selected by
Thank you. I think I understood everything except the rounding part.
I round considering epsilon and my rounding bit and then round it again with my given rounding method (to nearest even, to zero,...)?
And what about the hidden bit? If I use a hidden bit would I have 1.1001 and without a hidden bit just 1001?
For the rounding part, see the slides, in particular, slide 34. The red bit is r (called the rounding bit), and s (the sticky bit) is true whenever eps>0.

Yes, with a hidden bit, you omit the 1 left to the decimal point (and therefore you have to calculate one more bit for the mantissa.

Related questions

0 votes
1 answer
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer
asked Aug 18, 2020 in TUK (TU Kaiserslautern) by geraud joel (180 points)
0 votes
2 answers
Imprint | Privacy Policy