Floating point conversion

1 Answer

Best answer

I think the best way to do this is as follows which is without remembering many complicated formulas:

Normalize the given number in that you haven/double it until the mantissa is between 1 and 2 while maintaining a product with a power of 2. Example: 0.2 = 0.2*2^0 = 0.4*2^{-1} = 0.8*2^{-2} = 1.6*2^{-3}
Convert the mantissa to radix 2: 1.6 = 1 + 1/2 + 1/16 + 1/32 + epsilon, i.e., 1.10011 + eps; if you need four bits, compute five here (which is the additional red one called the rounding bit).
Consider the two representable numbers less and greater than the given number; in the example, these are 1.1001_2 * 2^{-3} and 1.1010_2 * 2^{-3}. Which is the nearest number? This is easily seen as follows: if eps=0, then and the red digit is 1, we are exactly in the middle between two representable numbers, otherwise a red 1 tend to the upper number, and 0 to the lower one. Tie breaking is needed in case of the middle case which is done according to the rounding modes.
Rest depends on the rounding mode and whether you have a special case (denormal number, overflow, etc).

Does this help? Read also the answer to https://q2a.cs.uni-kl.de/1697/conversion-to-resyfloat-if-x-1

answered Aug 22, 2020 by KS (171k points)
selected Aug 23, 2020 by akippler

Thank you. I think I understood everything except the rounding part.
I round considering epsilon and my rounding bit and then round it again with my given rounding method (to nearest even, to zero,...)?
And what about the hidden bit? If I use a hidden bit would I have 1.1001 and without a hidden bit just 1001?

commented Aug 22, 2020 by akippler (190 points)
edited Aug 23, 2020 by akippler

Most popular tags

Categories

Floating point conversion

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions