How float and double are stored?
C language follows the IEEE 754 standard for representing floating point values in the memory. Unlike the int type that is directly stored in the memory in binary form, the float values are divided into two parts: exponent and mantissa, and then stored.
According to IEEE 754, the floating point values consist of 3 components:
- Sign Bit: This represents the sign of the number. 0 represents positive while 1 represents negative.
- Biased Exponent: The exponent of the number cannot be directly stored as it can be both negative or positive, so we use a biased exponent where we add some bias to the exponent.
- Normalized Mantissa: Matissa is the number in scientific notation, i.e. precision bits of the number.
C float Memory Representation
The size of the float is 32-bit, out of which:
- The most significant bit (MSB) is used to store the sign of the number.
- The next 8 bits are used to store the exponent.
- The remaining 23 bits are used to store the mantissa.
Example
Let’s take 65.125 as a decimal number that we want to store in the memory.
Converting to Binary form, we get: 65 = 1000001 0.125 = 001 So, 65.125 = 1000001.001 = 1.000001001 x 106 Normalized Mantissa = 000001001 Now, according to the standard, we will get the baised exponent by adding the exponent to 127, = 127 + 6 = 133 Baised exponent = 10000101 And the signed bit is 0 (positive) So, the IEEE 754 representation of 65.125 is, 0 10000101 00000100100000000000000
C double Memory Representation
The size of the float is 32-bit, out of which:
- The most significant bit (MSB) is used to store the sign of the number.
- The next 11 bits are used to store the exponent.
- The remaining 52 bits are used to store the mantissa.
Example
Let’s take the example of the same number 65.125,
From above, 65.5 = 1.000001001 x 106 Normalized Mantissa = 000001001 Now, according to the standard, bais is 1023. So, = 1023 + 6 = 1029 Baised exponent = 10000000101 And the signed bit is 0 (positive) So, the IEEE 754 representation of 65.125 is, 0 10000000101 0000010010000000000000000000000000000000000000000000
C Float and Double
Float and double are two primitive data types in C programming that are used to store decimal values. They both store floating point numbers but they differ in the level of precision to which they can store the values.
In this article, we will study each of them in detail, their memory representation, and the difference between them.