Generating a 64-bit Double on an AVR

I’m doing some floating point operations on an AVR (using AVR Studio and winAVR). It took me half a day to figure out what was going wrong, but apparently in avr-libc “double” is just another word for “single”:

Single precision is just fine for the computations I’m doing, but to comply with a specified packet format I have to transmit the final computed values as IEEE 754-2008 64-bit doubles, and the conversion isn’t straightforward. After a fruitless search I’m working on my own function to convert the floats to the byte values they would have as doubles, but so far I’m not having much luck. Does anyone know of a library or some example code that would help with this?



1 Like


In case anyone has a similar problem in the future (it could happen!) here’s an example for the ATMega324p. It takes a 32-bit double ‘float32’ and generates a 64-bit integer ‘float64’ containing the bits that would represent the same number as a 64-bit double. To check it sends float64 out one of the USARTs, MSB first:

#define F_CPU 7372800
#define UBRR 3//baud rate variable for 115200 baud at 7372800 hz

#include <avr/io.h>

union floatBytes{
	double	f;
	unsigned long	b;

void USART1_Init(unsigned int ubrr){//Initialize USART hardware & settings for Serial Radio
	UBRR1H=(unsigned char)(ubrr>>8);//set buad rate
	UBRR1L=(unsigned char) ubrr;
	UCSR1B=(1<<TXEN1);//enable transmitter only
	UCSR1C=(1<<UCSZ10)|(1<<UCSZ11);// Set frame format: 8data, 1 stop bit

void USART1_Trans(unsigned char data){//Transmit a byte of data over USART to the Serial Radio
	while(!(UCSR1A&(1<<UDRE1)));//wait for transmition to complete

long long doubleUp(union floatBytes float32){
	unsigned char sign=(float32.b>>31)&0x01;
	unsigned long fraction=float32.b&0x007FFFFF;
	unsigned long exponent=(float32.b>>23)&0xFF;

	if((float32.b&0x7FFFFFFF)==0){//special case for +/- 0
		return (((long long)sign)<<63);
	if(exponent==0xFF){//special case for +/- infinity, NAN
		return (((long long)sign)<<63)|0x7FF0000000000000|(((long long)fraction)<<29);

	return (((long long)sign)<<63)|(((long long)exponent)<<52)|(((long long)fraction)<<29);

int main(){

	double float32=-7.51e-32;

	long long float64=doubleUp((union floatBytes)float32);



	return 0;


1 Like