No no, don’t click away yet! You haven’t seen these before – this will cover the modern floating point instructions that use the XMM registers.
This article will specifically discuss the SSE floating point instructions in the context of the x86_64 assembly language. SEE stands for “Streaming SIMD Extensions”, it is a group of instructions that work on lots of data. The subset we’ll be exploring today is SSE2. It is supported on all major x86 CPUs as of the year 2000. The SIMD part of the name stands for “Single Instruction Multiple data”, these kinds of instructions take in multiple values at once and do work on all of them, however I will not be covering the full extent of that.
Let’s laser focus specifically on the assembly today. The best resource out there on this topic is The Intel CPU intrinsics guide. This is great documentation, but you need a trained eye for it, let’s jump right in!
Reading the Intrinsics Guide

The Intel Intrinsics Guide focuses mainly on documenting C and C++ wrappers, however it still provides us with valuable information. If you know some C/C++, the function definition gives some insight into what it does, however it’s not required.
On the left, you can filter for the kind of instruction you’d like. On the top right, you can filter by function name or instruction. Clicking on an instruction opens an interface bellow with details. An important one to an assembly dev is the Instruction
field, this explains the exact assembly instruction used. The Description
describes how it works and Operation
gives a programmatic explanation.
The names of the SSE2 float instructions are quite obtuse, so let’s cover that now. Consider the divss
instruction. This instruction works on any XMM register.
To break it down:
- division
- It works on
S
calar values (just one a time) - It works on Single-length floating point numbers. (the normal size of 32-bit floats)
To put it all together, the divss
instruction does division of one floating point number.
Let’s consider movsd
:
- Do a move, aka copy the value
- It is scalar, working on only one value
- It works on double-length floating point numbers (64-bit floats)
Pretty straightforward right? It’s worth reading over these examples a couple of times because they are important context.
The Registers in Depth
The SSE2 float instructions work on XMM registers, there are 16 of them, from XMM0, to XMM15. They are 128-bits each, the explanation today will not delve into using the entire registers, for the purposes of this article you can think of them as one float, or a double-float.
The calling convention for these is that they are passed into functions with XMM0 to XMM7 and the function may modify them. XMM0 and XMM1 are used to return values.
SSE2 Floats Cheat Sheet
Name | Summary |
divsd | divide two double-floats |
movsd | move a double-float |
cvtsi2sd | convert an integer to a double-float |
cvtsd2si | convert a double-float to an integer |
comisd | compare two double-floats, setting the EFLAGs, useful for branching |
addsd | add one double-float to another |
mulsd | multiply two double-floats |
subsd | subtract two double-floats |
Those are the most useful instructions I have found for working with x86_64 floats so far, they each have plenty of variants, it’s always worth looking them up if you think you need something similar.
Some important notes are:
- Each of these instructions takes one or more XMM register.
- You cannot directly load an immediate float, you have to load it from the .data section, convert it from an integer, or otherwise
Example Snippets
mulsd xmm0, xmm0 ; xmm0 = xmm0 * xmm0
cvtsi2sd xmm1, rdi ; xmm1 = rdi
; return early if [normal_const] < xmm0
movsd xmm0, [normal_const]
ucomisd xmm0, xmm1
jb nm_end
lea rax, [normal_msg]
ret
nm_end: