Sunday, 8 April 2012

x86 code generation quality

People often claim that it is difficult to improve upon the code generated by a modern compiler using hand-written assembler. This blog post takes a look at the code generated by GCC for a simple function and finds that it implements impressive high-level optimizations but the generated code leaves a lot of room for improvement.

Consider the recursive integer Fibonacci function that may be defined in C like this:

int fib(int n) { return (n<2 ? n : fib(n-1)+fib(n-2)); }

Writing this function in assembler by hand we obtain the following 12 instructions:

fib:
 cmp eax, 2
 jl fin
 push eax
 add eax, -1
 call fib
 push eax
 mov eax, [esp+4]
 add eax, -2
 call fib
 add eax, [esp]
 add esp, 8
fin:
 ret

Benchmarking this solution we find that it takes 3.25s to compute fib(40).

With the -O2 option we find that GCC produces the following 25 assembler instructions:

_fib:
 push edi
 push esi
 push ebx
 sub esp, 16
 mov edi, DWORD [esp+32]
 cmp edi, 1
 jbe L4
 mov ebx, edi
 xor esi, esi
L3:
 lea eax, [ebx-1]
 mov DWORD [esp], eax
 call _fib
 sub ebx, 2
 add esi, eax
 cmp ebx, 1
 ja L3
 and edi, 1
L2:
 lea eax, [esi+edi]
 add esp, 16
 pop ebx
 pop esi
 pop edi
 ret
L4:
 xor esi, esi
 jmp L2

Benchmarking we find that the performance of this solution is indistinguishable from our assembler. So the compiler generated over twice as much code for no performance improvement. Moreover, our assembler solution was comprehensible whereas the output from the compiler includes a wider vocabulary of instructions (and, xor and lea) as well as a dramatic rearrangement. Specifically, the code generated by the compiler contains only a single recursive call rather than two recursive calls.

The transformation made by the compiler cleverly turned one of the two recursive calls into a loop. Disassembling the generated code by hand we can reconstruct the following algorithmically-similar C program:

int loop(int a, int n) { return (n<=1 ? a+n : loop(a+fib(n-1), n-2)); }
int fib(int n) { return loop(0, n); }

Compiling this to assembler by hand we obtain the following 15-instruction program:

fib3:
 push ebx
 push ecx
 mov ebx, 0
 mov ecx, eax
 jmp loop2

loop1:
 lea eax, [ecx-1]
 call fib
 add ebx, eax
 sub ecx, 2
loop2:
 cmp ecx, 1
 jg loop1

 lea eax, [ebx+ecx]
 pop ecx
 pop ebx
 ret

Note that we have made one non-trivial rearrangement to create this assembler. Specifically, the loop containing a branch has been rotated so that the branch is the loop and, therefore, the entry point to the loop is now in the middle of the code. This removes one of the two branches from a naive translation.

Whereas our previous assembler and the C code both took 3.25s to compute fib(40), this new assembler takes just 2.52s. These results imply that GCC is applying some productive high-level optimizations but their benefit is lost in poor code generation. Concretely, the compiler successfully replaced one call with a jmp and, consequently, removed some stack operations but the remaining code was too inefficient for this to pay off.

Interestingly, applying the same optimization to equivalent F# code reduces the running time from 4.44s to 2.64s and in OCaml from 3.21s to 2.77s. Here is the resulting code:

let rec loop a n =
  if n<2 then a+n else
    loop (a + fib(n-1)) (n-2)
and fib n = loop 0 n

Compiling with gcc -O3 produces far more code but the running time is reduced down to 2.01s. The most productive optimization in the case of this function appears to be to unroll the recursive calls once and simplify, in particular reusing the result of the two calls to fib(n-3), resulting in this program:

let rec fib = function
  | 0 | 1 as n -> n
  | 2 -> 1
  | 3 -> 2
  | n -> fib(n-2) + 2*fib(n-3) + fib(n-4)

This reduces the running time to just 0.72s in F# and 0.39s in OCaml.

8 comments:

saynte said...

Your assembly isn't comparable to GCC's as it (appears) to be generating code that uses the cdecl calling convention, as it fetches the arguments from the stack.

I think this would introduce at least one or two extra instructions in your assembly, and give a better comparison (C-compliant code to C-compliant code).

Flying Frog Consultancy Ltd. said...

This problem does not require any specific calling convention.

saynte said...

No the problem doesn't, but you wrote one solution in C (thus using a particular calling convention) then asked GCC to compile it; C does require a specific calling convention.

Since you're hand-coding your solution you can throw away that requirement, but GCC can't take such liberties.

Jules said...

GCC chould compile a top level wrapper around fib that uses cdecl, and use a better one to do the recursion.

Flying Frog Consultancy Ltd. said...

@Saynte: As Jules said, there is nothing to stop GCC from taking such liberties.

saynte said...

@Jules, you're right, I suppose they could do that. However, these issues are not the ones brought up in the article, which is little more than: Baha, look at GCC generate longer code with weird instructions.

@Flying Frog, or you could just tell it to pass by register (fastcall convention). It just doesn't give much confidence in your method to leave such fundamental details (ie, one program is written in C) out in the wind. If you're going to compare the quality of assembly that GCC generates, why not compare it to other C compilers? Why compare it to hand-rolled assmebly, F#, and OCaml? You could have tried it with Clang, although on my machine it does worse than GCC.

Samuel A. Falvo II said...

He has a point. The extra time taken by gcc's code could be due to cache effects (line spills/fetches), so replicating calling conventions is one detail that should be replicated just to eliminate the possibility.

That being said, I wish I could've seen your version of F#'s generated assembly. I suspect it would have yielded valuable insights.

I'm a big fan of hand-coded assembly over generated code, but today's compilers usually gets things right, or "right enough."

otr said...

Otr214428
INTERNATIONAL CONCEPT OF WORK FROM HOME
Work from home theory is fast gaining popularity because of the freedom and flexibility that comes with it. Since one is not bound by fixed working hours, they can schedule their work at the time when they feel most productive and convenient to them. Women & Men benefit a lot from this concept of work since they can balance their home and work perfectly. People mostly find that in this situation, their productivity is higher and stress levels lower. Those who like isolation and a tranquil work environment also tend to prefer this way of working. Today, with the kind of communication networks available, millions of people worldwide are considering this option.

Women & Men who want to be independent but cannot afford to leave their responsibilities at home aside will benefit a lot from this concept of work. It makes it easier to maintain a healthy balance between home and work. The family doesn't get neglected and you can get your work done too. You can thus effectively juggle home responsibilities with your career. Working from home is definitely a viable option but it also needs a lot of hard work and discipline. You have to make a time schedule for yourself and stick to it. There will be a time frame of course for any job you take up and you have to fulfill that project within that time frame.

There are many things that can be done working from home. A few of them is listed below that will give you a general idea about the benefits of this concept.

Baby-sitting
This is the most common and highly preferred job that Women & Men like doing. Since in today's competitive world both the parents have to work they need a secure place to leave behind their children who will take care of them and parents can also relax without being worried all the time. In this job you don't require any degree or qualifications. You only have to know how to take care of children. Parents are happy to pay handsome salary and you can also earn a lot without putting too much of an effort.

Nursery
For those who have a garden or an open space at your disposal and are also interested in gardening can go for this method of earning money. If given proper time and efforts nursery business can flourish very well and you will earn handsomely. But just as all jobs establishing it will be a bit difficult but the end results are outstanding.

Freelance
Freelance can be in different wings. Either you can be a freelance reporter or a freelance photographer. You can also do designing or be in the advertising field doing project on your own. Being independent and working independently will depend on your field of work and the availability of its worth in the market. If you like doing jewellery designing you can do that at home totally independently. You can also work on freelancing as a marketing executive working from home. Wanna know more, email us on workfromhome.otr214428@gmail.com and we will send you information on how you can actually work as a marketing freelancer.


Internet related work
This is a very vast field and here sky is the limit. All you need is a computer and Internet facility. Whatever field you are into work at home is perfect match in the software field. You can match your time according to your convenience and complete whatever projects you get. To learn more about how to work from home, contact us today on workfromhome.otr214428@gmail.com and our team will get you started on some excellent work from home projects.


Diet food
Since now a days Women & Men are more conscious of the food that they eat hence they prefer to have homemade low cal food and if you can start supplying low cal food to various offices then it will be a very good source of income and not too much of efforts. You can hire a few ladies who will help you out and this can be a good business.

Thus think over this concept and go ahead.