This superoptimizer compiles snippets to C code to most efficient assembly implementation. This is guaranteed to be the best implementation of the snippet, because the optimizer brute forces the space of assembly implementations in a bottom-up fashion. It is a bit slow though (some hours of processing to get a 6-line assembly output).
Source: github
Web: Superoptimizer (pt-br)