bitops: provide an inline implementation of find_first_bit
find_first_bit has started to be used heavily in TCG code. The current
implementation based on find_next_bit is not optimal and can't be
optimized be the compiler if the bit array has a fixed size, which is
the case most of the time.
This new implementation does not use find_next_bit and is yet small
enough to be inlined.