another patch for the dilate/erode step method, still without any functional changes.
This time it keeps the general algorithm but uses the tile system to make it
multithreaded. I could not measure a speedup on my 2-core laptop, but hope that
it will be faster for more cores. The immediate speedup that is very visible though is
that tiles come in as soon as they are calculated and a dilate/erode node does not
block the whole image to be calculated.
till then, David.