Another optimization of tangent space calculation
Don't use quick sort for small arrays, bubble sort works way faster for small arrays due to cache coherency. This is what qsort() from libc is doing actually. We can also experiment unrolling some extra small arrays, for example 3 and 4 element arrays. This reduces tangent space calculation for dragon from 3.1sec to 2.9sec.
This commit is contained in:
@@ -1677,6 +1677,19 @@ static void QuickSortEdges(SEdge * pSortBuffer, int iLeft, int iRight, const int
|
||||
}
|
||||
return;
|
||||
}
|
||||
else if(iElems < 16) {
|
||||
int i, j;
|
||||
for (i = 0; i < iElems - 1; i++) {
|
||||
for (j = 0; j < iElems - i - 1; j++) {
|
||||
int index = iLeft + j;
|
||||
if (pSortBuffer[index].array[channel] > pSortBuffer[index + 1].array[channel]) {
|
||||
sTmp = pSortBuffer[index];
|
||||
pSortBuffer[index] = pSortBuffer[index];
|
||||
pSortBuffer[index + 1] = sTmp;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Random
|
||||
t=uSeed&31;
|
||||
|
Reference in New Issue
Block a user