I'm looking for someone who may be able to help me understand why this works to improve performance.
I have a large table (10mil+ records), however I only need to pull out specific records for what I'm doing. In order to do this, I've setup a loop based a the clustered index, which is a artificial key. This subset is selected in a subquery with my other conditions. This all looks something like...
WHERE rowid>@rowid AND rowid<=@rowid+@span
INNER JOIN TableA A ON BT.[Fields]=A.[Fields w/ Clustered Index]
CROSS APPLY FunctionB(BT.[Fields]) B
INNER JOIN TableC C ON B.[Field]=C.[Field w/ Clustered]
AND BT.[DateFields] BETWEEN C.[DateFields]
When this is run on it's own, it uses two hash matches (BT-A, then BT-B, or the table B uses). However, when I make a simple change to my subquery to SELECT TOP (@span), since the most results we'll ever get is equal to @span, it switches to using nested loops for the joins and performance improves at least 10x.
I just want to understand why the addition of the TOP (@span) has this impact.