Alberto Ferrari writes many good blogs on SQL and SSIS. In one of his old posts he mentions the presence of a bug in the SSIS script component, where the output buffer is marked as completed as soon as the ProcessInput method returns from the last row of the last buffer.
This is not a bug, and if you think about it, it's actually correct for the stream to be marked as completed once the component returns from processing it. The reason for this is because when synchronous outputs are used, the output buffer is the input buffer - i.e. they're the same block of RAM, so when the component returns from processing the buffer, that buffer is passed on to the next down-stream component... that is... SSIS does not wait for other threads in your component to complete their work before passing the buffer on. Not only is this not a bug, but it's actually a performance booster and, in general, a really good thing.
The solution to Alberto's problem is reasonably simple, and should be a rule for your SSIS development work:
Always use asynchronous outputs when writting threaded components
By doing so, you are in complete control of how your output buffers are issued to downstream components, and no rows will be passed downstream while a thread is still calculating values for that row.
This is not a bug, and if you think about it, it's actually correct for the stream to be marked as completed once the component returns from processing it. The reason for this is because when synchronous outputs are used, the output buffer is the input buffer - i.e. they're the same block of RAM, so when the component returns from processing the buffer, that buffer is passed on to the next down-stream component... that is... SSIS does not wait for other threads in your component to complete their work before passing the buffer on. Not only is this not a bug, but it's actually a performance booster and, in general, a really good thing.
The solution to Alberto's problem is reasonably simple, and should be a rule for your SSIS development work:
Always use asynchronous outputs when writting threaded components
By doing so, you are in complete control of how your output buffers are issued to downstream components, and no rows will be passed downstream while a thread is still calculating values for that row.
Comments