SSIS has been my mid-night mistress for months now, and at the risk of sounding off-color I've compiled a short list of things which ought to be changed in the nest release of SSIS:
1) The diagram auto-layout desperately needs to be rethought and replaced.
2) The Data Viewer should at least allow "live vs. page-at-a-time" modes, and showing bottle-neck indicators on pipelines would be at least as useful as showing the row count.
3) Manual tuning of buffers should either go away (replaced by automated solution), or at very least be augmented with debug helpers and visual feedback.
4) Lookup and SCD components need to be binned and replaced with components that add more flexibility in terms of how lookup data is retrieved, how matches are performed (range-value matching would be a great start), allow for multiple joins (lookup), enable sliding window lookups (as opposed to partial caching).
5) Exposing only the most useless component properties to the data-flow's expression engine, instead of the most useful (e.g. Source's SQL Query, Destinations MaxInsertCommitSize, etc.)
6) The overall rigidity and resulting fragility of SSIS and the LineageID implementation brings. Most other ETL packages allow you to disconnect and reconnect pipelines without a litany of "missing field" dialogs. There should at least be a "take your best guess" toggle button on a toolbar.
7) Overall UI layout is counter-productive. I want to be able to see and edit variables, expressions and key properties without a cluttered workspace. Ideally a call-out style popup when hovering over a component would do the trick.
8) Inability to change scope of existing variables without deleting and recreating them.
9) The merge and sort components. For real-world data warehousing, these two components are the ultimate pitfalls... both run out of steam long before swap-disk is depleted. The merge component in particular desperately needs to be rewritten to not swamp down-stream buffers. Sort needs to be rewritten to intelligently handle sub-sorting of streams (i.e. resort subset of sorted input), as well as ability to yield windowed sorts, which would enable a new class of downstream lookups.
1) The diagram auto-layout desperately needs to be rethought and replaced.
2) The Data Viewer should at least allow "live vs. page-at-a-time" modes, and showing bottle-neck indicators on pipelines would be at least as useful as showing the row count.
3) Manual tuning of buffers should either go away (replaced by automated solution), or at very least be augmented with debug helpers and visual feedback.
4) Lookup and SCD components need to be binned and replaced with components that add more flexibility in terms of how lookup data is retrieved, how matches are performed (range-value matching would be a great start), allow for multiple joins (lookup), enable sliding window lookups (as opposed to partial caching).
5) Exposing only the most useless component properties to the data-flow's expression engine, instead of the most useful (e.g. Source's SQL Query, Destinations MaxInsertCommitSize, etc.)
6) The overall rigidity and resulting fragility of SSIS and the LineageID implementation brings. Most other ETL packages allow you to disconnect and reconnect pipelines without a litany of "missing field" dialogs. There should at least be a "take your best guess" toggle button on a toolbar.
7) Overall UI layout is counter-productive. I want to be able to see and edit variables, expressions and key properties without a cluttered workspace. Ideally a call-out style popup when hovering over a component would do the trick.
8) Inability to change scope of existing variables without deleting and recreating them.
9) The merge and sort components. For real-world data warehousing, these two components are the ultimate pitfalls... both run out of steam long before swap-disk is depleted. The merge component in particular desperately needs to be rewritten to not swamp down-stream buffers. Sort needs to be rewritten to intelligently handle sub-sorting of streams (i.e. resort subset of sorted input), as well as ability to yield windowed sorts, which would enable a new class of downstream lookups.
Comments