Parallelism is a great technique to improve the performance of your data integration operations. SSIS natively supports the parallel execution of packages, tasks, and transformations. The trick to successful parallelism is to configure operations within the constraints of your system resources.
Within SSIS, the control flow for each package is controlled by a setting called MaxConcurrentExecutables, which specifies the maximum number of SSIS threads that can execute in parallel per package. By default, this is set to -1, which translates to the number of logical machine processors plus 2.
If SSIS runs on a dedicated server and you have a lot of operations that run in parallel, you will likely want to increase this setting if some of the operations do a lot of waiting for external systems to reply. On the other hand, if you do not have a dedicated SSIS machine and your data integration application runs alongside several other applications, you may need to reduce this setting to avoid resource conflicts.
As you design packages for parallelism, you need to decide whether to run some or all of the operations in the package in parallel. As with buffer sizing decisions, decisions about parallelism are best made when you take into account available system resources.
Consider the tradeoffs of different design approaches that apply parallelism to a package that reads data from a source database, aggregates the data four different ways, and then loads each aggregated data set into a different destination table.
Note: This information was gathered during a 64-bit performance study at the Unisys Center of Excellence using Unisys ES7000 servers. For more detailed information regarding this study, please see the ETL Performance white paper at www.unisys.com