Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Currently we are running dsdgen to generate the entire data set as single thread
usually when deploy local load from data stored on the UM Node ,
and in multi treads only when have to
generate data and distribute data source files across PM nodes -eg to deploy cpimport load modes m2 and m3.
It's observed that it make long time to prepare bigger data sets especially with Scale Factor 1000 and above and/or when store all data on the UM or few PMs
It's needed to optimize the TPC_DS data generation on multi core CPU environments.