Parallelism

A large fraction of studies using Pythia follow the same general steps: first the Pythia object(s) and histograms are initialized. Then there is a loop where events are generated using Pythia::next. These events are subsequently analyzed one by one, and the resulting statistics are stored in histograms. Finally, the histograms are normalized and the results are plotted.

In this procedure, usually the events are independently generated, and in principle it should be possible to generate them in parallel. The PythiaParallel class provides a simple interface for doing this. Objects of this class behave similarly to Pythia objects in that they can be configured using readString and readFile, and are initialized using init. The difference is that instead of having a next method that generates a single event, they have the method run which generates a number of events in parallel and analyzes them using a used-defined callback function. The program flow using the parallelism framework is as follows:

  1. In addition to the standard header file, the parallel framework header must be included.
     
        #include "Pythia8/Pythia.h" 
        #include "Pythia8/PythiaParallel.h" 
    
  2. The next step is to create, configure, and initialize a parallel generator object. This process is identical to that for a standard Pythia object.
     
        PythiaParallel pythia; 
        pythia.readString("HardQCD:all = on"); 
        pythia.init(); 
    
  3. Finally, the PythiaParallel object is run with a user function which performs the necessary analysis on the events. In this example, an event multiplicity histogram is filled.
     
        Hist mult("mult", 100, -0.5, 799.5); 
        pythia.run(10000, [&](Pythia& pythiaNow) { 
          int nFinal = 0; 
          for (int i = 0; i < pythiaNow.event.size(); ++i) 
            if (pythiaNow.event[i].isFinal()) ++nFinal; 
          mult.fill( nFinal ); 
        }); 
    
    Here the syntax [&](Pythia& pythiaNow) may be unfamiliar. This is a lambda function with no arguments but that passes all the current variabls in scope, i.e. the mult histogram object.

The scope of this class is to provide a lightweight way of parallelising simple Pythia studies. As such, it may not offer the necessary features to support more complicated use cases. The specific features are documented below, and examples are given in main161, main162, main163, and main204.

vector<long> PythiaParallel::run(long nEvents, function<void(Pythia&)> callback)  
vector<long> PythiaParallel::run( function<void(Pythia&)> callback)  
this is the main method of PythiaParallel, analogous to Pythia::next. The method generates nEvents events in parallel, distributing the tasks automatically to different Pythia instances. If nEvents is not specified, Main:numberOfEvents is used instead. The callback is a function that will be called when each event is generated. It receives the Pythia instance that generated the event as an argument, and the event can then be accessed through Pythia::event. If an event fails to generate successfully, it will not be passed to the callback. By default, callbacks are synchronized. That is, only one callback can be active at the same time, which means it is safe to e.g. write to histograms from within the callback. Asynchronous processing of callbacks can be enabled by setting Parallelism:processAsync = on (see below). Returns a vector<long> containing the number of events successfully generated by each thread. If any events fail to generate, the entries will sum to a number that is smaller than the requested number of events.

bool PythiaParallel::init()  
bool PythiaParallel::init( function<void(Pythia&)> customInit)  
initialize each Pythia instance and returns whether successful.
argument customInit : If specified, this function will be called for each Pythia instance after it has been constructed and its settings have been set, but before calling Pythia::init.

void PythiaParallel::foreach(function<void(Pythia&)> action)  
perform the specified action for each Pythia instance. This can be useful for doing custom finalization on each instance, e.g. combining histograms.

void PythiaParallel::foreachAsync( function<void(Pythia&)> action)  
as PythiaParallel::foreach, but the actions are performed for all Pythia instances in parallel.

double PythiaParallel::weightSum() const  
returns the sum of weights from all Pythia instances, as given by Info::weightSum().

double PythiaParallel::sigmaGen() const  
returns the weighted average of the generated cross section for each Pythia instances, as given by Info::sigmaGen().

The following settings are available for the parallelism framework.

mode  Parallelism:numThreads   (default = 0; minimum = 0)
Number of threads to run in parallel. If set to 0, the number of threads will be estimated using std::thread::hardware_concurrency; if the program is unable to determine the number of threads this way, initialization will fail.

mvec  Parallelism:seeds   (default = {})
The seeds to use for each Pythia object. If empty, Random:seed will be used, incrementing the seed by 1 for each object. If non-empty, it must have a number of entries equal to Parallelism:numThreads.

flag  Parallelism:processAsync   (default = off)
By default, PythiaParallel::run will generate events in parallel, which are then processed serially. The advantage of this serial kind of processing is that it prevents race conditions, which can occur for example if two threads are trying to simultaneously write to the same histogram. Normally, event generation is far more time-consuming than analysis, and the time loss from processing in serial is completely negligible. However, there are situations where the analysis takes a non-negligible amount of time compared to event generation. In such scenarios, the user may enable parallel processing of the analysis by setting Parallelism:processAsync = on. In this case, the user is responsible for preventing race conditions, possibly by using mutex and lock_guard objects. An example of how this can be done is shown in main163.cc.

mode  Parallelism:index   (default = -1)
When a Pythia instance is passed to the callback function in PythiaParallel::run, this setting will contain an index that is unique to each instance, starting at 0 for the first instance. This index is particularly useful if Parallelism:processAsync = on. For example if each instance writes to different histograms, this index can be used to specify which histogram to write to. Note that the user should not write to this setting directly.

flag  Parallelism:balanceLoad   (default = off)
By turning this flag on, each Pythia instance will all produce the same number of events, ±1 if the number of threads does not evenly divide the number of events. If the flag is off, then each instance will instead generate events until the desired number has been generated. This can be significantly more efficient if the event generation time can vary significantly (e.g. as it does in central vs. peripheral heavy ion collisions), but one disadvantage is that the number of events generated per instance is random and can vary between runs. Since each instance has its own random seed, this means that the final statistics might vary slightly between runs. Thus, this flag should be set on to ensure that repeated runs will generate the exact same statistics, which can be useful for testing purposes. Even if this flag is on, events might still be processed in different orders between runs.