Contents

  1. Parallel processing can be used to speed up calculations.
  2. The function must be costly enough to justify the messaging overhead.
  3. Take account of how work is being distributed.
  4. The normal rule of using vector operations still applies. (prefer .Q.fc)
  5. Parallelising on different functions/data is possible


Parallel processing can be used to speed up calculations

The function must be costly enough to justify the messaging overhead

When the function is fast the overhead of passing the data to the threads outweighs the parallelisation benefits

Take account of how work is being distributed

  • PEACH - When 2 slaves, assigns items at 0,2,4,6.. to one thread. 1,3,5,7 to another.
  • .Q.FC - Cuts vector into s equal sized continuous pieces, where s is number slaves passes whole vector piece at once to each thread
The differences in how peach and .Q.fc split work amongst threads.

Performance for semi-realistic data sizes

Conclusion

  1. each was consistantly slow
  2. peach was faster except on data b.
  3. .Q.fc was slightly faster still except on a.

Careful sending unbalanced workloads to peach / .Q.fc. would be nice if they were smarter to batch into groups and queue them to each processor

.Q.fc appears to use less memory but this test isn't really representative in real world usage I've seen .Q.fc explode memory wise particularly when the vectors its passing to threads are just over a power of 2 as the memory manager then allocates basically double the space needed. peach is less likely to explode like that.

The normal rule of using vector operations still applies. (prefer .Q.fc)

Parallelising on different functions/data is possible