2010.02.23 - Zaranek - Parallel Computing with Matlab
- Types of parallel
- Task parallel
- parfor
- jobs and tasks
- Data parallel
- distributed
- spmd
- Matlab has a structure of:
- Client - standard instance - multithreaded
- Worker - by default singlethreaded
- Toolbox to do all of this fun stuff is: parallel computing
- Multicore processors
- "Simple" parallelization (Some toolboxes leverage the basic Matlab parallel functions)
- Optimization toolbox
- optimset(options,'UseParallel','always')
- Open up workers
- matlabpool open 2
- matlabpool close
- matlabpool size
- Use about 10x the number of workers for the measured overhead
- Parameter sweeps of ODEs
- parfor
- No nesting of parfor
- Automatic load balancing
- Variables are "sliced"
- Work around is to wrap section of code into a function for:
- Cannot break or return
- Cannot introduce variables (eval, load, global)
- Some workarounds.
- "Simple" parallelization (Some toolboxes leverage the basic Matlab parallel functions)
- Splitting data (data parallel)
- Client-side distributed arrays
- Computer clusters
- Leverage Amazon Elastic Cluster? whitepaper
- findresource()
- distributed.rand()
- Distributes the data EVENLY across the nodes of the cluster
- methods(var)
- Shows all of the parallel operations that you can do on the distributed data
- gather(var)
- Bring back data from distributed array
- spmd
- single program multiple data, a way for finer grained control
- labindex, numlabs
- gives you control of individual nodes
- MPI (message passing interface)
- labSend, labReceive, labSendReceive labBroadcast
- adsfa
- Scheduling
- job = batch('matlabFile', 'FileDependencies',{'odesystem.m'})
- Send either files or path dependencies
- wrapper for createjob
- Supported 3rd party schedulers: Torque, Platform, Windows HPC