‹ projects

cluster-rnn

a distributed Torch7 RNN cluster over MPI
Log | Files | Refs | README

README.md (751B)


      1 # word-rnn_with_mpit
      2 Implementing a complex Torch7 RNN implementation over a cluster with MPIT.
      3 
      4 These are both complex projects, but the key is in adding the code from the 
      5 core word-rnn script to the mpit execution script.  If the variables match, the
      6 EAMSGD optimizer should be able to use the available cluster to accelerate the 
      7 training process. 
      8 
      9 Assuming all dependencies are installed, run the program like this:
     10 
     11 mpirun -np 11 -f ../machinefile th mlaunch.lua
     12 
     13 Where '11' is the number of available cores in the cluster, '../machinefile' 
     14 points to the MPI machinefile, and mlaunch.lua is configured with the specific 
     15 Torch script you care to run.
     16 
     17 To use/test the trained model, run this code:
     18 th sample.lua cv/some_checkpoint.t7 -gpuid -1