PlaidML - eiichiromomma/CVMLAB GitHub Wiki

(DeepLearning) PlaidML

マルチプラットフォームなDeep Learningフレームワーク。Metal対応なのでmacOSのAMDのGPUもそれなりの速さで使える。

導入(Anaconda)

配布元に書いてある通りにやればpipだけで導入できる。virtualenvのところだけcondaに置き換えて作業する。

conda create -n plaidml python=3.6
source activate plaidml
pip install plaidml-keras plaidbench

でセットアップをして使う計算資源を指定する。

plaidml-setup
PlaidML Setup (0.3.5)

Thanks for using PlaidML!

Some Notes:
  * Bugs and other issues: https://github.com/plaidml/plaidml
  * Questions: https://stackoverflow.com/questions/tagged/plaidml
  * Say hello: https://groups.google.com/forum/#!forum/plaidml-dev
  * PlaidML is licensed under the GNU AGPLv3

Default Config Devices:
   No devices.

Experimental Config Devices:
   llvm_cpu.0 : CPU (LLVM)
   opencl_amd_amd_radeon_r9_m380_compute_engine.0 : AMD AMD Radeon R9 M380 Compute Engine (OpenCL)
   opencl_cpu.0 : Intel CPU (OpenCL)
   metal_amd_radeon_r9_m380.0 : AMD Radeon R9 M380 (Metal)

Using experimental devices can cause poor performance, crashes, and other nastiness.

Enable experimental device support? (y,n)[n]:

のように出てくる。LinuxだとDefault Config DevicesにGeforceが選択されたりするのだが,No devicesとなっているので,experimentalなデバイスを使うのでyでEnter。

Enable experimental device support? (y,n)[n]:y

Multiple devices detected (You can override by setting PLAIDML_DEVICE_IDS).
Please choose a default device:

   1 : llvm_cpu.0
   2 : opencl_amd_amd_radeon_r9_m380_compute_engine.0
   3 : opencl_cpu.0
   4 : metal_amd_radeon_r9_m380.0

Default device? (1,2,3,4)[1]:

どれを使うか聞いてくる。openclはmacOSだとパフォーマンスが落ちるのでmetalの4を使う。

Default device? (1,2,3,4)[1]:4

Selected device:
    metal_amd_radeon_r9_m380.0

PlaidML sends anonymous usage statistics to help guide improvements.
We'd love your help making it better.

Enable telemetry reporting? (y,n)[y]:

Almost done. Multiplying some matrices...
Tile code:
  function (B[X,Z], C[Z,Y]) -> (A) { A[x,y : X,Y] = +(B[x,z] * C[z,y]); }
Whew. That worked.

Save settings to /Users/username/.plaidml? (y,n)[y]:
Success!

で上記のように匿名での情報提供の可否に回答した後で,設定を保存するか回答して設定完了。

テスト

plaidbenchが用意されている。

plaidbench keras mobilenet

結論から言うとMetalが一番速い。

Running 1024 examples with mobilenet, batch size 1
INFO:plaidml:Opening device "metal_amd_radeon_r9_m380.0"
(中略)
Example finished, elapsed: 0.6172780990600586 (compile), 11.181226015090942 (execution), 0.010919166030362248 (execution per example)
Correctness: PASS, max_error: 1.7974503862205893e-05, max_abs_error: 1.0952353477478027e-06, fail_ratio: 0.0

OpenCL+macOSだと

Running 1024 examples with mobilenet, batch size 1
INFO:plaidml:Opening device "opencl_amd_amd_radeon_r9_m380_compute_engine.0"
(中略)
Example finished, elapsed: 0.6245379447937012 (compile), 18.993228912353516 (execution), 0.01854807510972023 (execution per example)
Correctness: PASS, max_error: 1.7974503862205893e-05, max_abs_error: 1.0952353477478027e-06, fail_ratio: 0.0

llvmだと

Running 1024 examples with mobilenet, batch size 1
INFO:plaidml:Opening device "llvm_cpu.0"
(中略)
Example finished, elapsed: 4.430587291717529 (compile), 99.72919416427612 (execution), 0.0973917911760509 (execution per example)
Correctness: PASS, max_error: 1.7511049009044655e-05, max_abs_error: 6.556510925292969e-07, fail_ratio: 0.0

opencl_cpuだと

Running 1024 examples with mobilenet, batch size 1
INFO:plaidml:Opening device "opencl_cpu.0"
(中略)
Example finished, elapsed: 1.7653231620788574 (compile), 205.1401641368866 (execution), 0.20033219153992832 (execution per example)
Correctness: FAIL, max_error: 0.005533537361770868, max_abs_error: 0.00035209953784942627, fail_ratio: 0.076

macOSでGPUの状態を監視

アクティビティモニタで「ウインドウ」-「GPUの履歴」でグラフが出てくる