AWS Inferentia - AshokBhat/notes GitHub Wiki

AWS Inferentia

AWS Neuron SDK

  • SDK to deploy ML inference on Amazon EC2 Inf1 instances
  • Consists of a compiler, run-time, and profiling tools
  • Pre-installed in AWS Deep Learning AMIs, AWS Deep Learning Containers and Amazon SageMaker
  • Can also be installed in your custom environment without a framework

AWS EC2 Inf1

  • up to 16 AWS Inferentia chips
  • 2nd generation Intel Xeon Scalable processors
  • up to 100 Gbps networking

Machine learning workflow

  • Building your model in one of the popular machine learning frameworks
  • Use GPU instances such as P3 or P3dn to train your model
  • Deploy your model on Inf1 instances by using AWS Neuron SDK

See also

  • [Groq]] ](/AshokBhat/notes/wiki/[[Habana-Labs) | Graphcore
  • Google TPU
  • [AWS EC2]] ](/AshokBhat/notes/wiki/[[AWS-Elastic-Inference)
  • [AWS Graviton]] ](/AshokBhat/notes/wiki/[[AWS-Inferentia)