AWS Inferentia - AshokBhat/notes GitHub Wiki
AWS Inferentia
AWS Neuron SDK
- SDK to deploy ML inference on
Amazon EC2 Inf1 instances
- Consists of a compiler, run-time, and profiling tools
- Pre-installed in
AWS Deep Learning AMIs, AWS Deep Learning Containers and Amazon SageMaker
- Can also be installed in your custom environment without a framework
AWS EC2 Inf1
- up to 16 AWS Inferentia chips
- 2nd generation Intel Xeon Scalable processors
- up to 100 Gbps networking
Machine learning workflow
- Building your model in one of the popular machine learning frameworks
- Use GPU instances such as
P3 or P3dn to train your model
- Deploy your model on
Inf1 instances by using AWS Neuron SDK
See also
- [Groq]] ](/AshokBhat/notes/wiki/[[Habana-Labs) | Graphcore
- Google TPU
- [AWS EC2]] ](/AshokBhat/notes/wiki/[[AWS-Elastic-Inference)
- [AWS Graviton]] ](/AshokBhat/notes/wiki/[[AWS-Inferentia)