AWS Inferentia - AshokBhat/notes GitHub Wiki
AWS Inferentia
AWS Neuron SDK
- SDK to deploy ML inference on
Amazon EC2 Inf1
instances
- Consists of a compiler, run-time, and profiling tools
- Pre-installed in
AWS Deep Learning AMIs
, AWS Deep Learning Containers
and Amazon SageMaker
- Can also be installed in your custom environment without a framework
AWS EC2 Inf1
- up to 16 AWS Inferentia chips
- 2nd generation Intel Xeon Scalable processors
- up to 100 Gbps networking
Machine learning workflow
- Building your model in one of the popular machine learning frameworks
- Use GPU instances such as
P3
or P3dn
to train your model
- Deploy your model on
Inf1
instances by using AWS Neuron SDK
See also
- [Groq]] ](/AshokBhat/notes/wiki/[[Habana-Labs) | Graphcore
- Google TPU
- [AWS EC2]] ](/AshokBhat/notes/wiki/[[AWS-Elastic-Inference)
- [AWS Graviton]] ](/AshokBhat/notes/wiki/[[AWS-Inferentia)