Amazon Web Services announced today Amazon Elastic Inference, a new service that allows customers to connect graphical processor-based inference acceleration to any Amazon EC2 instance and reduce the cost Learning in depth up to 75%.
"What we generally see is that the average utilization of these GPUs from P3 instances is about 10 to 30%, which is rather a waste of time with an elastic inference. You do not have to waste all those costs and all that GPU, "said Andy Jassy, general manager of AWS, on stage at the AWS re: Invent conference earlier in the day. "[Amazon Elastic Inference] is a game changer important enough to be able to execute the inference in a much more profitable way. "
Amazon Elastic Inference will also be available for instances and endpoints in the Amazon Sage notebook, "bringing acceleration to embedded algorithms and deep learning environments," the company writes in a blog post. It will support the TensorFlow, Apache MXNet and ONNX machine learning infrastructures.
It is available in three sizes:
- eia1.medium: 8 TeraFLOP of variable precision performance.
- eia1.large: 16 TeraFLOP with variable precision performance.
- eia1.xlarge: 32 TeraFLOP of variable precision performance.
Dive deeper into the new service here.