Google details improvements to depth detection via the ML in portrait mode of Pixel 3



[ad_1]

Unlike other cameras with a Portrait mode, the Pixel line resolves with only one rear camera. With Pixel 3, Google has turned to machine learning to improve depth estimation and "produce even better results in portrait mode".

With Pixel 2, Google was able to calculate the depth of an image with a single camera using two-pixel autofocus or Phase Detection Auto Focus (PDAF). At a high level, a neural network determines "which pixels correspond to people in relation to the background".

PDAF pixels capture two slightly different views of a scene and look for horizontal parallax motion in the background:

Parallax being a function of the distance from the point to the camera and the distance between the two points of view, we can estimate the depth by matching each point of a view to its corresponding point of the other.

However, this technique is difficult given the lightness of the movement, which leads to errors in estimating the depth and "unpleasant artifacts".

Pixel 3 portrait auto learning

The depth learned causes fewer errors

With Pixel 3, Google looked for other visual cues in an image, then used machine learning to form an algorithm.

For example, the points far from the plane of the focus appear less clear than those which are close, which gives us a blurring depth mark.

Moreover, even when viewing an image on a flat screen, we can accurately say how things are going, because we know the approximate size of everyday objects (for example, we can use the number of pixels on a photo of the face far east). This is called a semantics tail.

The training data was collected with a "Frankenphone" device consisting of five Pixel 3 phones programmed in Wi-Fi to simultaneously capture an image. The high quality depth is then calculated using the motion and multi-view stereoscopic structure.

Specifically, we form a convolutional neuron network, written in TensorFlow, which takes the PDAF pixels as input and learns to predict the depth. It is this new, improved ML-based depth estimation method that makes portrait mode work on pixel 3.

To ensure fast results, we use TensorFlow Lite, a cross-platform solution for running machine learning models on mobile and embedded devices, as well as the powerful Pixel 3 graphics processor to quickly calculate depth despite our inputs unusually large. We then combine the resulting depth estimates with masks from our segmentation neuron network to produce stunning Portrait mode results.


Check out 9to5Google on YouTube for more information:

[ad_2]
Source link