ML with Intel OpenVINO Toolkit — Super-Resolution [Part 2]

Learn to utilize Super-Resolution in this second installment of a four-part series on the Intel OpenVINO toolkit

Published in

Heartbeat

5 min readJul 1, 2022

From manually developing film in canisters, to using your mobile device to shoot photos and share them online in realtime, how we capture images has come a long way. Image resolutions have vastly improved, from a stunning 240p to 8K, and enhanced resolutions mean that displays are also constantly changing, with the help of technologies like OLED screens.

But what if you have an image or video that was taken a long time ago and you want to show it using modern day resolution or displays? Comparatively, the image has poor resolution and is grainy, leading to a viewing experience that is lacking. But, thanks to deep learning, there are ML models that are currently capable of increasing the level of detail in an image or video. This process is called super-resolution. Let’s dive in!

What is super-resolution?

Deep learning has achieved state-of-the-art results for image and video super-resolution. Details in a high-resolution output can be filled in, even where the details are otherwise essentially unknown. Have you ever watched an investigative series where detectives zoom-in on a picture, and suddenly the image quality improves, and details appear that weren’t present previously? That’s super-resolution. Below is an example of a low-resolution image with super-resolution performed upon it to improve the quality.

Src: https://towardsdatascience.com/deep-learning-based-super-resolution-without-using-a-gan-11c9bb5b6cd5

OpenVINO and Super-Resolution

OpenVINO is an open-source toolkit developed and maintained by Intel that facilitates the optimization and deployment of deep learning models using an inference engine onto Intel hardware. It allows developers to use models trained with popular frameworks like Keras, TensorFlow, PyTorch, and more, while boosting performance in various ML tasks.

https://docs.openvino.ai/latest/index.html

OpenVINO has a deep learning model in its open_model_zoo repository that can achieve this super-resolution task both for images and videos. It is based on the paper Attention-Based Approach for Single Image Super-Resolution. The methodology outlined in this paper seeks to address the main challenge in super-resolution, which is the recovery of high-frequency details such as tiny textures. Most state-of-the-art methods lack specific modules to address this issue, leading to blurry images and videos. The attention-based method seeks to discriminate between textured and smooth areas, leading to high-frequency enhancement, and better performance and visual effects.

Big teams rely on big ideas. Learn how experts at Uber, WorkFusion, and The RealReal use Comet to scale out their ML models and ensure visibility and collaboration company-wide.

How It Works

The network architecture of this model consists of two parts: the feature reconstruction network and the attention-producing network. The feature reconstruction network is a convolutional neural network that is responsible for recovering high-frequency details, and eventually reconstructing the high-resolution image.

The feature reconstruction network further consists of three parts: a convolutional layer for feature extraction, multiple stacked DenseRes blocks, and a subpixel convolution layer that acts as an up-sampling module as shown in the diagram above.

The attention-producing network is solely used to selectively enhance high-frequency features in the image. The architecture is inspired by UNet, and its purpose is to extract high-frequency components such as edges, then combine them with the low-resolution image as input to the network. As shown in the image above, the network consists of a contracting path (left side), an expansive path (right side), and skip connections. In the contracting path, low-level features are extracted from the image followed by max-pooling to reduce the dimension of data. In the expansive path, the low-level features are combined with the high-level features. The output can give a segmentation of whether an area is in the field with textures or not and needs to be repaired by the feature reconstruction network.

It is important to note that OpenVINO’s implementation has reduced number of channels and changes in network architecture. It enhances the resolution of the input image by a factor of 3.

Prerequisites

Make sure you have Python 3.7, and CMAKE installed in your development environment, then install OpenVINO. Refer to this tutorial to get started.

Download the model

For this tutorial, you can manually download the required model files to run the inference engine provided by OpenVINO:

Implementation

For this tutorial, we are going to use a simple script that loads the super-resolution model, performs inference using an input image and then outputs the high-resolution image:

Conclusion

Super-resolution can come in handy for various use cases in industries like forensics, healthcare imagery, entertainment, and more. OpenVINO provides an easy-to-use platform to help developers get started and this model could be quite useful if run in the cloud or on desktop computers that have intel hardware.

Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to providing premier educational resources for data science, machine learning, and deep learning practitioners. We’re committed to supporting and inspiring developers and engineers from all walks of life.

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We pay our contributors, and we don’t sell ads.

If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletters (Deep Learning Weekly and the Comet Newsletter), join us on Slack, and follow Comet on Twitter and LinkedIn for resources, events, and much more that will help you build better ML models, faster.