The goal of the “Helsinki Deblur Challenge” was to develop a general purpose debluring procedure to tackle the task at hand: Debluring images of characters with 20 steps of gradually increasing blur. The processed images are then fed to an optical character recognition engine. The deviation of the predicted characters to the ground truth characters is then quantified using the Levenshtein distance. We call this crucial metric the OCR score.
The network architecture building the backbone of our data driven pipeline is an adjusted version of the prominent U-Net. Adding group normalisation, an introductory pooling layer and more down- and up-sampling steps to the CNN - the modifications allow a better performance, as the field-of-view of the network increases.
The key to our debluring pipeline consists in the solid understanding of the blurring operator. Modelling the forward operation as a convolution followed by a moustache distortion brings several advantages: By augmenting the limited training dataset of 200 samples to an arbitrary amount of synthesized data, the networks generalization capacity is drastically improved. Furthermore, the inclusion of natural images in the training set enforces the network to become a general purpose deblurer, rather than a character classifier and generator.
Using the learned parameters of the forward model, we correct the radial distortion as a pre-processing step. This highly benefits the subsequent U-Net processing, as it renders the problem approximately translation invariant.
Eventually, the networks are pre-trained with about 75000 samples of synthesized data and then fine tuned with mainly original HDC data.
The 20 networks successfully deblur all 20 levels, reaching average OCR scores of 84.3-98.6 on our validation split and winning the HDC21 challenge.
Presented by Theophil Trippe