The Starting point
I started with one of the strongest architectures for image recognition the Resnet architecture. With just the basic architecture I got a spot in the top 50%.
I didn't use transfer learning since I could train the network from scratch using Kaggles free GPUs, and I wanted to modify the network for
further improvements. Kaggle offers 30 hours of free GPU time each week for members and that is a good way to train your Networks for free.
Improvements
I started adding image augmentation to increase the size of my test set by stretching and rotating and warping the characters. I was careful not to over-rotate
the characters since that would make the algorithm learn features that are not part of the numbers. Finding good values for Image augmentation moved me further
up on the leader board. I didn't add noise for example since both the sample and evaluation images were noise-free.
I also change the activation function of the Network from Relu to Mish since that algorithm has shown promising results in early testing.
I also modified the skip connection to add some weights on how active the skip connection should be.
I found a research paper detailing this approach but unfortunately, I have lost the source to the research paper.
What did the winners do that I didn't
Most teams that scored higher than me used the following two techniques:
Many of them used Pseudo Labeling.
They also trained multiple networks and let the different networks vote on each number and then choose the answer with the most votes.
It is possible to store and load different network parameters from a file on Kaggle.
Source-code
My code that was run in the competition can be found here:
Thanks to
Spctrm for giving me time to explore the AI/ML topic, Fast.ai community and also Kaggle for hosting the competition.
This article is a follow up to a AI night hosted by Spctrm where the competition was discussed and how to start with ML without a budget.
Publicerad 02 Mar 2020