GoogLeNet is the winner of ILSVLC 2014. VGGNet beats the GoogLeNet and won the localization task in ILSVRC 2014.
Improved from AlexNet(-8) 16.4% to VGG-16 7.3%)
A very important paper on CNN. It uses only 3x3 CONV and many networks are based on VGG architecture.
Rescale from 224 to 256~512px. Then crop to 224px which contains the object fully or partially. Has the effect of data augmentation with scaling and translation, which helps to reduce overfitting.