An Overview of Image Classification Architecture

Image classification is the central problem in computer vision; all other vision tasks, such as object detection, instance segmentation, and so on, begin with image classification. As a result, it is critical to thoroughly study IC and comprehend all of its nuances. In this blog post, I will go over the history, breakthrough ideas, and current state of affairs, and explain how it all comes down to IC.

It all begins with LeNet, which was proposed by LeCun et al. It first introduced the CONV layer, then the pool layer, and finally the FC layer. Modern CNN has all of these basic features, but it is a couple of other breakthroughs that make modern CNN tick.

A CNN's three basic components are DATA, COMPUTE, and ALGORITHM.

I'll use an analogy to demonstrate the three components and their roles. Assume a CNN is a diesel engine, with data serving as fuel, compute serving as the spark plug, and the algorithm serving as the engine design. All three must work together to create a nice, smooth-running engine.

All CNN models (engines) will now be defined as a function of these three components, and we'll see how far we've come.

LeNet - has a basic working engine design (conv.pool, fc) Data (fuel) was insufficient MNIST is 70k in comparison to IMAGENET spark plug was unavailable (CPU instead of gpu) AlexNet's basic design has been improved (relu, normalization) Data was sufficient, and a higher-quality compute was available for real-time fault detection. ZFNet - expanded the basic design (more layers) same data, same quality, same computation VGG - A large, bulky engine, similar to the brute approach, but with the added benefit of making it modular; previously, it was an it=rregular design.

  • classification of architecture:

  • prehistoric : basic design laid out, no fuel or plug

  • historic : got the fuel and spark plug, improvements in design
  • modern: more efficient design for bigger and better engines, improvement in plug
  • post-modern: lighter engine with comparable performance,