N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras

Abstract

We introduce N-ImageNet, a large-scale dataset targeted for robust, fine-grained object recognition with event cameras. The dataset is collected using programmable hardware in which an event camera consistently moves around a monitor displaying images from ImageNet. N-ImageNet serves as a challenging benchmark for event-based object recognition, due to its large number of classes and samples. We empirically show that pretraining on N-ImageNet improves the performance of event-based classifiers and helps them learn with few labeled data. In addition, we present several variants of N-ImageNet to test the robustness of event-based classifiers under diverse camera trajectories and severe lighting conditions, and propose a novel event representation to alleviate the performance degradation. To the best of our knowledge, we are the first to quantitatively investigate the consequences caused by various environmental conditions on event-based object recognition algorithms. N-ImageNet and its variants are expected to guide practical implementations for deploying event-based object recognition algorithms in the real world.

Video

What Are Event Cameras?

Event cameras are neuromorphic sensors that encode visual information as a sequence of events. In contrast to conventional frame-based cameras that output absolute brightness intensities, event cameras respond to brightness changes. The following figure shows a visual description of how event cameras function compared to conventional cameras. Notice how brightness changes are encoded as 'streams' in the spatio-temporal domain.

Dataset Overview

N-ImageNet is a large-scale event dataset that enables training and benchmarking object recongition algorithms using event camera input. The dataset surpasses all existing datasets in both size and label granularity.

Data Acquisition Setup

To capture N-ImageNet, we design custom hardware to trigger perpetual camera motion. The device consists of two geared motors connected to a pair of perpendicularly adjacent gear racks where the upper and lower motors are responsible for vertical and horizontal motion, respectively. Each motor is further linked to a programmable Arduino board, which can control the camera movement.

Object Recognition Performance Analysis

Due to its size and label diversity, N-ImageNet can function as a challenging benchmark for event-based object recognition. The plot below shows the classification accuracy of existing event-based recognition algorithms on N-ImageNet. There exists a large performance gap with the ImageNet state-of-the-art, which suggests that mastering N-ImageNet is still a long way to go. We expect N-ImageNet to foster development of event classifiers that could readily function in the real world.

Useful Links for Benchmarking and Downloading

Public Benchmark: Check out the public benchmark on event-based object recognition available at the following link. Feel free to upload new results to the benchmark!

Downloading Full N-ImageNet: Refer to the following link to download N-ImageNet. Note the full dataset size will be around 400 GB, so prepare a sufficient amount of disk space before downloading! Please leave an email to Junho Kim if you are in urgent need of N-ImageNet and the file share links are not working.

Downloading Mini N-ImageNet: Starting from 2022, we publicly released a smaller version of N-ImageNet, called mini N-ImageNet. The dataset contains 100 classes, which is 1/10 of the original N-ImageNet. We expect the dataset to enable quick and light-weight evaluation of new event-based object recognition methods. To download the dataset, please refer to the follwoing link.

Downloading Pretrained Models: To download the pretrained models, refer to the following link.

BibTeX

@InProceedings{Kim_2021_ICCV,
    author    = {Kim, Junho and Bae, Jaehyeok and Park, Gangin and Zhang, Dongsu and Kim, Young Min},
    title     = {N-ImageNet: Towards Robust, Fine-Grained Object Recognition With Event Cameras},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {2146-2156}
}