Multi View 3D Object Detection Network for Autonomous Driving

1 min readMar 5, 2021

Proposal of sensor fusion framework that takes lidar point cloud and RGB images as input and predicts 3D bounding boxes. Laser Scanners capture the depth information. Cameras capture much more semantic information. Lidar based methods achieve more accurate 3D locations while image based higher accuracy in 2D box evaluation.

Encoding the sparse 3D point cloud with compact multi view representation.
Generate the 3D candidate boxes efficiently from bird’s eye view representation of 3D point cloud.
Designed deep fusion scheme to combine features from multiple views.
25 % average precison for 3d localization and 30% average precision for 3D detection.
MV3D detection network consists of 2 parts:-

1. 3D proposal network

Utilizes bird eye view for representation of point cloud to generate 3D bounding boxes.

2. Region based fusion network.

In MV3D :-

Input → Lidar Bird Eye View, Lidar Front View, Image(RGB)

Proposals Generated On → Lidar Bird Eye View

Projection of proposals into 3 views for bird eye view, Front View of lidar, Image. Later passed through the nueral network.

Multi View 3D Object Detection Network for Autonomous Driving

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Gandham Vignesh Babu

No responses yet