No More Struggle for Your SLAM Methodology with eCapture Depth Camera


Overview of SLAM

SLAM, “simultaneous localization and mapping,” is a challenging problem in robotics, especially in the indoor application where the Global Positioning System (GPS) cannot be accessible for localizing a robot. The keyword “simultaneous” in SLAM refers to building the map of the environment without any prior knowledge and at the same time localizing the robot into the built-map without any human interference or ad-hoc localization infrastructure. LiDAR was a popular method for SLAM in the beginning but using cameras for SLAM emerged later and was called “visual SLAM” (vSLAM). Compared to LiDAR-based SLAM, vSLAM is marked by its abundant information by capturing the images of the environment rather than just collecting depth information from scanning. Therefore, machine learning can be introduced for semantic and contextual object information to get a higher level of scene understanding, which is usually required to perform effective navigation and perception tasks.

Feature-based vSLAM vs. Stereo vSLAM

Source: wavelab waterloo slides. Very good slides that are worth checking out to go deeper into many direct methods and compare them on a high level with feature-based SLAMs.

The methodology of vSLAM is quite diverse. Feature-based vSLAM, as suggested by its name, extracts the feature points from the images to construct the map by minimizing the reprojection error where the depth information is essential. That’s the reason why depth camera plays a critical role in vSLAM. The feature-based vSLAM has been studied and developed for decades and has been the mainstream in vSLAM.

On the other hand, stereo vSLAM uses full stereo images to deliver a dense or semi-dense map. Stereo SLAM has recently drawn more attention because effective navigation and obstacle avoidance require more detailed information from the environment, which the sparse map cannot fulfill. Another advantage of stereo vSLAM lies in its possibility to be executed on a standard CPU. Nevertheless, the robustness of stereo SLAM might be more challenging because its mapping is highly dependent on the light intensity of the scene to calculate photometric error.

eCapture Depth Camera “Interleave Mode” Frees Developer from Making Choice

As there is no perfect methodology for vSLAM yet, dilemmas often happen for users to make a choice. Among the various vision hardware for vSLAM, a stereo camera seems a more flexible one because it can output stereo images as well as depth information by the ASIC. However, it does not resolve the primary issue of “making a choice between the methodologies of vSLAM.”

The interleave mode of eYs3D’s stereo camera frees users from such dilemma. Under interleave mode, two kinds of image formats are output in alternative frames – a depth map in one frame and a stereo image in the next frame shown in Fig. 2, for example, allows a user to implement feature-based and stereo vSLAM at the same time. In eYs3D’s ASV camera module, the IR projector is well managed for interleave mode, too. The IR dot projector is alternatively turned on and off under a delicate timing control so that a high-quality depth map and an artifact-less stereo image can be delivered accordingly. Of course, the interleave mode is not limited to these two image formats. The combination of the output image format can be customized according to the need of the methodology of vSLAM. In other words, interleave mode facilitates “algorithm fusion” to combine their strengths and compensate for their drawbacks one another.

Outbox eYs3D camera and try the interleave mode. Enjoy your vSLAM strip.

50 views0 comments