![]() How is this possible? Well quite easily if you consider that, in the time it takes for one complete clock cycle in a 1 GHz processor, a pulse of light travels about 1 foot. The Microsoft Kinect 2.0 sensor relies upon a novel image sensor that indirectly measures the time it takes for pulses of laser light to travel from a laser projector, to a target surface, and then back to an image sensor. ![]() For measuring thin objects like hair or, perhaps, a utility cord as thin as a single image pixel, this is a significant obstacle, and its the major roadblock that the Kinect 2.0 attempts to address. For computer gaming, this isn't really a problem because body parts are large enough that my constellations fit inside the pixels forming your arm. We call this an ambiguity, and it means I cannot derive a depth estimate. Not having a constellation of points, there is no way for the Kinect processor to uniquely identify a dot in the projected pattern. And from a constellation of these dots, it can identify the exact constellation of points from the projected dot pattern. ![]() That is, it needs to detect individual dots, and then it needs to find neighboring dots as well. The problem for the Kinect 1.0 sensor is in the manner in which it relies on small windows in the captured image. So by simply tracking a dot's horizontal coordinate, the Kinect can tell you how far that dot is from the camera sensor, i.e. What this means in the captured image is that dots in the projected pattern appear to move left to right with distance, not up and down. So the cone of light that is the projector is expanding at the same rate as the lines of sight of camera's pixels are expanding. ![]() That is, the camera and projector have matching fields of view such that as the reflecting surface gets farther from the sensor, the light from the laser projector is getting larger and larger since it is a cone of laser light that is getting ever larger as you get farther from its source. It doesn't because the camera and the laser projector are epipolar rectified. Of course, you're probably wondering why the perspective distortion phenomenon doesn't make dots look smaller and closer together as the reflecting surface gets farther away from the camera. Hence, we need to sweep the laser so that we can generate many points in each row. So a 640x480 camera could only reconstruct, at most, 480 unique points in space. Note that with just the single laser stripe, a single frame of video can only reconstruct a single point of the target surface per row of the image sensor. Figure 2 shows how this would work for a single frame of video where the position of the laser appears to move left and right with depth such that the more to the right, the closer to the camera the target surface. Structured light is one of the first methods of 3-D scanning where a single light stripe mechanically sweeps across a target object, and from a large sequence of images taken as the stripe sweeps across the target, a complete 3-D surface can be reconstructed. Instead, it relies on triangulation between a near-infrared camera and a near-infrared laser source to perform a process called structured-light. Of course, the Kinect 1.0 sensor doesn't have two cameras performing triangulation. The disparity in travel distance between the red and blue spheres is a phenomenon known as parallax such that closer objects produce greater degrees of parallax than distant objects. That is, the blue sphere appears to move almost three-quarters of the cameras' fields of view while the red sphere moves only half of this distance. Looking at the two images as viewed by the stereo-camera pair, the blue sphere being closer to the cameras has a greater disparity in position from camera A to camera B. The process is illustrated in Fig. 1 where I show two points in space, at varying distances from the camera. Triangulation is the process used by stereo-imaging systems which is how the human visual system (i.e. Of the many methods of measuring depth with a camera, the Kinect 1.0 sensor falls within a broad range of technologies that rely on triangulation. ![]() Figure 1: Illustration of the stereo-imaging method. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |