Trying to teach computers how to 'see' a tree
Boston — Visual Integration by Semantic Interpretation of Natural Scenes (VISIONS) is as much an eyeful as it is a mouthful. VISIONS is the forerunner of ''seeing'' computers that may one day be mounted on your car for human-free driving, placed on mobile robots for delivering mail at the office, or attached to satellities for monitoring weather patterns. While practical application of computer vision is still years off, researchers are pressing ahead with efforts to ''teach'' computers to recognize and interpret objects and scenes.
Computer ''vision'' works through a process that converts electronic ''images'' from a video camera into numeric values, though digital scanning and laser ranging devices may eventually be employed. The computer then segments the image and compares it with ''images'' already programmed into it.
The VISIONS system at the University of Massachusetts, one of roughly a dozen programs at universities and corporate laboratories nationwide, is capable of recognizing bushes, trees, and houses from still photographs. Other labs are concentrating on developing three-dimensional imagery, ways of intepreting motion, and stereo vision that mimicks human eyesight for determining distance and depth.
''The advent of computer vision is tremendously significant when coupled with artificial intelligence as a whole,'' says Bruce Bullock, head of Intelligence System Group for Hughes Aircraft Corporation. ''Vision is the bottleneck keeping us from significantly wider computer applications.''
A few commercially applicable computer vision systems - ''mere shadows'' of what scientists are contemplating - have already reached the marketplace. Used mainly for inspection purposes and employing existing technology, they view the silhouettes of brightly lit objects moving along a conveyor belt or stationary on a table and compare them with a programmed image of what the object is supposed to look like.
The first commercial use of sophisticated computer vision is likely to be within the next three to five years in conjunction with industrial robots, says Dr. Edward Riseman, chairman of the Department of Computer and Information Science at the University of Massachusetts. Currently the use of ''blind'' industrial robots is made more expensive because of the need for parts to arrive in a particular order, orientation, and time sequence. The addition of a camera-computer combination would give the robot greater flexibility in such tasks as spot welding.
By far the largest expenditure of research money for computer vision is in defense. Eventually, computers that ''see'' are expected to be standard equipment for missile guidance systems and for satellites used in arms treaty verification.
''What you're going to see is typical technology transfer,'' says Mr. Bullock. ''The Department of Defense is putting so much money into it now, that is where the spinoffs, the breakthroughs, are going to come. On the industrial side, they are going to reach a plateau. There won't be enough money to build increasingly sophisticated machinery. The capability for a second generation will come out of DOD.''
Currently, the VISIONS system at the University of Massachusetts can see a number of blue areas that are side by side and near the top of an image and conclude it is the sky. In addition, the computer compares what it sees with descriptions of objects it has been programmed with. But such programming is no easy chore.
''I never knew how complicated a tree was until I tried to describe one to a computer,'' says Terry Weymouth, a graduate student with the VISIONS program.
Even more difficult are problems such as shadows, texture, motion, and ''edging'' - defining the edge of an image. The human eye can fill-in the implied space in a cartoonist's image or caricature; the computer has trouble doing that with natural scenes.
Though there isn't one particular stumbling block hindering the acceleration of research, the lack of sufficient hardware is a complicating factor. In particular, there's a need for more powerful, yet less expensive computers that could handle the massive number of calculations involved.
''The practice of computer vision is behind the state-of-the-art in research, '' says Takeo Kanade, head of the vision laboratory of the Robotics Institute at Carnegie-Mellon. ''Engineering problems abound. We know the solutions in many cases but we can't prove them technologically.''
Scientists predict that the future development of computer vision will be a slow, evolutionary process. ''There won't be any major breakthroughs in computer vision development,'' says Dr. Riseman. ''It is so complex and there are so many parts to it that no single advancement can constitute a breakthrough. We're not going to see a complete vision system overnight.''