Some examples of images to classify

jwatte's picture

The question came up: Why would I go through the trouble of training a neural network, rather than just checking "is the ground in front of me gray-colored?"

I've tried simple color sensing before, and it works OK for "find an orange cone against a non-orange background." It does not work well for "separate a gray concrete floor from a gray concrete wall" or "separate a horizontal mulch ground from a vertical tree trunk" or "separate a well-manicured lawn from a well-manicured hedge or square hay bale."

Here are some example images to show the input, and the manual classification I'm making of "stop" (red,) "go" (green,) and "interest" (blue.) These are the classes I'm training the network to recognize.

MergedClassifiedSource
dirtdirtdirt
mulchmulchmulch
coneconecone

Again, these are manually labeled images; part of the hundreds of images I have to paint by myself to start teaching the computer what to think.

Once this training is done, I will move on to object detection, which will let me sense "people" versus "other rovers" and such, and presumably make better safety decisions.

Finally, I'm looking at integrating stereo vision based on patch-based similarity measures for disparity (image separation.) This would let me build a 3D model of the surroundings, and compared with the stop/go information, would allow for autonomous navigation even absent encoder input.

At the SparkFun AVC, there is a significant bonus for vehicles that don't use GPS, and I've found that GPS is slow and inaccurate enough that I'd rather depend on a gyroscopic sensor, accelerometer sensor, and magnetometer (compass) sensor, together with high-resolution rotation encoders for each of the wheels. If I could add image-based movement detection / SLAM to that, I feel I'd have a very robust solution.