CS395T: Autonomous Robots -- Assignment 3

Assignment 3: Low-Level and High-Level Vision

This assignment has been designed with the objective of giving you a feel for the problems with doing visual processing on a constrained mobile robot domain. The assignment consists of two parts, Part I focuses on the problem of Color Segmentation while Part II introduces you to the process of Object Recognition. The associated code and data for the entire assignment can be found at: /robosoccer/assignments/prog3/ on Vieri. All directories mentioned below refer to sub-directories under this directory.

Both parts of the assignment are to be turned in by the due date but some sections may take considerable more work than the others - Part I is definitely simpler than Part II. Pace yourself accordingly - please do not leave the tasks to the day before it is due.

Part I - Color Segmentation.

Your job here is to develop a Classifier that performs Color Segmentation.

I(a) Normal Testing conditions:

You will be given a set of (N) input pixel values and the corresponding color labels, i.e. you will have a set of training samples which will consist of the color space values (Y,Cb,Cr) at an image pixel and the corresponding color class (C) that the pixel belongs to (the ground truth).
Y₁, Cb₁, Cr₁, C₁
Y₂, Cb₂, Cr₂, C₂
.
.
.
Y_N, Cb_N, Cr_N, C_N

The training dataset is provided (TrainingData.txt) in the directory mentioned above. It can also be downloaded here
In the current problem domain the color class can be one of eight different colors in the robot's environment: Pink (0), Yellow (1), Blue (2), Orange(3), Red (5), White (7), Green (8) and Black (9) and the numbers within the braces denote the corresponding class label -- the values that C can possibly take.
The problem then reduces to a discrete classification problem wherein given an input pixel, we need to determine which of the color classes it belongs to (or in case the exact match is not possible, which class it is closest to). Though real-time performance is desired, at this stage you do not have to worry about the computational complexity of your technique.
You have full freedom to use any technique, either something that you develop or something that you have read in one of the assigned (or related) readings. It could, for example, be one of the popular machine learning algorithms, such as nearest neighbor, decision trees, bayesian approaches, support vector machines or neural networks. The course on Machine Learning by Dr. Mooney and the book by Tom Mitchell are good sources for more information on these topics. Or make up something new!
Remember that you just have a list of pixel values and color labels. Nothing is specified regarding how likely this color labeling is. So, do not worry if the results are not great. This is just to give you a feel for the problem.
The desired output is a model that given an image pixel can assign a suitable color class to it. To test your model, you can use the test images provided in the directory: HW3Images/. The test images are the ones with the names consisting of yuvh and a 4-digit image number. The images are in the .ppm format and a function to read/write such a file is provided (see below).
The images in the directory are in the YCbCr format and are not visually informative. So the corresponding RGB color space images are also provided (names beginning with rgbh). Some sample results are provided in the same directory (names beginning with segh). Your results are only expected to look reasonable - need not be perfect.
You are also given a set of C functions, ClassHW3.c in the same directory which has a few functions that you could use to make your job easier. Some of the functions are:
1. YCbCr2RGB: Converts an input pixel from YCbCr color space to RGB.
2. Convert_Raw_Image(): Converts an image from YCbCr to RGB - takes care of reading in .ppm files.
3. Segment_Input_Image(): Segments an input image and saves it in a visually informative format.
4. Classifier(): Takes in an input pixel and classifies it and sets a suitable RGB value to be written into the output file.
5. Provide_Color_Class(): Basically the function that needs to be filled in - currently classifies each input pixel as BLACK.
Feel free to change or replace any of the functions, or to devise an entirely new strategy

I(b) Changed Illumination conditions:

Just to give you a feel for the sensitivity of color segmentation to illumination changes, try your classifier on the set of images captured under a different set of illumination conditions -- directory Illum2/. Your results will probably not be good, but don't worry about it - there is no need to tune your classifier for the different conditions.

Deliverables:

Your implementation of the classifier function or some other system that you developed. Also a directory of results on the set of sample images provided (for example, the output of Segment_Input_Image(), if you use that function).
Note that your program may be tested on other images that are not mentioned here.

Part II - Object Recognition.

Here, the objective is to design a system that recognizes objects in the input images. At this point, you are given a vision system that can perform low-level visual processing to a reasonably good degree. This implies that the vision system works up to the stage where the output is a set of bounding boxes for the various candidate blobs in each image. It also provides a separate list of blobs corresponding to each color. The properties of the blobs (that can be used for further processing) are explained in the team tech-report. You can either decide to use the entire system or only the output from the color segmentation stage.
Note: All the testing for this section should be done on the larger field with all the (overhead) lamps switched on. Ask somebody at the lab if you are not sure of operating the lamps. Remember to switch them on only when you need to use them. But, please do not frequently turn the lamps on and off. Also, while turning them on or off, please do so slowly. Hopefully, you have realized by now that color segmentation is sensitive to illumination changes. Remember, if the any of the lamps need to be replaced, everybody shall have to wait until that is done.

The required code can be found in the soccer/ subdirectory (under /robosoccer/assignments/prog3/) on Vieri. This in turn has two major subdirectories: Brain/ and CommObject/ and few other utilities. All the code you need is within Brain/ and the code in CommObject/ is only to help in the visualization process (through UTAssist/) if you desire - more on this below.
Note that for this assignment, the motion module has be completely removed and hence the robot will perform no motion at all - do not be surprised if the robot does not stretch or stand up on waking up. It will just stay flat on the ground. We will put the vision and motion modules together for a later assignment though you can go ahead and play with that task right now!
Also note that the task of finding objects under various tilt-pan-roll combinations of the robot's camera is not something you need to consider now.Assume that your code will be tested only under the conditions where the robot is on the ground and only the head pan (left-right) motion is performed.
To use the UTAssist tool, you need to be familiar with its use - try to get started on that a little ahead of the time when you actually start working on this part of the assignment.
To get started on this assignment, you need to first copy over the soccer/ directory to your home directory. Then, as usual, you enter this directory (the main directory of the program) and type: make which creates the appropriate .BIN files. Next, type: make install which puts the binary files in the MS/ directory. You are now ready to load the code on the robot using the same procedure as in Assignment 1.
Within Brain/, there are two main subdirectories: Vision/ and UTAssistClient/ and a few more files, of which you need to be familiar only with the files Brain.{cc,h} and Debug.{cc,h}. Once again, the UTAssistClient/ directory is just to help you visualize stuff better - debugging vision code without such a visualization tool can be very difficult!!
The Brain runs the overall algorithm which in this case consists of just processing an image (vision->processImage()) and then displaying various patterns on the LEDs on the robot's face (debug->Update()). Right now the code is set up to light up all the face LEDs every cycle but it can be easily modified using the patterns that have already been defined for you -- see UpdateDisplay() in Brain.cc. If appropriate flags are set true in the vision implementation, the suitable LEDs shall automatically light up.
Within the Vision/ directory is an implementation of vision on the robot, SampleVision/, in fact parts of the code we actually use on the robot. In this implementation you are only given code that performs vision up to the stage of finding bounding boxes and you have to extend it further to actually find the objects of interest -- more info on that below. You can re-implement previous approaches (from the assigned reading) or to come up with something new!
For full credit your algorithm should be able to detect the orange ball, the yellow goal, and one of the two yellow/pink beacons. If you cannot get everything working, focus on the orange ball and the yellow goal. Only minimal points shall be deducted if you cannot get the beacon. Your system need not be perfect, but you should try to avoid too many false positives. Test it when the robot is near and up to a half field-length from the object in question, and also when no object is present. Note that it is not sufficient to just identify the largest blob in the image of a given color, since then you shall pick out many situations in which the object of interest is not present.
If you are ambitious, you can have the algorithm detect both goals, the ball and multiple beacons. Once again, feel free to re-implement a previously developed approach and/or to come up with something new...

The system is set up to provide output in two forms:

The system combines the vision system with the LED display system and lights up specific LEDs when specific objects are seen on the image. The appropriate flags have been provided in the vision module and if they are set to be true the LEDs shall automatically light up.
The system provides the position in the image plane for the various objects seen in the image plane at any given instance of time. For printing it out to screen, you can use the OSYSPRINT function and telnet to the robot. If you are interested, you can also calculate the position of the objects relative to the robot.

To get from the given code to the desired outputs, you first need some information on the code:

The vision code you need is in the files: SampleVision.{cc,h} in the SampleVision/ directory. In particular the code for creating run-lengths merging them to create BoundingBoxes is in the function ProcessImage() in SampleVision.cc.
Specifically, you need not worry about the code until the point where it says: Find_Objects(). That is the function which you shall have to define to detect the objects of interest. Even in that function, you are given the links to the first blob in the linked list of valid blobs for each color, for example, colorListHead[YELLOW]. For more info on these blobs, look at the BoundingBox structure.
The functions Find_Blue_Goal(), Find_Yellow_Goal(), Find_Ball() are place-holders -- you need to define them to actually find the goals and ball.
The function: Save_Send_UTAssist_Data() saves the useful object properties in vision_objects to be sent to UTAssist tool. All the code for compressing and sending of data to UTAssist is already implemented - you only need to set the data in suitable structures.
There are some additional functions in the file, provided for your use (if you so desire), to perform basic operations on BoundingBoxes and image files. In particular, setting the flag savetemp -- look for savetemp = true; near the top of the function ProcessImage() -- can be used to save the current image and/or the segmented version of it as a .ppm file on the memory stick. Currently, it saves the input image (in YCbCr color space) when you ask for a large image through UTAssist. There are several other options (image formats and color spaces) which you can find in the SampleVision.h file.
Finally, to send (receive) data to (from) UTAssist, you need to save the same server port address set in the file CommObjectConfig.h (in the CommObject/ directory under soccer/) and the file Server.java under UTAssist. Two people cannot run UTAssist under the same port. Currently it is set to 52341 in both files. I would suggest that you choose a number between 52320-52340 and stick to that.

Next, some information on the UTAssist debugging tool:

Copy the code over from the directory UTAssist/ under robosoccer/assignments/prog3/. Enter the directory and type: javac *.java to generate all the necessary .class files. Then typing java Server should get the server up and running. If UTAssist runs to slow or runs out of memory, you might try starting it by typing java -Xmx600m Server.
For the purposes of this assignment, you need to be able to look at the images that the robot sends over. To do so, under the main menu on UTAssist, choose: Tools->VisionObjectsViewer and then Tools->CommandViewer.
On the CommandViewer you can select the desired options on the suitable connection and click on update to send ths info to the robot. For example, to see the segmented image on robot with IP:40, which is, say, connected on Connection 0:, you would select the suitable check-box on the CommandViewer and click on Update. You should hear the robot's ear click each time you send such a message.

For some more information on UTAssist, take a look at this page: UTAssist Info. Some of the information there may be outdated, such as the numbers on the sizes of various messages. You shall get more information and a demo on UTAssist in class.

Finally, some information on the LED interface that has already set up for you:

The various LED patterns have already been set up for you. If you set the suitable flags (for example, see Brain::getDebug()->BlueGoalFound in SampleVision.cc and Brain.cc), the LEDs shall automatically light up.
Currently the LEDs are set to reflect the field position. The LED in left central region on the robot's face lights up for the yellow goal while the symmetric one on the right lights up for the blue goal. Setting BallFound = true; lights up a LED at the center. The beacons with pink on top shall light up the LEDs on the suitable LEDs on the left side of the robot's face - pink-on-blue lights up the lower left LED. Similar arrangement exists for the other two beacons.
For more information on LED arrangement, see: ModelInformationGuide.pdf.
Currently the default setting assumes that all the beacons and the ball can be seen on each visual frame and hence the corresponding LEDs remain lit all the time. They need to actually light up only when the suitable objects are seen.

Deliverables:

You shall be asked to demo your code in class. The robot shall be placed at different positions on the field and the head will be panned (left-right). Your robot shall have to recognize the presence of specific objects and display an appropriate output (LEDs or telnet).

[ Back to Class Home page ]

Page maintained by Peter Stone and Dan Stronger
Questions? Send me mail