Undergraduate Research in the IVILAB

The Interdisciplinary Visual Intelligence group has an active program for involving undergraduates in research. We strongly encourage capable and interested undergraduate students to become involved with research early on. Doing so is becoming increasingly critical for preparing for grad school and has also helped many students get jobs in industry. Undergraduate can do research for pay, as part of academics (e.g., independent studies or honor's projects), as a volunteer, or some combination of these over time (most common).

What the IVILAB does. Computer vision is about algorithms for understanding images (and video). Our approach is to develop semantic representations (models) for what is relevant to real problems, and infer the particulars of the models from image data (the "evidence"). We apply this approach to understanding objects and scenes, human activities (from video), biological structure from images, and web images with captions. In addition we apply computer vision methods to make educational video more accessible (the SLIC project), and we apply our inference and modeling experience to biological problems such a population genetics where the data is molecular (not images). If any of these topics seem interesting to you, read on!

Life in the IVILAB as an undergraduate researcher. Undergraduate researchers working with the computer vision group are paired up with one or more graduate student mentors based on which project(s) they become involved with. The vision group follows a model that integrates teaching and mentoring, tasks where undergraduates can be productive relatively quickly (e.g., data collection, labeling/marking images, and straight forward programming tasks), and longer range research work where graduate students invest time to guide undergraduates on research projects. Here the expectation is that undergraduate researchers will become invaluable to project after some time, which has proven true on multiple occasions. The computer vision group is an especially supportive, friendly, and welcoming research group. Undergraduates are encouraged to take part in all lab activities. Active undergraduates typically have desk space with other vision students which further helps undergraduates become productive in research.

Training. The vision group runs training seminar sequences such as the summer "boot camp" where we train new students on the IVILAB computation infrastructure, basic programming in C/C++, effective use of Unix, and vision lab software development conventions. Additional training sessions, either during the academic year, or on different topics, are arranged as needed. When there is interest, we run a weekly vision seminar that is focused at the undergraduate level (link to the especially well organized 2009 version). Finally, undergraduate research students are encouraged to attend our ongoing vision meetings where we discuss papers and ideas related to computer vision.

Publishing. Undergraduate researchers working with the vision group have a great record of contributing sufficiently to research projects that they become authors on papers. So far, twelve undergraduates have been authors on nineteen vision lab papers and three abstracts. Click here for the list.

Outreach. The vision group undertakes a number of broader education/outreach activities, and we have had the good fortune of lots of help from undergraduates. For example, undergraduates who work with us in the summer are strongly encouraged to help out with our Integration of Science and Computing (ISC) sumer camp, which is a lot fun and very rewarding.

IVILAB undergraduate researchers past and present. Students who have participated in vision lab as undergraduates include Matthew Johnson (honor's student, graduated December 2003), Abin Shahab (honor's student, graduated May 2004), Ekaterina (Kate) Taralova (now at CMU), Juhanni Torkkola (now at Microsoft), Andrew Winslow (now at Tufts), Daniel Mathis, Mike Thompson, Sam Martin, Johnson Truong (now at SMU), Andrew Emmott (headed to Oregon State), Ken Wright, Steve Zhou, Phillip Lee, James Magahern, Emily Hartley, Steven Gregory, Bonnie Kermgard, Gabriel Wilson, Alexander Danehy, Daniel Fried, Joshua Bowdish, Lui Lui, Ben Dicken, Haziel Zuniga, Mark fischer, Matthew Burns, Racheal Gladysz, Salika Dunatunga (honors, now at U. Penn), Kristle Schulz (honors), and Soumya Srivastava.

Further information. Undergraduates interested in the vision lab's research should contact Kobus by E-mail (kobus AT cs DOT arizona DOT edu), or the contact person for specific projects. Some of our projects that enjoyed the help of undergraduates are showcased below. Additional vision lab projects are listed on a page linked here. All vision lab projects have potential to provide good undergraduate research experience.


Examples of undergraduate research in the computer vision lab


Understanding Scene Geometry

The image(*) to the right shows undergraduate Emily Hartley determining the geometry of an indoor scene and the parameters of the camera that took the picture of the scene. Such data is critical for both training and validating systems that automatically infer scene geometry, the camera parameters, the objects within the scene, and their location and pose.

Funding for undergraduates provided by REU supplement to NSF CAREER grant IIS-0747511.

(*) Photo credit Robert Walker Photography.


Semantically Linked Instructional Content (SLIC)

The image shows a screen shot of the SLIC educational video browsing system which is an excellent project for undergraduates interested in multimedia and web design. For more information see the SLIC project page, or contact Yekaterina (Kate) Kharitoova (ykk AT email DOT arizona DOT edu) for more information. SLIC has led to several publications with undergraduate authors, and currently several undergraduate students are working on it.

SLIC is partly supported by NSF grant EF-0735191.


Aligning image caption words with image elements

There are now millions of images on-line with associated text (e.g., captions). Information in captions is either redundant (e.g., the word dog occurs, and the dog is obvious) or complementary (e.g., there is sky above the dog, but it is not mentioned). Redundant information allows us to train machine learning methods to predict one of these modalities from the other. Alternatively, complementary information in the modalities can disambiguate uncertainty (see "Word Sense Disambiguation with Pictures" below), or provide for combined visual and textual searching and data mining. Under the guidance of PHD student Luca del Pero, undergraduates Phil Lee, James Magahern, and Emily Hartley have contributed to research on using object detectors to improve the alignment of natural language captions to image data, which has already led to a publication for them. For more information on this project, contact Luca del Pero (delpero AT cs DOT arizona DOT edu).

Funding for undergraduates provided by ONR and REU supplement to NSF CAREER grant IIS-0747511.

On the left is the baseline result; the image on the right shows the result with detectors.


Simultaneously tracking many objects with applications to biological growth

Simultaneously tracking many objects with overlapping trajectories is hard because you do not know which detections belong to which objects. The vision lab has developed a new approach to this problem and has applied it to several kinds of data. For example, the image to the right shows tubes that are growing out of pollen specs (not visible) towards ovules (out of the picture) in an in vitro plant fertilization experiment. The paths of the pollen tubes are green, and the tracks automatically found are superimposed in red. This work has led to a publication for undergraduate researcher Phil Lee. This project is being led by PHD student Ernesto Brau. For more information contact Ernesto (ernesto AT cs DOT arizona DOT edu).

This project is in collaboration with the Palanivelu lab.

Funding for this project provided by NSF grant IOS-0723421.


Identifying machine parts using CAD models

CAD models provide the 3D structure of many man made objects such as machine parts. This projects aims to find objects in images based on these models. However, since the data is most readily available as triangular meshes, 3D features that are useful for matching 2D images must be extracted from mesh data. Undergraduate Emily Hartley has contributed software for this task, and undergraduate Andrew Emmott has contributed software for matching extracted 3D features to 2D images. They have been mentored by PHD student Luca del Pero. For more information, contact him (delpero AT cs DOT arizona DOT edu).

Funding for undergraduates provided by NSF Grant 0758596 and REU supplement to NSF CAREER grant IIS-0747511.


Inferring Plant Structure from Images

Quantifying plant geometry is critical for understanding how subtle details in form are caused by molecular and environmental changes. Developing automated methods for determining plant structure from images is motivated by the difficulty of extracting these details by human inspection, together with the need for high throughput experiments where we can test against a large number of variables.

To get numbers for structure we fit geometric models of plants to image data. The picture on the right shows multiple views of an Arabidopsis plant (top), two views (bottom, left), and fits of the skeleton to the image data, projected using camera models corresponding to those two views. Undergraduates Sam Martin and Emily Hartley have helped collect the image data, arrange feature extraction, and create ground truth data fits to it for training and evaluation. This project is led by Kyle Simek. For more information, contact him (ksimek AT cs DOT arizona DOT edu).

This project is in collaboration with the Palanivelu and Wing labs.

Funding for undergraduates provided by the NSF funded iPlant project, via the University of Arizona UBRP program and an REU supplement to NSF CAREER grant IIS-0747511.


Modeling and visualizing Alternaria

To the right is a labeled model of the fungus Alternaria generated by a stochastic L-system built by undergraduate researcher Kate Taralova. For more information, follow this link.

This project is in collaboration with the Pryor lab.

Support for undergraduates provided by TRIFF and a REU supplement to a department of computer science NSF research infrastructure grant


Inactive and Subsumed Projects


Word sense disambiguation with pictures

Many words in natural language are ambiguous as illustrated here by the word "bank". Typically, resolving such ambiguity is attempted by looking at nearby words in the passage. Computer vision lab undergraduate researcher Matthew Johnson played a key role in the development of a novel method for adding information from accompanying illustrations to help reduce the ambiguity. The system learns from a data base of images that certain word senses (e.g., meanings of bank found with outdoor photos), are associated with certain kinds of image features. This association is then used to incorporate information in illustrations to help disambiguate the word under consideration. This work led to two publications for Matthew.

Support for undergraduates provided by TRIFF and a REU supplement to a department of computer science NSF research infrastructure grant.


Vision system for flying robots

These three images illustrate work by computer science students on a UA multi-department effort to compete in international aerial robotics competition which is largely an event for undergraduates. Here computer controlled planes and/or helicopters work towards accomplishing a mission specified by the contest organizers. Part of the current task is to find a building having a particular symbol on it (left), and identify the doors and windows of that building, and then identify which doors and windows are open so that a sub-vehicle can be launched through the portal. The middle figure shows the symbol identification software being tested from a moving vehicle to simulate flight. The far right figure shows a view from the computer science department with lines found in this image and the matching lines found in a companion image. The students use the shift (shown in green) between matching edges to estimate the distance to the edge, which is used to help analyze the structures. Images provided by undergraduate researcher Ekaterina (Kate) Taralova (now at CMU).

Browsing large image collections

A screen shot of a program for browsing large digital art image databases that is being developed by undergraduate students in computer science at the U of A. (Art images courtesy of the Fine Arts Museum of San Francisco). Contributions to this project have been made by undergraduates Matthew Johnson and John Bruce.

Evaluation of image segmentation algorithms

Two images which have been segmented by three different methods. U of A undergraduate students in computer science are involved in research to evaluate the quality of such methods. Segmentation quality is quantified by the degree to which the regions are useful to programs which automatically recognize what is in the images. Contributions have been made by Abin Shahab.