Thursday, July 22, 2010

Call for coherent, systematic field summaries

The overview

As a Ph.D. student, figuring out what has already been done by past researchers is by far the hardest problem I have come across. The current expectation and requirement is that every student will browse tens of conferences and journals going back tens of years. At the end of this long and tedious process, there are several common pitfalls. First, if even a single paper in a single conference was missed, this could lead to months of work which will eventually be found to be wasted time because it was spent working on an idea that was already developed. Second, even if every relevant paper was indeed located, students almost always discover what happened in each paper individually, and rarely draw links between papers. This is critically important, as described below.

The details

I recently attended the International Computer Vision Summer School (2010). There was one session in which students were asked to read 3 papers and trace the ideas in these papers back as far as possible in the literature in the form of a tree. After reviewing the submissions, the organizer was quick to point out that there was very little overlap in the trees submitted for these 3 papers. He went on to explain how the 3 papers had essentially presented exactly the same concept, just from slightly different angles and using different terminology and notation. Thus, the trees for the 3 papers should in fact be identical! This was an excellent concrete example of the problem - it takes an "already expert" to extract these deep and extremely important connections from a literature review. Leaving it up to every new student is not only a fruitless effort (as they will not get the correct information out anyway), it is also enormously replicated effort! There should be a system in place where efforts are pooled to do this completely and correctly a single time.


Previous attempts


There are occasionally "survey papers" written. These are very close to a good solution. However, they suffer from a lack of view points, as well as a lack of frequency. These two issues are directly addressed in the following section.


The proposal


There are two phases of my proposal for action. The first is the "catch up" phase, followed by the "maintenance phase".


"Phase 1: Catch Up"


The "catch up" phase is the longest and most difficult, but also the most helpful.
I propose that the general population of a field nominate and elect a committee of experts who are the most qualified, accomplished, and knowledgeable people in the given field. These people would be charged with two tasks. The first is splitting the field into an appropriate number of subfields. I can only speak of my field of Computer Vision. One could break this field down into "Structure from Motion", "Object Recognition", "SLAM", etc. There should be 3-5 experts in each of these sub-fields on the committee. Each sub-committee is then charged with producing a survey paper of the work in this area starting as far back as possible and going to the present year. This will certainly be a large document, with many, many references, however it is important not to get lost in the task of listing references. These connection between papers and following the evolution of each idea is the central idea of this whole project. The payment for this exercise is an overwhelming sense of advancing the state of the art of scientific research procedures, as well as a resume line item which indicates that you are a recognized expert.


"Phase 2: Maintenance"


This is the easy phase! This process must be performed yearly (or at some other regular interval). Again, a committee must be selected. However, all that must be done is a short review of what has happened in this sub-field in the last year. References should NOT, for the most part, come from more than 1 year ago. This keeps these reviews linear and sequential, making them extremely easy to follow.




Potential problems


After some initial conversations with some of the field experts, it is apparent that there is potential for some political issues to be raised with a project like this. You may get people complaining "Why is my paper not included in the survey!?". You may also expose parallels that the original authors did not realize, making them feel "foolish". It is my opinion that the progress of the field and rapid absorption of young researchers is much more important than protecting an individual from this type of silly whining.

Potential benefits

If students could read a couple of these documents and be fully caught up on the state of the art of their sub-field, many new doors would be opened. First, people would not be so restricted to a single sub-field. It would be possible to keep current very easily in multiple sub-fields by simple reading through these documents when they are released yearly. I have seen many times where a solution from an outside field has been adapted to a problem in the field with amazing results. Second, students could move forward confident in the fact that their work is actually on a track that the field is interested. They could also be certain that their work has not been previously attempted. The time savings when multiplied by the number of students is incredible. By applying the correct resources (the experts) in the correct places (a directed effort of these systematic summaries), a much more efficient community can certainly be achieved.

No comments:

Post a Comment