Thursday, July 22, 2010

Call for coherent, systematic field summaries

The overview

As a Ph.D. student, figuring out what has already been done by past researchers is by far the hardest problem I have come across. The current expectation and requirement is that every student will browse tens of conferences and journals going back tens of years. At the end of this long and tedious process, there are several common pitfalls. First, if even a single paper in a single conference was missed, this could lead to months of work which will eventually be found to be wasted time because it was spent working on an idea that was already developed. Second, even if every relevant paper was indeed located, students almost always discover what happened in each paper individually, and rarely draw links between papers. This is critically important, as described below.

The details

I recently attended the International Computer Vision Summer School (2010). There was one session in which students were asked to read 3 papers and trace the ideas in these papers back as far as possible in the literature in the form of a tree. After reviewing the submissions, the organizer was quick to point out that there was very little overlap in the trees submitted for these 3 papers. He went on to explain how the 3 papers had essentially presented exactly the same concept, just from slightly different angles and using different terminology and notation. Thus, the trees for the 3 papers should in fact be identical! This was an excellent concrete example of the problem - it takes an "already expert" to extract these deep and extremely important connections from a literature review. Leaving it up to every new student is not only a fruitless effort (as they will not get the correct information out anyway), it is also enormously replicated effort! There should be a system in place where efforts are pooled to do this completely and correctly a single time.


Previous attempts


There are occasionally "survey papers" written. These are very close to a good solution. However, they suffer from a lack of view points, as well as a lack of frequency. These two issues are directly addressed in the following section.


The proposal


There are two phases of my proposal for action. The first is the "catch up" phase, followed by the "maintenance phase".


"Phase 1: Catch Up"


The "catch up" phase is the longest and most difficult, but also the most helpful.
I propose that the general population of a field nominate and elect a committee of experts who are the most qualified, accomplished, and knowledgeable people in the given field. These people would be charged with two tasks. The first is splitting the field into an appropriate number of subfields. I can only speak of my field of Computer Vision. One could break this field down into "Structure from Motion", "Object Recognition", "SLAM", etc. There should be 3-5 experts in each of these sub-fields on the committee. Each sub-committee is then charged with producing a survey paper of the work in this area starting as far back as possible and going to the present year. This will certainly be a large document, with many, many references, however it is important not to get lost in the task of listing references. These connection between papers and following the evolution of each idea is the central idea of this whole project. The payment for this exercise is an overwhelming sense of advancing the state of the art of scientific research procedures, as well as a resume line item which indicates that you are a recognized expert.


"Phase 2: Maintenance"


This is the easy phase! This process must be performed yearly (or at some other regular interval). Again, a committee must be selected. However, all that must be done is a short review of what has happened in this sub-field in the last year. References should NOT, for the most part, come from more than 1 year ago. This keeps these reviews linear and sequential, making them extremely easy to follow.




Potential problems


After some initial conversations with some of the field experts, it is apparent that there is potential for some political issues to be raised with a project like this. You may get people complaining "Why is my paper not included in the survey!?". You may also expose parallels that the original authors did not realize, making them feel "foolish". It is my opinion that the progress of the field and rapid absorption of young researchers is much more important than protecting an individual from this type of silly whining.

Potential benefits

If students could read a couple of these documents and be fully caught up on the state of the art of their sub-field, many new doors would be opened. First, people would not be so restricted to a single sub-field. It would be possible to keep current very easily in multiple sub-fields by simple reading through these documents when they are released yearly. I have seen many times where a solution from an outside field has been adapted to a problem in the field with amazing results. Second, students could move forward confident in the fact that their work is actually on a track that the field is interested. They could also be certain that their work has not been previously attempted. The time savings when multiplied by the number of students is incredible. By applying the correct resources (the experts) in the correct places (a directed effort of these systematic summaries), a much more efficient community can certainly be achieved.

Conference Summary Committees

At every professional conference, hundreds of papers are presented. This can quickly become quite overwhelming. For people in attendance, the game plan seems to be to scan the list of titles in the conference schedule to see which posters and talks seem most interesting and/or most applicable to the individuals research objectives. To be sure, a major goal of conference going is to network and make new contacts. However, there should be another major goal which is often talked about but overlooked for the most part. That goal is keeping current with ideas and discoveries in fields related but not exactly in your research area. This is nearly impossible by simply walking around and looking at posters.

Enter the solution. At each conference, a panel should either be appointed or elected. This panel should consist of leading experts in many or most of the sub-fields represented at the conference. These experts should have a discussion at the end of the conference to decide what the serious contributions were at this conference. It is no big secret that the majority of papers submitted to a conference are incremental improvements on existing methods with mildly better results. However it is quite tough to pick out these "serious" papers without a solid background in the sub-field that they came from. Therefore, it should be up to this proposed panel to construct a short document (<5 pages or so) "summarizing" the contributions of the conference. This would allow not only conference attendees to receive the "take home messages" at the end of the conference, but also for people who were not able to attend the conference to have the big picture idea of what they missed. Handing a colleague a DVD with 400 abstract and papers and saying "here are the proceedings" is almost certain to invoke the same exercise of scanning titles and reading only papers relevant to his current research. If, instead, one could hand a colleague a 5 page document and say "this is what happened at the conference", the entire field would stay much more informed and up to date.

Students are not being prepared for industry

I can only speak about my field (computer vision and image processing), but I imagine the situation is similar across the board. What we learn in college are "the fundamentals" - the theoretical (often too much so) ideas of many topics. We are seldom asked to implement these ideas in software. When we are, it is done with absolutely no consideration of the process - that is, you can use which ever language you want, which ever method you like for revision control (including none!), work by yourself or in a group of your size and choosing, and the list goes on. The only thing that is important is the result. When you get to an industrial setting, exactly the opposite is true. Working on a team of programmers is critical. You must understand how to share responsibilities, ideas, and code. These are the most important skills for success in any real setting, and they are rarely exercised - and definitely not taught - in college.

After a recent interaction with the hiring manager for GE Global Research Center, I have learned that they actually plan for at least an entire year of negative productivity from new students. That is, new hires are an investment. They hire a new student with the intention of training them for at least a year before they start adding value to the company. It seems to me like this transition should be much much smoother. It should (clearly?!) be part of the responsibility of post-secondary education institutions to prepare students for their next life role as an employee.

Tuesday, June 1, 2010

Stop letting students play online during class!


Walk into almost any college classroom and you will almost certainly see at least a handful of students staring at a laptop screen, clearly not at all paying attention to the instructor. In this discussion, let's ignore the underlying reasons that the students are not paying attention (most times it is the lecturers fault that they are not engaging the students properly). No matter whose fault it is, fooling around on the internet during class is rude, unacceptable, and should (and can!) be stopped!

I am not suggesting laptops be banned from classrooms. In fact, they can be extremely valuable tools for taking notes, communication, demonstration, and much more. However, it is rather obvious when a student is doing something other than something class related. It is very simple - with a quick glance around the room during a pause in speech, those students whose eyes shift their attention to the lecturer to see what the pause is for are paying attention; those students whose eyes remain glued to the screen are not! An even more effective way of determining this is by walking a lap around the room (though this may not be feasible in a large lecture hall). It pains me to see faculty blindly continuing a lecture when many students are clearly off playing in “internet land”. Use your authority and tell them to put away the laptops!

I am actually NOT an advocate of required class attendance, but if a student does come to class, it is extremely rude to not pay attention, and even worse, become a distraction to other students who did come with the intent of paying attention.

As Monica Bulger mentions here: http://monicabulger.com/2010/04/banning-laptops-doesnt-solve-the-distraction-problem/
laptops are certainly not the only source of distraction in a classroom. However, actively engaging in non-class related activities on a laptop is an obvious distraction and one that can be easily prevented with a simple "Hey! Put away the laptops!".

Wednesday, May 26, 2010

Granularity of Grading Scales

One thing that has always bothered me is the granularity of grading scales. The point of the exercise of giving a grade should be an attempt to classify how well the student learned the material. It seems reasonable to classify this level of understanding into “not at all”, “not very much”, “ok”, “pretty well”, and “excellent”, which seem to correspond to the typical F,D,C,B,A. What does NOT make sense is to assign a number on a scale of 0-100. A grade of 67 seems to indicate somehow that 67 percent of the material was learned. This is, however, not at all the case. Rather, it means that the student answered 67% of the particular questions posed on this assignment correctly. It is extremely rare for faculty to ask exactly the right set of questions to determine if every concept was learned in a reasonable way, so this number seems just about meaningless. I have always been in favor of oral exams. I find it extremely easy to, within a 5 minute conversation with a student, classify their understanding of what they should have learned into one of the five categories described above. I guess at some level you’d have to buy into my “Teach the Why not the How” concept (http://daviddoria.blogspot.com/2010/05/teach-why-not-how.html) to realize that it doesn’t really matter if a student is able to produce the correct numerical value on an exam question, but it is EXTREMELY CRUCIAL that they understand the general “what is going on”. I understand that in large classes they are not reasonable, and this would certainly need to be addressed in order to implement such a system on a large scale.

Monday, May 24, 2010

Why does no one care that professors aren't trained as instructors?

It has always seemed extremely odd and unacceptable to me that faculty members of most universities, while being experts in their areas of research, have not received even a single hour of training on how to be an effective educator. While such expert status may make someone the best person to teach a very specialized topics course on their research interests, how are they the most qualified people to teach introductory level, or even advanced undergraduate courses? One could argue that a student who has completed the course is equally qualified to teach it as the professor. Though the professor may know many more advanced topics, these rarely help in explaining basic principles; in fact, it may make them more convoluted.

Many parents are so interested in having these unqualified instructors that some universities have instituted policies to prevent graduate students from teaching classes. A graduate student who is interested in teaching would serve as a much better instructor than a "distinguished" faculty member who learned the material over 40 years ago, hasn't changed his teaching style to keep up with modern trends in education research and learning style evolution, and is frankly uninterested in teaching at all in this point in his career. It is unfortunate that parents don't consider these facts when deciding, especially so passionately, who should be teaching their student.

For any other job, training is an intensely integral part of the job. Pilots of airplanes must log thousands and thousands of hours before they are allowed to fly. There are even federal regulations to ensure that every airplane pilot is not only trained appropriately, but also can demonstrate that his training has resulted in him being an excellent pilot. However, for arguably the most important job, educating the next generation of the world, no one blinks an eye at the zero hours of training logged by the pilots of the classrooms.

My recommendation is a "basic training" for faculty. When any university hires a new faculty member, they should be required to attend a several week training program by a nationally standardized group. It is the responsibility of this group to have thorough training, practice, and examination programs in place to ensure new faculty are going to be effective educators. This initial training would be a massive step forward, but it cannot end here! Almost no certifications are a "get it once and have it for life" type of deal. Every 5 years, the faculty should be required to attend a 1-2 week "re-certification". The faculty training staff would be provided with the course reviews that the faculty member has gathered in the time since the last training session, and they would work together to address any issues that may have arisen, as well as introduce new technologies and methods that can (and usually should) be incorporated into the instruction.

It is a well known fact that there is much griping among students such as "All of my classes are terrible!" and "The professor just rambles at the blackboard!". With this type of faculty training system in place, there should be no way for these complaints to continue, making students happier, smarter, and much better engineers of tomorrow.

Thursday, May 20, 2010

Teach the "Why", Not the "How"

Let us consider a college Calculus course. The timeline of the course goes approximately like this:

Week 1 - What is a derivative?
Week 2-10 - Practice manually computing derivatives of hundreds of functions.
Week 11 - What is an integral?
Week 12-20 - Learn many methods for performing integration manually and practice this on hundreds of complex functions.

They have the missed the point completely. With todays technology, on a device as simple as a handheld calculator, one can find a derivative (10 weeks of practicing) in a single line:

diff(f(x),x)

The same goes for integration:

int(f(x),x)

What is it, then, that we humans are needed for at all? The answer is three fold

1) To be able to produce an f(x) out of a real life problem

2) To know when to type one of those lines into the calculator or computer

3) Understand how to interpret the results

Consider a typical, simplified workflow in an industrial application. You work at a marble factor. Your boss says "Dave, we need you to make a box out of this cardboard square so that we can ship the most marbles." You go back to your office and draw a sketch of an unfolded box. You come up with the equation of the volume of the resulting box in terms of a couple of parameters. You realize that what you need to do is find the maximum of this volume function. You know that by equating the derivative of a function to zero you will find the extrema! You then type

solve(diff(f(x),x)=0,x)

into the nearest computing device. It tells you that there are two solutions. You realize that one is probably a minimum and one is probably the maximum you are looking for. Another couple of lines to verify this:

v(solution1)
v(solution2)

You pick the solution that produces the highest volume. You are done! You report to you boss that you know how to make the biggest box out of the cardboard blank and he is happy with your work.

Here is a rough breakdown of the knowledge required to solve this problem:

50% - Setting up the volume function
40% - Knowing that the zeros of the derivatives are the extrema
10% - Interpret what the roots mean

Note that 0% was reserved for knowing HOW to solve for the roots. Computers can do this :)

Your boss may ask you "How did you do it?". To this you can say "I developed an equation for the volume of the box with a couple of parameters that you can see in this sketch. I then maximized this function to determine the appropritae parameters." You will NOT have to answer the question "How did you solve that equation?". This is because not only is it unimportant, it is also assumed that you have used all of the tools at your disposal to prevent human error.

Let's revisit the Calculus class. How much time did they spend teaching you how to develop functions from practical situations? Almost none. How much is this necessary? VERY! How much time did they spend teaching you HOW to solve the problems? Almost 10 weeks. How much is this necessary? NONE!

The solution here is to shift the instruction away from the "how" and focus on the "why".

A skilled team of mathemeticians and computer scientists has figured out how to handle almost any operation on almost any function you can develop. If you are not on that team, you do not need to know what happens inside the box, you only have to be intellegent about what you feed to the box, and you have to know what to do with the information the box gives you back. THIS is engineering.

Certainly we do not want these mechanics/details to become a lost art. By not teaching every student these details, we are not losing the art. There are hundreds of text books published with the mechanics/methods for these operations. If one ever does need to know the details (perhaps they've joined the team to write the next software package!), any of these books can be consulted.

I am ABSOLUTELY NOT condoning the typical malpractice of students just typing equations into a calculator without knowing what they mean. In fact, many teachers have banned calculators in classrooms for this reason. Unfortunately, these teachers have missed the point. If their questions can be answered by the calculator alone, they are asking the WRONG QUESTIONS! As described above, the need to focus almost entirely on what the calculator CANNOT do, and explain that it is OK to defer the parts that the calculator CAN do to the calculator! A math teacher may tell you "No, no, look at all of these real life problems the students are doing for homework!". Look more closely. These "real life" problems are typically simply a mask over a mundane "do the computation" type of problem. Rather than just renaming variables, the empasis needs to be on "Look how many exciting things you can do now that you know these concepts!".

We have to be careful, though, to not say "Here is the theory, now you are done." This is very bad. Theory without practice is reserved for scientists and mathemeticians! But we are engineers! We must say instead "Here is the theory, now we're going to discuss and give you many examples of typical use cases". This is what will put students on the right track to becoming excellent, problem solving engineers.