Friday, July 20, 2012

ICFP 2012 Post Mortem

Update: The results of the first round have been published. Our submission was eliminated in the first round (not a huge surprise) and we got 154th place out of a mere 221 actual submissions that the organizers actually received. We definitely could have done better, but this is not that awful especially considering some of the bugs that were found after the submission. Only the top 117 submissions advanced to the next round and we were not that far off. Ah, now I can't wait until next year...

After a very long coding session with very little sleep, ICFP concluded and my team did fairly well for my first time on my own and my teammate's first time all together. In the end, we did submit a program, which is more than I can say for some years that I have competed (and I do mean seriously competed and never submitted, not just tinkering around).

Whew, that was close…

I am always surprised at how long it takes to package up your program and submit it. I got our submission in at 6:59 AM, less than a minute before the deadline. It was quite the nail biting finish, to be sure.

The Successes

I found this year's problem to be excellent. I never felt that I was completely lost for long periods of time and yet I never felt that I would completely explore all aspects of the problem, either. This years problem seemed much less mathematically deep than those of some other years, but it was interesting and very easy to get started with, which makes for a very nice ICFP contest problem.

The problem this year involved writing a program that played a little mining video game (here is a javascript version written by another contestant). A good submission would find an optimal path through the mine to collect as many "lambdas" as possible, avoid dieing (rocks falling on top of it or drowning), and find its way to the exit. This year's problem was very accessible. You could submit something immediately that would be a valid program (something that just printed an A for "abort"). Further, there was basically no barrier to improving that solution.

What was suspiciously lacking this year was the deep math or computation theory under it all. This seemed to me to be a basic "search as best you can until your time runs out" sort of problem. Then again, we didn't have a real computer science guy or mathematician on the team this year, just a web developer and a computational physicist, so I wonder what other people found that we missed.

Also, I am pretty shocked that the organizers didn't allow people to submit maps to try other peoples robots against. I guess it doesn't quite jive with the storyline, but you could make it fit. Adding direct competition is a good way to make a problem adaptive to the skill set of the contestants. Also, no leader board? Bummer. But these are minor things. The task was all in all very good.

The organizers were also excellent. I say that because I never had to contact them and I never felt (for long periods of time) that the problem description made no sense. Even when I felt (and might still feel) that the task statement was ambiguous, or in some cases just plain wrong, they provided ample examples to clarify at least how they interpreted the task statement's meaning.

My teammate was also excellent. He stuck with it through the entire weekend for very long hours and provided much help in thinking out solutions and strategies as well as hashing out the meaning of various parts of the problem. I only hope he found the contest as fun as I did. All in all a pretty good year for ICFP. Definitely the most fun I have had with ICFP in a long time.

The Remote Development Environment

The collaborative editing tools were mostly successful. There are some annoying (and productivity hurting) bugs that we hit using Rudel, usually when disconnecting and reconnecting later, but for the most part it worked quite well. The lack of graphics were a real detriment for me. I could have set up a visualization program very easily if there was a good way to use it over the network, but instead we fought with rendering the maps to Emacs buffers. My teammate suggested we use Flash for visualization, something he has a lot of experience with. While I was reticent of the idea at the time, due to my aversion to using flash in general, I now think that this would have been a good course of action.

In the end, however, I very much enjoyed text based setup we actually put together. This allowed us to have meaningful printed representations which are invaluable for debugging a program, something that we would have implemented anyway. After three straight days of watching our robot wander around a mine I feel like I have been playing Nethack for the last 72 hours, battling a wily rust monster. On the right is an example of our robot performing a search on a mine I made up using a particularly bad search heuristic. Each frame is a system state the robot is considering in the A* search.

In my opinion, the usage of Rudel made for much less "work off the reservation", basically people working on some aspect of the problem that really isn't that important to the group. In addition, the fact that we were working on the same image gave a certain urgency to changes that temporarily broke functionality. I still think this is a net positive for development of this type, but I would also like to try it more to determine if it truly is.

I also feel that I did an good job organizing the team save for the number of people I was able to attract (a bit of a pat on the back, but whatever). The EC2 server where we hosted Lisp image worked pretty flawlessly. Google+ Hangouts worked very well for sharing peoples screens (although Google transcoded the video to an illegible resolution upon the YouTube broadcast). This all worked very well, all things considered. The end bill for running a "small" instance on Amazon EC2 for the weekend, plus a two days before to upgrade to Debian Testing and to make sure everything was working, plus storing the hard drive images was around $10. In my opinion this is very reasonable. Below are the recordings that we took of the coding sessions. We stopped on the second day and what is there is long, unedited, and, as I mentioned, mostly illegible, here it is anyway.

I think that actually developing on a machine very similar to what it will be judged on was a good idea. In the end, I built an executable core image and included a simple script to build the executable. I believe that I screwed up and I needed to explicitly install SBCL in the install script, which I did not. Luckily, because the image should be binary compatible with the judges machine, even if the install script fails to run, it will still use the binary that I included in the tarball.

The Shortcomings

As good of a year this was for ICFP, there were some disappointments, for me at least.

The biggest disappointment this year was that I was unable to get more people from the small Common Lisp community interested in this team. Posts on Reddit, Lispforum, c.l.l., and #lisp, and #lisp-lab, even Hacker News resulted in a only one interested teammate before the time of the competition. When the competition started, our live coding of the event drove some limited interest and we got two or three interested parties, but none of them were really able to participate with us for various reasons: one individual was confused about the time frame of the competition and missed it, one couldn't get Google+ Hangouts to work in their country (IIRC), and another wanted to watch the Rudel session and may have contributed, but I was understandably timid about granting either ssh access to an unknown party or opening the Rudel server port to the Internet (Rudel doesn't have a read only mode that I know of). Each of these issues could have been avoided with even a single day of prep/testing time, but I probably could have found a solution with a lower barrier to joining in mid-competition.

I do wonder, however, how much of the low participation was due to the tools I chose to use. Not every Common Lisp programmer uses Slime and Emacs (or some other hypothetical Swank and Obby capable editor). Some others may have been turned off by the Google+ usage. I imagine that other Lispers participated, just not with us. Perhaps we can do better next year.

The time separation (7 hours) was actually a pretty big problem. The idea was to have a meeting before one of us went to sleep and when one woke up to catch everybody up. This proved difficult as by the time we got around to a meeting, one of us had usually passed out. It is hard to stop time sensitive work for a meeting. That said, we dealt with it and overcame. More people would have alleviated this significantly, particularly if they are scattered around in different time zones.

One real criticisms of myself is that I didn't take on a leadership role. In retrospect I think that I should have. Not be a hard ass, mind you, but to provide a direction that I wanted to work towards. Once someone sets a direction, I find it is easier for someone else to voice their dissent, which is a good thing. I also suffered from a bit of the software architectitis, where I started making packages and files for no good reason. In the end we had around 9 files and than 6 packages, some circularly dependent. Because of this complication of the program, things needed to be refactored at the last minute to get a submission together and that refactoring proved more difficult than it should have been. This was one of the main factors that resulted in us barely submitting in time.

The last irritating bit was that, as per usual, I didn't quite perform as well as I wanted. The way it should have gone (giving a liberal amount of time for each step):

  1. Friday 13:00 UTC: We are basically playing Dig-Dug, okay! (Up until the very end I was expecting fire breathing dragons to show up)
  2. Friday 14:00 UTC: We need to find good paths in a graph
  3. Friday 15:00 UTC: Parser implemented
  4. Friday 20:00 UTC: Simulator implemented
  5. Saturday 00:00 UTC: A* search Implement
  6. Saturday - Monday 12:00 UTC: Alternating updating the simulator and tweaking heuristic function until it works well
  7. ? UTC: Some brilliant insight that makes our program awesome

We were roughly on schedule until step 4, IIRC. Things didn't continue on schedule for a number of reasons. First, everything just takes longer than what we think it should. One big setback was when I implemented the simulator in a way that was not using the task descriptions notation, my teammate wanted it using that notation (so index values matched, etc.). It was a valid concern, and I had to change it. This caused more bugs than would usually be there in a very crucial part of the program, but it may have avoided other bugs cropping up from confusion in the later development. This is not a huge set back as these bugs just changed the rules of the game and didn't crash things, so certain aspects of the game were weird from time to time. Looking back, the thing that took way more time than it should have was the decision to settle on an A* search for finding solutions, which should have been the first thing we tried and tweaked from there. By the end of day two of the competition, I had grabbed Peter Norvig's PAIP code and was using this as a search mechanism (using beam search). It wasn't until the middle of day three that I switched to A* which is much more appropriate. Besides making me question my competency as a programmer, this really made me question why we don't have the code from PAIP in an easy to download (a.k.a Quicklisp accessible) and use form. I think I will make this one of my projects, maintaining a modern version of some of Norvig's code, particularly the search tools (I have long wanted to package an improved version of the CAS he provides).

Programming competitions and software development

As I often feel after an ICFP competition, I think that I could do much better if I attempted it the next weekend. I'm not talking about giving us a few more days. That wouldn't have helped at all. I was pretty exhausted by the end of it. I mean that competing in the ICFP contest, getting code out the door in a short period of time, requires a certain frame of mind that is distinctly different from my usual state of mind when developing. You have to re-adjust how your brain works in order to be successful in these short deadline competitions.

I simply do not work this way in my job or hobby projects. I tend to enjoy the process of developing a well thought out program, or better yet, a document, as I very much like literate programming. A well written program, in my opinion, is a program that reads like journal article. This is not a conducive mindset for the ICFP contest, to say the least.

At the start I was using Git to keep track of changes. This is a lot of development overhead, but that overhead tends to be worth it in software development, even beneficial, but it is a downright detriment to your productivity in a competition like ICFP. When I finally stopped using Git, things went a lot faster. At the start, I spent time thinking about organization. Later, I just hacked code wherever it would fit. Again, this eliminated a lot of development overhead. There were even a few moments when I started writing documentation for the theory of why I was doing this or that. This was quickly stopped when I noticed what I was doing. It was as if I was unlearning any and all good practices that I have learned over the years in order to compete in ICFP effectively. After I got into a quick development mindset, the whole thing became much easier, but it took days for me to reach that point.

I have come to the conclusion that being good at short deadline contests relates to actual software development about the same way as running relates to driving a car. Both running and driving move you places, but they are extremely different activities. Granted, you do have to worry about similar problems: don't run into things, you need some form of fuel to get you moving, you have rubber objects that insulate you from the ground, but being a good runner doesn't make you a good driver, and a good driver doesn't make a good runner. You are thinking about two vastly different sets of important factors when you are doing each. You optimize for different goals and each is good for achieving different things. Driving is clearly faster for long term goals, but it is often faster to sprint a short distance than to get into a car and drive there, particularly if the path is off road. However, as with running and driving, I see no reason why people shouldn't train for both if they want to be good at both.

No comments :

Post a Comment