Sunday, December 16, 2012

A Flat Dark Theme for Unity/Gnome 3

I care a fair amount about the style and visual nature of my environment, not only from an appreciation of nice things, but also from a utilitarian standpoint. I use Unity and up until today, have been using a theme called Zukitwo, in particular the "brave" version of that theme. I used this as it was reasonably dark and stylish. Today I switched to a darker theme (the windows contents are actually dark), and one that I think I like a bit better, Boje. If you actually kind of like the minimalist thing that Google has been doing with their websites ever since Plus came out, and you would like a dark version of that style for your Unity/Gnome3 desktop, Boje might work well for you.
All you need to do is download the theme, extract it into the ".themes" folder in your homespace. Then you can select it using the Advanced Settings program under the theme option on the side-bar. Set the GTK+ and Window theme to Boje. Once you have done this your computer will, more or less, honor this theme. Technically, this will only apply for GTK apps, but luckily most applications that you encounter are GTK applications. There is one hiccup that I've found, however. Firefox, for whatever reason, uses the system theme to color certain elements of web pages. Specifically, when using Firefox, input forms (e.g. text boxes, radio buttons, drop-down selectors) will have the background color of the system theme and, to make matters worse, tend to have the web sites text color. This means that many sites will have black bars on them which are actually search fields. You can type in them and they do what they are supposed to do, but you cannot see the text and they look ugly, which is more than a little bit annoying. To fix this (at least partially) Firefox allows you to insert overriding CSS style into the pages you visit via a file ~/.mozilla/firefox/yourprofile.default/chrome/userContent.css. If you edit this and insert a few styles that request that Firefox use white backgrounds on text fields this will make pages at least render readably. This is a general problem with dark themes and Firefox and really should be fixed.

What About The Rest Of The World?

There is one issue with all this, while you might enjoy a dark theme, the rest of the world seems to have decided on bright backgrounds (something about thinking they look simple, or clean, or minimal). When you combine that with the fact that probably more than half your time is going to be spent in a browser window, you are going to be staring at and interacting with a lot of brightly themed interfaces anyway. But there is something we can do; we can extend our little hack for Firefox above to all web content and do it for Chrome as well using Stylish. Stylish is an extension for Firefox and Chrome/Chromium that allows users to fiddle with CSS styles (and perhaps more) on web pages that you visit. Think of it as a limited version of GreaseMonkey that is geared towards tweaking the visual style of pages. With Stylish installed you just click a button when you encounter a page that you don't like and find a style that fixes what bugs you. There are over 41,000 user submitted styles out there, but importantly they don't usually work forever (pages constantly change and the Stylish scripts will need to be tweaked). This means that the vast majority of these actually don't work completely. It is a process of trial and error, but it is easy to turn off malfunctioning style scripts. I guess I feel that they are worth the effort as they are exactly the best possible solution to this problem if they work, and sometimes they do. They allow you to do the following:


When all else fails

I find that it is good to have a quick and dirty method to save your eyes if you have a non-GTK app or your Stylish script is broken or things are just generally not working. For this, I use Compiz's "Negative" plugin which you can configure using Compiz Configuration Settings Manager or ccsm. This plugin allows you to invert the video on a per window basis at the press of a key chord. I have bound "Super-n" to this functionality. This is far from ideal. This inverts all video which mean that all of the colors are screwed up when we use it, which makes video and images look like crap. But it is a useful fall-back method.

Update: Firefox can really look the part if you install the "Dark Bright-Aero" theme.

Thursday, November 1, 2012

No More Coursera Starter Kit

I just finished the latest Coursera assignment minutes before the deadline (a deadline that was extended due to Hurricane Sandy).  The assignment was not hard, but it certainly was time consuming.  Clearly I will be unable to put out a starter kit in any timely manner.  But, more importantly, I realized something when working through the assignment, this code that they have offered is a mess.  The act of porting this to stuff to Common Lisp would be a nightmare.  It is much better to rewrite the stuff.  The problem with this is that the actual assignments more or less require you to use the Octave code as it does things like sample from a random distribution.  It would be annoyingly difficult to guarantee that any code I wrote would sample the distribution the same way, especially since they use a built in MatLab/Octave routine.  I would have to research what PRNG Octave uses and likely implement it and make sure to seed it with whatever Octave is using.

The other thing I realized when working with the perceptron code is that writing starter code sucks for the programmer.  Writing code that has trivial pieces missing from it on purpose is not how I want to spend my time.  Creating teaching materials requires a crazy amount of prep-time and I don't care enough about education to spend all that time and, in the end, am left with nothing but a classroom exercise that cannot even be used for tinkering at the REPL.

This has lead me to the decision to stop my already feable attempts to maintain a Common Lisp port of the neural network starter code.  However, I still find that porting code is extremely useful in making sure you understand the concepts involved in the code.  I have always felt that once I could write the code for something, I tend to truly understand it.  So I will be taking the provided code from the course as a starting point and creating my own neural network related code, closely in parallel with the course, in Common Lisp.  I will not be leaving out the missing pieces as they do in the class.

I have decided that this won't really violate any rules of the course.  Looking at my perceptron code and the code that was released by the class organizers I doubt that any person would see one as derivative to the other if they didn't already know I was working off the other.  Further, I doubt that any person that couldn't answer the questions in the programming assignments using the MatLab code could look at my code and identify the pieces that needed to be ported back.  And it isn't as if executing the Lisp code will help them at all.  As I said, I doubt I could actually write code that would give the correct numerical answers for the homework even if I wanted to because of the MatLab/Octave heavy emphasis in the provided code.  I was able to just barely do it in the perceptron case and it probably took about half the development time to figure out how I would load the MatLab data and exactly match their form of the learning algorithm.

If you are following this work, I will probably stay on the same git repository for a while, but might switch to another more appropriately named one in due time.  Hopefully I will have a little toy neural network library to play with in a few weeks.

Friday, October 19, 2012

Neural Networks, Coursera, and Common Lisp

There is a course offered on machine learning using artificial neural networks offered at Coursera this "semester".  I am taking it and it is my first class I have taken with Coursera.  I have forgotten how nice it is to learn a new subject.

Besides the fact that artificial neural networks have always interested me, one of the reasons I decided to take this class is to become somewhat fluent in Python, which for some reason I thought would be one of the supported languages.  Turns out it isn't, but no matter.  Python is popular enough that someone will always port to Python.

I figure I might as well do the same for Common Lisp.  This is very late for the first assignment, and I make no promises that it will be more timely in the future.  I decided not to do a line by line porting of the Octave or Python code to Lisp because, well that's just not how we Lispers do things.  It is cleaner in its current form, but harder to prove correct at a glance.  Also, I will be using some packages that I have been working on but are not officially released, such as my zgnuplot library.  I have been using this plotting library with in-house code for a long time, but it is still pretty rough around the edges and the interface is still in flux (hopefully headed towards something a bit better).  But whatever, it's out there, feel free to do what you will with it.

Note to people that might fork this, the rules on Coursera (well, at least this course) is that you shouldn't post solutions.  So don't work off this repo, plug your solution into it, and publish it online.  You should probably work in a private branch that you don't publish to Github.

Tuesday, September 25, 2012

Stallman on Steam on GNU/Linux

In July, Valve announced that they were in the process of porting Steam and a handful of games to GNU/Linux. Shortly after, Stallman put forward this statement on the benefits of porting Steam to GNU/Linux. In this delightful and surprising opinion statement, he argues that the gains of video game loving people switching their OS, the foundation of all of the their computation, to a Libre alternative displaces the damage done by Steam moving to GNU/Linux. Now, we should all note that while Stallman can give his opinion until he is blue in the face and it doesn't necessarily mean anything. GNU/Linux is Libre Software and, by the definition of Libre Software, has nothing to say about what software can or cannot be run on the system.1 Be that as it may, Stallman's opinion does hold some weight for some for the community.

What is interesting is that this is a huge departure from the opinions Stallman typically expresses. From what I've seen, Stallman has never been one for compromise when it comes to Libre Software, even when that compromise would almost certainly result in benefit to the Libre Software community. As an example, you can listen to Bryan Lunduke and Chris Fisher's conversation with Stallman on the "GNU/Linux Action Show." Lunduke, who dominated the discussion, was looking for tips on how he could transition from a proprietary software developer into a Libre Software developer. In particular, he was looking for information on how he could do this and still support his family. Now, I'm not claiming that Lunduke is a programming superstar, or that Lunduke's software is so great that having it released as open source is a clear and substantial benefit for the community. I don't really know what effect his software would have for anybody. I will say that having a guide of how to transition to Libre Software development, even if only partially, would be a great help to the community, and reiterating the message of the Free Software Movement to any captive audience is likewise a benefit. Stallman's response was implicitly that that an independent developer (i.e. someone that makes their living selling digital copies of their software) simply cannot currently support themselves with Libre Software development. What he actually said, however, was that Lunduke should seek a new career. Specifically, his point seemed to be that comparing the viability of "for profit" Libre Software development to proprietary software development was a futile exercise since proprietary development is unethical and akin to "burglary". It might or might not make you money, but it should not be done for ethical reasons. While I agree with the gist of the statement, arguing with extremes like comparing a relatively minor restriction of people's freedom to something like burglary is frowned upon by most people. More to the point, I feel that many more minds can be won over when an understanding and softer, but still firm, voice is used. I invite you to watch the whole interview, as infuriating as it is. By the way, if you come off thinking that this Lunduke fellow was treated unfairly and that Stallman is a d-bag, you might watch the following week's episode where Lunduke throws a multitude of ad hominem attacks against Stallman, which will probably make you reassess where the d-baggery lies.

I bring up all of this so that the context is clear of what Stallman's opinion of compromise has been in the past. This opinion on Steam seems nearly 180 degrees out of phase from the standard Stallman opinion. It is not that I see this as antithetical to the advancement of Libre Software, quite the contrary as I will argue, but I do see it as not keeping with Stallman's usual position on the mixing of Libre and proprietary software, much less DRM laden software. It should not be lost on us that Steam is perhaps the most visible DRM peddling marketplace in existence today. This is the software that Stallman is tacitly ok with.

I wonder if this is a change in Stallman's strategy for promoting Libre Software usage. I wonder if he now sees more benefit performing a bit of social engineering rather than just leading by example. At the very least, for people that say that Stallman is not able or not willing to compromise or see the bigger picture (which I have been guilty of a few times) this should demonstrate otherwise. Whatever the reason behind this change of opinion, tone, or otherwise, I think that this is a good thing that hopefully we will see more of.

A Libre Software ecosystem

I think that Stallman might be understating some of the indirect benefits of having Steam on GNU/Linux. Windows dominates the computing world because it has many users. It has those users because it has developers writing software. It of course has those developers writing software because it has all those users, who are willing to pay money. This self-sustaining ecosystem is something to be sought in the Libre Software world. Anything that can be used to bootstrap this, should be sought out. And yes, this means that people in the Libre Software world will need to get used to the idea of paying for what they want if the desire this kind of ecosystem.2 This is not difficult to work out. A developer only has so much time, some of this time needs to be used to make money to live, thus if they can make money developing free software there will be more free software. This is the reason I feel that the funding problem is the number one problem to tackle for the Free Software Movement.

The problem is, an ecosystem takes a long time to build up unless you are willing to pump vast amounts of money into it, and maybe it doesn't grow even if you do. It takes a long time to build, but what if you can have an already established ecosystem migrate to you? Valve has been relatively candid about their dislike of the direction that Microsoft is headed in, and who could blame them? There are some shaky decisions coming out of Microsoft and things are probably going to get much worse before they get better (if they do). I am expecting a pretty large exodus of users fairly soon.

The question becomes, where will these users go? Presumably, many will switch to OS X. However, as many have noted, Apple has some quite disturbing opinions (along with the track record to back it up) about what rights the users and developers of software for their iOS should have. When these limitations by Apple land on OS X, I expect that it will very likely discourage many Windows users from moving to that OS. This is made all the more troubling when you see Microsoft starting to tread down the same path Apple travelled as they seek to regain some of their lost market share and profits. These kinds of decisions by the major players in the field are a strong incentive for diversifying your product to a stable, yet non-restrictive OS.

Thus Valve decided to port Steam to GNU/Linux. I think the motivation of Valve's decision is just plain business sense. They want to buy some "Windows and Microsoft falls apart and not everybody likes Apple" insurance. This is a good way to hedge their bets on where their users will end up and ensure a longer life should Microsoft continue to make poor decisions. Further, they will have a substantive advantage over late comers to the GNU/Linux platform if it does indeed become popular amongst gamers. But the reasons for their decisions aren't what I am focusing on; I am interested in what effect we should expect for Libre Software.

The benefit and harm of Steam on GNU/Linux

In a recent post (see the last section), I made a particularly Raymond-esque argument that the importance of having something be Libre is directly proportional to how important that software is to your life and livelihood.3 I concluded that proprietary video games are "ok" so long as they don't come with even a hint of DRM. Because of this "no DRM" standard, I will most likely never pay any money for any software sold via Steam, nor do I think anybody should for that matter, though I know and accept that many will.

The point is, once you accept this argument for proprietary games and proprietary stops being your cut-off for completely unacceptable, you start seeing shades of gray and you will most likely come to the conclusion that proprietary may be bad, but DRM is worse. It is a matter of degree. DRM in software marks that extra step past software that is merely proprietary but non-abusive to something which is, in fact, abusive. DRM is in its very essence not in the users best interest, the epitome of an anti-feature. In addition, DRM is very often more abusive than the original authors intended it to be, hurting "legitimate users" as well as copyright infringers. DRM is almost by definition the act of abusing the proprietary nature of the application in order to actively restrict a user's freedom in a very overt and measurable way. DRM is the actual boogie man, or one of them, that the FSF has been warning about for all these years. Here he is, in the flesh, right where we can all see him.

This makes me wonder if this move by Stallman is a bit more diabolical than it initially seems. I suspect that if there is one place where proprietary software is unlikely to "teach users that the point is not freedom," video games are it. To the average user, a video game is a piece of software that is the epitome of luxury. From a practical point of view, a video game is only as good as the art that goes into it, and from Stallman's point of view, Libre art is of much lower priority than Libre Software (which I agree with). What I think might be the ultimate result of DRM video games in GNU/Linux is a better appreciation and understanding of the main goals of freedom in Libre Software rather than an erosion of these ideals.

By letting the boogie man run around your town and terrorize your citizens, it often has the result of people increasing their support for the white knights that have devoted their life to protecting users from that boogie man. What better way to ensure a boogie man that will be noticed but not cause too much damage than to let in DRM, which is today basically synonymous with proprietary software abuse, but only in video games, which are hugely popular but basically synonymous with luxury? We may even get a few new white knights out of it. When these new GNU/Linux users boot up and find a wealth of software, in fact an entire operating system, all of which is devoid of DRM, lock-in, and excessively high pricing, they might, just might, start to associate proprietary with restriction, DRM, and "annoying to use" while, at the same time, start to associate Libre/Free/Open Source with freedom and, gasp, "easy to use." In fact, if we continue to produce these pretty impressive Libre video games, people might start to wonder what they need Steam for at all.

And if Steam isn't overtly abusive? What if people accept it and like it? You don't get this effect I describe, true, but you still have Stallman's original point: more users of Libre Software means more freedom for users. You alse get more people familiar with developing for GNU/Linux and, arguably, increase the pool of developers that might write Libre Software in their free time. More or less, this is a win however you cut it, but it might be a bigger win than Stallman lets on.

The year of the GNU/Linux Desktop?

I know, this has been speculated about to death. But with MS actively imploding under user disappointment and Apple encroaching further and further on users' freedoms, there could very likely be a sizable of portion of the community that is just lost enough to jump to Ubuntu or similar.

The upshot of all this is that GNU/Linux is actually is a very good position to accomplish at least Torvald's initial dream, a sizable install base for Linux based systems on the desktop/laptop, and a large part of the FSF's goal, to get more people using more Libre Software. I know it has been said before, but right now I see a possible inflection point. If Microsoft continues to tank on their desktop/laptop software (which seems very likely), Apple continues to make their desktop/laptop computing environment more like the restrictive iOS, GNU/Linux gets Steam and game developers start to develop for GNU/Linux, and provided we find a good way to financially support Libre Software, there is a very good chance for a vast increase of the install and developer share. This increase will, at the very least, increase the number of users that have control over their own computers even if there will be a few, non-essential, pieces of software that will be out of their control due to DRM. Stallman is correct to encourage Valve's port of Steam to the GNU/Linux platform.

Footnotes:

1 It should also be noted that prior to fairly new positions by Apple in their iOS, this would be a silly thing to even bother to say. Before that it would be considered financial suicide to do so. Thank you, Apple…

2 That is not to say that I don't see the great benefit in having Libre Software also be gratis, but these ideas are not totally at odds with one another.

3 I actually have walked back my opinion a bit. I got to thinking about my sister, who sees video games as a social tool. As a social tool, this now becomes an important part of her life. I would wager that there are many people out there just like my sister. It would be arrogant of me to argue that just because I find something to be "purely a luxury" means that this is true for everyone. If I wish others to respect my opinions on what is important enough to be Libre Software, I should respect that others might find something I think of as non-essential as important enough as well.

Wednesday, September 19, 2012

The Cost of Fuel

I saw this image on Reddit today and thought, hey, that's not right. All methods of transportation require fuel. For the bike it just happens that the fuel is food rather than gasoline or diesel. As it turns out, gas stations tend to ubiquitously offer two things, fuel for automobiles and snacks for humans. So, in actuality, the price not only completely inaccurate, we can actually calculate approximately what that price should be.

The price of the gasoline is set per gallon here in the US. In order to compare to food we need to compare in terms of a common quantity, a quantity that combines the energy density and efficiency of the engine. The quantity that really cuts to the chase combining all of these factors implicitly is the distance that you can travel. Specifically, we seek to compare the distance you can travel per dollar, which is actually how you should compare most energy sources, cost divided by some benefit measure.

A car can typically travel around 30 miles on one gallon of gasoline. The price of gas around where I live is on the order of 4 USD per gallon. This means that it costs 13.3 cents to travel a mile in your car.

A simple Google search tells us that a typical bicycle rider burns around 50 kcals per mile. At a gas station, with their premium on convenience, you can expect to spend something like 2 USD on a glorified bottle of sugar water which contains a total of 125 kcals of energy. This means that it cost you 80 cents to travel a mile on your bike.

Therefore, if you are purchasing fuel from a gas station and have the choice between travel by car or travel by bicycle, and care only about the cost of fuel to do so, it is much cheaper, six times cheaper, to purchase your fuel in the form of gasoline and drive your automobile around than to purchase food and ride your bicycle around.

But that's not all. The URL on the gas station sign tells me that this station is in Austria. They are on the Euro and the current cost of gasoline there is something like 1.4 Euro per liter, so I suppose the price at this station is 1.46 Euro for a liter of diesel. Another Google search tells us our 2 USD sugar water would cost 1.50 Euro assuming gas station snack foods cost the same as they do in the US. This corresponds to 37.5 euro cents to travel a km on bike to 11.4 euro cents to travel a km in a car, a ratio of 3.28, smaller but still favors travel by automobile. This means that the price on the sign should say something like 4.787 Euro.

Tuesday, August 7, 2012

Why I Have a Flattr Button

Update (Jan. 9, 2015): Many new services have come into vogue since I originally posted this, many of which are close enough to what I initially had in mind that they deserve mention here. Of these, Patreon comes to the forefront as nearly optimal, not exactly what I wanted, but pretty good. Also, I found that my idea that I allude to in this post turns out to not be such a good idea. I thought that developers would be happy to see people providing funding mechanisms to them, even if they didn't request it. I legitimately felt that, during the bootstrapping phase of the service, it would be a good idea to have bots crawl websites, repos, and mailing lists, train a model to determine how funds should be distributed amongst a project, and then distribute them via unintrusive email notifications to the developers. Turns out someone else was doing something similar to this and got seriously chewed out by several developers, so, that was a bad idea.

At the same time as I was rethinking my idea (which never found traction with people I pitched it to) Flattr changed something about their buttons so now they rarely show on my pages, and other stuff. So, yeah, nothing other than sending BTC is likely to work right now and I haven't gotten around to fixing it.

Flattr is a micropayment service geared towards funding artists (primarily bloggers) and community tech support. I am writing this blog as a hobby, as a sort of portfolio and record of things I have worked on, and as a motivation to learn new things and learn them to the point that I can explain them to others. Direct monetary gain has never come into the equation. I am also of the opinion that nothing I have written here actually warrants "Flattr"-y. However, perhaps someday something will. I would just let this silently go up and see if it reaps any rewards, but I have had the idea of funding in Libre Software and art in general on my mind a lot lately.

The Future of Funding Art

There is a deeper reason behind the Flattr button and the Gittip button right beside it. I put the Flattr button up as I feel that they got a lot of things right when designing their funding service. I mostly support their method of micropayments, and just using their service is in some way a support of their business, as is writing this post. But it isn't just Flattr that I support, it is other services like Gittip, and to lesser extent Indiegogo, Kickstarter, and a slew of other payment systems that have popped up recently. Each of these services, as I see it, strike a blow towards toppling an empire built on false scarcity, built on copyright. This is a false scarcity because scarcity of art that can be represented electronically simply doesn't actually exist. Once actually produced, software, literature, music, and movies are not a scarce commodity. They are commodities which are infinitely reproducible at next to zero cost. The sole purpose of copyright is to allow us to integrate those perfectly copyable things into our current market system in which value is based on scarcity. This act of limiting reproducibility via legality using copyright and, more recently, via technology using DRM is the act of setting up a false scarcity. The fact that this is a false scarcity is the reason that people reject the propaganda campaigns to relabel copyright infringement as theft, and the reason that some don't see it as wrong at all.


I don't think that the proliferation of services like these is a coincidence or that the recent success stories are a fluke. Services like these are the first harbingers of a fundamental change to the market for these non-scarce things. This change will be so fundamental that we will probably not even say that there is a "market" for software, literature, music, or movies in a few decades as there will be something else, something based on pledges and donations that has yet to be completely defined. This won't be the death of art, quite the contrary, it will be a rejuvenation. It may, however, be the death of companies that make their money by limiting the freedom of users. How do I know this? I don't, I just have hope. But as Kay says, "the best way to predict the future is to invent it." I still feel that these services, or any service I have seen, haven't quite gotten it right for the markets I care the most about (Flattr and Gittip seem quite close but slightly misdirected). From my point of view, there is ample area for innovation in this "sector", and I (and a few friends) have an idea that is in the works which I hope to announce not too long from now, if things go well and someone doesn't beat us too it.

Why no Advertisements

What about advertisements?, you might ask, they are certainly a funding model that nobody would argue with. After all, they cost the user nothing. Ads are certainly a good fit for some situations. However, I do disagree with advertisements based on the same principles as above. They actually do cost the user something in terms of time, and once again, ad revenue is based on the fact that the user cannot, for either legal or technical reasons, copy the content, extract the ads, and re-host it. I don't see using AdBlock Plus or a DVR with commercial skipping technology as that different from extracting DRM from a piece of media other than the fact that removal of DRM is illegal in the USA and circumventing ads is not (though this could change in the future). Ads add a new negative aspect as well in that they degrade the content by cluttering it with ads and have a tendency to produce biased information. So, no, I don't think ads are a funding system to be embraced if you have another option available, and if there are no other options, we should be actively seeking new solutions.

To make a long story short: even if you think all that is a bunch of nonsense, I hope that you at the very least find this little Flattr button to be much less intrusive than ads and much more easily ignored.

Friday, July 20, 2012

ICFP 2012 Post Mortem

Update: The results of the first round have been published. Our submission was eliminated in the first round (not a huge surprise) and we got 154th place out of a mere 221 actual submissions that the organizers actually received. We definitely could have done better, but this is not that awful especially considering some of the bugs that were found after the submission. Only the top 117 submissions advanced to the next round and we were not that far off. Ah, now I can't wait until next year...

After a very long coding session with very little sleep, ICFP concluded and my team did fairly well for my first time on my own and my teammate's first time all together. In the end, we did submit a program, which is more than I can say for some years that I have competed (and I do mean seriously competed and never submitted, not just tinkering around).

Whew, that was close…

I am always surprised at how long it takes to package up your program and submit it. I got our submission in at 6:59 AM, less than a minute before the deadline. It was quite the nail biting finish, to be sure.

The Successes

I found this year's problem to be excellent. I never felt that I was completely lost for long periods of time and yet I never felt that I would completely explore all aspects of the problem, either. This years problem seemed much less mathematically deep than those of some other years, but it was interesting and very easy to get started with, which makes for a very nice ICFP contest problem.

The problem this year involved writing a program that played a little mining video game (here is a javascript version written by another contestant). A good submission would find an optimal path through the mine to collect as many "lambdas" as possible, avoid dieing (rocks falling on top of it or drowning), and find its way to the exit. This year's problem was very accessible. You could submit something immediately that would be a valid program (something that just printed an A for "abort"). Further, there was basically no barrier to improving that solution.

What was suspiciously lacking this year was the deep math or computation theory under it all. This seemed to me to be a basic "search as best you can until your time runs out" sort of problem. Then again, we didn't have a real computer science guy or mathematician on the team this year, just a web developer and a computational physicist, so I wonder what other people found that we missed.

Also, I am pretty shocked that the organizers didn't allow people to submit maps to try other peoples robots against. I guess it doesn't quite jive with the storyline, but you could make it fit. Adding direct competition is a good way to make a problem adaptive to the skill set of the contestants. Also, no leader board? Bummer. But these are minor things. The task was all in all very good.

The organizers were also excellent. I say that because I never had to contact them and I never felt (for long periods of time) that the problem description made no sense. Even when I felt (and might still feel) that the task statement was ambiguous, or in some cases just plain wrong, they provided ample examples to clarify at least how they interpreted the task statement's meaning.

My teammate was also excellent. He stuck with it through the entire weekend for very long hours and provided much help in thinking out solutions and strategies as well as hashing out the meaning of various parts of the problem. I only hope he found the contest as fun as I did. All in all a pretty good year for ICFP. Definitely the most fun I have had with ICFP in a long time.

The Remote Development Environment

The collaborative editing tools were mostly successful. There are some annoying (and productivity hurting) bugs that we hit using Rudel, usually when disconnecting and reconnecting later, but for the most part it worked quite well. The lack of graphics were a real detriment for me. I could have set up a visualization program very easily if there was a good way to use it over the network, but instead we fought with rendering the maps to Emacs buffers. My teammate suggested we use Flash for visualization, something he has a lot of experience with. While I was reticent of the idea at the time, due to my aversion to using flash in general, I now think that this would have been a good course of action.

In the end, however, I very much enjoyed text based setup we actually put together. This allowed us to have meaningful printed representations which are invaluable for debugging a program, something that we would have implemented anyway. After three straight days of watching our robot wander around a mine I feel like I have been playing Nethack for the last 72 hours, battling a wily rust monster. On the right is an example of our robot performing a search on a mine I made up using a particularly bad search heuristic. Each frame is a system state the robot is considering in the A* search.

In my opinion, the usage of Rudel made for much less "work off the reservation", basically people working on some aspect of the problem that really isn't that important to the group. In addition, the fact that we were working on the same image gave a certain urgency to changes that temporarily broke functionality. I still think this is a net positive for development of this type, but I would also like to try it more to determine if it truly is.

I also feel that I did an good job organizing the team save for the number of people I was able to attract (a bit of a pat on the back, but whatever). The EC2 server where we hosted Lisp image worked pretty flawlessly. Google+ Hangouts worked very well for sharing peoples screens (although Google transcoded the video to an illegible resolution upon the YouTube broadcast). This all worked very well, all things considered. The end bill for running a "small" instance on Amazon EC2 for the weekend, plus a two days before to upgrade to Debian Testing and to make sure everything was working, plus storing the hard drive images was around $10. In my opinion this is very reasonable. Below are the recordings that we took of the coding sessions. We stopped on the second day and what is there is long, unedited, and, as I mentioned, mostly illegible, here it is anyway.

I think that actually developing on a machine very similar to what it will be judged on was a good idea. In the end, I built an executable core image and included a simple script to build the executable. I believe that I screwed up and I needed to explicitly install SBCL in the install script, which I did not. Luckily, because the image should be binary compatible with the judges machine, even if the install script fails to run, it will still use the binary that I included in the tarball.

The Shortcomings

As good of a year this was for ICFP, there were some disappointments, for me at least.

The biggest disappointment this year was that I was unable to get more people from the small Common Lisp community interested in this team. Posts on Reddit, Lispforum, c.l.l., and #lisp, and #lisp-lab, even Hacker News resulted in a only one interested teammate before the time of the competition. When the competition started, our live coding of the event drove some limited interest and we got two or three interested parties, but none of them were really able to participate with us for various reasons: one individual was confused about the time frame of the competition and missed it, one couldn't get Google+ Hangouts to work in their country (IIRC), and another wanted to watch the Rudel session and may have contributed, but I was understandably timid about granting either ssh access to an unknown party or opening the Rudel server port to the Internet (Rudel doesn't have a read only mode that I know of). Each of these issues could have been avoided with even a single day of prep/testing time, but I probably could have found a solution with a lower barrier to joining in mid-competition.

I do wonder, however, how much of the low participation was due to the tools I chose to use. Not every Common Lisp programmer uses Slime and Emacs (or some other hypothetical Swank and Obby capable editor). Some others may have been turned off by the Google+ usage. I imagine that other Lispers participated, just not with us. Perhaps we can do better next year.

The time separation (7 hours) was actually a pretty big problem. The idea was to have a meeting before one of us went to sleep and when one woke up to catch everybody up. This proved difficult as by the time we got around to a meeting, one of us had usually passed out. It is hard to stop time sensitive work for a meeting. That said, we dealt with it and overcame. More people would have alleviated this significantly, particularly if they are scattered around in different time zones.

One real criticisms of myself is that I didn't take on a leadership role. In retrospect I think that I should have. Not be a hard ass, mind you, but to provide a direction that I wanted to work towards. Once someone sets a direction, I find it is easier for someone else to voice their dissent, which is a good thing. I also suffered from a bit of the software architectitis, where I started making packages and files for no good reason. In the end we had around 9 files and than 6 packages, some circularly dependent. Because of this complication of the program, things needed to be refactored at the last minute to get a submission together and that refactoring proved more difficult than it should have been. This was one of the main factors that resulted in us barely submitting in time.

The last irritating bit was that, as per usual, I didn't quite perform as well as I wanted. The way it should have gone (giving a liberal amount of time for each step):

  1. Friday 13:00 UTC: We are basically playing Dig-Dug, okay! (Up until the very end I was expecting fire breathing dragons to show up)
  2. Friday 14:00 UTC: We need to find good paths in a graph
  3. Friday 15:00 UTC: Parser implemented
  4. Friday 20:00 UTC: Simulator implemented
  5. Saturday 00:00 UTC: A* search Implement
  6. Saturday - Monday 12:00 UTC: Alternating updating the simulator and tweaking heuristic function until it works well
  7. ? UTC: Some brilliant insight that makes our program awesome

We were roughly on schedule until step 4, IIRC. Things didn't continue on schedule for a number of reasons. First, everything just takes longer than what we think it should. One big setback was when I implemented the simulator in a way that was not using the task descriptions notation, my teammate wanted it using that notation (so index values matched, etc.). It was a valid concern, and I had to change it. This caused more bugs than would usually be there in a very crucial part of the program, but it may have avoided other bugs cropping up from confusion in the later development. This is not a huge set back as these bugs just changed the rules of the game and didn't crash things, so certain aspects of the game were weird from time to time. Looking back, the thing that took way more time than it should have was the decision to settle on an A* search for finding solutions, which should have been the first thing we tried and tweaked from there. By the end of day two of the competition, I had grabbed Peter Norvig's PAIP code and was using this as a search mechanism (using beam search). It wasn't until the middle of day three that I switched to A* which is much more appropriate. Besides making me question my competency as a programmer, this really made me question why we don't have the code from PAIP in an easy to download (a.k.a Quicklisp accessible) and use form. I think I will make this one of my projects, maintaining a modern version of some of Norvig's code, particularly the search tools (I have long wanted to package an improved version of the CAS he provides).

Programming competitions and software development

As I often feel after an ICFP competition, I think that I could do much better if I attempted it the next weekend. I'm not talking about giving us a few more days. That wouldn't have helped at all. I was pretty exhausted by the end of it. I mean that competing in the ICFP contest, getting code out the door in a short period of time, requires a certain frame of mind that is distinctly different from my usual state of mind when developing. You have to re-adjust how your brain works in order to be successful in these short deadline competitions.

I simply do not work this way in my job or hobby projects. I tend to enjoy the process of developing a well thought out program, or better yet, a document, as I very much like literate programming. A well written program, in my opinion, is a program that reads like journal article. This is not a conducive mindset for the ICFP contest, to say the least.

At the start I was using Git to keep track of changes. This is a lot of development overhead, but that overhead tends to be worth it in software development, even beneficial, but it is a downright detriment to your productivity in a competition like ICFP. When I finally stopped using Git, things went a lot faster. At the start, I spent time thinking about organization. Later, I just hacked code wherever it would fit. Again, this eliminated a lot of development overhead. There were even a few moments when I started writing documentation for the theory of why I was doing this or that. This was quickly stopped when I noticed what I was doing. It was as if I was unlearning any and all good practices that I have learned over the years in order to compete in ICFP effectively. After I got into a quick development mindset, the whole thing became much easier, but it took days for me to reach that point.

I have come to the conclusion that being good at short deadline contests relates to actual software development about the same way as running relates to driving a car. Both running and driving move you places, but they are extremely different activities. Granted, you do have to worry about similar problems: don't run into things, you need some form of fuel to get you moving, you have rubber objects that insulate you from the ground, but being a good runner doesn't make you a good driver, and a good driver doesn't make a good runner. You are thinking about two vastly different sets of important factors when you are doing each. You optimize for different goals and each is good for achieving different things. Driving is clearly faster for long term goals, but it is often faster to sprint a short distance than to get into a car and drive there, particularly if the path is off road. However, as with running and driving, I see no reason why people shouldn't train for both if they want to be good at both.

Wednesday, July 18, 2012

Efficiency of Pseudo Random Numbers in Lisp

I came across Alasdair's posts regarding his exploration of Pseudo Random Number Generators, or PRNG, in high and low level languages. These posts show timings of a hypothetical PRNG for different implementation languages and "big integer" libraries. This PRNG is defined as:

\[ x_{0} = b = 7,~~ p=2^{31} - 1 \]

\[ x_{i} = b^{x_{i-1}}\pmod{p} \]

His python version follows (you can go to his site to see the C versions):

def printpowers(n):
  a,p = 7,2^31-1
  for i in range(10^n):
    a = power_mod(7,a,p)
    if mod(i,10^(n-1))==0:
       print i,a

When I read this post, I wondered how SBCL would stack up, so I took literally one minute to port the C version into a Lisp version (the python version might have went faster) and evaluated it in SBCL. I spent absolutely no time at all thinking about optimization. Afterward, when writing this post, I made a few cosmetic changes to make it a little more pleasing to the eye but the execution is identical. My Lisp version looks like this:

(defun print-powers (n)
  (let ((b 7)
        (p (- (expt 2 31) 1)))
    (iter (for i below (expt 10 n))
      (for x
           initially (cl-primality::expt-mod b 7 p)
           then (cl-primality::expt-mod b x p))
      (when (zerop (mod i 100000))
        (collect x)))))

Here are the times taken to calculate one million PRNs in the various implementations. Because I am comparing implementation across different system, I cannot compare directly, so I compiled the GMP version and used that to give a scaling factor (I tried to also test how the Python version compared, but that wouldn't run with the information Alasdair provided). This means that only the bold values in the "calibrated time" column are actually measured. The others are approximated by assuming the performance differences between my machine and Alasdair's can be summed up as a simple numeric constant. Results are in seconds.

ImplementationReported timeCalibrated time
Python74.110
C GMP.901.3
C no big ints1.52.2
C MPIR.781.1
C PARI.52.75
C TomMath13.18.
Lisp (SBCL)1.8

Judging from the fact that Alasdair describes himself as a high level language programmer and this is his first foray into the world of low level optimization, it is safe to assume that some of these numbers could be made significantly lower. In particular it is surprising that the C calculation without any big integer math at all does so poorly. Also, the extremely poor performance of TomMath is something to be investigated. Be that as it may, the "one minute to write" Lisp version holds up pretty well against all competitors in the timing tests and absolutely shines compared to the Python version. It maintains Python's readability and is nearly two orders of magnitude faster.

Though I am quite fond of several aspects of Common Lisp, I am not really in the business of promoting Common Lisp as an alternative to everything. That said, I absolutely do think it has a place in the scientific and mathematical computing, quite possibly more than Python does. I think that Common Lisp gives a clear path from tinkering at the REPL, to quickly prototyping an algorithm, to optimizing it if necessary, then porting to a lower level if absolutely necessary. Much the same can probably be said of Python, the crucial difference is that the initial algorithm prototype in Common Lisp will likely be within a factor of two of your initial C implementation, rather than a factor of 100. The fact that most software in academia never goes past the prototyping phase (it usually isn't necessary to optimize in order to get a publishable result), makes this all the more important. These results reinforce my opinion that Python should not be used outside of a glue language for low level libraries and, of course, I/O bounded programming such as user interfaces.

That difference between Python and Lisp (SBCL in this case) is partly in the compiler, to be sure, but also in the community and its opinions of what is important. If you are willing to wait, that gap in the quality of the compiler will certainly become smaller but that might not be a priority for the Python community. While you wait, some other language may come into vogue, perhaps improving or hurting performance. I think that in the mean time, however, Common Lisp has a very large advantage in this area that should not be ignored.

Saturday, July 7, 2012

Quicklisp July Update

I was scanning the IRC #lisp logs for any replies to my ICFP advertisement and saw this:

19:41:31 *Xach* adds a bunch of smithzv libraries

That seemed odd, I checked the Quicklisp libraries and it had nothing new in them. Then the announcement of the July release came across my RSS feed today and I was surprised to see:

New: asdf-encodings, backports, cl-6502, cl-factoring, cl-libusb, cl-neo4j, cl-nxt, cl-openstack, cl-permutation, cl-plumbing, cl-primality, cl-protobufs, clx-xkeyboard, coretest, hh-web, lisp-interface-library, parse-float, pythonic-string-reader, recur, single-threaded-ccl, sip-hash.

Emphasis mine. This is pretty awesome, four libraries I maintain have found their way into the Quicklisp repository. I guess the work I did refactoring some of my libraries paid off.

This is kind of scary, as well. This means that more people are using my libraries, even libraries that haven't seen an official "this is ready to go" stamp of approval from me. While it is certainly true that CL-Primality and CL-Factoring are solid and ready to go (CL-Factoring is largely centered around another man's work done years ago, I'm just bringing it up to date), the library CL-Plumbing is kind of not extremely useful (yet) and I really wanted to integrate Pythonic-String-Reader with Named-Readtables or something like it to modularize the read table changes that it effects. CL-Plumbing, in particular, has already been changed incompatibly in my local repo.

That said, it is very good to see vibrancy in the CL community, even if it means some scrambling on my part every once in a while and some non-backward compatible interface changes here and there.

Adventures in Collaborative Coding With Common Lisp

Update: I've posted a post mortem of the team's attempt this year.

In anticipation of the upcoming ICFP contest (by the way, still looking for team mates, we could use a handful more before I will feel we will saturate our workload), I started looking into collaborative tools for coding. I am aware of a large set of tools that might be useful. This post will describe some these and how we might use them. I am looking at using some subset of Emacs (of course), Slime, Rudel, Mumble, Google+ hangouts, VNC, X11 forwarding (perhaps using XPra), Dropbox, perhaps Git, and naturally ssh to tie it all together.

Collaborative Editing Topology

The basic topology will be like this. I don't know if this helps anybody, but it looks pretty.

Communication between collaborators happens via Mumble and Google+. Google+ has the nice feature that whatever happens in the Hangout will be mirrored to a live Youtube stream and will be saved for future viewing. Files can be exchanged using Dropbox. Rudel allows us to quickly work together and see what others are doing.

The production server is communicated with via Slime/Swank, X forwarding and/or VNC, all via an ssh tunnel. We need X forwarding or VNC in order to make any sort of graphical stuff painless (well, less painful). After experimentation, this is still quite painful. Still looking for a good solution here.

At the end of this post is a pair of videos of a collaborative session I had with one other person on Thursday. The pair of videos are all together quite long and the quality of the video is quite low (much lower than during the actual Hangout), so low that you cannot actually read the text. I'm still trying to figure out how to save the session data well. This was my second attempt to get some kind of example for this post and I felt I couldn't sit on it any longer with the ICFP Contest quickly approaching.

The production server

The first step is to setup a server that can host your Lisp image. This server can really be anything, but you should keep in mind that giving users swank access to a server, is basically giving them shell access at that the Lisp image's privilege level. This means that unless you really trust your collaborators, you should be wary of using a server you care about.

I chose to host off of Amazon EC2 as you only pay per hour of use, so I can start up a fresh system, set it up, and ditch it hours or days later without paying for a month as you might in other places. In a subsequent post (to be posted soon, I hope), I will detail how to set up an EC2 instance for this purpose.

This server will host a Rudel session, a lisp image with a swank connection, optionally a VNC server and Mumble server, and be connected via Dropbox.

Slime/Swank

Most Common Lisp people are probably intimately familiar with Slime and Swank. We are going to be using Slime and Swank to set up a communal Lisp image. Multiple users are going to connect to it and awesomeness will ensue.

We can also have local Lisp images for quicker and/or dirtier work (we don't want to eval broken code on the communal server if we can help it) and anything that needs graphics to run. This is simple and Slime/Swank is ready to go using M-x slime-connect. Use the Slime selector to quickly switch between open connections.

Rudel

Rudel is a collaborative editing library for Emacs. It can use many backends, but we will be using the Obby backend as that was the only one that was easy to set up. Once Rudel is loaded, one person can host a session, which is then joined by any number of participants. We will be hosting the rudel session from the production server. Buffers within Emacs can then be published by one party and subscribed to by any number of other people. After this, that buffer on each computer will hold the same contents, updated in real time as the people code. Text edited by a particular user will be marked in his/her specific background color. There are some packages you will need:

apt-get install emacs23 gnubin-tls avahi-daemon avahi-utils

Setting up Rudel is easy so long as you get the correct version (the one from SourceForge). Once you download and extract it, just add this single line to your .emacs file.

(load-file "~/.emacs.d/rudel-0.2-4/rudel-loaddefs.el")

Note that Rudel uses "C-c c" as a prefix command, which is weird to me, so if you use "C-c c" for anything, either remove that binding, or bind that after you load Rudel, so you can effectively clobber their bindings.

With a couple exceptions (see below) Rudel is pretty painless to use, just join the session, subscribe to some buffers or publish your own, and start editing. It is a good idea to have your Rudel session hosted by an Emacs instance on the production server (so you don't have to kill your Emacs to reset any problems). This is also a good idea just so your computer isn't the single point of failure for the team. You can go to sleep and shut down your computer without effecting others.

One issue for lisp programming is that you can't share the REPL buffer (slime-repl-mode can't be simply turned on and off, nor can you insert text into it all willy-nilly like Rudel assumes it can). However, you can share a Slime scratch buffer, or any buffer that is in slime-mode, which is basically just as good. Google+ allows you to make more involved presentations at the REPL between collaborators if that is needed.

One annoying thing about Rudel is that it seems to be impossible to actually leave a session. When you attempt to leave the session via rudel-end-session, you are disconnected and unsubscribed to all of the buffers, but your login remains and the server keeps the connection open (I believe). This doesn't seem too bad until you try to join again and realize that you can't because your username (and possibly color) are currently in use. To get around this, I just append a number on the end of my username in order to make it fresh every time. Regarding colors, I just pick a garish one when logging in and then change it to something better once I have joined (once you have joined, you can have the same color as someone else). Most likely, most people will work with colors turned off anyway.

Another annoying but (logically consistent) feature of Rudel is that M-x undo will undo other peoples edits as well as you own. This is something which is sometimes desired, but often times not if there are two writing code concurrently. If kill-undo is burned into your muscle memory instead of kill-yank, then you might have some problems. I am trying to come up with a work around for that particular case. Other times can be handled by simply specifying the region and using undo within that region (see the undo help page).

Git, Rudel, and Dropbox

The summary of this section is that these tools don't work together, at all, at a fundamental level. Use Rudel. People can use Git on a person by person basis. Just give up on the idea of sharing a source directory via dropbox. It is a lost cause to try combining Dropbox, Rudel, or Git simultaneously. Be warned that Git will be crippled when using it this way, you can't do any of the good git stuff like branches, reverts, and merges, as it will mess up everybody else's Rudel buffers.. Always unsubscribe, do any fancy Git commands you like, then republish (possibly under a different name) or subscribe and replace the entire file (presumably with the approval of the people sharing the file).

While I have never participated in a short dead line contest and actually used a version control system, I am sufficiently sold on the idea of distributed version control that I would like the option of using it here. Using a version control system has kind-of been integrated into how I think development should be done in general. Development should be broken into smaller, separable tasks which should be made as commits with commit logs telling the future developers (a.k.a. you) why this was done. Developing without it would make me feel naked, or at least haphazard.

That said, there is a problem with using git and Dropbox at the same time. In fact, it is a very fundamental problem. Dropbox is attempting to make two or more directories seem to be the same, no matter what computer you are on. Git, on the other hand, explicitly works under the assumption that the directories are on two different computers and are absolutely independent.

For an example of this conflict, consider a group of people collaborating on a project using Dropbox. As one person edits files on his computer, these edits are quietly sent to the other computers (which causes you to have to constantly revert buffers in Emacs and leads to conflicts, but let's say we are okay with that). If you are using Git, any change to the repo will also be synced. This seems good at first, until you realize that the index is in the repo. This means that you can't develop like Git wants you to develop, incrementally building up the index, crafting your commit, then committing. Each developer would step on the others' toes as they add to the index. Instead you need a process where you build your index in your head, then put a freeze on development to commit, e.g. "hey, nobody do anything, I am going to commit something." This basically eliminates most of the positives that Git brings to the table.

I tried several schemes of moving the .git directory out of the Dropbox folder which fixes this whole index problem. When you do this, you get back all of that Git goodness, but you lose the idea of a synced repo, so why use Dropbox at all? In fact it is worse that that. You now have two conflicting ideas of what the merged repo will be. You cannot combine the two, only discard one and accept the other. So, I submit, that Dropbox and Git just do not mix for this purpose. It can't be done in a sane way.

Everything I just described regarding Git and Dropbox is also true of Git and Rudel. Rudel, however, comes with the extra limitation that you can't just change files on disk anymore. The buffers might be saved to your disk, but the real buffer is in the "cloud". So, changing the file on reverting to a file on disk will break Rudel. From what I have seen, your buffer will no longer be in sync with others. It is important to note that you could actually reconcile this limitation by replacing revert-buffer with something that edits the changed lines in a Rudel approved fashion. But right now, this is not supported.

I lean towards using Rudel as I feel it is more important. We will still use Dropbox for easy file transfer, and people can still try to use git so long as they don't attempt to use any commands that will change the buffers on disk. No one should try to concurrently develop in the same synced Dropbox folder, though. I might setup a backup script that will run every 5 minutes or something and take a snapshot (using rdiff-backup, for instance) of my files so that we can roll back to previous versions if everything hits the fan.

Another thing that can be done with Git is to have a single person in charge of version control of a given file. That person will watch what other people are doing and make commits as needed. That person also institute reverts, branches, and merges, but such actions really need to be done via a safer mechanism than a simple git-checkout or git-merge. The person in charge of version control should really unpublish the file and republish it once the change is made.

Google+ and Mumble

I really like Google+ hangouts, they are about as close sitting at the same desk as someone else as video chat has ever gotten. As nice as Google+ is, it is a bit annoying to have to leave that CPU/network hog running non-stop in order to communicate. This can be partially handled by Mumble, a VoIP push to talk program. It is light weight and can be left running non-stop without many issues. I'm not sure what will be better in practice.

VNC and X Forwarding

VNC has a lot of issues when it comes to connecting two peers, particularly two peers that might be using an ISP that won't allow incoming connections from the Internet. It will work well with a one computer acting as a server that is accessible from the Internet. However, ever when you have VNC working, you can usually count any OpenGL out of the equation. Attempting to use OpenGL on EC2 resulted in a crash of the Lisp system, if I remember correctly. Replacing the OpenGL drivers with a software renderer might help (in fact it seems necessary as the EC2 server has no video card for a hardware driver to make sense). But the main issue is the lag, which is pretty bad, and the general frame rate. However, you can certainly setup a GUI with buttons, combo boxes, static images, etc, and it will work fine. The only real issue is real-time graphics.

X forwarding is another option, it can be made pretty efficient with the help of XPra or NX, and with XPra, at least, the window can be detached and re-attached by someone else. But this is not really collaborative, though. To my knowledge, there is no way to use this technology (or technology like it, e.g. XPra or NX) in a collaborative way. If you do choose to use it and you are using EC2, I could only get it to work if I installed the software rendering drivers (otherwise the Lisp system crashed, this is actually not that uncommon if you are using CL-GLUT).

The Experience

This is a really neat experience. Rudel and sharing a Lisp image was a very new experience to me, and it felt like there was a lot of potential there. It will take me, and probably others, some time to actually wrap my head around all of the implications here. I often times found myself forgetting that I can edit the buffer while someone else is editing something else, or that I can evaluate that code and it will instantly become available to the other users. I can only imagine that this parallel development could scale nicely with more people. You do need to coordinate the development, but this is always true. Problems need to be broken into distinct, separable subtasks and the solutions to those subtasks need to have well defined interfaces in order to prevent breaking other peoples code, but this is, again, always part of any development with more than one person. There are also times where it is very clearly a win, for instance, when writing unit tests or debugging at the REPL.

Of course, if you are really interested in playing with this, hopefully this post and optionally my subsequent post on setting up EC2 will allow you and your friends to try this out yourself. Also, I'd once again like to put in a plug for my ICFP team, join us, it will be fun. Beyond that, at least for the time being, I will put out a standing offer that if you want to have a collaborative coding session with me, in Common Lisp, let me know and I will probably be happy to participate.

Here are the videos of the coding session. Again, I apologize for the quality and the slowness of the development (Oleg and I are still learning each other's style). The task we were setting out to accomplish was to design a program that could solve a maze.


Other things we could have used

There are tons of tools out there. I am aware that you can do a lot with communal Screen sessions if you are willing to limit yourself to the terminal. You can also just run Emacs over X11 forwarding (Emacs has the capability to spawn frames on different displays). This might work, but some have said that Emacs can freeze if one of the users drop their connection (I suppose without closing the window).

If anybody knows of any other awesome tools, or a better set up like this, please comment below, I'd love to hear about it.

Wednesday, July 4, 2012

Ubuntu 12.04 vs. Emacs Key Bindings

As I am becoming more at home in the Unity interface I came across the annoying problem that Unity binds the <Control><Alt>t chord for starting a terminal window. While that is a fine chord for that, and I believe that starting a new terminal should be as easy as possible, that key-binding is already used for something I use even more frequently, transpose Sex-Ps in Emacs.

In the past this could easily be rebound by using ccsm, the CompizConfig Settings Manager, and entering the Gnome Compatibility plug-in and setting it to something else. Unity (at least what comes with 12.04) ignores this binding, it seems. In fact it seems that there are several places where this binding might be set. I remembered stumbling upon a list of bindings in MyUnity, or UbuntuTweak, or some other third party tweaking app, but I have long since forgotten where that was. But using every place I found, disabling that key-binding never had any effect.

I finally took the time to work out a solution today. The solution is to get my hands dirty and use gconf-editor directly. I don't think gconf-editor is included in the default install of Ubuntu, so you need to:

sudo aptitude install gconf-editor

Start the program and search for keys that have the word "terminal" in them. I found three places where that key-binding was specified and I wiped out the value in each, though it was the first key value that seemed to matter. Then if you wish to set a binding, run CompizConfig Settings Manager and edit the key binding to start a terminal under the Gnome Compatibility plug-in. Once again, CompizConfig Settings Manager doesn't come installed by default, so:

sudo aptitude install ccsm

I don't think that Canonical is really interested in promoting deep customization of their OS or window manager, something that is not very GNU/Linux or Libre Software like. This is a very different zeitgeist from the GNU/Linux of a decade ago, or even five years ago. Maybe that is why I had such trouble changing this binding. How is this not a bug that would have been fixed in 11.04? It's all fine, though, so long as they don't take that extra step of actually obstructing people from customizing things.

Update: I recently reinstalled Ubuntu 12.04 and used none of the old configuration files in the new install. After this, setting the shortcut under Settings -> Keyboard -> Shortcuts (tab) -> Launchers does the correct thing. No need for gconf-editor or ccsm or anything else.

Wednesday, June 27, 2012

Cedet Interferes with Slime

Just in case this has bitten others: Cedet (at least the current 1.1 version) does not play well with Slime. It clobbers some Slime facilities, like arg-list documentation in the mode line, and it seems to map capital letter key bindings to lower case key bindings (like C-c C-d A seems to run C-c C-d a which is slime-apropos). There are probably other annoying "step on other libraries' toes" sorts of bugs as well. My advice is that if you are seeing odd Slime behavior, or odd anything behavior, pull the Cedet stuff out of your .emacs file and try without it.

This almost certainly has to do with the way many Cedet setup guides direct you to enable a bunch of global-xxx-mode settings. However, I don't have any lines like this in my .emacs file and still see odd behavior, so I wouldn't be surprised if this is the default.

I found this that briefly sketches how you might only locally enable Cedet. This seems a bit involved and there is no guarantee of it resolving the problem. So, I guess I am back to using by memory to figure out structure members like some kind of animal.

Sunday, June 24, 2012

ICFP Contest Call for Teammates

Update: I have posted an overview of the tools that were used as well as a post mortem for the team's attempt this year.

I have been competing in the ICFP Contest for the last 5 years or so and will do so again this year. The contest begins in 3 weeks, it starts on July 13 at 12:00 UTC and will continue until July 16 at 12:00 UTC.

For those that don't know, the ICFP Contest is a long running competition that poses a single hard, and sometimes mathematically deep, question for the participants to solve in a 3-day time period. The task is a secret which is revealed at the beginning of the contest. The problems are hard to solve, but (usually) easy to understand.

To give some idea of what kinds of problems you might expect, previous years have some awesome tasks like:

  1. Designing control systems for a Mars rover
  2. Designing a flight system for orbiting satellites
  3. Writing an AI for a complex card game

And also some interesting, but esoteric problems like:

  1. Reverse engineering an alien machine code from a compiled executable and data mining that executable
  2. Writing code in various, odd computational models (typically involving writing an interpreter and an optimizing compiler)

In the past I have worked either by myself or in a group of people where each used their language of choice. I have found that when you work by yourself it is easy to lose motivation, particularly when you get stuck and you have no one to bounce ideas off of. When you work with others it is easy to stay with it, but when you are each using your own language, a great deal of effort is spent in porting solutions between various languages or in designing flexible interfaces and protocols between the various code pieces. What I would like to do this year is work with a team of people using Common Lisp for the main development.

The idea would be to get together a group of Lisp programmers that will communicate over IRC, Google+ hangouts (group video chat), or even a Mumble VoIP server and share files using Dropbox and/or GitHub. We can even try some more cutting edge collaborative setups such as shared VNC and/or screen sessions, collaborative editing using Rudel with Emacs, or just connecting to a single Lisp image with multiple swank connections if we are able to try it out prior to the contest start. Last year, my team used a setup with some of these elements to facilitate development with people from Boulder, Colorado to southern Georgia to Seoul, South Korea. It worked pretty well.

The contest is sponsored by Facebook this year, which means that there will be some monetary prizes, which is unusual. The real reward, however, is working for a few days on an interesting problem that you might never have thought about before. I am hoping to have a team that aims to have fun with this competition. We might have a few late nights if you are up for it, but the main goal is to enjoy the competition and not work anybody to the bone that doesn't want to do that.

Anyway, if you are interested spending a weekend participating and would like to work with me, using Common Lisp, please contact me by leaving a comment or sending mail directly to zachkostsmith@gmail.com.

Update: I have written a post about the collaborative tools that I hope to use this year.

Saturday, April 7, 2012

Kickstarter Video Games and GNU/Linux

I actually don't play video games very much (maybe a couple hours per week at peak, but more commonly zero), but I have been buying a lot of them the last few years (Humble Bundle), and helping to fund several game development initiatives on Kickstarter. What the heck is going on?

Long video game hiatus

First, this is coming from someone who as a young man, adored video games and spent way too much money on them. But when I got to college I stopped for financial and study reasons. Up until a few years ago, I bought very few video games. To put it in perspective, I stopped buying games back when they came in gigantic boxes, back when actual voice acting was reason enough to buy any game, when DRM was called copy protection and it often times involved looking up a particular word in the manual, or a spinny decoder wheel thing. However, the way I see it now, video games are more important than ever to my future happiness.

Something new has started recently. Relatively well known developers are starting Kickstarter campaigns to fund video game development. This is pretty cool in itself, but what makes it much cooler is that many the developers are going to be making truly cross-platform games (MS Win, MacOS, GNU/Linux).

I am putting out money for these games that I might, in all reality, never play. I'm doing this for one reason, they are showing a willingness to support GNU/Linux systems. Now, don't pigeon hole me as a person that thinks all software should be FOSS (even though that is what I believe), my reasoning is more that I want and need a more stable platform from GNU/Linux than is currently available and I think that video games are an important first step towards that goal.

Video games drive stability

Here is the way I see it. Video games make beaucoup bucks. With that money, people start to think about how they might make it easier to make more money, and that is where the stability will increase. If you look at the shortcomings of GNU/Linux as compared to MS Windows systems, the most glaring problem that I see is that hardware is poorly supported. The best way to get hardware supported is to impose a financial incentive to have them work and work well. That financial incentive scales with the number of users. The number of users is highly correlated with the number of well programmed flashy computer games. I will go so far as to say that there is likely a causal relationship where a vibrant games market draws more users.

But there aren't enough GNU/Linux users out there

Of course, to a studio that sells millions of copies of a game, the number of home PC GNU/Linux users might seem negligible. I actually don't think this is the case. Now, I don't know if anything can ever de-throne MS Windows in the PC game market, but if we look at the numbers from such promotions as the Humble Indie Bundles, we see that a good chunk of that money is coming from GNU/Linux users. Fairly consistently, we have seen the money from GNU/Linux users hit around one eighth to one quarter of the total money raised. In the last bundle, GNU/Linux contributions hit around the one sixth mark; that's \$100k that would have been lost if not for the GNU/Linux support. We can discus why this is the case (my guess is a combination of record numbers for GNU/Linux users and a form of video game starvation), but I think it is hard to argue that this number is negligible.

For a project like Double Fine Adventure, perhaps it is not too far off to think that \$550k was due to the GNU/Linux support, and perhaps \$350k for Wasteland 2. To contrast, one wonders what to expect for another project I would have been very interested in, Shadowrun Returns, which has announced that they will not have GNU/Linux support. Are they missing out on one sixth of their possible funding? Is the cost of porting to GNU/Linux more than that lost funding? Update: Shadowrun Returns has announced that if \$1M dollars is reached, there will be an effort to port to GNU/Linux after the release. This is pretty nice and while it would be great to have Linux support promised now and without the post release delay, I'm pretty sure that \$1M will be reached and any support is better than none. Yay! I won't pick on Al Lowe here, as budget is his primary concern right now, and it is too soon to tell what is going on with Jane Jensen's project. I wish you both the best and I'll be watching for other (hopefully new) titles on Kickstarter. I will say that I am kind of over LSL (much more of a Space Quest and King's quest fan) but if there was a GNU/Linux version, I'd throw \$15 at it to support it.

So, here is the point. I think these projects should be supported. Go do it, now. But I also think people might want to start considering GNU/Linux as a deployment platform, and not for the novelty of it ala Id Software, but for the profit of it. Now, I realize that game developers are not reading this blog (I know this is true because no body is reading this blog ;) but maybe if you are reading it, and you run into someone that does make these kinds of decisions, some of these ideas will leak through during your conversation.

Also, since I don't want to post on this stuff again, a quick note. The Humble Bundle started as a promotion where the games were released as FOSS after a certain goal was reached. This is no longer the case or this goal is very high. I guess I just want to relay my own data point to the world on the value of libre source. I paid \$30 for the first bundle after I heard of the possible source release. Now I have dwindled to around \$5 for each. Just a word of advice, releasing the code as FOSS is valuable to people and people are willing to pay more for it.

Uh, did you forget about Libre Software?

Oh, right… Well all of this stuff happening in our little nook of the world of video game development is great, but not exactly perfect. However, I have a differing opinion on video games than the FSF has.

To me, closed source is 'okay' for DRM-free, non-mission critical software.

Contrast against something like a computer algebra system that you use for your job, a financial books program that you use for your personal or work finances, or an operating system where you run everything. Having these be closed source means that you have relinquished a great deal of knowledge of how useful your work is, and correct and accurate your financial records are, and how safe any of your private data is. Having these be open source but non-Libre means that you have relinquished control over these things, but at least you know where you stand. In an ideal world, everything would be FLOSS and we would have come up with a widely applicable funding mechanism, but this isn't an ideal world yet. If there is going to be something non-Libre in the mix, I suppose that a video game that is fundamentally a luxury, is the best one to have, but only so long as there is no DRM attached to it. This is a big 'but' and I cannot stress this enough. If there is one place where we have demonstrative proof of the abuse of proprietary software, year after year after year, it is with DRM for video games.

So, let me reiterate. Libre games would be a boon for gamers. I would like games better if they were Libre, all else being equal. But, the importance of a particular piece of software being Libre Software is directly proportional to how much of my life depends on it. For video games, this is sufficiently unimportant to the point that I feel that it is okay to support DRM free video games, particularly if there are fringe benefits like attracting more users, developers, and hardware vendors to GNU/Linux.

Saturday, March 31, 2012

Running a Command Whenever a File Changes

Edit: I actually had to republish this as I didn't understand the Blogger interface. Because of this, I added a new section where I combine this into a script that you can actually use. Enjoy.

Every once in a while you probably would like to have a command run every time a particular file changes. I know I did when I was writing some Latex based presentations with Beamer and was using the latex to dvips to gs chain of commands to compile the document into a PDF file. I came up with a way to ease these steps by detecting changes in the document's DVI file that Emacs writes for me. I later thought of a couple of other methods and figured I would put them up here in case anybody ever needed something similar. I included all of them as you never know when you need to put together a hack on some old or limited system, or just a system that you don't control and thus cannot install new software.

I decided that the interface of my little script should be to specify a file to watch, the watch file, and a command to run when it changes, the on change command. The on change command should take one argument, the watch file.

Watch File Modification Times

In this version we look at the watch file modification time and compare it to a reference time that is updated every time the watch file changes. This is definitely a cheap way to find these updates. The easiest way I could find to check the modification time of a file was to create a temporary file which acts as a time stamp. We just check which file is newer and touch the temporary file any time we want to update the time stamp.

#!/bin/bash

watched_file=$1
command=$2

# First we need a temporary file to test against
marker=$(mktemp /tmp/on-change-XXXXXX)

# We have to trap the EXIT signal as we want to clean up our tmp file
trap "rm -f $marker" EXIT

# Loop until killed
while true
do
    # If the watched_file is newer than the marker...
    if [ $watched_file -nt $marker ]
    then
        # Wait for a bit to make sure that the file is done being modified
        sleep 1
        # ... run the command ...
        $command $watched_file
        # ... and reset the marker
        touch $marker
    fi
    sleep 2
done

One thing to note is that this script, like many scripts, has a race condition in it. If it happens to detect a modification of the watch file but the modification is not yet complete, it will trigger the on change command and then probably get an error from it, then detect another modification the next iteration. This will continue until the program modifying the watched file has completed its work. This is bound to happen for short polling times (here we use two seconds) or long running programs that modify the watch file. This is the reason I added one second delays after the modification time based test. The delay is to try and give time for anything that is currently underway to complete. It helps, but of course fails to help other times.

A possible problem here is that it is easy to trick this into doing more work than it should. The watch file might be repeatedly touched but never actually modified, which would lead to unnecessary execution of the on change command. The key is to note that the newer modification time is a necessary but not sufficient condition for the file to have changed. It leads to false positives, but is very cheap.

It should be noted that, in the case of compiling Latex documents, while this can happen, it is a pretty rare event. But, if we are building a general purpose tool, it is something we should worry about.

Watch File Hash

How can we eliminate the reprocessing of identical files? File hashes.

The next technique is to poll the file for actual file changes. We don't need to keep an old copy of the file, or anything like that, we just keep track of the file's old hash value. Every two seconds (plus the time it takes to hash the file) we compute files hash and compare it to the old file's hash. If they differ, we run the specified on change command and update the stored hash value.

#!/bin/bash

watched_file=$1
command=$2

# Compute the reference hash value
hash=`md5sum $watched_file | cut -f 1 -d\ `
while true
do
    # Grab the files hash
    newhash=`md5sum $watched_file | cut -f 1 -d\ `
    # If it has changed...
    if [ $newhash != $hash ]
    then
        # ...run the command...
        $command $watched_file
        # ...and record the new hash
        hash=$newhash
    fi
    sleep 2
done

This has the advantage that it never needs to run the on change command unless there is an actual change in the file. This means that for an expensive on change command the method works very well. However, for very large watch files the hashing becomes unnecessarily expensive.

However, the race condition is still there and this time it is worse. Whereas in the modification time test method we could try to give the modifying process extra time to complete, this time we don't have that option. We just have to blindly try again until we get it right.

Use INotify

As is often the case, you find something that is a better solution than what you have hacked together after you have long since found a good enough solution to your problem. This is the case with the me and INotify.

The INotify kernel facility was designed for just the problem I was attacking. It provides, via a kernel interface, a way to hook into the file system and receive notifications on events such as reading from, writing to, opening, and closing files. If you are on one of the mainstream distros and have been keeping things even remotely up to date, you probably have an INotify ready kernel, but might not have the shell tools installed. We will be using inotifywait, which blocks until a specified file system event is triggered. In this case, we are interested in file modification events, so we will pass the option -e modify to the program. Just a note, this is taken almost verbatim from the inotifywait man page.

#!/bin/bash

watched_file=$1
command=$2

while inotifywait -e modify $watched_file; do
    # Wait a bit, in case the modifier is still working
    sleep 1
    # Then run the command
    $command $watched_file
done

This is much simpler than the other methods. A real strength here is that this is short enough that it really doesn't need to be a script at all. You can just memorize this. This is definitely a pretty good version, but it still has that race condition. So, one last version, where we will wait for the file to not be modified for a few seconds before we run the command on it. We can do this because inotifywait provides a timeout which exits with a 2 if it did timeout.

#!/bin/bash

watched_file=$1
command=$2

while inotifywait -e modify $watched_file;
do
    while [ 2 != $(inotifywait -e modify -t 1 $watched_file \
                    1> /dev/null 2> /dev/null; echo $?) ]
    do
        echo waiting...
    done
    $command $watched_file
done

This catches most of the race condition issues, I think, but the price you pay is that you have to wait at least one second after the file has been modified before the file can be processed by your script.

Of course, without a proper synchronization mechanism, which needs to be agreed upon by both programs and thus is largely incompatible with the shell idea of small self contained programs, you will never get rid of this race condition. It is firmly up to the user to ensure that two different programs are not accessing the same file at the same time (actually, I think we can go a long way towards solving this by using lsof and checking if the watched file is still open. I don't have time to explore this but lsof $watched_file seems to be promissing).

Combine The Methods

We can have the best of all of these methods by combining them into one monster script that checks for changes using INotify (with a modification time check fall back) and then confirms that the file has actually changed by comparing the file hashes. But I wont bother because this is already too long, both in words and time spent typing and editing.

Update: here it is. I changed the interface, now you specify a file and the complete command you want to run on change.

#!/bin/bash

watched_file=$1
command="${@:2}"

# We start with a bogus hash
hash=null

if which inotifywait
then
    function wait_till_unchanging()
    {
        while \
            [ $(inotifywait -e modify -t 1 $watched_file \
            1> /dev/null 2> /dev/null; echo $?) != 2 ];
        do
            echo waiting...
            sleep 1
        done
    }
else
    function wait_till_unchanging()
    {
        # Wait until there isn't a change for delay seconds
        touch $marker
        cont=1
        while [ 1 == $cont ]
        do
            cont=0
            sleep 1
            # If the watched_file is newer than the marker...
            if [ $watched_file -nt $marker ]
            then
                # reset the marker
                touch $marker
                cont=1
            fi
        done
    }
fi

if which md5sum
then
    function if_changed_run ()
    {
        # Grab the files hash
        newhash=`md5sum $watched_file | cut -f 1 -d\ `
        
        # If it has changed (this is always run the first time as $hash is null)...
        if [ $newhash != $hash ]
        then
            # ...run the command...
            $command

            # ...and record the new hash
            hash=$newhash
        fi
    }
else
    function if_changed_run ()
    {
        # ...run the command...
        $command
    }
fi

if which inotifywait
then
    while inotifywait -e modify $watched_file;
    do
        wait_till_unchanging
        if_changed_run
    done
else
    # First we need a temporary file to test against
    marker=$(mktemp /tmp/on-change-XXXXXX)

    # We have to trap the EXIT signal as we want to clean up our tmp file
    trap "rm -f $marker" EXIT

    while true
    do
        # If the watched_file is newer than the marker...
        if [ $watched_file -nt $marker ]
        then
            wait_till_unchanging
            if_changed_run
        fi
        sleep 1
    done
fi

Using It To Automate Latex Builds

I said I wanted this to make compiling Latex documents easier. In order to do that you just use something like:

on-file-change presentation.dvi dvipdf

Update: with the new interface, it looks like this:

on-file-change presentation.dvi dvipdf presentation.dvi

Of course the best method would be to figure out how to make Emacs run this post processing for me. However, I have so far failed to figure that out and this little tool is applicable to more than just this scenario anyway.

Update: I did find out how to do this the right way in Emacs. Turns out there are a lot of different ways and not all of them work. For instance, the first answer on that page doesn't work for me. I use AuCTeX, which means that you can temporarily change the Latex compilation mode to use pdflatex by using "C-c C-t C-p" in the buffer or permanently set it by adding (setq TeX-PDF-mode t) in your .emacs file.