Sunday, March 13, 2016

7DRL Post Mortem

I have officially declared my attempt this year as a failure, but I was able to get quite far in this in spite of a major service interruption in the web services that I maintain (which soaked up about half the week to remedy) and a project due for an Android class I was taking (which took up a solid day of time). I was able to gain some shaky footing in the Godot engine and start to understand how you might make a game with it.

I think that while this attempt didn't get to the point of a playable game, it was a success in many ways and I think I'll keep at it and perhaps have something to demo at my next Indie Game Developer meet up. If I can get the basic mechanics of the game in place before then, even if it lacks the rogue-like aspects I wanted, I think that I will be pretty impressed with the accomplishments. Further, I'll have this project at a pausing point just in time for our development sprints using Unity3D.

However, the challenge is over and I wanted to put together a post about what I learned regarding this process.

The Good

Using Godot it was very quick to get certain things up and running and even getting some of the behavior I wanted basically for free. That isn't even considering all of the things it gives me for free like handling input in a reasonable way, being portable to half a dozen platforms, and abstracting the graphics and sound to an easy to use interface. These tend to be things you get from any game engine, but it is important to note them here as they are something that Godot does pretty well.

Godot seems to be well thought out. No matter what exactly I want to do, there seems to be something in place to support that. This is pretty great feeling when you are coming from the world of either incomplete engines or building something by yourself. I feel like I'm programming, which is a good thing, but I also recognize that the scaffolding that they provide makes this all go much faster. Godot Script isn't that bad of a language and, while the text editor sucks and the developers seem to have something against embracing the one true faith, it works well enough to limp through basic editing tasks.

  • I was able to get an avatar walking around the screen using keyboard for motion and mouse for looking/aiming.
  • I was able to get basic collision detection for the player with walls, enemies, objects in the levels.
  • I was able to use the collision system and ray casting to allow interaction of the player with elements in the level. For example, the player can open a chest, but only if they are standing facing it and within the range of their arm.
  • I was able to use the 2D light occlusion to provide line of sight out of the box.
  • I was able to get some of the visual feel I wanted by using a pixelating shader.
  • I was able to get some animations close to how I would like them (the walking and aiming animations) and incorporate some of the mechanics (such as movement while aiming is slowed).
  • I was able to use the navigation polygon facility in the engine to route enemies around the map.

All in all, this got no where near a 'game', per se, but it did quickly become something where you could walk around and interact with things and it was clear how to move forward.

The Bad

Most of the 'bad' things that happened were basically because I underestimated all of the work that goes into making a game. I knew I would underestimate it, but I think it bears mentioning anyway.

Even as your ideas start to gel in your head, implementing them in the editor is quite the bottleneck. I basically can't imagine a world where this won't be true, but it sure is aggravating. You have a vision, be it good or bad, but it takes time to get things out. For example, while the art was difficult to brainstorm (how do you make something scary from a top-down view), it was even harder to create somewhat decent sprites and to get them all exported from Inkscape to their separately moving components and then key frame the animations in Godot. Only then do you see that what you were trying to do actually doesn't achieve the effect you wanted, and you need to go back to the drawing board all over again. This iteration process takes a lot of time.

One limitation that Godot caused was that performance is somewhat suspect in the Godot engine. Very early in this process I needed to decide between making a 3D game and a 2D game. I felt that a 3D game would provide more flexibility in how the visuals would appear and would require less fiddly tweaking of animations. In 3D I wouldn't need to worry about splitting my sprites into separate pieces to key the animation, this would all be handled by the renderer. However, after playing with the 3D platformer demo, I was struck by how poorly it performed on my Intel graphics chip. Using the discrete card gave passable performance, but I was looking to make a lightweight game and this engine was struggling to run a demo on my system, so I chose to go the 2D route. This directed many of my decisions from then on… and it makes me wonder if they directed them in bad directions.

I think these are basic limitations of working with sprites. Not much to do about it. Godot has some functionality to render 3D models into 2D sprites, but I didn't experiment enough to know if that would work. I had limited time and a decision needed to be made.

The Ugly

There were parts of Godot that are just awful, or at least awful from someone that approached it like I did. I thought that developing a game would be something like writing a program. It is not, or at least it is not if you are using Godot and probably any other engine with a GUI interface.

I develop software for a living. I might be doing all sorts of things with that software, automating processes on our servers, creating web pages, creating intricate back end networks, modeling complex systems, or scientific/data analysis, but I'm always writing code. As part of that I'm a bit of a snob when it comes with comes to how this should be done. When you develop code, you should be doing it with a text editor and a version control system. You develop by finding something you would like to change, making that change, and testing it to make sure it works (and potentially modify the change over several iterations), and the select those and only those changes to make a commit with at least a somewhat descriptive commit message to the repository. I pride myself on the fact that each commit only deals with one thing and it does it in a clear and concise of a way as possible. You can look at the code that was changed and it will only be code that relates to the purpose of the commit.

This mode of development is incredibly hard, if not impossible, when using Godot.

Adding a node to a scene generally changes many lines in the file. At minimum, because the nodes in the scene have an integer index, all later indexes are shifted. I have no idea why they have an integer index and they don't just use the location in the file, my guess is that they couldn't be bothered to write a decent parser for their text based scene format.

But that isn't the worst of it, general state of the editor window is also saved in the scene file. This includes things like the transformations (position, rotation, scaling) of all of the nodes even if those transformations are part of a animation. Seriously. If you advance the position in an animation, the new positions of the nodes is actually saved to disk! Why? Why not simply save the baseline transformation and then calculate the effective transformation using the animation and animation position before drawing? My guess is that they didn't think this through entirely. Whatever the case, I started getting into the habit of moving all of your animations back to 0 time before you make a commit in order to work around this.

The scene file also tracks metadata about the state of the editor, like what view is open, the 2D screen or the editor window. I think this is pretty dumb, but I think they are just following suit from Blender which does something similar if I recall correctly.

However, I will say that I think that this is just pretty difficult to do right. The point of the graphical interface is that it automates many menial tasks for you. Unfortunately, by covering up a bunch of menial tasks, the editor is quick to produce changes that are large and far reaching with the simple click of a button. I have opened a scene clicked the scene one time, and saw the diff of the file with what is in the repository develop hundreds of changes. In these cases I usually see no changes at all in the scene, nothing appears to be different in the editor, but it did something very significant to the file on disk. Invisible changes to your file are the root of many a head ache in software systems. All those times your spouse is tearing their hair out cursing at MS Word or Excel and trying to understand why everything looks the same but things are working differently, these invisible changes are often why. I want reproducibility. How can I get reproducibility if I can't even be sure of the contents of what is on disk?

This is not something that is in any way limited to Godot. Heck, it isn't even limited to graphical editors like Godot and Unity (or Inkscape and Blender, for that matter), it is something that you see in any editor that seeks to abstract some process so that the user doesn't need to worry about it. Even things like Eclipse or practically any 'enterprise' (a.k.a. bloated) IDE, tend to hide stray whitespace from the user which tends to encourage users to produce far from clean commit diffs (and causes good maintainers to reject their pull requests).

So, in total, I think that they need to seriously rethink their format for saving files. The new text based scene and resource files were advertised as a version control friendly format. I very much love the idea that there could potentially be a version control friendly format, and I think the great step forward they made here was making the format a human readable text format, but the goal of making the scenes and resources somehow source control friendly has simply not been met.

Update: After writing this, I think I have come to a 'solution' here. While far from my ideal solution, I'm simply not going to develop any Godot projects under source control. I am going to treat Godot as I would Inkscape or Blender, I'm going to treat it as a way to edit visual media and I'm not going to care what is going on under the hood in the file formats.

Perhaps I will put my scripts under source control, but for the rest of the project I am going to do something much simpler, I'm going to treat the entire thing as a black box and take periodic backups. I wrote a script that takes a backup of my working directory every 10 minutes (if changed) and thins out the old backups as I go (using rsync with hardlinking). This will be a better way to work on this. I don't get to keep track of changes, but at least I know I can restore a relevant previous version of the project if something goes wrong. If we are being honest, unless you can look at the content of each commit diff and understand what is happening, you basically have a file format that is a black box and you shouldn't attempt to use version control. I expect that I will be using a similar setup for my Unity3D projects. Perhaps some good has come of what I originally thought of as an unmitigated failure of the Godot engine. This was a hard lesson to learn and I feel a bit naked without a version control safety blanket, but I needed to realign my thoughts on how this sort of GUI based game development works.

Sunday, March 6, 2016

7DRL Announcement

I haven't used this blog in a long time and I find less and less time to do so. But I really want to have some public outlet for my ideas, I am going to start it again with something that will hopefully be fun… I'm going to take part in the 7 Day Rogue-like challenge.

This will not be my first attempt at something like this. In the past I have tried to take part in a game jam, but at the time I just didn't have the time. I'm not sure that I have more time this year, but the circumstances are different so I'm hoping that the outcome will be different as well. I have the opportunity to work on game development (for actual money) with a friend of mine and, while I'm not sure I want to be a professional game developer like he does, I want to support him in his endeavor.

I will use the 7 day rogue-like challenge to cut my teeth a bit in making games using more modern tools. In the past I have prototyped games using home made or half baked game "engines". This time I will be using an established game engine, Godot. This will mean I will need to adapt to their way of thinking in order to get things done, but it also means that I get the benefit of their architecture and design and I get to repurpose a lot of content to make this go faster. I'm hoping that after this experience, I will have a better understanding of what goes into making games on these game engine/IDE things.

I chose Godot because it is Free Software that runs well on GNU/Linux. It has plenty of warts and there are very capable proprietary competitors out there that run on GNU/Linux (mostly), but I'm going to learn Godot first and I'll learn those later.

So, without further ado, here is the announcement:

My entry into the 7 day rogue like will be a real-time dungeon crawler. It will be an action game where you attempt to descend into the depths of a dungeon to reach a final boss and defeat him/her. The main mechanic that I hope is somewhat interesting is that each floor of the dungeon will have a timer that starts the instant you get there. At some random amount of time after you reach a floor, gates will open and unleash a mobs of monsters to kill you. The goal here is to spend enough time on each floor to find the powerups, but get out of the floor before the mobs of monsters overpower you.

The vision I have for the game is a cyberpunk, horror setting. This is mostly because I've been playing some Shadowrun and, after finding it deeply unsatisfying, I started to wonder how I would do it better.

The visual style of the game is going to use pixelation to evoke the same kind of imaginative aspects that ascii art can produce (and to hide my lack of artistic prowess). This game will probably look more than a little bit like Teleglitch.

I have a bunch of other thoughts about this game, some of them jotted down, but I haven't put any code or art down yet. I started my attempt on March 5th, 2016 at 22:00 UTC.

Tuesday, September 30, 2014

Ubuntu Upgrade Woes

I tried to upgrade to the newest LTS release last week. Let's get this out of the way, relevant XKCD. It took three solid, long days, which is more than anybody should ever be expected to spend installing an operating system for a desktop system (though I must admit that I spent a full week working through Linux From Scratch in my off time). All said and done, I was left with a (almost) fully functional Ubuntu installation, so I am pretty happy.

To put this in context, I have a whole bunch of rants about how much Ubuntu sucks filed away on my computer, but this will probably be the first one that will actually see the light of day. The only reason I bring it up is that it seems that I, sort of, figured out why this was causing me problems and how to fix it. However, I don't really know of a place to put this. I know what worked for me, but I really don't understand it fully and I really don't think I am in the position to give technical advice to anybody.

So, I'll get to the short of it with the hopes that anybody in my same shoes that is frantically Googling about might land on this page. If you:

  1. Are running Ubuntu (or probably any other GNU/Linux that supports EFI boot) on a MacBook Pro (at least the 6,2 model, but perhaps others that use hybrid video technology).
  2. Are having a tough time getting the Nvidia drivers to work (the install but you end up with a blank screen on the next boot).

You might be suffering from the same issue I had; you need to ensure that you are booting in emulated BIOS mode rather than using Apple's EFI. For me, this means that I need to hold the "Option Key" on boot and hand select the Ubuntu drive, but Google says that there are other ways to do this.

It actually solves one of the big mysteries I've had since starting to use Ubuntu on the Apple computers: the poor battery life in Ubuntu versus OS X (in my and my friend's experience this basically cuts the idle lifetime by a factor of two to three). At least one of the reasons for this poor battery life (and perhaps the major contributor) is that the video card is stuck in the discrete Nvidia GPU mode and never switches to the integrated i915 card.

This caught me by surprise as I wasn't aware that booting using different methods would leave the hardware in different states (i.e. boot one way and the system is stable and usable, boot another way and you will get crappy performance at best and very likely an install that will crash every time it tries to start Xorg). It is a shame that Ubuntu picks the unstable EFI way to set up the booting process… however…

You really can't blame Ubuntu for their decision. Apple used a non-standard EFI method for (at least) this series of MacBook Pros. They also used a non-standard hybrid video technology based on Optimus but sufficiently different that none of the bumblebee stuff works, yet. I think people are working on it and Youtube shows some videos with working proof-of-concept graphics switching, but it is a pretty old laptop so I'm not holding my breath.

As a side note, while the whole installation debacle should have soured me on 14.04, I have to say that it is extremely well polished compared to 12.04. Of course it could be better, but I absolutely love how Unity looks once you install a few themes and fiddle with Unity Tweak Tool to bend it to your taste.

Also, I know many people that would ask, "was it worth it?" Is it worth it to work for three days to get a GNU/Linux installation up and running? Absolutely. I am vastly more productive when working within GNU/Linux. Should it have taken less time? Absolutely, and that is the real question to ask. Would another distribution provide similar features for much less investment?

Tuesday, April 1, 2014

Some Emacs hacks for GDB and other stuff

I am fighting with GDB in Emacs today. Thought I would share a few hacks that make things more livable. These get hackier as you go, so… be warned.

First, GDB in many windows mode makes a bunch of dedicated windows. This can be pretty annoying. In fact, dedicated windows in general can be pretty annoying. If I want to change the buffer of a window, don't you dare stop me. Emacs Wiki has some advice code that allows you to disable the setting of dedicated windows in GDB startup (bottom of the page). But to be clear, this is a problem with the Emacs interface, not the GDB interface. The GDB interface has a good reason to not want those buffers to change, and it is annoying to use it if they do change. The problem is when I know that I really do want these to change but Emacs makes me jump through hoops to do it. So, let's fix the underlying problem. Here is a bit of code to add to your init files that will allow you to change buffers even if they are dedicated.

(defun undedicate-window (&optional window)
  (set-window-dedicated-p (or window (get-buffer-window)) nil))

;; Removing annoying dedicated buffer nonsense
(defun switch-to-buffer! (buffer-or-name &optional norecord force-same-window)
  "Like switch-to-buffer but works for dedicated buffers \(though
it will ask first)."
   (list (read-buffer-to-switch "Switch to buffer: ") nil 'force-same-window))
  (when (and (window-dedicated-p (get-buffer-window))
             (yes-or-no-p "This window is dedicated, undedicate it? "))
  (switch-to-buffer buffer-or-name norecord force-same-window))

I just set a global key binding of (kbd "C-x b") to switch-to-buffer! (actually I use a global minor mode to keep track of all of my keybindings, but the result is the same). This will now act exactly like switch-to-buffer unless the window is dedicated, in which case it will ask if you want to "undedicate" the window first. Now Emacs wont reuse these buffers willy nilly, but you can still do what you think is best.

Second, dedicated windows are so convenient (now that they are convenient to undedicate) that you might find that you want to start setting the dedicated flag on random windows that you don't want Emacs to change. So here is a function for that. If you want a dedicated completion window, well, just set it on that window and you won't have to worry about it getting taken over by some other Emacs pop-up.

(defun toggle-window-dedication (&optional window)
  (let ((window (or window (get-buffer-window))))
    (set-window-dedicated-p window (not (window-dedicated-p window)))))

(global-set-key (kbd "C-x d") 'toggle-window-dedication)

Third, I really hate that performing any act in GUD tends to make the source buffer you are working with jump back to the execution point. This is a problem if you are setting several breakpoints, or printing out several values from the source file. Thus I came up with this hack (and this really is a hack) to make this problem go away.

(defun nice-gud-print (arg)
  (interactive "p")
   (gud-print arg)
   (sleep-for .1)))
(global-set-key (kbd "C-x C-a C-p") 'nice-gud-print)
(defun nice-gud-break (arg)
  (interactive "p")
   (gud-break arg)
   (sleep-for .1)))
(global-set-key (kbd "C-x C-a C-b") 'nice-gud-break)
(defun nice-gud-tbreak (arg)
  (interactive "p")
   (gud-tbreak arg)
   (sleep-for .1)))
(global-set-key (kbd "C-x C-a C-t") 'nice-gud-tbreak)

The sleep-for call is necessary to make save-excursion actually work. My guess here is that GUD queues requests and then executes them later. So without the sleep, my call to GUD returns, and with it the save-excursion environment exits, long before GUD processes the command and then resets the view to the execution point. Not pretty, but it works for me at least.

Lastly, I find that it is helpful to have a key binding for gdb-restore-windows. This way you can easily jump to a particular window, make the buffer full screen, and then return to the debugger view later. However, if you like to use a vertically split screen like I do (i.e. you like 80 column width programs), it is often even better to have a toggle between just the two center windows and the full debugger view. So, here is a function to do that:

(defun gdb-windows-toggle (&optional full-restore)
  (interactive "P")
  (let ((window-tree (first (window-tree))))
    (cond (full-restore
          ((and (listp window-tree)
                (= (length window-tree) 5))
           (let ((buffers (if (listp (fourth window-tree))
                              (mapcar 'window-buffer
                                      (rest (rest (fourth window-tree))))
                              (list (window-buffer (fourth window-tree)))))
                 (first-selected (or (windowp (fourth window-tree))
                                     (eql (first (window-list))
                                          (third (fourth window-tree))))))
             (when (> (length buffers) 1)
             (cl-loop for buffer in buffers
                      for window in (window-list)
                      do (progn (undedicate-window window)
                                (set-window-buffer window buffer)))
             (unless first-selected
               (select-window (second (window-list))))))
          ((or (windowp window-tree)
               (and (= (length window-tree) 4)
                    ;; horizontal split
                    (not (first window-tree))
                    ;; No further splits
                    (windowp (third window-tree))
                    (windowp (fourth window-tree))))
           (let ((current-buffers
                   (if (windowp window-tree)
                       (list (window-buffer window-tree))
                       (mapcar 'window-buffer (rest (rest window-tree)))))
                 (first-selected (or (windowp window-tree)
                                     (eql (first (window-list))
                                          (third window-tree)))))
             (let ((windows (rest (rest (fourth (first (window-tree)))))))
               (when (= (length current-buffers) 1)
                 (delete-window (second windows)))
               (cl-loop for buffer in current-buffers
                        for window in windows
                        do (progn (undedicate-window window)
                                  (set-window-buffer window buffer)))
               (if first-selected
                   (select-window (first windows))
                   (select-window (second windows))))))
          (t ;; If all else fails, just restore the windows

(global-set-key (kbd "C-c g") 'gdb-windows-toggle)

That is long and ugly, and probably could be made much simpler, but it does the trick and does it pretty well. Maybe if these really work out I can get someone involved with Emacs to include some of these little hacks (cleaned up of course). It would be nice to deal with the fact that GDB is fundamentally broken for me right now, but… we'll see where those bug reports go.

Monday, January 13, 2014

Visualization of the Game of Life

Working with visualization a bit today and needed a system to visualize, so I did the Game of Life and it turned out pretty, so I decided I should share.

The game is played in the XY plane. Older generations are moved down in Z and their color is darkened by 20%. Each system is started with random live and dead cells. The first runs are with free boundary conditions while after around 2:50 I switch to a toroidal system of size 20 in X and Y. You can clearly see gliders at several places in the video, for instance there is one at the 2:00 mark.

If you care, the simulation and visualization were all done from inside a Lisp session and takes around 90 lines of code (including blank ones) to get this effect.  The stuttering frame rate is due to the screen capturing software, RecordMyDesktop.  I'm not sure how to do this better other than grabbing the frames from the OpenGL display buffer, writing them to a bitmap file, and making my own video with FFMPEG (which is what I do for presentations).

I need to figure out how to make this a live wallpaper for my phone...

Friday, October 25, 2013

CL-Libsvm Usage

CL-libsvm usage

At work we decided that we needed to use a bit of machine learning to automate a problem. I recommended using LIBSVM as it very established (read old with many bindings available). I decided to see what was involved in using it this morning, but found that it took an embarrassingly long time to get it working like I expected it to out of the box. This is my short-coming, I had forgotten too much of the machine learning that I used to know. But this post will hopefully help others get it working more quickly.

First, SVMs are pretty versatile, and there are a lot of choices that you can make when you start using LIBSVM. It is often not clear what you should use for what task, and I don't have the expertise to tell you what is best. For that information, you should check out this guide by the library's author. What I will show here is a quick and dirty approach.

I will be using Gabor Melis' bindings for LIBSVM. They work well enough, but they will issue many warnings when you load them up and use them as they use a now deprecated syntax of CFFI. Just be aware.

We will be using a radial basis function (RBF) kernel to model the "a1a" training data and test data that is available on the LIBSVM website. Already this is a bit odd, as the LIBSVM people have split their data such that ~5% is in the training data set while ~95% is in the test data set. This is probably due to the somewhat expensive nature of SVMs? But, whatever the case, we will stick with it.

Here is a simple procedure to read in the data:

(ql:quickload '(:iterate :cl-ppcre :alexandria))

(use-package :iterate)

(defun parse-data-file (file)
  "Read in a data-file, return list of target values with input as a sparse
vector suitable for cl-libsvm."
  (iter (for line :in-file file :using 'read-line)
     (list (read-from-string line)
           (let ((result nil))
             (ppcre:do-matches-as-strings (match "\\w+:\\w+" line)
                (apply 'cons (mapcar 'read-from-string
                                     (let ((separator (position #\: match)))
                                        (subseq match 0 separator)
                                        (subseq match (+ 1 separator))))))
             (coerce (reverse result) 'vector))))))

(defun randomize-order (data)
  "Return a random permutation of the data."
  (alexandria:shuffle (copy-seq data)))

The parse-data-file returns a vector where each element is a line in the data file mapped into a list of the form (target-val sparse-vector-input). A sparse vector is a vector where each element is a cons cell containing an index as its car and a value as its cdr.

We need to further divide the testing data so that we have a data set that we can use to validate the parameters that the RBF SVM uses (the "validation" set) while we use the rest of the test set to determine the accuracy of the model (the "test" set). We could just cut the training data in half, but that would make our validation calculations very slow, and as we will see is a moment, this becomes the bottleneck of the calculation. I have chosen my validation set to be 2000 point. Whatever size you choose will be a trade off between accuracy and speed. Having 2000 points in your validation set is about my limit when it comes to playing around at the REPL (I would choose something higher if it showed benefits for the models accuracy). If we are dividing the data, we must first first randomize it to ensure that no structure of how the data was collected shows up in the analysis. This can be done with the randomize-order function above and a couple calls to subseq.

(let ((test-data (randomize-order (parse-data-file #p"~/Desktop/a1a.t"))))
  (defparameter *validation-set* (subseq test-data 0 2000))
  (defparameter *test-set* (subseq test-data 2000)))

Now, to get into the nitty-gritty of CL-Libsvm. First, you'll need to install it, and it is not available in Quicklisp yet. So, clone it from Gabor's Git repo. Naturally you will also need to have LIBSVM installed (you don't actually need libsvm-dev or libsvm-tools, but why not have them anyway):

cd ~/quicklisp/local-projects
git clone

# Install LIBSVM if needed
sudo aptitude install libsvm3 libsvm-dev libsvm-tools

CL-Libsvm is basically structured like LIBSVM, so the same documentation applies here. We must first create a problem which contains inputs and the expected outputs, then create a parameter structure that contains the parameters that define how the SVM operates (numerical constants as well as the type of SVM and kernel), and finally combine these two into a model which can be used to make predictions.

This can be done easily enough for any particular RBF parameters \(C\) and \(\gamma\).

(ql:quickload :cl-libsvm)

(let ((data (parse-data-file #p"~/Desktop/a1a")))
  (defparameter *problem* (libsvm:make-problem (map 'vector 'first data)
                                               (map 'vector 'second data))))

(defparameter *parameter* (libsvm:make-parameter :c c :gamma gamma))

(defparameter *model* (libsvm:train *problem* *parameter*))

But we want to do this for many different values of \(C\) and \(\gamma\) in order to find what parameters give us the best performance. We could do several things to find the optimum values. We will be following the LIBSVM procedure and just search the parameter space on a grid. We should note from the manual that it is probably in our best interests to search in \(\log C\) and \(\log \gamma\).

(let ((data (parse-data-file #p"~/Desktop/a1a")))
  (defparameter *problem* (libsvm:make-problem (map 'vector 'first data)
                                               (map 'vector 'second data))))

(defparameter *optimization-grid*
  (iter :outer
    (for log-c :from 1.5 :to 3.5 :by .1)
    (iter (for log-gamma :from -5 :to -3.5 :by .1)
      (in :outer (collecting
                  (list* log-c
                         (let* ((params (libsvm:make-parameter
                                         :c (exp log-c)
                                         :gamma (exp log-gamma)))
                                (model (libsvm:train *problem* params)))
                           (list (quality model *validation-set* log-c log-gamma)

Note that there is a missing function here. We never defined quality. This function is meant to take a model and some testing data and determine a measure of how good the model is performing. For this I chose to use the Matthews Correlation Coefficient with the threshold for the prediction set to \(0.5\).

(defun logistic (x)
  (/ (+ 1 (exp (- x)))))

(defun quality (model test-data log-c log-gamma)
  "Use the Matthews Correlation Coefficient to measure how well the model does"
  (iter (for (target input) :in test-data)
    (let ((p (if (< 0.5 (logistic (libsvm:predict model input)))
                 1 -1)))
      (cond ((and (= p 1) (/= target p))
             (summing 1 :into false-positives))
            ((and (= p -1) (/= target p))
             (summing 1 :into false-negatives))
            ((and (= p 1) (= target p))
             (summing 1 :into true-positives))
            ((and (= p -1) (= target p))
             (summing 1 :into true-negatives))))
     (let ((quality
             ;; Compute quality of model
             (if (= 0 (- (* true-positives true-negatives)
                          (* false-positives false-negatives)))
                  (/ (- (* true-positives true-negatives)
                        (* false-positives false-negatives))
                       (* (+ true-positives false-positives)
                          (+ true-positives false-negatives)
                          (+ true-negatives false-positives)
                          (+ true-negatives false-negatives))
       ;; Print some output so we know what it's doing
        t "log-c = ~A, log-gamma = ~A~@
           TP = ~A, TN = ~A, FP = ~A, FN = ~A~@
           Quality = ~A~%"
        log-c log-gamma
        true-positives true-negatives false-positives false-negatives
       (return quality)))))

When we put this all together and playing around with the ranges of the plots, we get a plot that looks like this:

From this we can tell that there is likely some kind of optimum around \(\log C = 2.10\) and \(\log \gamma = -4.27\).

You might wonder how I was able to determine, looking at those blocky heat maps, where that tiny maximum was. While you can do what LIBSVM docs suggest and move to finer and finer grids to find optimal points, I find this pretty annoying, and finicky. I opted to use some, as yet unreleased, bindings for the NLOpt optimization library. With NLOpt we can do a global optimization followed by a local optimization, which requires next to zero human intervention and finds a pretty good optimum that I doubt I would be able to otherwise. (Small caveat here, it is entirely possible that the rough nature of my MCC heat map is merely an artifact of the small validation set size. I don't have the patience to test this for a silly example)

;; Global optimization (better grab a snack)
 (lambda (log-c log-gamma)
   (let* ((params (libsvm:make-parameter
                   :c (exp log-c)
                   :gamma (exp log-gamma)))
          (model (libsvm:train *problem* params)))
     (- (quality model *validation-set*))))
 '(1 1)
 :lower-bounds '(-5 -5)
 :upper-bounds '(7 7)
 :abs-xtol 1d-1)

;; Fit parameters = (2.071364331304816d0 -4.265683211751565d0)
;; Validation MCC = -0.5509856550306286d0

;; Local optimization (this should be quick)
 (lambda (log-c log-gamma)
   (let* ((params (libsvm:make-parameter
                   :c (exp log-c)
                   :gamma (exp log-gamma)))
          (model (libsvm:train *problem* params)))
     (- (quality model *validation-set*))))
 '(2.071364331304816d0 -4.265683211751565d0)
 :lower-bounds '(-5 -5)
 :upper-bounds '(7 7)
 :abs-xtol 1d-5)

;; Fit parameters = (2.096969188326027d0 -4.268553108908674d0)
;; Validation MCC = -0.5522135970232868d0

(let* ((params (libsvm:make-parameter
                :c (exp 2.096969188326027d0)
                :gamma (exp -4.268553108908674d0)))
       (model (libsvm:train *problem* params)))
  (quality model *test-set*))

;; Test set MCC = 0.5497032935038368d0

This gives the optimum at \(C = 2.10\) and \(\gamma = -4.27\) and a Matthew's Correlation Coefficient of \(0.55\) (measured against the test set, naturally).

Wednesday, July 31, 2013

Rudel Survival Guide

We are gearing up for our collaborative effort for ICFP and so I figured it might be nice to write out a little "how to make it work" guide for Rudel, the collaborative editing framework for Emacs. To get this out of the way first, Rudel doesn't work out of the box. In order to use it, we had to hack a bit on the source to correct what we can only guess are legitimate errors. That said, I certainly don't have the expertise on the system in order to dictate the correct way to solve these problem.

As a note, throughout this contest our setup will be:

  • Rudel v0.3
  • Obby backend
  • TCP transport
  • Port 6522
  • not using the built in Rudel/Obby encryption
  • all connections must be tunneled over SSH (for encryption/access control)
  • No global or user passwords

The first step for using Rudel is to, naturally, install Rudel. If you have Emacs v24, you may use the package manager via M-x package-list-packages. This will make sure that it also gets any dependencies, however I don't think there are any that aren't already part of Emacs. If you are not using Emacs v24, you will need to install package.el (and this will actually be useful as I believe it will have to track down some dependencies). This can be done like this:

cd .emacs.d

Then from Emacs, M-x load-file that package.el file. Then you can use continue as if you were using v24 (except where noted).

Rudel is available via the the Marmalade repository. In order to enable the Marmalade repository, you should add something like this early in your .emacs file:

;; (load "~/.emacs.d/package.el") ;; If not v24

;; Load up the new package manager
(require 'package)
;; Add the Marmalade repo
(add-to-list 'package-archives
             '("marmalade" . "") t)

The install/compile should seemingly complete without any issues. (If you are not on v24, then there might be an issue with package displaying the info page of certain packages, Rudel amongst them. Instead of using M-x package-list-packages, use M-x package-install and specify "rudel" when it asks).

Now the trouble fun begins.

1.1 We broke your Emacs

Try closing and restarting Emacs. On your next start, Emacs will fail to read your .emacs file with an error like:

Symbol's function definition is void: eieio-defclass-autoload

This is because there are some bugs in Rudel that make it not load properly. My solution is to not use the standard method of loading Rudel. The first order of business is stopping it from loading the correct but broken way via the package manager. Go into the shell and move the rudel directory somewhere where it cannot be found by the package manager:

cd .emacs.d
mv elpa/rudel-0.3 ./

Now Emacs should read your .emacs without issue (because it no longer is trying to load Rudel).

The next order of business, we want to be able to load and use Rudel. In order to do this, we will run Rudel's compile script. Do a M-x load-file on .emacs.d/rudel-0.3/rudel-compile.el. This will compile and generate some files, most importantly for our purposes, it will generate rudel-loaddefs.el. Perform a M-x load-file on .emacs.d/rudel-0.3/rudel-loaddefs.el and Rudel should be operational. Try it out. Use M-x rudel-host-session to start a session (protocol obby, transport tcp, port 6522). Then join that session via M-x rudel-join-session. Try publishing a buffer with M-x rudel-publish-buffer. This should all work.

We want to make this permanent, so we should add something like:

(load-file "~/.emacs.d/rudel-0.3/rudel-loaddefs.el")

…to our .emacs file after the (package-initialize) statement.

Look, I know this is extremely hackish, but I think it will work for everybody. It is the only way I have consistently been able to get things working.

1.2 Joining and leaving sessions

So, as best I can see, this is the status of this functionality in Rudel: you can join sessions and you can leave, but as far as the server cares, you can never leave. This doesn't seem like too much of a problem at first, but here is how problems start to manifest.

  1. A username must be unique: This means that each time you log in, you have to pick a unique name, not the one you used last time. This manifests as a lot of "smithzv", "smithzv2", "smithzv3", etc. Luckily, you shouldn't be disconnected often.
  2. A color must be unique at login time. This one isn't as bad as you can change colors once you are in the session using rudel-change-color. This means that a good practice is to log in and pick a garish and otherwise infrequently used color and then immediately change it to something more appropriate. No need to worry about conflicts after you have logged in.

1.3 Undo and Rudel

So, one of the biggest problems that I have with collaborative editing in Rudel is that undo are treated very literally. I you undo, you implicitly expect it to apply to your code. However, with Rudel, where someone could be editing somewhere else in the file, undo is happy to go about undoing their work as they type rather than the edits you just made.

The end result is that in a collaborative editing environment, undo is a very dangerous thing; doubly so when undo means what it does in Emacs land (i.e. you can redo by undoing an undo). Basically, if you are using Rudel, you probably should not be using undo, at all. This is a pretty tall order if you have deeply internalized undo into your editing commands (as I have).

In strikes me that a helpful heuristic would be to undo only within a region that contains only your edits (or even using something like multiple-regions to allow for undo of anything that you have edited but not things that others have). This means that if you are the only person editing a buffer, undo works exactly as expected, but if there are others editing the buffer, it will only undo your edits. Note that this isn't always what you want.

However, I'm not sure that such a heuristic is possible (or more likely, it is possible but is hard to do). I'll take a look. It seems that for safe, useful undoing, you need to tell everybody that is working on that buffer that they need to stop what they are doing so you may perform your undos.

I realize that other undo systems can be used with Emacs. For instance, there is Undo-Tree. I am not sure how well these work (or really how they work). Perhaps someone who is better versed in these tools can enlighten us.

1.4 When things go wrong

There are times when, try as it might, Rudel screws up and the files become out of sync. This happens fairly infrequently, thank goodness, but when it does, Rudel does not handle this gracefully. There is no "resync" function that I can see. This means that if you find that your buffer says one thing, but someone elses says something else (this usually manifests a what looks like a typo, but if you fix it, another person goes back and changes it back to its typo state), you will have do something pretty drastic in order to get Rudel to work correctly again. There are a couple of things that must work, but they both are pretty annoying:

  1. Ditch this buffer, have everybody unsubscribe, rename the buffer so it will share under a different name, then republish. This way everything has started fresh with this buffer.
  2. Restart the entire Rudel session.

Of course, the first method is preferred to the second.

1.5 Dealing with Annoying Colors

If you start using Rudel, sometimes a collaborator will pick a color that doesn't work with your theme or general aesthetic. Luckily, there is something you can do about it even if they refuse to change their color (or are perhaps away from keyboard), you can turn off all coloring by author. Simply specialize the variable rudel-overlay-author-display (set it to nil) and no more colors. This is pretty necessary right now because Rudel is ignorant of the difference between light and dark themes. Thus two locally appropriate choices might be completely inappropriate for the remote party.