Friday, June 10, 2011

Syncing Emacs with Google Documents

Before anything, sorry for the long post. If I had more time, I would have made it shorter. I made a promise to myself that I would spend the time to write up things I do, mainly so I get practice doing so, and to have a record of things I've done.

Recently I had an interview with Google. At Google they try to weed out applicants with a quick technical phone interview before they fly you out for an actual interview. As part of this interview, a shared Google Document is set up as a kind of virtual white-board so you can show your programming ability. Never mind that I would rather be sharing a Screen session, VNC connection, or even a live screen cast of my desktop, this is the way they decided to do it. The problem is, Google Documents are not meant to be code editor, they are meant to be a tool for writing documents. As such, I would be left without the aid of auto-completion, argument lists, and (most importantly) the Emacs key bindings, that, by now, are etched into my spinal column. So, a few minutes after receiving an email stating a Google Document would be used, I started coding up a program that would allow me to program in Emacs, but have my work displayed in Google Documents. See the video for an idea of what I'm talking about.


I quickly found the hooks: before-change-functions and after-change-functions. Actually, they are not hooks, I believe, as Emacs has a strictly specified meaning and argument passing protocol for hooks, they are just a list of functions that are called before and after each change, respectively. These would allow me to find what changes are being made locally so I can send them to Google Docs.

Now came one hurdle: how do I interface with Google Docs? There is a Greasemonkey and Python script that allows you to download Google Documents from the command line. Unfortunately, I could not find a script made for uploading documents, which is really the main part of this. Further, while the script worked for downloading, it was far too slow for interactive editing. The time from hitting return to retrieved document was on order 5-10 seconds. I needed something that could be run extremely fast, like on the order of the time it takes to type a key.

I eventually decided that I should try to mirror the edits between the two document by sending keystrokes to the browser window. Next hurdle: in X, as far as I know, keystrokes are only sent to the focused window. And since I would be using Emacs, I couldn't rightly send keystrokes to Firefox at the same time. Or can I? I quickly decided to run a separate X server which would run an instance of Firefox which could be focused at the same time as Emacs on my normal X server. Now there are several ways to do this (e.g. actually run a new server on a different virtual terminal, or playing around with Xnest which runs an X server inside a window in your host X server) but in the end I decided to use the awesome programs xpra and Xvfb. Xpra is a program designed to be an alternative to X forwarding.


Xpra and Xvfb

As an aside, X forwarding allows you to run graphical applications on remote machines while having the graphics display locally. This is different from VNC and RDP in that you don't have access to the entire X session, just windows associated with applications you are running. X forwarding has become less used in the Unix world, mostly due to a lack of knowledge of its existence, I believe. This is not to say that there aren't good reasons for it not be used. It actually has a pretty severe problem on high latency network (a.k.a high ping time) connections.

Xpra was also designed to get around X forwarding poor performance as well as to add abilities like detaching GUI programs and re-attaching from a different point on the network, just like with screen. It truly is very neat. All of it's neatness aside, I just wanted to use it to handle Xvfb, which is a virtual X server, like Xnest, but one that doesn't have to be displayed at all. You can start programs under an Xvfb server and they will happily do their business thinking they are displaying windows, but no window is drawn to any screen. Xvfb sounds just like what I needed. First I started the Xvfb server on display ":1" by using the Xpra:

xpra :1

I started Firefox under the Xvfb server via:

DISPLAY=:1 firefox -no-remote -P some-other-profile

This is more complicated than normal because I leave Firefox running. The options -no-remote tells Firefox to not check for other, currently running instances of Firefox before starting it. If Firefox sees a running instance of itself, it will choose to just open a new window rather than start a new instance. The option -P tells Firefox to use a different profile (which you must create prior to this) for the session. This is necessary as a different running instance of Firefox will have a lock held on your normal profile. If you are willing to just close Firefox, you can omit both of those options.

At this point, nothing should be displayed. If you want to attach the Firefox window to the current X server via Xpra, just use xpra attach and it should appear. You can navigate this to your Google Document.


Sending input to the window

Up until now, we have not even discussed how you are going to send input to the window. Luckily, this can be accomplished by many tools. In fact, I ran into three without even looking really (Xnee which includes gnee and cnee, xte, and xdotool). I have even heard people describing directly inserting characters into the input devices in /dev. I ended up using xdotool because (1) it seemed more powerful than xte and (2) cnee crashed my (main ":0") X server. With xdotool you can send keys (with modifiers), strings, mouse motion and clicks, and manipulate windows, like bringing them into focus.

# Send an 'a', a 'B', and a 'C'
DISPLAY=:1 xdotool key a B shift+c

# This types "hello how are you"
DISPLAY=:1 xdotool type "hello how are you"

# This moves the mouse to 100,100 relative to the top-left corner of the window,
# then right click (presses and releases the 3rd button)
DISPLAY=:1 xdotool mousemove --window 41466948 100 100 click 3

# This brings window with id 41466948 into focus
DISPLAY=:1 xdotool focusWindow 41466948

This is fine and dandy, but what if we want to input large sections of text. Even if we use the type command, xdotool is still just typing every letter, albeit ~100 characters/second. Further, due to a bug/shortcoming/whatever in xdotool, I couldn't figure out how to make it send text that includes a newline character (or an ampersand or question mark for that matter). It seems worth our while to come up with other ways of interfacing with the Google Documents window. The other way I came up with is the clipboard.

X has a clipboard (actually it has three), and that clipboard can be accessed via the command line. Again, there are multiple tools that can do this including xclip and xsel. While the functionality of the two programs are near identical, I chose to use xsel as it seemed a little more robust. Using xsel, we can send large sections of text to the clipboard on display ":1" (remember, each X server has it's own clipboard), then send a "ctrl+v" using xdotool. More interestingly, however, is that copying in the ":1" server allows us to gain some information on what is in the other file and more importantly, as it turns out, the location of the point in the Google Document.


Putting it together

Most of this involves coding in Emacs Lisp. The point of everything here is to keep track of where the point in the Google Document while making alterations. If we ever lose track of where the point is in the document, we will corrupt the document with every edit. In the following, I use the word "document" to refer to the Google document and "document point" to refer to the cursor location in that document. The word "buffer" refers to things in Emacs, and "buffer point" refers to the point in the Emacs buffer, i.e. the cursor position. I use the word function here because that's what Emacs calls them. In reality these are all procedures called for their side effects.

We will need some basics:

  1. A function that will send a key to the window
  2. A function that will send a string to the window
  3. A function to properly escape strings sent through the shell
  4. Functions that move left and right
  5. A function that moves the point by some relative amount
    (defvar gd-point 0 "A variable holding the document point")
    
    (defun gdmir-gdocify-string (str)
      "Escape every character.  This means that the shell shouldn't
    mess with any of this."
      (replace-regexp-in-string "\\(.\\)" "\\\\\\1" str))
    
    ;; The parameter INSTANT is used in the buffered input optimiztion, but not
    ;; here.
    (defun gdmir-send-key (keys &optional instant)
      (shell-command (concat "DISPLAY=:1 xdotool windowFocus 4194333 key "
                             keys )))
    
    (defun gdmir-send-string (string)
      (shell-command (concat "DISPLAY=:1 xdotool windowFocus 4194333 type "
                             string )))
    
    (defun gdmir-move-left (n)
      "Move the buffer N characters left, reducing GD-POINT by N.
    The buffer point doesn't change."
      (loop repeat n
            do (gdmir-send-key "Left")
               (decf gd-point)) )
    
    (defun gdmir-move-right (n)
      "Move the buffer N characters right, increasing GD-POINT by N.
    The buffer point doesn't change."
      (loop repeat n
            do (gdmir-send-key "Right")
               (incf gd-point) ))
    
    (defun gdmir-move-to-relative-point (point-difference)
      "Move the buffer point to \(+ GD-POINT POINT-DIFFERENCE).  The
    buffer point doesn't change."
      (if (< point-difference 0)
          (gdmir-move-left (abs point-difference))
          (gdmir-move-right (abs point-difference)) ))
    

Now for the meat and potatoes: the hook functions. We will define a function that sets the after and before change functions. The before function, gdmir-before-edit, which receives the start and end point of the forthcoming edit, moves the document point to the end of the edit range, then saves the original string, and finally deletes the original text1. The after function, gdmir-after-edit, sends commands required to type the new text.

(defun split-for-xdotool (string)
  "This function runs through the input and replaces instances of
`?', `&', and newlines with symbols corresponding to these
characters.  We return a list containing strings and the special
symbols in the order they should be output.  We have to use the
special symbols due to the fact that xdotool, or bash, or
something cannot handle these symbols, even if they are quoted."
  (rest
   (let ((count 0))
     (loop for segment in (split-string string "[\n?&]")
           appending
        (list (when (> count 0)
                (cond ((eql (aref string (- count 1)) (aref "?" 0))
                       'question )
                      ((eql (aref string (- count 1)) (aref "&" 0)) 'amp)
                      ((eql (aref string (- count 1)) (aref "\n" 0))
                       'ret )))
              segment )
           do (incf count (1+ (length segment))) ))))

(defvar *prechange-text* nil "This holds the contents of the
edited region prior to the edit.  We don't use this, but I can
think of a few reasons we might want to save it." )

(defun gdmir-before-edit (start end)
  (gdmir-move-to-relative-point (- end gd-point))
  (setf *prechange-text* (list start end (buffer-substring start end)))
  ;; Delete the string in GD before in Emacs
  (loop for i below (- end start)
        do (decf gd-point)
           (gdmir-send-key "BackSpace") ))

(defun gdmir-after-edit (start end old-length)
  (when (< 0 (- end start))
    (loop for line in (split-for-xdotool (buffer-substring start end))
          do (cond ((eql line 'question)
                    (gdmir-send-key "shift+slash") )
                   ((eql line 'amp)
                    (gdmir-send-key "shift+7") )
                   ((eql line 'ret)
                    (gdmir-send-key "Return")
                    (gdmir-send-string "\\|") )
                   ((< 0 (length line))
                    (gdmir-send-string (gdmir-gdocify-string line)) )))
    (incf gd-point (length (buffer-substring start end))) ))

(defun insert-change-hook ()
  (setf *change-hook-in-effect* t)
  (push
   'gdmir-before-edit
   before-change-functions )
  (push
   'gdmir-after-edit
   after-change-functions ))

We also can use a few helper functions which we will implement as key-bindings.

  1. A function that syncs the points in the buffers (C-x SPC when mirroring is on)
  2. A function that turns the mirroring on and off (C-x SPC when off will turn
    it on, if it's on and the point is already synced, it will be turned off)
  3. A set of functions that moves the point in the browser window (M-<direction
    keys>). These are really handy for a quick reposition of the document
    point.
    (defun sync-points ()
      "Set GD-POINT to the buffer point."
      (setf gd-point (point)) )
    
    (defvar *change-hook-in-effect* nil "Used to determine if our
    change hooks are already doing their thing.")
    
    (defun setup-mirror ()
      "Set up the mirroring environment.  This should really be a
    minor mode, but since this is basically a throw away hack, I'm
    not going to bother."
      (setf old-after-change-functions after-change-functions
            old-before-change-functions before-change-functions )
      (local-set-key (kbd "M-<left>")
                     (lambda () (interactive) (gdmir-send-key "Left" t)) )
      (local-set-key (kbd "M-<right>")
                     (lambda () (interactive) (gdmir-send-key "Right" t)) )
      (local-set-key (kbd "M-<up>")
                     (lambda () (interactive) (gdmir-send-key "Up" t)) )
      (local-set-key (kbd "M-<down>")
                     (lambda () (interactive) (gdmir-send-key "Down" t)) )
      (local-set-key (kbd "C-x SPC")
                     (lambda ()
                       (interactive)
                       (cond ((and after-change-functions
                                   *change-hook-in-effect*
                                   (= (point) gd-point) )
                              (message "Mirroring disabled")
                              (setf *change-hook-in-effect* nil
                                    after-change-functions old-after-change-functions
                                    before-change-functions old-before-change-functions ))
                             ((and after-change-functions
                                   *change-hook-in-effect* )
                              (message "Syncing points")
                              (sync-points) )
                             (t
                              (message "Mirroring enabled")
                              (setf *change-hook-in-effect* t)
                              (insert-change-hook)
                              (sync-points) )))))
    

Optimizations

While all of this works, it works quite slowly. Just moving from one point to another in the document can take a considerable time as it involves moving side to side perhaps thousands of times. Here we consider two areas where there can be considerable improvements.


Incorporating other moves than side to side

The first optimization I made was to allow for motion using the up and down keys. This is difficult as we don't know how the lines are wrapped on the Google Doc, so we don't know how far that moves in the file. This can be attacked by noting that if we hold down shift as we move, the region will be selected. We can copy that to the clipboard2 and read it in Emacs, allowing us to find length of the string selected, and therefore the motion in the document. This almost works. However, it turns out that whitespace at the beginning and end of selections are often times elided when copied to the clipboard (not sure why). Another issue, if you make a selection of only whitespace, it may not be copied to the clipboard at all, leaving what was on there before.

This means that we may lose track of our document point if we start or end on white space in a vertical move. The way I chose to deal with this is to place a vertical bar, "|", at the beginning of each line. To move vertically, I move the point to the beginning of the line (before the vertical bar) and then move up or down selecting the difference. This method ensures that the selection includes no end whitespace (other than the newline at the end. The newline at the end of the selection is deleted but it is compensated for by the vertical bar. As long as we only move one line at a time, we can tell how far we've gone in the vertical direction.

To add this change, we must modify the gdmir-move-left and gdmir-move-right functions to take two steps when we pass a newline in order to skip over the vertical bar in the left hand column. We must also have our change hooks check to see if we are deleting a newline (we must delete an extra character) and if we are inserting a newline (we must add a new vertical bar).

(defun gdmir-move-left (n)
  (save-excursion
    (goto-char gd-point)
    (loop repeat n
          do (when (= 0 (current-column))
               (gdmir-send-key "Left") )
             (gdmir-send-key "Left")
             (decf gd-point)
             (backward-char) )))

(defun gdmir-move-right (n)
  (save-excursion
    (goto-char gd-point)
    (loop repeat n
          do (gdmir-send-key "Right")
             (incf gd-point)
             (forward-char)
             (when (= 0 (current-column))
               (gdmir-send-key "Right") ))))

(defun gdmir-before-edit (start end)
  (gdmir-move-to-relative-point (- end gd-point))
  (setf *prechange-text* (list start end (buffer-substring start end)))
  ;; Delete the string in GD before in Emacs
  (save-excursion
    (goto-char gd-point)
    (loop for i below (- end start)
          do (progn
               (when (= 0 (current-column))
                 ;; Clear out code marker
                 (gdmir-send-key "BackSpace") )
               (decf gd-point)
               (gdmir-send-key "BackSpace")
               (backward-char) ))))

(defun gdmir-after-edit (start end old-length)
  (when (< 0 (- end start))
    (loop for line in (split-for-xdotool (buffer-substring start end))
          do (cond ((eql line 'question)
                    (gdmir-send-key "shift+slash") )
                   ((eql line 'amp)
                    (gdmir-send-key "shift+7") )
                   ((eql line 'ret)
                    (gdmir-send-key "Return")
                    (gdmir-send-string "\\|") )
                   ((< 0 (length line))
                    (gdmir-send-string (gdmir-gdocify-string line)) )))
    (incf gd-point (length (buffer-substring start end))) ))

(defun insert-change-hook ()
  (setf *change-hook-in-effect* t)
  (push
   'gdmir-before-edit
   before-change-functions )
  (push
   'gdmir-after-edit
   after-change-functions ))

Then we can introduce gdmir-move-up, gdmir-move-down, and extend gdmir-move-to-relative-point so it makes use of this new ability.

;; *EMS-PER-LINE* will be used to guess at when a line might be wrapped.  if we
;; *over-estimate, vertical moves will be broken
(defvar *ems-per-line* 50
  "A conservative estimate for how many `m's are in a line in the google document." )

(defun gdmir-move-to-zero-column ()
  (flush-commands)
  (save-excursion
   (goto-char gd-point)
   (gdmir-move-left (max (- (current-column) (- *ems-per-line* 1)) 0))
   (goto-char (- gd-point (max (- (current-column) (- *ems-per-line* 1)) 0)))
   (cond ((< (current-column) *ems-per-line*)
          (gdmir-send-key "Home Right" t)
          (setf gd-point (line-beginning-position)) )
         (t (gdmir-move-left
             (current-column) )))))

(defun gdmir-grab-selection ()
  "To read from the clipboard"
  (gdmir-send-key "ctrl+c" t)
  (shell-command-to-string "xsel --display :1 -o -b") )

(defun gdmir-move-up ()
  (flush-commands)
  (save-excursion
    (goto-char gd-point)
    ;; More to the left most position on the screen
    (gdmir-move-to-zero-column)
    (goto-char gd-point)
    ;; Move to the left of the code delimiter
    (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key Left")
    ;; move up selecting the differnce
    (let ((orig-line (line-number-at-pos)))
      (loop until (/= (line-number-at-pos) orig-line)
            do (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key shift+Up")
               (let ((selection (gdmir-grab-selection)))
                 (backward-char (length selection)) )
               ;; This doesn't move the point, it just moves to the left of the
               ;; selection and unselects the text.
            (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key Left") ))
    (setf gd-point (point))
    ;; Move to the right of the code delimiter
    (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key Right") ))

(defun gdmir-move-down ()
  (flush-commands)
  (save-excursion
    (goto-char gd-point)
    ;; More to the left most position on the screen
    (gdmir-move-to-zero-column)
    (goto-char gd-point)
    ;; Move to the left of the code delimiter
    (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key Left")
    ;; move down selecting the differnce
    (let ((orig-line (line-number-at-pos)))
      (loop until (/= (line-number-at-pos) orig-line)
            do (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key shift+Down")
               (let ((selection (gdmir-grab-selection)))
                 (forward-char (length selection)) )
               ;; This doesn't move the point, it just moves to the left of the
               ;; selection and unselects the text.
            (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key Right") ))
    (setf gd-point (point))
    ;; Move to the right of the code delimiter
    (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key Right") ))

(defun gdmir-move-to-relative-point (point-difference)
  (let ((target (+ gd-point point-difference)))
    ;; 80 is a guess at how far we have to be going before moving vertically
    ;; might be of benefit
    (cond ((< point-difference -80)
           ;; Go until you have passed the point.
           (while (> gd-point target)
             (gdmir-move-up) ))
          ((> point-difference 80)
           ;; Go until you have passed the point.
           (while (< gd-point target)
             (gdmir-move-down) )))
    ;; Then we will finish off with side to side movement
    (let ((point-difference (- target gd-point)))
      (if (< point-difference 0)
          (gdmir-move-left (abs point-difference))
          (gdmir-move-right (abs point-difference)) ))))

The motion to the beginning of the line is still annoying but as you can see we sped that up slightly by using the "home" key. This moves the point to the beginning of the line. You don't even have to select in this move as we can calculate the point at the beginning of the line in Emacs. There is one problem with this, however, since lines can wrap, and pressing "home" in Google Docs brings you to the beginning of the visible line, this cannot be used if you are on a wrapped line. I have the program move horizontally until it is within a conservative estimate of the maximum line width, then us the "home" key. You might think it a good idea to use the "shift+home" key combo repeatedly interspersed with "left" key presses counting how far we've gone, but the same issue remains, if we happen to start on some whitespace, we may miss some characters at the end of the line.


Buffering Commands

Buffering the input commands speeds things up in general because it removes some overhead to the input sending process. The way we have it set up right now is to have each keystroke start a bash shell and run the xdotool command to send input to the document. Just starting bash contributes a pretty big overhead. By buffering the commands, we can collect several commands and send them all at once, defeating the overhead. All in all this seems to speed up input by a factor of two, or so. This is a pretty significant improvement3. In fact, this beats the above method of motion for all but very long yet unwrapped lines.

In order to incorporate this kind of buffering from within Emacs, we can just hold of list of pending commands that need to be sent to the document. We will modify the gdmir-send-key function to push onto that list rather than actually send it to via xdotool. If the buffer gets to a certain size, we flush the commands to the document. In addition, since sending a string via xdotool is the last thing one can do in an invocation to xdotool, we also must force a flush of all stored commands before we send the string.

(defvar *pending-keys* nil)

(defun flush-commands (&optional string)
  (let* ((cmd (if *pending-keys*
                  (apply #'concat "key " (mapcar (lambda (x) (concat " " x)) (reverse *pending-keys*)))
                  " " ))
         (cmd (if string
                  (concat cmd " type " string)
                  cmd )))
    (when (or *pending-keys* string)
      (shell-command (concat "DISPLAY=:1 xdotool windowFocus 4194333 " cmd)) )
    (setf *pending-keys* nil)
    (setf *last-flush* (float-time)) ))

(defun gdmir-send-key (keys &optional instant)
  (push keys *pending-keys*)
  (when (or instant (< 20 (length *pending-keys*)))
    (flush-commands) ))

(defun gdmir-send-string (string)
  (flush-commands string) )

In addition to all of this, we will also need to alter several other functions to force flushing as some of them require synchronous execution, like anything that involves the clipboard.

(defun gdmir-move-to-zero-column ()
  (flush-commands)
  (save-excursion
   (goto-char gd-point)
   (gdmir-move-left (max (- (current-column) (- *ems-per-line* 1)) 0))
   (goto-char (- gd-point (max (- (current-column) (- *ems-per-line* 1)) 0)))
   (cond ((< (current-column) *ems-per-line*)
          (gdmir-send-key "Home Right" t)
          (setf gd-point (line-beginning-position)) )
         (t (gdmir-move-left
             (current-column) )))))

(defun gdmir-move-up ()
  (flush-commands)
  (save-excursion
    (goto-char gd-point)
    ;; More to the left most position on the screen
    (gdmir-move-to-zero-column)
    (goto-char gd-point)
    ;; Move to the left of the code delimiter
    (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key Left")
    ;; move up selecting the differnce
    (let ((orig-line (line-number-at-pos)))
      (loop until (/= (line-number-at-pos) orig-line)
            do (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key shift+Up")
               (let ((selection (gdmir-grab-selection)))
                 (backward-char (length selection)) )
               ;; This doesn't move the point, it just moves to the left of the
               ;; selection and unselects the text.
            (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key Left") ))
    (setf gd-point (point))
    ;; Move to the right of the code delimiter
    (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key Right") ))

(defun gdmir-move-down ()
  (flush-commands)
  (save-excursion
    (goto-char gd-point)
    ;; More to the left most position on the screen
    (gdmir-move-to-zero-column)
    (goto-char gd-point)
    ;; Move to the left of the code delimiter
    (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key Left")
    ;; move down selecting the differnce
    (let ((orig-line (line-number-at-pos)))
      (loop until (/= (line-number-at-pos) orig-line)
            do (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key shift+Down")
               (let ((selection (gdmir-grab-selection)))
                 (forward-char (length selection)) )
               ;; This doesn't move the point, it just moves to the left of the
               ;; selection and unselects the text.
               (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key Right") ))
    (setf gd-point (point))
    ;; Move to the right of the code delimiter
    (shell-command "DISPLAY=:1 xdotool windowFocus 4194333 key Right") ))

(defun gdmir-before-edit (start end)
  (gdmir-move-to-relative-point (- end gd-point))
  (setf *prechange-text* (list start end (buffer-substring start end)))
  ;; Delete the string in GD before in Emacs
  (save-excursion
    (goto-char gd-point)
    (loop for i below (- end start)
          do (progn
               (when (= 0 (current-column))
                 ;; Clear out code marker
                 (gdmir-send-key "BackSpace") )
               (decf gd-point)
               (gdmir-send-key "BackSpace")
               (backward-char) ))))

(defun gdmir-after-edit (start end old-length)
  (when (< 0 (- end start))
    (loop for line in (split-for-xdotool (buffer-substring start end))
          do (cond ((eql line 'question)
                    (gdmir-send-key "shift+slash") )
                   ((eql line 'amp)
                    (gdmir-send-key "shift+7") )
                   ((eql line 'ret)
                    (gdmir-send-key "Return")
                    (gdmir-send-string "\\|") )
                   ((< 0 (length line))
                    (gdmir-send-string (gdmir-gdocify-string line)) )))
    (incf gd-point (length (buffer-substring start end))) )
  (flush-commands) )

(defun insert-change-hook ()
  (setf *change-hook-in-effect* t)
  (push
   'gdmir-before-edit
   before-change-functions )
  (push
   'gdmir-after-edit
   after-change-functions ))

And lastly we probably want to set an idle timer to run flush-commands when nothing is happening. This will make it so no changes will sit indefinitely in the key buffer. After waiting around a few seconds (four here) and pending keys are sent to the document.

(setf *idle-flusher*
  (run-with-idle-timer 4 t (lambda () (flush-commands))) )

Two other thoughts of speeding things up come to mind, though I did not attempt this for free time reasons.

  1. Asynchonous Input: Buffering commands is also a good idea for a completely
    different reason if we can buffer them outside of Emacs. Emacs, as lovely
    as it is, doesn't allow for multiple threads. This means that we have to
    wait as the commands are processed. If we are able to buffer commands to a
    different process, we could have a concurrent execution. We only need to
    have to synchronize the input with Emacs when we want to get contents of the
    Google Doc, i.e. when we are examining the clipboard.
  2. Command optimization: The commands can also be optimized. For instance,
    when you yank to an Emacs buffer (a.k.a. paste), Emacs apparently writes it,
    then deletes it, then writes it again. I'm not sure why this happens, nor
    did I ever notice until the mechanism of writing and deleting was slowed
    down by a factor of 1000 or so. In principle, before the buffer is flushed
    to output, it could pass the contents through an optimizer, that would
    reduce this particular delete-redraw cycle to no action.

Conclusions

Let me take a moment to relay a few observations I made while developing this interface with a "cloud" program. First, this was very hard and convoluted. I don't think that Google has any interest in stopping me from doing this, but at times I actually thought there was an antagonistic entity on the other end thwarting my attempts. If you copy and paste from a Google Doc, whitespace might get eaten (I think this is happening due to X), or indentation might get messed up (this is almost certainly happening on the Google side). I had weird cases where more than four spaces in a row were converted to tabs, and bizarre rules on what and how much whitespace was eaten at the beginning and end of the selection. You saw how we dealt with this in the code (adding an vertical bar before each line), but it is just the nature of the beast that this is a fragile set up. It would really be nice if Google made a true virtual white-board, or used an existing one (they must exist). They could put a public API on it and anyone trying to do this would be giddy.

Second, because it's a cloud application, Google can upgrade and make incompatible changes at any time. The compatibility doesn't matter much to humans as they are adaptable, but to a program, it really throws a wrench in the works. For instance, I swear that when I first started developing, copying a block of text and pasting it in a Google Doc preserved the indentation. A few days later I tried it and this was no longer the case. It seemed like a bug that would bother people, so I wouldn't be surprised if it is fixed by now. Many people are excited by how fast an application can move when the developer decides the upgrade schedule. I guess I often times prefer the stability of getting to decide that myself.

In the end this worked well enough to use, which is the most important thing. It is pretty basic functionality, but good enough to really help when programming in Google Docs4. I wouldn't be surprised at all, though, if I tried this in a few weeks and found that it didn't work anymore. In the very end, however, it was ultimately unused as the phone interview consisted only of pen and paper questions. Not to despair, however, the problem was a fun one to tackle. It included interesting things like pushing the X windowing system further than most users ever do, really exploiting the extensibility of Emacs, and playing with the flexibility of the GNU/Linux operative system in general. A few years ago I would have thought this too hard to actually do and over a decade ago in Windows I would have thought this impossible (this might be possible in OS X, I imagine fiddling with the clipboard and sending keys might be the hard bits). All in all, this was a fun little project during some of the development.

Footnotes:

1 There is actually a bug here. Emacs sometimes specifies that the
change area before isn't the change area specified in after. For instance,
check the command M-x capitalize-word, which will delete the word, but only
write the first letter of the word. I'm not sure if this is my misunderstanding
of the arguments Emacs sends, or if it is a bug in Emacs itself. Luckily it
seems to be very rare.

2 Yeah, I know, if it's selected it is already on the selection clipboard.
I use the ctrl+c/ctrl+v one as it gives a bit more control.

3 To go further, one could attempt to reduce the xdotool delay
between key presses (which defaults to 13 ms). I didn't attempt this as I
assumed that it might reduce the robustness of the method (i.e. keystrokes might
be missed).

4 I originally had visions of actually mirroring the buffers.
I.e. you tell Emacs which buffers you want displayed in the document and it will
grab them, insert the code marker, perhaps a simple frame marking the buffer you
are in, and send it to the clipboard to be pasted on the document. I actually
had Emacs sending entire buffers at one point. After I started hitting more and
more issues with clipboards and how Google Docs messes with indentation, or
whitespace, or other things, I decided that this was not worth the effort.

7 comments :

  1. Excellent post, Zach. I hope you got the job!

    ReplyDelete
  2. Isn't the purpose of coding in a Google Doc to eliminate the 'aid of auto-completion, argument lists, and (most importantly) the Emacs key bindings'?

    ReplyDelete
    Replies
    1. Interesting thought, but I doubt it (or I would hope not). I think that the point of Google Docs was to provide a platform independent, sure fire way to communicate text during the interview.

      There is a chance that some might see the capabilities of good IDE as an unfair advantage because they are using more than what is in their heads. I think this is akin to saying that we should interview race car drivers by measuring how fast they can run the 50m. Presumably, the interview should allow the interviewer to test the capability of the interviewee to do the job that they would be doing at the company. If you are interviewing for a job at a Java shop, I don't think it is out of line to see if the candidate knows how to use Eclipse. I am reminded of this: https://blogs.msdn.com/b/ericlippert/archive/2011/02/14/what-would-feynman-do.aspx particularly the end:

      Interviewer: Well I think that concludes this portion of the interview. Before we let you go for the day do you have any questions for me about this company, this team or the job?

      RPF: Yes. When you build software algorithms, do you build systems using well-established software engineering principles to produce software that conforms to industry standards and practices?

      Interviewer: Of course.

      RPF: And do you use software analysis tools, like profilers, debuggers, theorem provers, and so on, to facilitate detection and diagnosis of flaws?

      Interviewer: Yes, again, of course we do.

      RPF: Then why would you ask an interview question that tests my willingness to abandon industry-standard, well-established techniques...

      Delete
  3. Fantastic!! I'm actually in a similar situation : I have an interview in 10 days which is going to use google docs for whiteboarding. I love emacs, but I'm a rank novice at Elisp. Could you host your code on github or something (and maybe a few instructions to setup the stuff)? that'l be of great help. Thanks! Regardless of the interview, I'd be thrilled to bits to have my emacs configured like you did in the video. Thanks!

    ReplyDelete
    Replies
    1. Ok, I will, but you should note that I tried it today and it doesn't work... Well it works almost as well as it did when I wrote it (which was already not working very well). I have not touched this since I wrote the original post. Remember that my conclusion from this experiment was that this sort of thing is a bad idea and likely cannot be made to work as changes that Google makes to their Javascript will break it in subtle ways without warning.

      You should probably not use this code for anything, especially anything important. There are correct ways to do this and this approach isn't it (i.e. don't use Google Docs). Also, this was also my first foray into Emacs Lisp... don't assume that this is well written my any stretch.

      All said, here is a Gist: https://gist.github.com/smithzvk/8362474

      Delete
  4. Thanks a lot Zach. I shall diligently follow your advice on not using this for anything important.

    ReplyDelete
  5. What a great post! In my field (astronomy) colleagues are turning from Dropbox to Google Docs for writing e.g., proposals jointly - but having emacs typing habits ingrained on my backbone also (for example hitting ESC q all the time for paragraph reformat) I hate it. The setup you describe is far beyond my ability - but why doesn't Google itself offer an emacs typist setting for its Docs? If you got the job, make them do so?

    ReplyDelete