Using flymake to check erb templates

November 5, 2010 – 9:19 pm

Live syntax checking code in Emacs with flymake is extremely
useful. It’s quite easy to use for syntax checking scripting
languages or for running code analysis tools in the
background. Flymake’s initial goal, however, was syntax checking
compiled languages like C by running a custom make target. The
flexibility needed to make all of this work makes configuring
flymake quite involved.

However, for syntax checking a scripting language, you usually copy
a recipe off the emacswiki and be done with it. There’s also a
recipe for ruby, but there was nothing to syntax check ERB
templates with embedded ruby code.

This is how an ERB template usually looks like:

<h1>My awesome blog!</h1>
<% @posts.each do |post| %>
  <div class="post">
    <h2><%= post.title %></h2>
    <%= post.body %>
<% end %>

Configuring flymake consists of these steps:

  1. Write a flymake init function that takes the contents of the
    current buffer, writes it to a temporary file, sets
    flymake-temp-source-file-name and returns a list with a program to
    call and the arguments to pass to it. This program should return
    with a non-zero exit code in case of errors or warnings.
  2. Tell flymake to use this function for a certain type of file by
    adding an entry to flymake-allowed-file-name-masks.
  3. You might need to add an entry to flymake-err-line-patterns
    with a regex that matches the output of your syntax checker.
  4. Turn on flymake-mode.

For making flymake work with ERB the trick is to get the init
function right. Instead of just copying the buffer we have to
transform it from ERB to plain ruby. This can be done by calling
the erb command line utility with the -x switch. The lines in the
resulting ruby code correspond to their equivalents in the ERB
template which is crucial for flymake to highlight errors
correctly. Here the code:

(defun flymake-erb-init ()
  (let* ((check-buffer (current-buffer))
         (temp-file (flymake-create-temp-inplace (buffer-file-name) "flymake"))
         (local-file (file-relative-name
                      (file-name-directory buffer-file-name))))
        (with-temp-file temp-file 
          (let ((temp-buffer (current-buffer)))
            (set-buffer check-buffer)
            (call-process-region (point-min) (point-max) "erb" nil temp-buffer nil "-x"))))
      (setq flymake-temp-source-file-name temp-file)
      (list "ruby" (list "-c" local-file)))))

You now need to tell flymake to use this function for files with an
extension of .erb:

(push '("\\.erb$" flymake-erb-init) flymake-allowed-file-name-masks)

If you’re still working with files that have a .rhtml extension you
can use this instead:

(push '("\\.\\(erb\\|rhtml\\)$" flymake-erb-init) flymake-allowed-file-name-masks)

Flymake doesn’t come with a pattern for ruby’s error messages, but
you might still have this line if you’re already got syntax
checking for plain ruby files set up:

(push '("^\\(.*\\):\\([0-9]+\\): \\(.*\\)$" 1 2 nil 3) flymake-err-line-patterns)

As you can only add to a list once the library defining it is
loaded you either have to (require 'flymake) first or wrap these
statements in an eval-after-load:

(eval-after-load "flymake"
     (push '("\\.\\(erb\\|rhtml\\)$" flymake-erb-init) flymake-allowed-file-name-masks)
     (push '("^\\(.*\\):\\([0-9]+\\): \\(.*\\)$" 1 2 nil 3) flymake-err-line-patterns)))

Finally you have to make sure flymake-mode is turned on for all .erb
files. If you’re using html-mode for your templates this might

(add-hook 'html-mode-hook 'flymake-mode-on)

However, ERB templates can be used in different places: for
configuration files, YAML fixtures or maybe even CSS. And you might
also have static HTML pages that are not ERB templates and don’t
need to be checked. A possible way to turn on flymake-mode for all
files with an .erb extension would be to turn it on from the
find-file-hook like this:

(defun turn-on-flymake-for-erb-files ()
  (when (string-match "\.\\(erb\\|rhtml\\)$" (buffer-file-name))
(add-hook 'find-file-hook 'turn-on-flymake-for-erb-files)

Here’s the complete code for you to copy and paste:

(defun flymake-erb-init ()
  (let* ((check-buffer (current-buffer))
         (temp-file (flymake-create-temp-inplace (buffer-file-name) "flymake"))
         (local-file (file-relative-name
                      (file-name-directory buffer-file-name))))
        (with-temp-file temp-file 
          (let ((temp-buffer (current-buffer)))
            (set-buffer check-buffer)
            (call-process-region (point-min) (point-max) "erb" nil temp-buffer nil "-x"))))
      (setq flymake-temp-source-file-name temp-file)
      (list "ruby" (list "-c" local-file)))))
(eval-after-load "flymake"
     (push '("\\.\\(erb\\|rhtml\\)$" flymake-erb-init) flymake-allowed-file-name-masks)
     (push '("^\\(.*\\):\\([0-9]+\\): \\(.*\\)$" 1 2 nil 3) flymake-err-line-patterns)))
(defun turn-on-flymake-for-erb-files ()
  (when (string-match "\.\\(erb\\|rhtml\\)$" (buffer-file-name))
(add-hook 'find-file-hook 'turn-on-flymake-for-erb-files)

Setting default-directory for Mercurial MQ patches in Emacs

July 6, 2010 – 3:27 pm

Emacs’ diff-mode is a great tool to work with patches. You can move inside a patch by files or by hunks, it highlights the changes in each line and you can apply and revert individual hunks. However, diff-mode doesn’t work out-of-the-box with Mercurial’s MQ extension. To make it work, we first have to make Emacs recognize an MQ patch automatically like this:

(add-to-list 'auto-mode-alist '("\\.hg/patches/" . diff-mode))

However, applying and reverting hunks will not work, because Emacs can not find the files mentioned in the patch, as it assumes that the paths are more or less relative to where the patch file lies. To fix this we add a function to diff-mode-hook:

(defun mq-patch-set-default-directory ()
  (when (string= ".hg" (nth 2 (reverse (split-string default-directory "/"))))
    (setq default-directory (expand-file-name (concat default-directory "../../")))))
(add-hook 'diff-mode-hook 'mq-patch-set-default-directory)

A possible usage scenario for diff-mode with MQ could be that you want to apply parts of an (unapplied) MQ patch to your working copy, maybe because the patch as a whole doesn’t work anymore or you want to throw it away and only intend to keep a few bits of it.

Whatever you’re doing with this, I hope you’ll find it useful.

Recovering data from a broken NTFS hard drive

April 26, 2009 – 3:23 pm

Today I want to tell you about my recent adventure in data recovery. A friend of mine had a broken USB disk that was no longer readable. The single 230 GB partition was formatted with NTFS and neither Windows nor Ubuntu (with the NTFS-3g driver, I assume) were willing to read it. This disk contained photos, audio files and videos, the mission was to at least restore the photos.

Make a disk image

The first step for a data recovery project should be to make an image of the drive or partition: when a drives starts to lose data because of physical errors on the disk these errors tend to spread, things are getting worse and you will not get any data out of anymore at some point. And every tool will encounter the same problems when trying to read defective sectors on disk, and it will not be possible to repair these.

By the way, if your data is really valuable you shouldn’t even try to recover it yourself, that is, if the drive shows signs of some physical damage. You should disconnect it as soon as possible and hand it to a professional data recovery company – if it is worth a few hundred or thousand Euros.

The data I was working with, however, wasn’t business critical. So, after consulting $SEARCH_ENGINE, I did an image of the drive with dd_rescue – this works similar to a normal “dd” but handles I/O errors more gracefully. But here things started to get confusing: there are two programs for this purpose with an almost identical name:

  • Kurt Garloff’s original dd_rescue tool uses the executable named “dd_rescue”, the Debian/Ubuntu package is named “ddrescue”.
  • Antonio Diaz Diaz new and improved GNU ddrescue provides an executable named “ddrescue”, the Debian/Ubuntu package carries the name “gddrescue”.

The latter is the one to choose: you get much better progress information – copying hundreds of gigabytes takes quite some time, so you want to know what’s going on – and the capability to interrupt the process and continue where you left off with the help of a log file. After finding out about that the hard way I got my image with

sudo ddrescue -r3 /dev/sdb1 hdimage logfile

where “-r3″ means: “in case of an error, retry 3 times” and /dev/sdb1 is the name of the partition of the USB disk, obviously.

Unfortunately, the resulting image still couldn’t be mounted. “ddrescue” only reported a few bad sectors on the disk, but it was obviously enough to make file access impossible. Another idea that I wasn’t able to pursue: it might have been possible to repair the NTFS filesystem with a virtualized windows instance running in VirtualBox – but VirtualBox only takes complete disks as images, not single partitions. If I had done an image of the complete disk instead, including the partition table, this might have worked out. I didn’t feel like copying the 230 GB over into a new image with a partition table and also didn’t have enough free disk space to do it.

Recovery tools: file carvers

The next step was trying to recover as much of the data as possible. I had successfully used a “file carver” before to recover images from my digital camera’s memory card after the FAT filesystem became corrupted. A file carver is a program that scans a raw binary stream for the headers of known file types, like that of JPEG images or MP3 audio files, and tries to extract the contents, completely ignoring the file system. The advantage is that it doesn’t matter how broken your filesystem is – the program doesn’t have to know anything about the filesystem’s structure. It can also recover deleted files. The disadvantage is that you lose all information that is stored in the filesystem, the file name and directory structure. It’s also prone to errors for fragmented file systems, which also means that you’re less likely to succeed when recovering large files.

I tried two tools from this category, “foremost” and “photorec”. “foremost” is a simple command line tool, you call it like this:

foremost -i hdimage -o recovered -v

and it will sort the files it can find by file type into sub folders of “recovered”.

Photorec has a curses interface. It also takes hints about the structure of the image, like presence of a partition table or filesystem type. It is part of the “testdisk” package. The command line invocation shows that this tool was ported from DOS:

photorec /log /debug /d output-directory hdimage

Recovery tools: Sleuth Kit

Researching further, I stumbled upon the Sleuth Kit and Autopsy. These are forensic analysis tools and therefore are designed to recover data that someone deliberately tried to hide or destroy. The Sleuth Kit is a suite of command line tools which Autopsy is a web frontend for. Autopsy comes with its own web server. I started it with these commands:

mkdir my-autopsy-dir/
autopsy -d my-autopsy-dir/
firefox http://localhost:9999/autopsy

Getting around the web interface can be a bit confusing: you have to create a “case” first, then add a host to investigate and finally a hd image to look at. Anyway, the time it took me to get used to autopsy wasn’t wasted because I now was able to see the complete contents of the original NTFS filesystem! I was able to look at the data, browse the filesystem, download single files and compute MD5 sums. However, autopsy offers no feature for copying whole directory trees. This is because it is intended for forensic analysis rather than data recovery. So you, the computer forensics expert, are supposed to look at every single file and make notes about it which in turn are then recorded in the “case”.

I wasn’t really interested in a forensic analysis of the contents of my friend’s drive so I took a closer look at the command line tools. The relevant commands from the Sleuth Kit are “fls” for listing files in an image and “icat” for getting at the contents. You use “fls” like this:

fls -urp hdimage

where -u means that I’m not interested in deleted files, -r that I want a recursive listing and -p that I need to have the full path for every file. The output looks something like this:

d/d 180-144-8:  some-dir
d/d 5192-144-1: some-dir/some sub dir
r/r 5190-128-3: some-dir/some sub dir/some_file.exe
r/r 5188-128-3: some-dir/some sub dir/another_file.jpg

The funny numbers in the second column are the “inode” of the file, which you need to feed into “icat” to get the contents. So how do you recover a whole directory tree with these tools? What I should have done is using a script like this one:

fls -urp $IMAGE | 
while read type inode name; do
    case $type in
        d/d) mkdir "$name" ;;
        r/r) icat $IMAGE $(echo $inode | sed 's/://g') > "$name" ;;

But I was lazy and so I saved the file listing in a text file which I turned into a big shell script using Emacs’ rectangle functions, regular expressions and keyboard macros. This wasn’t working so well: there were some funny characters in the file names I forgot to escape, like single quotes and backticks. So, as always, it turned out to be more work doing it “the easy way”. However, in the end I was able to completely recover the data from the partition.

Analyzing the data

Since now I got all the data back, having already tried other methods of recovery before, this can serve as a nice real world benchmark of the usefulness of the file carving tools I used.

Just counting how many files these tools think they’ve found doesn’t help us much, we also need to know if the recovered files were really complete and undamaged. I did a quick check with the files the Sleuth Kit recovered, and all files I checked seemed to be ok: the photos were fine and the videos and mp3s played without any hiccups. So, let’s assume that the data I got from the Sleuth Kit is really genuine. To find out about the identity of the recovered files, I computed the MD5 hash for all of them with this little script:

for tool in foremost photorec sleuthkit; do
    find $tool -type f -print0 | xargs -0 md5sum | tee md5sums/${tool}.txt

And here’s a script I hacked together to do some analysis on these files:

md5s_by_ext() {
    local ext=$1
    grep -hi "\.${ext}\$" "$@" | awk '{ print $1 }' 
unique_md5s_by_ext() {
    md5s_by_ext "$@" | sort | uniq
unique_md5s() {
    cat "$@" | awk '{ print $1 }' | sort | uniq
clean_wc() {
    wc -l | sed 's/ //g' 
common_files() {
    local ext="$1"
    echo -ne "${ext}\t"
    echo -ne $(unique_md5s_by_ext $ext sleuthkit | clean_wc) "\t"
    for tools in photorec foremost "photorec foremost"; do
        echo -ne $(unique_md5s_by_ext $ext $tools | clean_wc) "\t" \
            $(comm -12 <(unique_md5s             $tools) <(unique_md5s_by_ext $ext sleuthkit) | clean_wc)"\t"\
            $(comm -12 <(unique_md5s_by_ext $ext $tools) <(unique_md5s_by_ext $ext sleuthkit) | clean_wc)"\t"
common_files_total() {
    echo -e "total\t"\
         $(unique_md5s sleuthkit         | clean_wc) "\t"\
         $(unique_md5s photorec          | clean_wc) "\t"\
         $(comm -12 <(unique_md5s photorec) <(unique_md5s sleuthkit) | clean_wc) "\t\t"\
         $(unique_md5s foremost          | clean_wc) "\t"\
         $(comm -12 <(unique_md5s foremost) <(unique_md5s sleuthkit) | clean_wc) "\t\t"\
         $(unique_md5s photorec foremost | clean_wc) "\t"\
         $(comm -12 <(unique_md5s foremost photorec) <(unique_md5s sleuthkit) | clean_wc)
echo -e "\tsleuthkit\tphotorec\t\t\tforemost\t\t\tphotorec+foremost"
for i in jpg gif mp3 avi mpg zip rar exe cab dll txt htm rtf pdf doc xls; do
    common_files $i

And here are the results as a really ugly table:

sleuthkit photorec foremost photorec+foremost
found matching matching
found matching matching
found matching matching
total 4391 6600 3669 1210 771 6960 3718
jpg 831 768 711 711 853 747 747 901 755 755
gif 1 1 0 0 46 1 1 47 1 1
mp3 3218 4697 2851 2851 0 0 0 4697 2851 2851
avi 128 5 0 0 5 0 0 10 0 0
mpg 1 207 0 0 1 0 0 208 0 0
zip 5 3 3 3 13 0 0 16 3 3
rar 25 29 24 24 30 8 8 50 24 24
exe 37 60 4 4 78 6 6 83 6 6
cab 0 3 0 0 0 0 0 3 0 0
dll 10 69 6 6 71 7 7 80 8 8
txt 12 699 4 4 1 0 0 700 4 4
htm 6 0 2 0 3 1 1 3 2 1
rtf 1 2 0 0 0 0 0 2 0 0
pdf 0 1 0 0 1 0 0 1 0 0
doc 7 15 6 5 16 0 0 31 6 5
xls 0 2 0 0 0 0 0 2 0 0

This table needs a bit of an explanation:

  • “found” means the number of files the tool extracted from the image
  • “matching” means the number of files the tool found that are identical with files recovered with the sleuth kit
  • “matching+ext” means that we’ve also got the extension right

Foremost recovered almost 90% of the images, Photorec following close behind 85%, and it found only 8 photos that foremost couldn’t identify. Looking at other data types, Photorec is clearly superior: it found 24 of the 25 RAR files present in the image, foremost only got 8 of them right. And only photorec was able to recover any mp3s: it found 89% of them, but we also got quite some false positives here. Neither of the tools was able to recover any movies – possibly because they were fragmented on disk.


So here comes the take home message:

  • Use GNU ddrescue to make a hard drive image first.
  • If your filesystem is not mountable, try the sleuth kit – you might get all your data back including the file names and directory structure.
  • If the sleuth kit fails and you’re trying to recover some photos, “foremost” might help you.
  • If the sleuth kit fails and you’re looking for something other than images, give “photorec” a shot. Anyway, it’s less likely in this case that you’ll get your data back.

Tiling window managers talk

March 31, 2009 – 12:48 am

I just finished the english translation of the speaker notes for my talk about tiling window managers I gave at the C-Base for the Ubuntu Berlin user group last thursday. Here are the original notes in german and here’s the english translation.
As for my last few talks, I took my notes down with org-mode, and I’m deeply impressed with how easy it is to get a gook-looking HTML page out of it. And the javascript-interface just rocks!
For those who couldn’t attend I will update my notes with some screenshots of the window managers I presented when I find the time do so, so watch this space for updates.

Finally putting my personal configuration files under version control

March 23, 2009 – 1:36 pm

Version controlling your config files, it seems to be like flossing: everyone knows it’s a good idea and they should be doing it, but never get around to do so. I finally got around to put my files under version control – again. I used to use darcs, but I ran into the dreaded exponential merge issues, so this time I’ll try git for this. Mercurial would have been an even nicer choice, but git seems to have a bigger mind share these days and is a bit more difficult to learn, so I’d rather learn some git.

While having your personal config files under version control is a good idea, making your home directory version controlled isn’t – it’s not easy to see what files are under version control, and some operations just take ages because they have to crawl through your all your personal files. So I created a git controlled directory, named “.vc” and did symbolic links from there to the config files locations. As I add more files to version control, this will get tedious to do by hand, so I wrote a little script to help me that I want to share with you: I just called it “setup” and put it into .vc as well:

cd $(dirname $0)
find . \( -path \*/.git \) -prune -o -type f -not -name "$(basename $0)" | \
    while read path; do
        if [ "X$(basename "$path")" = "X.git" ]; then
        if [ -e "$destination" ]; then
            if [ -L "$destination" ]; then
                echo "Symlink for $path exists, skipping."
                echo "backing up existing $destination to ${destination}.vc_backup"
                mv "$destination" "${destination}.vc_backup"
        echo "doing symlink for $path"
        ln -s "$VC_DIR/$path" "$HOME/$path"

Showing the current directory in Emacs’ mode line

May 23, 2008 – 3:28 pm

Today I got tired of always looking up, where all these little files named “_show.rhtml”, “_list.rhtml” and their ilk are living, and patched the emacs mode line to include the last element of the current buffer’s directory. describes something very similar, but it repeats the whole mode line definition of mode-line-format, which might break with the next version of Emacs. It’s much cleaner to add to a variable that is used in the mode line:

(setq-default mode-line-buffer-identification
               '(:eval (replace-regexp-in-string "^.*/\\(.*\\)/" "\\1/" default-directory))

Live syntax-checking JavaScript with Emacs

May 6, 2008 – 10:00 am

There are quite some options for doing live syntax checks from within Emacs. A good one is using Steve Yegge’s relatively new js2-mode for javascript editing which has a javascript parser built in. But that is not what this blog post will be about.

The other option is to use flymake with some command line javascript syntax checker. Two possible syntax checkers are Mozilla’s stand alone SpiderMonkey interpreter “smjs” with the “-s” (strict) option and Douglas Crockford’s JsLint running under Rhino. There are reasons to use each of them:

  • SpiderMonkey is the javascript engine currently used in Firefox. So, if it can not parse your javascript, neither can Firefox.
  • JsLint does more than just syntax checking, it also checks for bad style and possible programming errors, like missing semicolons and braces, assignment in an if conditions. This can be good, but it’s also a hinderance when you’re working on code that was not written to make JsLint happy and you don’t want to refactor it. Furthermore, the style enforced by JsLint might differ from your programming style, as it says on the web page: “Warning! JSLint will hurt your feelings.”
    I, for instance, like to leave out the braces after an if statement if the executed block is itself just one statement – it takes up less space, and if I ever want to add to this block Emacs will indent the code correctly, so I won’t forget to add the braces afterwards. However, maybe JsLint is right, because you sometimes have to collaborate with people that don’t use Emacs – maybe I should do them the favor.

Setting up SpiderMonkey

On an Ubuntu system the stand alone SpiderMonkey interpreter is provided by the package “spidermonkey-bin”. SpiderMonkey has the problem that it is just the javascript engine, but does not include the DOM, and there is no global window object. So if your javascript code does more than just defining functions you need to do some mocking. This is fortunately very easy to do.

The code I wanted to check also installs event handlers with prototype’s Event.observe method, so I had to mock that, too:

// fake the global window object
var window = this;
var Event = { observe: function() {} };

You will most probably need to extend this for your code. Now you can do a syntax check from the command line like this:

smjs -s -e "(load '/path/to/mock.js')" file-to-check.js

Setting up JsLint

JsLint is itself written in javascript. To use it from the command line you need the javascript interpreter Rhino. It is, not surprisingly, provided on Ubuntu by the package “rhino”. JsLint for rhino is avaiable here. To check a javascript file with rhino, just call it like this:

rhino /path/to/jslint.js file-to-check.js

Integrating both into Emacs

The following code allows you to do live syntax checking from within Emacs with either SpiderMonkey or JsLint. You can switch between them using “C-c t”. The code uses a circular list structure to keep the list of available methods, just for the heck of it.

(setq flymake-js-method 'spidermonkey)
(defun flymake-js-toggle-method ()
  (let ((methods '#1=(spidermonkey jslint . #1#)))
    (setq flymake-js-method (cadr (memq flymake-js-method methods)))
(defun flymake-js-init ()
  (let* ((temp-file (flymake-init-create-temp-buffer-copy
         (local-file (file-relative-name
		      (file-name-directory buffer-file-name))))
    (if (eq flymake-js-method 'spidermonkey)
        (list "smjs" (list "-s" "-e" (format "load('%s')" (expand-file-name "/path/to/mock.js")) local-file))
      (list "rhino" (list (expand-file-name "/path/to/jslint.js") local-file)))))
(eval-after-load "flymake"
     (add-to-list 'flymake-allowed-file-name-masks
                  '("\\.js\\(on\\)?$" flymake-js-init flymake-simple-cleanup flymake-get-real-file-name))
     (add-to-list 'flymake-err-line-patterns
                  '("^\\(.+\\)\:\\([0-9]+\\)\: \\(SyntaxError\:.+\\)\:$" 1 2 nil 3))
     (add-to-list 'flymake-err-line-patterns
                  '("^\\(.+\\)\:\\([0-9]+\\)\: \\(strict warning: trailing comma.+\\)\:$" 1 2 nil 3))
     (add-to-list 'flymake-err-line-patterns
                   '("^Lint at line \\([[:digit:]]+\\) character \\([[:digit:]]+\\): \\(.+\\)$" nil 1 2 3))))
(defun my-js-setup-flymake ()
  (flymake-mode 1)
  (local-set-key (kbd "C-c n") 'flymake-goto-next-error)
  (local-set-key (kbd "C-c p") 'flymake-goto-prev-error)
  (local-set-key (kbd "C-c t") 'flymake-js-toggle-method))
(add-hook 'javascript-mode-hook 'my-js-setup-flymake)

The result

And here is what it looks like with SpiderMonkey:

Emacs with flymake running spidermonkey

And here an example of some code and what opinion JsLint has about it:

Emacs with flymake running jslint

Varnish, a faster Squid

May 5, 2008 – 5:09 pm

I just stumbled across Varnish, an HTTP cache and modern alternative to Squid. The config file is compiled into a shared library at startup, logging is done in-memory and paging out unused pages to disk is left to the OS. With squid, an unused cached page is explicitly written to disk by the proxy, while the OS most probably already swapped it out anyway.