all repos — h3rald @ 8.1.0

The sources of https://h3rald.com

content/articles/take-back-your-site-with-nanoc.textile

 1
 2
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
----- 
type: article
tags:
- website
- ruby
- programming
- writing
date: 2009-09-15 13:32:51.049000 +02:00
permalink: take-back-your-site-with-nanoc
title: "Take back your site, with nanoc!"
filters_pre:
- erb
- redcloth
toc: true
summary: "A quick overview on the nanoc site compiler, and how I turned my Typo-powered dynamic site into a static one, discovering the pleasures of blogging 'like a hacker'."
-----

Back in 2004, when I bought the h3rald.com domain, this site was static. At the time I hardly knew HTML and CSS, nevermind server-side languages, so I remember creating a _pseudo-template_ for the web site layout and using it whenever I wanted to create a new page, to preserve the overall look-and-feel. This was a crude and inefficient strategy, of course: whenever I changed the layout I had to replicate the change in all the pages of the site – the whole eight of them.

Five years later, after rebuilding this web site "seven times":/h3rald/ using different backends (PHP + CakePHP, Ruby + Rails + Typo, etc.), I decided to make it static again, this time with a twist. It all started when I read a "post":http://tom.preston-werner.com/2008/11/17/blogging-like-a-hacker.html by Tom Preston-Warner ("GitHub":http://www.github.com co-founder) that I finally decided to give it a try. Today, the 8th release of this web site is 100% static: if you load any page, there's no server-side interpretation going on, you're just browsing a plain HTML page, at most with a few AJAX calls. But let's start from the beginning...

h3. Why I don't need a blog platform

There's nothing inherently wrong with blog platforms like Wordpress: they allow _anyone_ to publish content on the web using a user-friendly administration area. They were built with one thing in mind: make publishing content on the web something as simple as possible, even for people who don't know anything about HTML, let alone server-side scripting. 

What about people who _do_ know about web development though? Do they still need a blog platform? Depends. If you are comfortable with editing files using a text editor, if you enjoy using the command-line on a daily basis, if you like programming and _hacking_ a little bit, if you don't really care about fancy and user-friendly administration backends... _then you probably don't_.

All you need is a system to transform a bunch of source files into a web site. The good news is that such system exists – and you're also spoiled for choices!

h3. Introducing site compilers

The first _site compiler_ I discovered was "Webby":http://webby.rubyforge.org/:

bq. [...] Webby works by combining the contents of a page with a layout to produce HTML. The layout contains everything common to all the pages — HTML headers, navigation menu, footer, etc. — and the page contains just the information for that page. You can use your favorite markup language to write your pages; Webby supports quite a few.

There are quite a few applications like Webby, such as:
* "nanoc":http://nanoc.stoneship.org/ 
* "Rassmalog":http://snk.tuxfamily.org/lib/rassmalog/doc/guide.html 
* "Jeckyll":http://www.jekyllrb.com/ 
* "WebGen":http://webgen.rubyforge.org/ 
* "Rog":http://rog.rubyforge.org/ 
* "Rote":http://rote.rubyforge.org/ 
* "Hobix":http://hobix.com/ 
* "RakeWeb":http://rakeweb.rubyforge.org/wiki/wiki.pl 
* "RubyFrontier":http://www.apeth.com/RubyFrontierDocs/default.html 
* "StaticMatic":http://staticmatic.rubyforge.org/ 
* "StaticWeb":http://staticweb.rubyforge.org/ 
* "ZenWeb":http://www.zenspider.com/ZSS/Products/ZenWeb/ 
* "YurtCMS":http://yurtcms.roberthahn.ca/ 
* "NanoBlogger":http://nanoblogger.sourceforge.net/ 

There are probably even more, with different features, but they all try to solve the same problem: provide a way to generate static web sites in an automated way.

I spent some time reading about each one of them, "evaluating the pros and cons":http://github.com/h3rald/h3rald/issues/closed#issue/1 and in the end I decided to go for "nanoc":http://nanoc.stoneship.org/, simply because it was the only one that seemed to fit all my needs.

h3. A quick overview of nanoc

nanoc is a nifty tool written in Ruby suitable for _[...] building small to medium-sized websites_. In other words, anything which doesn't involve some fancy user interaction. For what concerns blogs, the only user interaction is _comments_ – but that's fine, because there's more than one web service for that, such as "Disqus":http://disqus.com/ or "IntenseDebate":http://intensedebate.com/.

h4. Some details on the project

Compared to the alternatives, nanoc is one of the most mature and most maintained, having hit just a few weeks ago its 3.0 release. Its creator, Denis Defreyne, uses it for his own "web site":http://stoneship.org/ and is involved with the project on a daily basis, both coding and offering support to nanoc users like myself who regularly ask questions on the "nanoc user group":http://groups.google.com/group/nanoc.

Denis also seems very concerned about keeping documentation up-to-date – something that really impressed me from a technical writer's point of view. The "tutorial":http://nanoc.stoneship.org/tutorial/ he put together will get you started in no time, and the "manual":http://nanoc.stoneship.org/manual/ will explain everything else you may possibly want to know. When release 3.0 came out he even put together a "migration guide":http://nanoc.stoneship.org/migrating/. If this is still not enough and you don't mind spending some time extending the system, nanoc's "RDoc documentation":http://nanoc.stoneship.org/doc/3.0.0/ is very comprehensive compared to other Ruby projects.

h4. Sites, Items and data sources

!>/img/pictures/nanoc-structure.png! 

nanoc ships with a really neat command line tool that can do most of the work for you. @Nanoc3 create_site h3rald@ will create a new web site in a folder called h3rald. The contents of this folder are laid out according to a particular logic (_convention over configuration_, remember?) So:

* *content* – your articles, pages, stylesheets, images, ...all the site content and assets.
* *layouts* – the site layouts (and partial layouts)
* *lib* – place your custom ruby code and vendor libraries here
* *output* – your "compiled" site, ready to be deployed
* *config.yaml* – your site's configuration file. The only one (and it's just a few lines) 
* *Rakefile* – place any custom Rake task here
* *Rules* – defines the rules for compilation, layout and routing

Here's the default @config.yaml@ file:

<% highlight :yaml do %>
--- 
data_sources: 
- items_root: /
  layouts_root: /
  type: filesystem_compact
  output_dir: output
<% end %>

A _data source_ in nanoc defines where data is retrieved from to create the web site. By default, the "filesystem_compact":http://nanoc.stoneship.org/doc/3.0.0/Nanoc3/DataSources/FilesystemCompact.html data source requires that you create two files in the /content folder for each article or page of your web page:
* One containing the actual content of the page
* Another for the page's arbitrary metadata

By personal preference, I chose the "filesystem_combined":http://nanoc.stoneship.org/doc/3.0/Nanoc3/DataSources/FilesystemCombined.html data source, which allows you to combine the content and the metadata of a page in a single file.

The source code for this very article, for example, starts like this:

<% highlight :text do %>
-----
type: article
tags:
- website
- ruby
- programming
- writing
date: 2009-09-15 13:32:51.049000 +02:00
permalink: take-back-your-site-with-nanoc
title: "Take back your site, with nanoc!"
toc: true
-----
Back in 2004, when I bought the h3rald.com domain, this site was static. At the time I hardly 
knew HTML and CSS, nevermind server-side languages, so I remember creating a _pseudo-template_ for
 the web site layout and using it whenever I wanted to create a new page, to preserve the overall look-and-feel. 
This was a crude and inefficient strategy, of course: whenever I changed the layout I had to replicate the change
 in all the pages of the site &ndash; the whole eight of them.
<% end %>

At run time, the content goes through a Textile filter and the metadata is used in layouts, to generate tag links automatically, for example. 

h4. Layouts, filters, and helpers

Layouts in nanoc are similar to layouts and views in Rails, but much simpler. The same applies to helpers. Here's a snippet from my "default layout":http://github.com/h3rald/h3rald/tree/master/layouts/default.erb:

<% highlight :text  do %>
        <div id="container">
          <!-- CONTENT START -->
          <div id="content" class="clearfix&lt;%= (@item[:permalink] == 'home') ? ' home' : ' standard' %&gt;">
            <h2>&lt;%= @item[:title] %&gt;</h2>
            &lt;%   case @item[:type]
                when 'article' then%&gt;
                <div id="content-header">
                  &lt;%= render 'article_meta', :article => @item %&gt;
                </div>
              &lt;% end %&gt;
              <hr />
              <div id="content-body">
                &lt;%= yield %&gt;
              </div>
              <div id="content-footer">
                <div class="share">
                  <script type="text/javascript" src="http://w.sharethis.com/button/sharethis.js#publisher=6e34d60c-b14e-4c19-9b2f-7c35a9f0ab09&amp;type=website&amp;linkfg=%23a4282d"></script>
                  &lt;% if @item[:feed] then %&gt;
                  <a href="&lt;% @item[:feed_url] || @item[:feed]+"rss/" %&gt;" type="application/rss+xml" rel="alternate"><img src="/images/theme/feed-icon-14x14.png" alt="#"/>H3RALD - &lt;%= @item[:feed_title]%&gt;</a>
                  &lt;% end %&gt;
                </div>
                &lt;%= render 'article_buttons' if @item[:type] == 'article' %&gt;
              </div>
            </div>
<% end %>

This source code snippet shows quite a few features of nanoc's layouts:
* You can access the metadata of the page which is being rendered using the <notextile><code>@item</code></notextile>, so <notextile><code>@item[:title]</code></notextile> returns the page's title, for example.
* Layouts can be nested, and behave like Rails's partials. The @render@ takes a string parameter (the name of the layout to render) and an optional hash parameter to pass variables to the layout.
* The @yield@ method is used to include the content of a page.
* Layouts support any kind of filter, like ERB for example. Go crazy.

Helpers can be used in layouts to perform common tasks, like creating links, feeds, navigation elements and so on. Check the "source code docs":http://nanoc.stoneship.org/doc/3.0.0/ for more info, and of course feel free to create your own as you see fit.

Finally, filters are used to filter content markup. nanoc ships with "almost everything you need":http://nanoc.stoneship.org/manual/#list-of-built-in-filters, from Textile to Haml to RDoc, but nobody forbids you to create your own, and it's dead easy. 

h4. Rules and tasks

While tasks (as in Rake tasks) do not constitute a huge part of nanoc (but as usual, you may need to create your own to perform custom operations), Rules became, as of version 3, one of the key concepts to grasp in order to make everything work. Rules are stored in the @Rules@ file of your nanoc site, they can be used to:
* Define routes, i.e. where pages are deployed in the output folder.
* Define how pages are compiled, which filters to apply to a particular set of pages, which layouts to use, etc.
* Define how layout are handled, which filters to apply to a particular layout, etc.

You can find more information in the "manual":http://nanoc.stoneship.org/manual/#rules, along with other important information, but for now, let's say you should be familiar with _most_ of nanoc's jargon and how it works. Let's see what you can do with it, in practice.

h3. Migrating from your blog platform

As of version 7, h3rald.com has been powered by the "Typo":http://www.typosphere.org blog platform. If you are not familiar with it, let's just say it's a sort of Wordpress built on top of Rails: database backend, pretty admin front-end, tags, comments, and all sort of things a blog may need. While Typo is pleasant enough to use, it has all the inherent disadvantages of any other similar platform:
* It relies on a database
* It relies on server-side scripting to render pages
* It uses a complex caching mechanism to produce, ultimately, semi-static pages
* It may be subject to exploits, attacks, high server loads, and similar
* You can't really customize it beyond a certain point
* You have to upgrade your backend frequently, and often is not as painless as you may expect
* You can't use versioning tools like git for your content, as it's stored in a database

I'm not claiming that nanoc is blogging's silver bullet (it was not created for that), but for sure:
* It _does not_ rely on a database
* It _does not_ rely on server-side scripting to render pages (not in real-time, anyway)
* It _does not_ need a complex caching mechanism simply because it produces static pages
* It is definitely less prone to nasty things
* It's extremely flexible and hackable with very little effort
* You don't have to upgrade all the time, but it is _really_ painless if you decide to
* You can use git and similar: your content is in plain old text files

Rants are beside the point, suffice to say I recently convinced myself that switching from Typo to nanoc was a _good thing_, so let's see how it worked out. 

h4. Posts, pages and comments

Out of Typo's MySQL database, I just wanted to get the following data:
* Pages and posts
* Tags
* Comments

Following the approach used by "Jekyll":http://github.com/mojombo/jekyll, I decided to use the simple and powerful "Sequel":http://sequel.rubyforge.org/ gem. I'm sorry to disappoint you, but the whole migration process can be summarize with the following Rake task:

<% highlight :ruby do %>
  task :migrate, :db, :usr, :pwd, :host do |t, args|
    raise RuntimeError, "Please provide :db, :usr, :pass" unless args[:db] && args[:usr] && args[:pwd]
    db = Sequel.mysql args[:db], :user => args[:usr], :password => args[:pwd], :host => args[:host] || 'localhost'
    # Remove all existing pages!
    dir = Pathname.new(Dir.pwd/'content')
    dir.rmtree if dir.exist?
    dir.mkpath
    # Prepare page data
    dataset = db[:contents].where("state = 'published' || type = 'Page'")
    total = dataset.count 
    c = 1
    total_tags = []
    dataset.each do |a|
      puts "Migrating [#{c}/#{total}]: '#{a[:title]}'..."
      meta = {}
      meta['tags'] = get_tags a[:keywords]
      meta['comments'] = get_comments db, a[:id]
      meta['permalink'] = a[:permalink] || a[:name]
      meta['title'] = a[:title]
      meta['type'] = a[:type].downcase
      meta['date'] = a[:published_at]
      meta['toc'] = true
      meta['filters_pre'], extension = get_filter db, a[:text_filter_id]
      contents = convert_code_blocks meta, a[:body]+a[:extended].to_s
      write_page meta, contents, extension
      c = c+1
    end
  end

<% end %>

That's it. Well, almost: you can find the @get_comments@, @get_tags@ and @get_filter@ methods in a separate "utility file":http://github.com/h3rald/h3rald/tree/master/lib/utils.rb. Nothing special really, just a few convenience methods wrapping queries or simply processing data. Note how all information, including tags and legacy comments, is saved in each page's metadata. The @write_page@ method simply creates a file in the @/contents@ folder.

h4. Filters and highlighters

On my old site, I used mainly Textile and Markdown to write posts. However, some of my really old articles used BBCode, whose corresponding filter is not available in nanoc. No worries, I soon found out that creating a new nanoc filter came down to this:

<% highlight :ruby do %>
require 'rubygems'
require 'bb-ruby'

class BbcodeFilter < Nanoc3::Filter
  identifier :bbcode

  def run(content, args)
    content.bbcode_to_html
  end

end
<% end %>

Yes, that's it. Granted, the @bb-ruby@ gem does all the work, but notice how easy it is to just plug in new Ruby code into nanoc's architecture!

The next big challange was code highlighting. After a quick research, I found at least a half dozen of possible solutions to highlight source code. Some were javascript based, others were based on a server-side language like PHP, Ruby or Python. Again, I looked at Jekyll for inspiration and discovered they integrated the "Pygments":http://www.pygments.org _Python_ library. Why use a Python library for code highlighting in a Ruby-based project? Because there's nothing to stop you (if you can run Python on your server, that is), because it looks very neat and because it supports a lot of different programming languages.

Lazy as I am, I more or less dropped "Chris Wanstrath's Ruby wrapper":http://github.com/h3rald/h3rald/blob/master/lib/albino.rb into my @/lib@ folder (I just used Open3 instead of Open4 for Windows compatibility), and monkey-patched nanoc's filtering helper as follows:

<% highlight :ruby do %>
module Nanoc3::Helpers::Filtering

  def highlight(syntax, &block)
    # Seamlessly ripped off from the filter method...
    # Capture block
    data = capture(&block)
    # Reconvert <% %>
    data.gsub! /&lt;%/, '<%'
    data.gsub! /%&gt;/, '%>'
    # Filter captured data
    filtered_data = "\n<notextile>"+Albino.colorize(data, syntax)+"</notextile>\n" rescue data 
    # Append filtered data to buffer
    buffer = eval('_erbout', block.binding)
    buffer << filtered_data
  end

end

include Nanoc3::Helpers::Filtering
<% end %>

There you go, another thing sorted.

h4. Tags and Feeds

Adding tagging support was a tiny bit more tricky. nanoc supports content tagging out-of-the-box though metadata and a simple helper, but I wanted to create tag pages (with feeds). Nothing too difficult though, it all came down to a simple Rake task:

<% highlight :ruby do %>
  task :tags do
    site = Nanoc3::Site.new('.')
    site.load_data
    dir = Pathname(Dir.pwd)/'content/tags'
    dir.rmtree if dir.exist?
    dir.mkpath
    tags = {}
    # Collect tag and page data
    site.items.each do |p|
      next unless p.attributes[:tags]
      p.attributes[:tags].each do |t|
        if tags[t]
          tags[t] = tags[t]+1
        else
          tags[t] = 1 
        end
      end
    end
    # Write pages
    tags.each_pair do |k, v|
      write_tag_page dir, k, v
      write_tag_feed_page dir, k, 'RSS'
      write_tag_feed_page dir, k, 'Atom'
    end
  end
<% end %>

Again, you can find all the other simple utility methods in my "utility file":http://github.com/h3rald/h3rald/tree/master/lib/utils.rb.

When it came to feeds, I decided to create a new method for the Blogging helper to create RSS feeds, although nanoc does come with an Atom feed generator:

<% highlight :ruby do %>
  def rss_feed(params={})
    require 'builder'
    require 'time'
    prepare_feed params
    # Create builder
    buffer = ''
    xml = Builder::XmlMarkup.new(:target => buffer, :indent => 2)
    # Build feed
    xml.instruct!
    xml.rss(:version => '2.0') do
      xml.channel do
        xml.title @item[:title]
        xml.language 'en-us'
        xml.lastBuildDate @item[:last][:date].rfc822
        xml.ttl '40'
        xml.link @site.config[:base_url]
        xml.description
        @item[:articles].each do |a|
          xml.item do
            xml.title a[:title]
            xml.description @item[:content_proc].call(a)
            xml.pubDate a[:date].rfc822
            xml.guid url_for(a)
            xml.link url_for(a)
            xml.author @site.config[:author_email]
            xml.comments url_for(a)+'#comments'
            a[:tags].each do |t|
              xml.category t
            end
          end
        end
      end
      buffer
    end
  end
<% end %>

Nothing too daunting, once you get used to Ruby's XML builder. I followed a similar approach for my "monthly archives":/archives

h4. 3rd-party services

Finally, the interactive bits. I basically turned to third-party services and a bit of jQuery for everything which required user-interaction or pulling data from other web sites. Here's a list of services and APIs I currently use:

* "IntenseDebate":http://intensedebate.com/, for comments.
* "Google AJAX Search API":http://code.google.com/apis/ajaxsearch/web.html for internal site-wide search.
* "Twitter JSON API":http://apiwiki.twitter.com/ to fetch tweets.
* "Delicious JSON API":http://delicious.com/help/json to fetch delicious bookmarks.
* "BackType JSON API":http://www.backtype.com/developers to fetch comments from other sites.
* "GitHub JSON API":http://develop.github.com/ to fetch GitHub commits for most of my "projects":/projects

If you want to know how I integrated them, check out my "/js folder":http://github.com/h3rald/h3rald/tree/master/content/js, it was very simple, really.

h3. Conclusion

I was very happy of switching to nanoc. It didn't take me long, and I spent most of the time with non-nanoc issues (brushing up jQuery, CSS, graphics, etc.). Of course knowing the Ruby programming language helps, and if you're not comfortable with hacking your way a little bit, then maybe it's not for you. 

!</img/pictures/nanoc-compile.png!

Personally, I've been waiting for something like nanoc for a long time: its simple and yet powerful architecture makes you able to do virtually anything with it. For the first time in a long time, I feel like I'm in complete control of my web site, I know every bits of it and if I want to change the way it works or looks I only have to touch a few files. 

nanoc's metadata is mindblowing for its simplicity and power: although you're not dealing with a database, you can query your content in the easiest ways possible. Whenever I needed a way to easily access pages, filter them, add extra logic to them, I just added metadata. If you forget something, you don't have to change your database tables, create new relationships or anything of the sort, you simply add metadata to pages.

Be warned that tweaking nanoc gets addictive very quickly: you soon end up creating silly little tasks for making things just the way you want. For me, adding a new article to my blog now just means this:

<% highlight :text do %>
$ rake site:article name=take-back-your-site-with-nanoc
$ vim content/articles/take-back-your-site-with-nanoc
... write & close the file ...
$ Nanoc3 compile 
<% end %>

...Exactly what I need. Nothing more, nothing less.