overstimulate

Simple Search

Sun, 01 Oct 2006 javascript rake search comments

Since converting to a rake powered site, I've missed having search. To that end I've started experimenting coding search on the client side.

My (naive) implementation involves creating an index of all words, with a list of the documents that contain them (in json):


  var words = {
    'hello': [0,1],
    'world': [1],
  }
  var articles = [
    ['/article/first', 'Title of post with hello world in it'],
    ['/article/second', 'Title of post about world of svg'],      
  ]

You can see my articles.json that I create with a rake task:


  file 'html/search/articles.json' => ['html/articles/index.html', 'html/search', 'theme/articles.json'] do
    articles = FileList['data/articles/*.yml'].collect { |fn| 
      data = YAML.load(File.open(fn).read) 
      data['permalink'] = File.basename(fn)[0...-4]
      data
    }
    articles = articles.sort_by { |d| d['created_at'] }.reverse
    all_words = {}
    articles.each do |article|
      body = article['body']
      body.gsub!(/<[^>]*>/m, ' ')
      body.gsub!(/[^a-zA-Z0-9]/, ' ')
      words = body.downcase.split(' ')
      words.uniq!
      words.each do |w| 
        all_words[w] ||= []
        all_words[w] << article
      end
    end
    article_json_template = ERB.new File.open('theme/articles.json').read
    File.open('html/search/articles.json', 'w') { |f| f.write article_json_template.result(binding) }
  end

Then I can provide a simple "live search" for a single word via:


  function search(word) {
    var matches = {}
    if (word in words) {
      for (idx in words[word]) {
        matches[words[word][idx]] = true;
      }
    }

    var results = '';

    for (var article_idx in matches) {
      results += "<a href=\""+ articles[article_idx][0]+ "\">" + articles[article_idx][1] +"</a>" + "<br />"
    }
    document.getElementById('results').innerHTML = results;
  }

A somewhat more complicated version is in use on my search page (view source.) This version searches when the input field changes and allows for searching multiple words.

It would be nice to add substring matching, support for boolean operations on multiple words, and a compact representation (bloom filters?). Anyone else doing search on the client side?


Responses to "Simple Search"

Leave a response

My Card Add to your Address Book

Jesse Andrews
open source, web browsers, web services, web sites & folk dancing. contacts/sites

Keep Up To Date

Get updates via RSS or
get email when I blog

Previous Blog Posts