MarkItUp Markdown footnote button

Here’s a simple addition to markItUp, that will prompt the user for a footnote number, a footnote text and then insert the number after the selection and the footnote text at the end of the full text. As handy extra feature, the plugin will save the previous entered footnote and not ask for further numbers during the session, but instead increment the number each time the button is pressed.

Add this to your code to set.js inside your settings object:

{name:'Insert footnote',
	beforeInsert: function(h) {
		instructions = "Type in the number of the footnote:)";
		if (!h.last_footnote) {
			h.last_footnote = prompt(instructions);
		} else {
			if (!isNaN(h.last_footnote))
				h.last_footnote = parseInt(h.last_footnote) +1;
			else
				h.last_footnote = prompt(instructions);
		}
		h.footnote_content = prompt("Type in the text of the footnote:");
	},
	closeWith: function(h) {
		return '[^' + h.last_footnote + ']';
	},
	afterInsert: function(h) {
		h.textarea.value += '[^' + h.last_footnote + ']: ' +
			h.footnote_content;
	}},

And put this icon in the set’s images folder: footnote

After that be sure to modify style.css so it says something like (if 14 was the placement of the footnote button):

.markItUp .markItUpButton14 a	{
	background-image:url(images/footnote.png);
}
Posted in Web | Tagged , | Leave a comment

Django localized date template filter

UPDATE! This is going to be redundant in Django 1.2, in which you can add DATE_FORMAT to your django.po files.

I’ve often been frustrated that using settings.DATE_FORMAT does not give a localized date. Granted that the name of a month may be localized, but the format string does not change. So let’s start out by modifying settings.py. We wrap our default date format in a ugettext so the makemessages command will detect it, and we need to make it a dummy function, because the i18n library cannot be imported in settings.py due to circularity (it depends on settings.py).

ugettext = lambda s: s
DATE_FORMAT = ugettext('N j, Y')

Run compilemessages and type in your localized date formats. Now we need a template filter that uses a localized format for calling the Django date format function. This is really simple:

from django.template.defaultfilters import stringfilter
from django.utils import dateformat
from django.utils.translation import ugettext
from django.conf import settings
 
@register.filter()
def localdate(value):
    """Format date with localized date format"""
    format = ugettext(settings.DATE_FORMAT)
    return dateformat.format(value, format)

And done. Using the filter is straight forward:


Date: {{ my_date|localdate }}

Posted in Django, Python | Tagged , , | Leave a comment

Presenting: django-simple-wiki

It was bothering me that all the wikis I tried, all had either errors, feature lacks, too many dependencies or were simply unmaintained. So I decided to create yet another one. Curiously, the third hit when googling ‘django wiki’ is Create a wiki in 20 minutes. Luckily that’s not really true, so all the PHP guys and MediaWiki can continue breathing. This took me several days.

Google Code project page

Demo website

Hierarchy and relations
First of all, as in the Trac wiki system, I chose to create a system for hierarchy, meaning that it’s possible to create an article and then create sub-articles. The hierarchy does not support multiple inheritance, because it needs to be basis for the permission system. That’s where the relation system comes in place: All articles can contain symmetrical relations to any other articles in the hierarchy.

Parsing
Python and Django supports Markdown pretty much out of the box, so it’s an obvious choice to use this for parsing. The HTML features of normal Markdown have been removed, so all HTML is escaped in django-simple-wiki. And parsing is static, so every time a revision is created, the contents are passed and stored. This means that the contents of the article itself are not supposed to be dynamic. On the other hand, it is desirable to avoid parsing contents for every page hit. The parsing area of the application is only a few lines of code, and can be expanded if further parsing needs to be done, or someone wants to replace Markdown completely. For instance, if no parsing is done and HTML escaping is disabled, the wiki becomes a very simple CMS.

Curious issues
There are a few out standing problems:

  • Permission system is related to User entries in the Django auth system. But maybe this is too much of an annoyance, if the project already has groups setup in the existing auth system. On the other hand, other users would be bothered to setup both wiki groups and user groups, if the permission system was linked to user groups. And directly linking articles to user groups would require wiki-related groups to be created directly in the auth system.
  • Since relations are symmetrical, what should happen if one article is locked, but a user modifies it’s relations by deleting them from related articles?
  • Title editing: The title can only be created once, since it is coupled to the ‘slug’ of the article. A user can deliberately create a completely different title, which is fine, but should subsequent editing be allowed, which would add complexity to the revision system?
  • Article deletion: When an article is deleted from the backend it shouldn’t worry anyone. But if the feature is added to the frontend, we would want to handle maliciousness etc. But should we really store all these files and revisions? Should we alert admins, so they can do the final cleanup?
Posted in Django, Python | Tagged , , , , | 1 Comment

Getting started with Data Parallel Haskell

DPH is a work-in-progress, and not everything is yet easy to realize. But you should still prepare yourself as a Haskell programmer, because this is really where Haskell can set off into uncharted areas of popularity and innovation.

As an exercise for a programming course, I developed a spell checker, that runs in parallel and was motivated to write this short howto on getting to grips with DPH. First of all, DPH comes as a builtin library with GHC 6.10.1+, so it’s easy to get started. This howto is compatible with 6.10.1. But the DPH team has, as of March 2009, warned that significant fixes are added in the GHC trunk, and you should run the following command before installing (ie. recompiling GHC), so I advice you to do so for your own benefit.

$ ./darcs-all --dph get

Now, what you need to know is: DPH consists of two things, a compiler module that transforms your code for parallelism (‘vectorization’) and some special libraries that you should use in your code, as they are compatible with the vectorizer.

DPH can help you achieve two kinds of parallelism: Normal (unlifted) data parallelism and nested data parallelism. What the vectorizer works on is the latter, and for simple one-dimentional parallelism, you don’t need to know much, so let’s just get that done in a few lines:

module SpellCheckUnlifted where
 
import Types
import Trie
import Data.Array.Parallel.Unlifted
import GHC.PArr
 
type DocumentU = Array Word
 
spellCheck :: Dictionary -> Word -> Bool
spellCheck dic w = lookupTrie dic w
 
spellCheckU :: Dictionary -> Document -> [Word]
spellCheckU dic doc = 
  fromP ( filterP (\s -> not (spellCheck dic s)) (toP doc) )

As you can see, it’s fairly simple: Import Data.Array.Parallel.Unlifted and GHC.PArr. Then use functions toP and fromP to transform a list into a parallel array and back. The filterP function will execute functions in parallel on your array.

Download all source code of spell checker example.

Nested parallel example

The strong feature of DPH is nested parallelism, and you should think your problem through before engaging in this. In my approach, I start out with a non-parallel problem, and I think this is probably the classic example; we model our problems non-parallel and then consider how to transform them. My example is not suitable for nested parallelism, as it obviously has too little complexity and too much overhead for parallelism, but nonetheless there’s some interesting points to be gained:

  • The problem set (a list of words) is easily transformed
  • The sub-problems (looking up in a Trie data structure) is nested
  • The data types are non-trivial and need some convertion
{-# LANGUAGE PArr #-}
{-# OPTIONS -fvectorise #-}
module SpellCheckDPH where
 
import qualified Prelude
import Data.Array.Parallel.Prelude
import Data.Array.Parallel.Prelude.Word8
import qualified Data.Array.Parallel.Prelude.Int as I
 
type WordP = [:Word8:]
type DocumentP = [:WordP:]
type DictionaryP = [:TrieP:]
data TrieP = NodeP Word8 DictionaryP | LeafP Word8
 
isNode :: Word8 -> TrieP -> Bool
isNode w t = case t of
  (NodeP w' dic) -> w == w'
  _ -> False
 
isLeaf :: Word8 -> TrieP -> Bool
isLeaf w t = case t of
  (LeafP w') -> w == w'
  _ -> False
 
lookupTrieP :: DictionaryP -> I.Int -> WordP -> Bool
lookupTrieP dic i w =
    let len = lengthP w
        in if (I.<) ((I.+) i 1) len
              then let node = filterP (isNode (w!:i)) dic
                       in if (I.==) (lengthP node) 0
                             then False
                             else case node!:0 of
                                    (NodeP _ d) -> lookupTrieP d ((I.+) i 1) w
                                    _ -> False
              else (I.==) (lengthP (filterP (isLeaf (w!:i)) dic)) 1
 
spellCheckP :: DictionaryP -> DocumentP -> DocumentP
spellCheckP dic doc = filterP (\s -> not (lookupTrieP dic 0 s)) doc

Notice the first two lines. These tell GHC to add some syntax to handle the parallel arrays ([:a:]) and to let the vectorizer handle the code. Secondly, we import the DPH prelude and hide the normal Prelude. You should keep normal GHC library functions separate from DPH! spellCheckP does not convert a normal datatype into a parallel array, because we don’t want anything that cannot be vectorized in this module.

lookupTrieP is not a pretty function, but it has to work with arrays and a limited number of utility functions in the DPH Prelude. And BEWARE! Not all functions in GHC.Parr have been implemented in the DPH Prelude, which is because they simply don’t work with nested parallel arrays, yet, so you simly shouldn’t import this module.

You’ll also notice how I avoided polymorphism. Since DPH only supports Ints, Doubles and Word8 (unsigned 8-bit int), it isn’t of any use. When you transform your datatypes into nested parallel types, you will loose generality. But you can still use data types with different constructors etc. And that’s the power of DPH!

Now we only have to transform our nested data structure and problem into parallel arrays. This is done in a seperate module (SpellCheck.hs in the example).

c2w :: Char -> W.Word8
c2w c = fromIntegral (ord c)
w2c :: W.Word8 -> Char
w2c w = chr (fromIntegral w)
wArr2Str :: [W.Word8] -> String
wArr2Str wArr = map w2c wArr
 
t2P :: Trie Char -> TrieP
t2P (Leaf c) = LeafP (c2w c)
t2P (Node c d) = NodeP (c2w c) (dic2P d)
 
dic2P :: Dictionary -> DictionaryP
dic2P ds = toP $ map t2P ds
 
doc2P :: Document -> DocumentP
doc2P doc = let doc' = filter (\s -> not $ s=="") doc
              in toP $ map (toP.(\s -> map c2w s)) doc'

Download all source code of spell checker example.

Compiling and testing

When you compile, you have to use ‘-fdph-par’ for parallel libraries and ‘-fpar-seq’ for a similar, but sequential library. Also, you have to add ‘-Odph’ to instruct GHC to use the parallel compiler module, ‘-threaded’ because we’re using threads and ‘-fcpr-off’ to turn off the CPR optimizer (a trade-off, but I don’t know how severe).

$ ghc --make -fdph-par -Odph -threaded SpellCheck

Now, let’s test it and see the results. To run with to cores, we add ‘+RTS -N2 -RTS’, and to get some diagnostic output, we add ‘+RTS -N2 -sstderr -RTS’. My results are not very good, because I only have one core :)

$ ./SpellCheck +RTS -N2 -sstderr -RTS /usr/share/dict/words document.txt 1
   1,024,145,348 bytes allocated in the heap
     580,407,632 bytes copied during GC
      33,537,816 bytes maximum residency (16 sample(s))
      15,817,400 bytes maximum slop
              97 MB total memory in use (1 MB lost due to fragmentation)

  Generation 0:  1945 collections,    25 parallel,  3.92s,  4.47s elapsed
  Generation 1:    16 collections,    13 parallel,  1.98s,  2.38s elapsed

  Parallel GC work balance: 1.57 (79357576 / 50670026, ideal 2)

  Task  0 (worker) :  MUT time:   7.88s  (  2.43s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  Task  1 (worker) :  MUT time:   7.88s  (  2.47s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  Task  2 (worker) :  MUT time:   1.99s  (  2.47s elapsed)
                      GC  time:   5.90s  (  6.85s elapsed)

  Task  3 (worker) :  MUT time:   7.88s  (  2.47s elapsed)
                      GC  time:   0.00s  (  0.00s elapsed)

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    1.99s  (  2.47s elapsed)
  GC    time    5.90s  (  6.85s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    7.88s  (  9.32s elapsed)

  %GC time      74.8%  (73.5% elapsed)

  Alloc rate    515,130,747 bytes per MUT second

  Productivity  25.2% of total user, 21.3% of total elapsed
Posted in Computers | Tagged , | Leave a comment

GTK and scrolling without scrollbars

There’s still some work to do on the full screen plugin for Rhythmbox, but the current version is very usable indeed. The latest addition is scrolling by hovering the track list.

I changed the display from a normal fixed table with 3 tracks to a gtk.Layout with a gtk.VBox containing n tracks. In a Layout widget it’s possible to freely place and move child widgets, so by detecting motion events on the edges of the Layout you can emulate scrolling by moving a child widget accordingly with Layout.move(widget, x, y). Unfortunately I get a rather nasty blinking effect when scrolling too fast, and I don’t have an explanation for this, so I’d be glad to hear from anyone who can help.

def track_layout_scroll(self, widget, event):
    time_step = 10 #msecs
    ycoord = event.y
    accel_factor = 10 #how many pixels to scroll at the edge
    edge_distance = 100.0 #pixels
    layout_size = widget.get_size()
    top_dist = edge_distance - ycoord
    bot_dist = edge_distance - layout_size[1] + ycoord
 
    if top_dist > 0:
        accel = -1 - (top_dist / edge_distance) * accel_factor
    elif bot_dist > 0:
        accel =  1 + (bot_dist / edge_distance) * accel_factor
    else:
        accel = 0.0
 
    if self.scroll_event_id:
        gobject.source_remove(self.scroll_event_id)
 
    if not accel == 0.0:
        self.scroll_event_id = gobject.timeout_add(time_step, self.do_scrolling, accel, widget)
 
def do_scrolling(self, accel, layout_widget):
    step = int(1*accel)
    if step == 0:
        return
    vbox_size = self.vbox.size_request()
    layout_size = layout_widget.get_size()
    scroll_height = vbox_size[1]-layout_size[1]
    if self.scroll_y + step < 0:
        self.scroll_y = 0
    elif self.scroll_y + step > scroll_height:
        self.scroll_y = scroll_height
    else:
        self.scroll_y += step
 
    self.track_layout.move(self.vbox, 0, -self.scroll_y)
    return self.scroll_y > 0 and self.scroll_y < scroll_height

The code is related to my Rhythmbox plugin, but I’m sure you get the idea. Also please note, that track_layout_scroll has to receive notify_motion_event from the Layout widget, and that you have to set a size for the Layout widget with set_size().

Update: By playing around with time_step (lowering it to be more exact) and slowing down the acceleration, I managed to almost make the white flashes disappear.

Posted in Python | Tagged , , , , , | Leave a comment