TaskPaper (BNF) Grammar

Here’s my attempt at an annotated taskpaper grammar. Before I turn it into a ProseMirror schema, can I get some feedback on:

  • Anything I missed or got wrong
  • Better terminology to match Taskpaper convention. (Specifically, I didn’t know if there was better term for “content.”)

Interesting things I discovered while writing the grammar:

  • ‘:’ is a valid project name (however it messes up parsing of bodyContentString)
  • “@” and “@()” are valid tags. (And you can even retrieve the value of 'data-' via the API)

Thanks!

///--- GRAMMAR ---///
// A taskpaper document is a series of items.
taskpaperDocument : {item} ;

// An item starts with optional indents followed by a project, task, or note.
// An item ends with a newline preceded by optional white space.
item : {INDENT} (project | task | note) [WHITESPACE] NEWLINE ;

// A project starts with a project name (that ends in a colon).
// A project line may optionally end with tags.
project: PROJECTNAME {tag} ;

// A task starts with optional indents followed by a dash and white space.
// Any text after that forms the content of the task.
task : {INDENT} TASK_OPEN WHITESPACE content ;

// A note starts with an optional indent followed by content.
note: {INDENT} content ;

// A tag starts with whitespace followed by the @ symbol and tag name.
// A tag may end with an optional value enclosed in parentheses.
tag: WHITESPACE TAGNAME [VALUE_OPEN [VALUE] VALUE_CLOSE] ;

// Content is a series of tags and/or text.
content: {tag | TEXT} ;



///--- TERMINAL TOKENS ---///
// Project names are 0 or more non-colon characters terminated a colon.
PROJECT_NAME: /[^:]*:/ ;

// Tasks are indicated by starting with the special dash character.
TASK_OPEN: '-' ;

// Tags names are the @ symbol followed by 0 or more non-tag-specific characters.
TAG_NAME: /@[^@()]*/ ;

// Parentheses enclose the value of a tag.
VALUE_OPEN:  '(' ;
VALUE_CLOSE: ')' ;

// The value can be anything except for the value close.
// If the VALUE_CLOSE is escaped, it is treated as part of the value.
VALUE: /([^)]|(\\))*/ ;

// Taskpaper only indents with tabs.
INDENT: '\t' ;

TEXT: /./ ;
WHITESPACE: /\s+/ ;
NEWLINE: '\n' ;