text processing utilities

Highlight documentation

Plug-Ins

The plug-in interface allows modifications of syntax parsing and colouring. The output's header or footer can be enhanced, and recognized syntax elements may be outputted with additional information.
A common task would be to define a new set of syntax keywords, or to add items to existing keyword groups. More advanced plug-ins may add tooltips based on ctags input or Javascript source code folding to HTML output.

Script structure

The following script contains the basic plug-in structure which has no effect on the output:

Description="Boilerplate plugin"

-- optional parameter: syntax description
function syntaxUpdate(desc)
end

-- optional parameter: theme description
function themeUpdate(desc)
end

-- optional parameter: syntax description
function formatUpdate(desc)
end

Plugins={
  { Type="theme", Chunk=themeUpdate },
  { Type="lang", Chunk=syntaxUpdate },
  { Type="format", Chunk=formatUpdate },
}

The first line contains a description which gives a short summary of the plug-in effects.

The next lines contain function definitions: The syntaxUpdate function is applied on syntax definition scripts (*.lang), whereas the themeUpdate function is applied on colour themes (*.theme).
The formatUpdate is not executed in a syntax or theme Lua state. It just returns strings to override or enhance the document header and footer.
The names of the functions are not mandatory.
The desc parameter contains the description of the syntax definition or colour theme. This can be used to restrict modifications to certain kinds of input (i.e. only C code should be enhanced with syslog keywords).

The Plugins array connects the functions to the Lua states which are created when highlight loads a lang or theme script. In this example, themeUpdate is connected to the theme state, and syntaxUpdate to the lang state.

Syntax definition elements

The following list includes all items which influence the syntax highlighting:

Syntax definition items:

Comments: table
Description: string
Digits: string
EnableIndentation: boolean
Identifiers: string
IgnoreCase: boolean
Keywords: table
NestedSections: table
Operators: string
PreProcessor: table
Strings: table

Document modification items:

HeaderInjection: string
FooterInjection: string

Read only (internal highlighting states):

HL_STANDARD: number
HL_BLOCK_COMMENT: number
HL_BLOCK_COMMENT_END: number
HL_EMBEDDED_CODE_BEGIN: number
HL_EMBEDDED_CODE_END: number
HL_ESC_SEQ: number
HL_ESC_SEQ_END: number
HL_IDENTIFIER_BEGIN: number
HL_IDENTIFIER_END: number
HL_KEYWORD: number
HL_KEYWORD_END: number
HL_LINENUMBER: number
HL_LINE_COMMENT: number
HL_LINE_COMMENT_END: number
HL_NUMBER: number
HL_OPERATOR: number
HL_OPERATOR_END: number
HL_PREPROC: number
HL_PREPROC_END: number
HL_PREPROC_STRING: number
HL_STRING: number
HL_STRING_END: number
HL_UNKNOWN: number
HL_REJECT: number

Read only (other):

HL_PLUGIN_PARAM: string (set with --plug-in-param)
HL_LANG_DIR: string (path of language definition directory)


Read only (output document format):

HL_OUTPUT: number (selected format)
HL_FORMAT_HTML: number
HL_FORMAT_XHTML: number
HL_FORMAT_TEX: number
HL_FORMAT_LATEX: number
HL_FORMAT_RTF: number
HL_FORMAT_ANSI: number
HL_FORMAT_XTERM256: number
HL_FORMAT_SVG: number
HL_FORMAT_BBCODE: number
HL_FORMAT_PANGO: number
HL_FORMAT_ODT: number

Functions:

AddKeyword: function
OnStateChange: function
Decorate: function
DecorateLineBegin: function
DecorateLineEnd: function

IMPORTANT: Functions will only be executed if they are defined as local functions within the "lang" chunk function referenced in the Plugins array. They will be ignored when defined elsewhere in the script.

Function OnStateChange

This function is a hook which is called if an internal state changes (ie from HL_STANDARD to HL_KEYWORD if a keyword is found). It can be used to alter the new state or to manipulate syntax elements.

OnStateChange(oldState, newState, token, kwGroupID)

  Hook Event: Highlighting parser state change
  Parameters: oldState:  old state
              newState:  intended new state
              token:     the current token which triggered
                         the new state
              kwGroupID: if newState is HL_KEYWORD, the parameter
                         contains the keyword group ID
  Returns:    Correct state to continue OR HL_REJECT

Return HL_REJECT if the recognized token and state should be discarded; only the
first character of token will be outputted using the state defined as oldState.

Examples:

function OnStateChange(oldState, newState, token, kwgroup)
   if newState==HL_KEYWORD and kwgroup==5 then
      AddKeyword(token, 5)
   end
   return newState
end

This function adds the current token to the internal keyword list if the keyword belongs to keyword group 5. If keyword group 5 is defined by a regex, this token will be recognized later as keyword even if the regular regex does not match.

function OnStateChange(oldState, newState, token)
   if token=="]]" and oldState==HL_STRING and newState==HL_BLOCK_COMMENT_END then
      return HL_STRING_END
   end
   return newState
end

This function resolves a Lua parsing issue with the "]]" close delimiter which ends both comments and strings.

Function AddKeyword

This function will add a keyword to one of the the internal keyword lists. It has no effect if the keyword was added before. Keywords added with AddKeyword will remain active for all files of the same syntax if highlight is in batch mode.

AddKeyword(keyword, kwGroupID)

  Parameters: keyword:   string which should be added to a keyword list
              kwGroupID: keyword group ID of the keyword
  Returns:    true if successfull

AddPersistentState functions

This function enables storage of keywords and keyword ranges in a plug-in file. If the syntax contains elements which depend on a context, you can highlight them although this context is lost in other input files or code sections.

The invocation of AddPersistentState will cause highlight to save a plugin as temporary file and parse input files using this plug-in again if necessary.

AddPersistentState(keyword, kwGroupID)

Parameters: keyword:   string which should be added to a keyword list
            kwGroupID: keyword group ID of the keyword
Returns:    true if successfull

AddPersistentState(lineno, kwGroupID, column, length)

Parameters: lineno:    line number
            kwGroupID: the keyword group ID
            column:    column
            length:    length of the keyword
Returns:    true if successfull

Decorate functions

The Decorate function is a hook which is called if a syntax token has been identified. It can be used to alter the token or to add additional text in the target output format (e.g. hyperlinks).

Decorate(token, state, kwGroupID)

  Hook Event: Token identification
  Parameters: token:     the current token
	      state:     the current state
              kwGroupID: if state is HL_KEYWORD, the parameter
                         contains the keyword group ID
  Returns:    Altered token string or nothing if original token should be
              outputted

The functions DecorateLineBegin and DecorateLineEnd are called if a new line starts or ends. They can be used to add special formatting to lines of code.

DecorateLineBegin(lineNumber)

  Hook Event: output of a new line
  Parameters: lineNumber: the current line number
  Returns:    A string to be prepended to a new line (or nothing)

DecorateLineEnd(lineNumber)

  Hook Event: output of a line ending
  Parameters: lineNumber: the current line number
  Returns:    A string to be appended to a line (or nothing)

IMPORTANT: The return value of Decorate functions will be embedded in the formatting tags of the output format. The return values are not modified or validated by highlight.

Example:
function Decorate(token, state)
  if (state == HL_KEYWORD) then
    return  string.upper(token)
  end
end

This function converts all keywords to upper case.

Theme chunk elements

The following list includes all items which influence colour and font attributes:

Output formatting items:

Default: table
Canvas: table
Number: table
Escape: table
String: table
StringPreProc: table
BlockComment: table
PreProcessor: table
LineNum: table
Operator: table
LineComment: table
Keywords: table

Custom theme items:

Injections: table


Read only (output document format):

HL_OUTPUT: number
HL_FORMAT_HTML: number
HL_FORMAT_XHTML: number
HL_FORMAT_TEX: number
HL_FORMAT_LATEX: number
HL_FORMAT_RTF: number
HL_FORMAT_ANSI: number
HL_FORMAT_XTERM256: number
HL_FORMAT_TRUECOLOR: number
HL_FORMAT_SVG: number
HL_FORMAT_BBCODE:
HL_FORMAT_ODT: number
HL_FORMAT_PANGO: number

Format chunk elements

Read only (output document format):

HL_OUTPUT: number
HL_FORMAT_HTML: number
HL_FORMAT_XHTML: number
HL_FORMAT_TEX: number
HL_FORMAT_LATEX: number
HL_FORMAT_RTF: number
HL_FORMAT_ANSI: number
HL_FORMAT_XTERM256: number
HL_FORMAT_TRUECOLOR: number
HL_FORMAT_SVG: number
HL_FORMAT_BBCODE: number
HL_FORMAT_PANGO: number
HL_FORMAT_ODT: number

Functions:

DocumentHeader: function
DocumentFooter: function

Function DocumentHeader

This function will be executed when a new document is generated. It can override or extend the document header section.

DocumentHeader(numFiles, currFile, options)

  Hook Event: output of a new file's header
  Parameters: numFiles: number of files to be generated
              currFile: current file counter
              options: Map of the following options
              options.title: document title
              options.encoding: document encoding
              options.fragment: true if header/footer should not be outputted
              options.font: font name
              options.fontsize: font size

  Returns:    [string, boolean?] (or nothing)
              The string contains the new document header
              The boolean value indicates if the string should replace the default
              header (false=default) or if it should be appended to it (true).

Function DocumentFooter

This function will be executed when a new document is generated. It can override or extend the document footer section.

DocumentFooter(numFiles, currFile, options)

  Hook Event: output of a new file's footer
  Parameters: see DocumentHeader

  Returns:    [string, boolean?] (or nothing)
              The string contains the new document footer
              The boolean value indicates if the string should replace the default
              footer (false=default) or if it should precede it (true).

Complete example

-- first add a description of what the plug-in does
Description="Add qtproject.org reference links to HTML, LaTeX or RTF output"

-- the syntaxUpdate function contains code related to syntax recognition
function syntaxUpdate(desc)

  -- if the current file is no C++ file we exit
  if desc~="C and C++" then
     return
  end

  -- this function returns a qt-project reference link of the given token
  function getURL(token)
     -- generate the URL
     url='http://qt-project.org/doc/qt-4.8/'..string.lower(token).. '.html'

     -- embed the URL in a hyperlink according to the output format
     -- first HTML, then LaTeX and RTF
     if (HL_OUTPUT== HL_FORMAT_HTML or HL_OUTPUT == HL_FORMAT_XHTML) then
        return '<a class="hl" target="new" href="'
               .. url .. '">'.. token .. '</a>'
     elseif (HL_OUTPUT == HL_FORMAT_LATEX) then
        return '\\href{'..url..'}{'..token..'}'
     elseif (HL_OUTPUT == HL_FORMAT_RTF) then
        return '{{\\field{\\*\\fldinst HYPERLINK "'
               ..url..'" }{\\fldrslt\\ul\\ulc0 '..token..'}}}'
     end
   end

  -- the Decorate function will be invoked for every recognized token
  function Decorate(token, state)

    -- we are only interested in keywords, preprocessor or default items
    if (state ~= HL_STANDARD and state ~= HL_KEYWORD and
        state ~=HL_PREPROC) then
      return
    end

    -- Qt keywords start with Q, followed by an upper and a lower case letter
    -- if this pattern applies to the token, we return the URL
    -- if we return nothing, the token is outputted as is
    if string.find(token, "Q%u%l")==1 then
      return getURL(token)
    end

  end
end

-- the themeUpdate function contains code related to the theme
function themeUpdate(desc)
  -- the Injections table can be used to add style information to the theme

  -- HTML: we add additional CSS style information to beautify hyperlinks,
  -- they should have the same color as their surrounding tags
  if (HL_OUTPUT == HL_FORMAT_HTML or HL_OUTPUT == HL_FORMAT_XHTML) then
    Injections[#Injections+1]=
      "a.hl, a.hl:visited {color:inherit;font-weight:inherit;}"

  -- LaTeX: hyperlinks require the hyperref package, so we add this here
  -- the colorlinks and pdfborderstyle options remove ugly boxes in the output
  elseif (HL_OUTPUT==HL_FORMAT_LATEX) then
    Injections[#Injections+1]=
      "\\usepackage[colorlinks=false, pdfborderstyle={/S/U/W 1}]{hyperref}"
  end
end

-- load the chunks
Plugins={
  { Type="lang", Chunk=syntaxUpdate },
  { Type="theme", Chunk=themeUpdate },
}

Selection of packaged plugins

bash_functions.lua

Description: Add function names to keyword list

Features: Adds new keyword group based on a regex, defines OnStateChange, uses AddKeyword

theme_invert.lua

Description: Invert colours of the original theme

Features: Modifies all color attributes of the theme script, uses Lua pattern matching

ctags_html_tooltips.lua

Description: Add tooltips based on a ctags file (default input file: tags)

Features: Uses file input (defined by cli option --plug-in-param) and parses tags data before Decorate is called.

outhtml_curly_brackets_matcher.lua

Description: Shows matching curly brackets in HTML output.

Features: Uses Decorate to add span tags with unique ids to opening and closing brackets. Adds JavaScript with HeaderInjection variable. Inserts additional CSS styles with Injections variable.

outhtml_keyword_matcher.lua

Description: Shows matching keywords in HTML output.

Features: Uses Decorate to add span tags with unique ids to opening and closing brackets. Uses OnStateChange to assign an internal ID to each keyword. Adds JavaScript with HeaderInjection variable. Inserts additional CSS styles with Injections variable.

outhtml_codefold.lua

Description: Adds code folding for C style languages, Pascal, Lua and Ruby to HTML output

Features: Uses DecorateLineBegin and DecorateLineEnd to add ID-spans to each line. Applies Decorate to each code block delimiter to add onClick event handlers. Adds JavaScript with HeaderInjection and FooterInjection variables. Inserts additional CSS styles with Injections variable.

Tupel7