Generic language module

This module is a generic language highlighting engine. It is driven by specifications read from file generic.conf.

Since it is a class, it can be derived to serve custom needs, such as speed optimisation on specific languages.

new ($writeDB, $pathname, $releaseid, $lang)

Method new creates a new language object.

  1. $writeDB

    a boolean integer requesting to store language properties (huyman-readable type description) into the database

  2. $pathname

    a string containing the name of the file to parse

  3. $releaseid

    a string containing the release (version) of the file to parse

  4. $lang

    a string which is the key for the specification hash 'langmap' in file generic.conf

This method is called by Lang's new method but the thirs argument is different! The returned object will be Lang's value.

The full unfiltered content of generic.conf is stored in the object structure.

To make sure identifiers can be recognised, a default pattern (covering at least C/C++ and Perl) is copied into the language specification if none is found.

read_config ($writeDB)

Internal function (not method!) read_config reads in language descriptions from configuration file.

  1. $writeDB

    a boolean integer requesting to store language properties (huyman-readable type description) into the database

Sets in global variable $config_contents a reference to a hash equivalent to the configuration file. The differences are:

Loading the file and transforming it is only executed once, saving the overhead of processing the config file each time.

However, The mapping between ctags tags and their human readable counterpart is stored in every database for every language. The mapping, as a table index in the DB, is kept in a new hash 'typeid'.

indexfile ($name, $path, $fileid, $index, $config)

Method indexfile is invoked during genxref to parse and collect the definitions in a file.

  1. $name

    a string containing the LXR file name

  2. $path

    a string containing the OS file name

    When files are stored in VCSes, $path is the name of a temporary file.

  3. $fileid

    an integer containing the internal DB id for the file/revision

  4. $index

    a reference to the index (DB) object

  5. $config

    a reference to the configuration objet

The effective job is done by ctags. This method is only a wrapper around ctags to retrieve its results and store them in the database.

referencefile ($name, $path, $fileid, $index, $config)

Method referencefile is invoked during genxref to parse and collect the references in a file.

  1. $name

    a string containing the LXR file name

  2. $path

    a string containing the OS file name

    When files are stored in VCSes, $path is the name of a temporary file.

  3. $fileid

    an integer containing the internal DB id for the file/revision

  4. $index

    a reference to the index (DB) object

  5. $config

    a reference to the configuration objet

Using SimpleParse's nextfrag, it focuses on "untyped" fragments (aka. code fragments) from which symbols are extracted. User symbols, if already declared, are entered in the reference data base.

parsespec ()

Method parsespec returns the list of category specifications for this language.

The language specification is a list of hashes describing the delimiters for the different categories, such as code, string, include, comment etc.

Each category is defined by a set of 2 or 3 regexps describing the delimiters: opening, ending and optionnaly locking delimiters.

flagged ($flag)

Method flagged returns true (1) if the designated flag is present in the language-specific hash 'flags'.

  1. $flag

    a string containing the flag name

processinclude ($frag, $dir)

Method processinclude is invoked to process a generic include directive.

  1. $frag

    a string containing the directive

  2. $dir

    an optional string containing a preferred directory for the include'd file

Algorithm

Note:

processcode ($code)

Method processcode is invoked to process the fragment as generic code.

  1. $code

    a string to mark

Basically, look for anything that looks like an identifier, and if it is then make it a hyperlink, unless it's a reserved word in this language.

isreserved ($frag)

Method isreserved returns true (1) if the word is present in the language-specific 'reserved' list.

  1. $frag

    a string containing the word to check

In the case of a case-insensitive language, comparisons are made betwwen upper case versions of the words.

language ()

Method language is a shorthand notation for $lang->{'language'}.

langinfo ($item)

Method langinfo is a shorthand notation to extract sub-hashes from language description {'langmap'}{'language'}.

  1. $item

    a string containing the name of the looked for sub-hash