Index module

This module defines the abstract access methods to the database. If needed, the methods are overridden in the specific modules.

new ($dbname)

new is Index object constructor. It dispatches to the specific constructor based on its argument.

  1. $dbname

    a string containing the condiguration parameter 'dbname' describing the engine and the characteristics of the DB

Note:

The specific constructor is responsible for creating hash elements in $self containing "cooked" queries (meaning they have been processed by prepare DBD method.

They are mentioned by the Requires paragraphs in the following method descriptions.

read_open ()

read_open "prepares" the transactions which read from the database.

The separation between read and write transactions is two-fold.

First, it ensures that faulty code will not corrupt the database when the write transactions have not been enabled. Second, it improves initialisation speed and decreases memory footprint when only browsing the tree.

write_open ()

write_open "prepares" the transactions which write into the database. They are only used by the indexing utility.

The separation between read and write transactions is two-fold.

First, it ensures that faulty code will not corrupt the database when the write transactions have not been enabled. Second, it improves initialisation speed and decreases memory footprint when only browsing the tree.

write_close ()

write_close removes the write-enable transactions.

uniquecountersinit ($prefix)

uniquecountersinit initialises the unique counters for file, symbol and type ids.

This is a new extension method for derived object usage.

  1. $prefix

    a string containing the database table prefix

Several database engines have better performance using cached counters for fields with unique attributes unstead of the built-in features. It comes from the fact that the used (incremented) value is not written back immediately to disk (fewer commits).

This trick is valid because we write to the DB only at genxref time and DB loading is single thread.

CAUTION!

fileid ($filename, $revision)

fileidifexists ($filename, $revision)

fileid returns a unique id for a file with a given revision, creating it if it does not exist.

fileidifexists is similar, but returns undef if the given revision is unknown, which can happen if the revision was created after the latest genxref indexation.

  1. $filename

    a string containing the path relative to 'sourceroot'

  2. $revision

    the revision for the file

    CAUTION: this is not a release id! It is computed by method filerev in the Files classes.

The result is used as an index between the different DB tables to refer to the file.

Requires:

getallfilesinit ($releaseid)

getallfilesinit prepares things for nextfile.

  1. $releaseid

    the release (or version) for which all recorded files should be returned

The subroutine executes the allfiles_select transaction. Results are retrieved one by one through nextfile.

Requires:

nextfile ()

nextfile is an iterator running over all files making up a version of the source tree, as known from the database.

A file description is returned for each call until it returns undef, at which time it must no longer be called.

Requires:

setfilerelease ($fileid, $releaseid)

setfilerelease marks the file referred to by $fileid as part of $releaseid.

  1. $fileid

    an integer representing a file in the DB

  2. $releaseid

    the release (or version) containing the file

Requires:

The final result is as many records in the releases tables as versions of this file. All these records point to the same item in the files table.

The releaseid is any tag under which the file in this state is known by the VCS. The revision, stored in the files table, is a canonical identification of the file state. The file state will be parsed and cross-referenced only once, thus reducing genxref processing time, but the result may still be referenced by any tag.

removerelease ($fid, $releaseid)

removerelease deletes one release from the set associated to a base revision.

  1. $fid

    the unique id for a base revision file

  2. $releaseid

    the release (or version) containing the file

Requires:

fileindexed ($fileid)

fileindexed returns true is the file referred to by $fileid has already been indexed; otherwise, it returns false.

  1. $fileid

    an integer representing a file in the DB

Requires:

setfileindexed ($fileid)

setfileindexed marks the file referred to by $fileid as being indexed.

Since indexing (i.e. symbol definition collecting) is usually done outside LXR, indexing time is not updated.

  1. $fileid

    an integer representing a file in the DB

Requires:

filereferenced ($fileid)

filereferenced returns true is the file referred to by $fileid has already been parsed for references; otherwise, it returns false.

  1. $fileid

    an integer representing a file in the DB

Requires:

setfilereferenced ($fileid)

setfilereferenced marks the file referred to by $fileid as having been parsed for references.

Indexing time is updated for user information.

  1. $fileid

    an integer representing a file in the DB

Note:

Requires:

filetimestamp ($fileid)

filetimestamp retrieves the time when the file was parsed for references.

  1. $filename

    a string containing the path relative to 'sourceroot'

  2. $revision

    the revision for the file

    CAUTION: this is not a release id! It is computed by method filerev in the Files classes.

Requires:

symdeclarations ($symname, $releaseid)

symdeclarations returns an array containing the set of declarations for the symbol in this release.

  1. $symname

    the symbol name

  2. $releaseid

    the release (or version) containing the file

Requires:

setsymdeclaration ($symname, $fileid, $line, $langid, $type, $relsym)

setsymdeclaration records a declaration in the DB.

  1. $symname

    the symbol name

  2. $fileid

    the unique id which identifies a file AND a release

  3. $line

    the line number of the declaration

  4. $langid

    an integer key for the language

  5. $type

    the type of the symbol

  6. $relsym

    an optional relation to some other symbol

Requires:

symreferences ($symname, $releaseid)

symreferences returns an array containing the set of references to the symbol in this release.

  1. $symname

    the symbol name

  2. $releaseid

    the release (or version) containing the file

Requires:

setsymreference ($symname, $fileid, $line)

setsymreference records a reference in the database if the symbol is already present (as a declaration).

  1. $symname

    the symbol name

  2. $fileid

    the unique id which identifies a file AND a release

  3. $line

    the line number of the declaration

Requires:

setsymreference includes since release 1.0 part of issymbol so that this latter function is no longer needed when referencing files and MUST NOT be used in referencefile functions

issymbol ($symname, $releaseid)

issymbol returns true (1) for an existing symbol in a given release according to the DB, false (0) otherwise.

  1. $symname

    the symbol name

  2. $releaseid

    the release (or version) containing the file

Requires:

This functions is used during browsing to decide whether the symbol should be highlighted or not.

Since release 1.0, this function is no longer used during the usage collecting pass. It can now have its own independent cache strategy, but it MUST NOT be called outside the browsing pass.

symid ($symname)

symid returns a unique id for a symbol.

If symbol is unknown, insert it into the DB with a zero reference count. The reference count is adjusted by the methods which add definition or usage. Decrementing the reference count is only done when purging the database.

  1. $symname

    the symbol name

Requires:

symname ($symid)

symname returns the symbol name from a symbol id.

  1. $symid

    the unique id for a symbol

Requires:

decid ($writeflag, $lang, $string)

decid retrieves a unique id for a declaration type in a given language. If this declaration is not yet in the DB, record it if the write flag is set.

  1. $lang

    the unique id for the language

  2. $string

    the text for the declaration (from {'typemap'}{letter} in a generic.conf language description)

Requires:

These records are in fact the text for the language types.

The text retrieval function is not implemented because it is implictly done in the symdeclarations query.

CAVEAT!

commit ()

Commit the last set of operations and start a new transaction.

If transactions are not supported, it's OK for this to be a no-op.

forcecommit ()

Commit now the database, even if auto commit mode is in effect.

This method should not be overridden in specific drivers.

emptycache ()

emptycache empties the internal symbol cache.

This function should be called before parsing each new file. If this is not done then too much memory will be used and things will become very slow.

Note:

flushcache ($full)

flushcache flushes the internal symbol cache.

  1. $full

    optional argument to force 0-count write back

    (When creating the database, reference counts are incremented. Consequently, if the final count is still zero, the symbol has not been referenced and there is no need to overwrite the record. On the contrary, when purging the database, reference counts may decrement to zero and it is then mandatory to update the record so that it can later be purged or correctly updated.)

This function should be called at the end of file processing. It writes the cached symbol reference count into the appropriate symbol records of the DB.

To minimize I/O, reference counts are negated when entered into the cache. The counts are turned back positive when they need to be incremented. Thus strictly positive values show which symbols have been referenced. Only these are flushed to the DB.

The cache is then emptied

Requires:

purgefile ($fid, $releaseid)

purgefile deletes data related to an obsoleted file in the DB.

Data associated to the designated file are erased from the tables.

  1. $fid

    the unique id for a base revision file

  2. $releaseid

    the release (or version) containing the file

Requires:

"Relation" symbol (from definitions) reference count must be decremented first. After that, order of definitions/usages deletion is irrelevant.

Symbols are not deleted when their reference count decrements to zero because the same file (in a more recent version) is supposed to be indexed soon: a majority of the symbols will be reentered again in the database.

Release erasure is done in another sub since this erasure can occur also when no definition/usage deletion is necessary. The relevant code is thus written only once.

purge ($releaseid)

purge selectively deletes data in the DB.

Data associated to a release are erased from the tables.

Order of erasure is critical to comply with foreign key constraints between the different tables and to guarantee correctness of resulting database structure.

Once we know which base version files will be deleted, definitions and usages in these files are erased, which decrements symbol count. The symbols with zero reference are deleted then.

After this step, no definition or usage are left pointing to the candidate files. Releases are deleted, decrementing the references in status. Status with zero reference are then deleted (files cannot be deleted first because there is a "foreign key contraint" on files to status). Files are implicitly deleted by a trigger from status deletion.

  1. $releaseid

    the target release (or version)

Requires:

Note:

Todo:

purgeall ()

purgeall deletes all data in the DB.

This is a brutal way of erasing everything, e.g. for --reindexall --allversions. It is much more efficient than a sequence of purge on every version.

Requires:

uniquecountersreset ($force)

uniquecountersreset restarts the counters from 0.

  1. $force

    an integer used to force the $xxxini variables

    If different from 0, this forces uniquecounterssave to write the reset values to the DB if immediately called after this method.

    It is better to call the method a second time with argument 0 to avoid any unforeseen side-effects, though there should be none.

uniquecounterssave ()

uniquecounterssave stores in the DB the current values of the file, symbol and type counters for later sessions.

dropuniversalqueries ()

dropuniversalqueries deactivates all "universal" query statement to prevent annoying "Disconnect invalidates xx active statement handles ..." messages from disturbing the end user. Derived instances are responsible for killing their own queries.

Most are probably overkill since execure or fetchrow_array may already have disactivated the statement.

Must be called before final_cleanup before disconnecting.

saveperformance ($releaseid, @wtimes)

saveperformance writes genxref's milestone times to the DB.

  1. $releaseid

    the release (or version) for which performance data should be saved

  2. $reindex

    full reindex flag

  3. $step

    a single-character string identifying the step

  4. $starttime

    the starting time of the step (in seconds)

  5. $endtime

    the completion time of the step (in seconds)

Note:

Requires:

getperformance ($releaseid)

getperformance retrieves genxref's milestone times from the DB.

  1. $releaseid

    the release (or version) for which performance data should be returned

Requires:

final_cleanup ()

final_cleanup allows to execute last-minute actions on the database and disconnects.

Must be called before Index object disappears.

post_processing ()

post_processing executes maintenance actions on the database at end of genxref processing.

Must be the last action called before Index object disappears.