warning ($msg)
fatal ($msg)
tmpcounter ()
_edittime ($thetime)
indexstate ($who)
nonvarargs ()
urlargs (@args)
fileref ($desc, $css, $path, $line, @args)
diffref ($desc, $css, $path, @args)
idref ($desc, $css, $id, @args)
incref ($name, $css, $file, @paths)
http_wash ($name)
http_encode ($name)
fixpaths ($node)
minimal_http_headers ()
std_http_headers ($who)
httpinit ()
clean_release ($releaseid)
clean_identifier ($id)
clean_path ($path)
httpclean ()
This module contains HTTP initialisation and various HTML tag generation.
Note:
It initially contained nearly all support routines but for the "object" collections (files, index, lang), and was then correctly the "common" module. Its size grew beyond maintanability and readability and forced a split into smaller, specialized modules. Consequently, its name should be changed to reflect its present content.
warning ($msg)
Function warning
(hook for warn
statement) issues a warning message into the error log and optionally on screen.
$msg
a string containing the message
The message is prefixed with Perl context information. It is printed on STDERR and if enabled on STDOUT as an HTML fragment.
To prevent HTML mayhem, HTML tag delimiters are replaced by their entity name equivalent.
This function is called after successful initialisation. There is no need to check for HTTP header state, since early errors are fatal and handled by the next function. However, the < HTML >
tag and < BODY >
element may not yet have been emitted if this is an error on the page header template.
Note:
Since it proved a valuable debugging aid, the function has been modified so that it can be used very early in LXR initialisation. Variable $HTMLheadOK
tells if the "standard" header part of the page has already been sent to screen. If not, some general purpose header is emitted to support HTML layout of the warning message.
Of course, when the standard header part is later emitted, some of its components will be discarded (or not properly set) by the browser because they occur at an inappropriate location (not HTML-compliant). This happens only in exceptional circumstances, usually requiring fix by the LXR administrator.
fatal ($msg)
Function fatal
(hook for die
statement) issues an error message and quits.
$msg
a string containing the message
Full Perl context information is given and tentative LXR configuration data is dumped (on STDERR).
The message is printed both on STDERR and in the HTML stream.
If variable $HTTP_inited
is not set, HTTP standard headers have not yet been emitted. In this case, minimal headers and HTML initial elements (start of stream, < HEAD >
element and start of body) are printed before the message and the HTML page is properly closed.
Note:
The message may be emitted after the final closing < /HTML >
tag if some regular HTML precedes the call to this subroutine. This is not HTML-compliant. Some browsers may complain.
tmpcounter ()
Function tmpcounter
returns a unique id for numbering temporary files.
_edittime ($thetime)
Function _edittime
returns a human-readable date/time in a string.
$thetime
an integer containing an UTC time in seconds since the epoch
indexstate ($who)
Function indexstate
returns the most recent indexation time for the current tree or 0 if it is not indexed yet, -1 if indexing crashed, -2 if indexing is in progress.
$who
a string containing the main script name (used to avoid to retrieve records unrelated to present script, mainly in the perf case)
The times table records pertaining to the current version of the tree are read in: first the global "in-progress" sentinel record, then either the genxref's free-text indexing or declaration parsing dates.
If the "in-progress" record is still there, genxref is still working or crashed (depending on the sign of end time).
If no date record was retrieved, the tree has not been indexed. Otherwise, the termination time of the usage collection step is returned. Incremental indexing takes precedence over full reindexing because incremental times are erased when full reindexation takes place.
nonvarargs ()
Function nonvarargs
returns an arrray containing "key=value" elements from the original URL query string not related with LXR "variables".
A non "variable" key is identified by its "sigil", an underscore ("_"). Any other key is ignored.
urlargs (@args)
Function urlargs
returns a string representing its argument and the current state of the "variables" set suitable for use as the query part of an URL.
@args
an array containing "key=value" elements
To avoid progressive lengthening of the resulting string, the "key=value" strings for default variable values are deleted from the array.
All elements are concatenated with standard ampersand separator ("&") and prefixed with question mark ("?"). This string can be used as is in an URL.
fileref ($desc, $css, $path, $line, @args)
Function fileref
returns an < A >
link to a specific line of a source file.
$desc
a string for the user-visible part of the link, usually the file name
$css
a string containing the CSS class for the link
$path
a string containing HTML path to the source file
$line
an integer containing the line number to reference (or void)
@args
an array containing "key=value" elements
Notes:
All non alphanumeric characters in $path
are URL-quoted to avoid conflicts between unconstrained file name and URL reserved characters.
Since line anchor ids in LXR are at least 4 characters in length, the line number is eventually extended with zeros on the left.
The @args argument is used to pass state and makes use of sub urlargs
.
diffref ($desc, $css, $path, @args)
Function diffref
returns an < A >
link for the first step of difference display selection.
$desc
a string for the user-visible part of the link, usually the file name
$css
a string containing the CSS class for the link
$path
a string containing the HTML path to the source file
@args
an array containing "key=value" elements
But for the $line
argument, the interface is identical to sub fileref
's. See notes above.
Since script diff
can be controlled through some URL arguments, a call is made to sub nonvarargs
to keep the values of these arguments between calls.
idref ($desc, $css, $id, @args)
Function idref
returns an < A >
link to the cross reference list of an identifier.
$desc
a string for the user-visible part of the link, usually the identifier
$css
a string containing the CSS class for the link
$id
a string containing the name of the identifier to search
@args
an array containing "key=value" elements
Since script ident
can be controlled through some URL arguments, a call is made to sub nonvarargs
to keep the values of these arguments between calls.
incref ($name, $css, $file, @paths)
Function incref
returns an < A >
link to an include
d file or undef
if the file is unknown.
$name
a string for the user-visible part of the link, usually the file name
$css
a string containing the CSS class for the link
$file
a string containing the HTML path to the include'd file
@paths
an array containing a list of base directories to search for the file
If the include'd file does not exist (as determined by sub incfindfile
), the function returns undef
. Otherwise, it returns an < A > link as computed by sub fileref
.
http_wash ($name)
Function http_wash
returns its argument reversing the effect of a URL-quote.
$name
a string to URL-unquote
http_encode ($name)
Function http_encode
returns its argument URL-quoted.
$name
a string to URL-quote
fixpaths ($node)
Function fixpaths
fixes its node argument to prevent unexpected access to files or directories.
$node
a string for the path to fix
This is a security function. If the node argument contains any /../
part, it is removed with the preceding part. Also /./
and all repeating /
are replaced by a single slash.
The OS will then be presented only "canonical" paths without access computation, minimizing the risk of unwanted access.
Caution!
Any use of this sub before full LXR context initialisation (i.e. before return from sum httpinit
) is doomed to fail because the test for directory type needs a proper value in $releaseid
. This failure is invisible: it does not lead to run-time error, it just returns a non-sensical status.
minimal_http_headers ()
Function minimal_http_headers
ouputs minimal HTTP headers for emergency situation during early initialisation.
std_http_headers ($who)
Function std_http_headers
ouputs the "expected" HTTP headers and a blank line to switch to content (body) mode.
$who
an optional string containing the main script name
Presently, only a Last-Modified and a Content-Type headers are output.
If $who
is undefined, current time is used in the Last-Modified header which is an elegant way to tell the browser to refresh its cache. This is useful for "dynamic" views like perf or showconfig.
httpinit ()
Function httpinit
parses the URL, cleans up the parameters and sets up the LXR "variables".
It initializes the global variables (the LXR context) and HTTP output.
Information extracted from URL is stored into hash $HTTP
.
This sub is also responsible for HTTP state transition from one invocation to the other. The URL (query) arguments are spread into 4 name spaces identified by a "sigil":
-none-: standard 'variables'
exclamation mark (!
): override 'variables'
value
tilde (~
): difference 'variables'
underscore (_
): LXR operational parameter
httpinit
deals only with the first 2 namespaces.
clean_release ($releaseid)
Function clean_release
returns its argument if the release exists otherwise the default value for variable 'v'
.
$releaseid
a string containing the release (version) to check
Note:
This filtering breaks with CVS if a file is not targeted i.e. directory listing or identifier query.
For a directory, the default release is not a pain, since it is easy to change it to the desired one as soon as a file is accessed. The provided release is however kept for the case where directory display comes from a link in a file and user then jumps to another file in the directory. It is assumed that usually user wants both files with same version.
For identifier query, under some VCS, the provided release could be reverted to the default one if the eventual file in the query string does not exist in this version It is recommended to submit user queries (as opposed to those from a link in a source) without path info.
clean_identifier ($id)
Function clean_identifier
returns its argument after removing "unusual" characters.
$id
a string representing the identifier
Caveat!
When adding new languages, check that the definition of "unusual" in this sub does not conflict with the lexical form of identifiers.
clean_path ($path)
Function clean_path
returns its argument truncated to known good characters.
$path
a string containing the path to check
The path is truncated at the first non-HTML quote conformant character. Every sub-path equal to /./
or /../
is then removed.
Note:
Is this really necessary since it restricts the user choice of filenames, even if the set covers the common needs? All is needed to protect against malicious attacks is to "quote" HTML reserved characters.
httpclean ()
Function httpclean
does the final clean up.
To be called when all processing is done.