|
|
|
|
RI |
Help |
Site Search Help | |
Site Search Help
Person search |
Publication search |
Lab, group, & project search |
Site search
Full text searching searches all pages on this site except the person,
publications, and lab/project pages. We use the Alkaline search
engine. This help page is a copy of the search tips
listed on their web site.
Introduction and Basics
Alkaline Search Engine finds documents on an internet site, several internet sites or an
intranet domain. To search for any information you have to type in a sequence
of words that define what you are looking for. The search engine will output
a list of results, best results first. Alkaline searches exact words and word heuristics
(parts of words) only. It does not do fuzzy or misspelled words search. Alkaline does not search
phrases (yet).
Simple Search
is done by typing a word.
Searching for light will find all pages containing
light, lightning, delighted, etc. It will also find pages
with Light and Lightning because searching is case-insensitive.
Searching Multiple Words is done by typing a sequence
of words separated by spaces. Searching for ricky blue will find
all pages containing Ricky, tricky, blue, blues, etc. Page containing
both words will be shown first in the results.
Case-Sensitive search can be enabled by using a single capital letter inside a word.
For example, searching for Intranet let will find all pages
containing Intranet and letter, but will not list pages
with just intranet or Letter.
Entire words can be searched by using quotes. Searching for
"net" will find pages containing
net and Net, searching for "Net" will
of course find only pages containing Net because of the capital N enabling the
case-sensitive search.
Boolean Search
Boolean Search allows to lookup for pages containing some word and not
containing some other. To express the fact that a page must contain a word, a
+ sign must be placed in front of the word. To
search for all pages not containing a word, a - sign
should be used. For example, you can search
+net -"Internet" which will show pages containing
net, network, etc. only if they don't contain Internet.
Note that searching for -word or +word -word
will produce no results.
Refined Boolean Search can be done by mixing a boolean expression with normal
words. Searching +net +Vestris hello will sort results
showing pages containing net, Vestris and optionally hello first.
Meta Data
Meta Search can be done by specifying a meta tag followed by a column, for example
author:pushkin. You can get refined meta results too, things like
author:+"Pushkin" and +author:"Pushkin"
will produce same output.
Searching author:"Daniel Doubrovkine" is equivalent to
author:"Daniel" author:"Doubrovkine". All rules about case-sensitivity,
full and partial words are preserved with the meta search. The meta tag name is never case-sensitive
and is always exact.
Scope Restriction
To define a Scope means to specify a more precise location of a document.
An Alkaline search engine is capable of indexing multiple sites with the same daemon. For example, the Vestris Inc.
search page looks for documents at http://www.vestris.com and http://db.infomaniak.ch.
Host Scope is specified by adding the rightmost part of a host entry, such as
host:.host.com to the search
string. For example, to search "vestris" in documents at http://www.vestris.com, the following command should be
issued: host:.vestris.com vestris. It is possible to specify multiple hosts by writing
more than one host: elements in your search string. Example:
host:.ch host:.vestris.com +vestris -alkaline will return all documents
indexed at *.ch, *.vestris.com containing "vestris" and not containing "alkaline".
Path Scope is specified by adding the leftmost part of a path without the leading slash,
such as path:software/xreplace to the search string.
For example, to search "vestris" in documents at http://www.vestris.com/software/xreplace, the following command
should be issued: host:www.vestris.com path:software/xreplace vestris. Notice that
it is possible to mix host: and path: as it is possible to specify multiple path entries.
Url Scope is specified by adding the leftmost part of a complete url without the leading
http://, such as url:www.vestris.com/software/xreplace to the search string.
For example, to search "vestris" in documents at http://www.vestris.com/software/xreplace, the following command
should be issued: url:www.vestris.com/software/xreplace vestris.
File Extension Scope is specified by adding the rightmost part of a filename
without the leading dot, such as ext:cpp,h to the search string.
Multiple extensions can either be specified separated by commas or by adding more than
one ext: parameters to the search string.
For example to search "void" in all .cpp documents, the following command should be
issued: ext:cpp void
Alkaline will return documents matching the search query and ANY of the host:, path: or url: scope
specifiers AND the ext: scope delimiter if present. As usual if no scope
is specified, the entire indexed domain is searched.
It is possible to specify multiple values in all scope delimiters by separating them with commas or by adding multiple
scope options to the search string.
Date Scope
To search documents modified after or before a specified date
add before: and/or after: to the search string
followed by a valid date of one of the following forms:
You can specify both before and after items, for example before:12.05.1999 after:01.05.1999
will return all documents between these two dates.
Note that the bounds are NEVER included in the search results.
Search Options Forcing
To search all words case-sensitive, opt:case should
be added to the search string.
To search all pages containing ALL words (AND operator), opt:and
should be added to the search string. The default behaviour of Alkaline is to search all pages containing ANY of the
words and producing best results first.
To force searching of whole words only, opt:whole should
be added to the search string. The default behaviour of Alkaline is to search partial matches.
It it of course possible to specify more than one such option by separating them by commas or by adding multiple opt:
entries to the search string, for example: ant opt:whole,case will return all
pages containing the exact word "ant".
Hints and Techniques
Punctuation is not indexed, but future versions might use it for phrasal search.
Searching a word with a comma will be unsuccessful because the comma is
used as any other character together with other punctuation marks.
A dash "-" is a common character, except placed in the beginning of a word that is
not quoted (boolean search). Words with a dash can be searched and indexed.
Alkaline can search and index numbers and words with digits.
Accentuated characters, such as in French,
are translated into respective variants (é to e,
â to a, etc.) at both search and index.
Multilingual servers using Alkaline might have this feature removed to enable searching
of Russian text for example.
Hint: Do not search "the" or "it", they are also indexed
for statistical purposes. Instead try to be as consistent as possible. Medium sized
rare words (of about 6 characters) are those found faster.
The Robotics Institute is part of the
School of Computer Science,
Carnegie Mellon University.
This page maintained by robotwebmaster@ri.cmu.edu.