The Document Library: search expressions

If you want to read more about the text search technology used here, you’ll need to know that our MySql document database uses the InnoDB engine and the boolean, not the natural language full text search method. Most relevant page is probably mysql.com’s documentation on natural language boolean search.

Remember that you are searching document titles and descriptions that we’ve provided for each document – not the documents themselves.

A search expression contains words, or search terms. Each word may contain only alphanumeric characters. Words are separated by spaces.

Words are not case-sensitive, so “France” is the same as “france”

Search expressions may be no longer than fifty characters.

Words in document descriptions and titles have been compiled into an index, except if they’re less than three characters long, or on a list of common words (stopwords). Consequently any word in a search expression either less than three characters long, or among the stopwords, will be ignored; you'll get a warning message with your results if that happens.

As well as words, a search expression may contain operators.
Any word may be immediately preceded by just one of the operators + or -.
The beginning of a word may be immediately followed by the operator *. .

There are some sample search expressions below.

the details:

+ preceding a wordIndicates that a word must be present in a document’s description or title.
- preceding a wordIndicates that a word must not be present.

It acts only to exclude document descriptions and titles that are otherwise matched by other search terms.

A search that contains only terms preceded by - returns no documents; it does not return “all document descriptions or titles except those containing any of the excluded terms”.
(no operator)By default (when neither + nor - is specified), the word is optional, but a document whose description or title contains it is rated higher.
* following a wordTruncation (or wildcard) operator. Unlike the other operators, it is appended to the word to be affected. The wildcarded word is considered as a prefix that, to produce any results, must be present at the start of one or more word in the index.

Note * is only valid at the end of a word.
"  " enclosing wordsThe enclosed phrase matches only documents whose description or title contains the phrase literally, as it was typed. Non-word characters need not be matched exactly: phrase searching requires only that matches contain exactly the same words as the phrase and in the same order. For example, "test phrase" matches "test, phrase".

If the phrase contains no words that are in the index, the result is empty. The words might not be in the index because they do not exist in the text or they are stopwords; or they are shorter than four characters. So for instance the phrase
"in UK"
will never be found; “in” is a stopword, and “UK” is too short.

Other operators (lower relevance: <; increase relevance: >; negation: ~; grouping ( and ); and @distance) even though allowed by mySql fulltext search, aren’t supported or allowed here.

Results are not ordered by any measure of how well the search expression matches document descriptions or titles.

some more examples:

pattern:returns all documents whose description or title contains:
Hensenthe name “Hensen”
clima* -climatea word beginning “clima” which is not the word “climate”, probably “climat”
Fran* Fren*documents referring to things french, and also franking machines, and frenetic.
brit* engl* ukdocuments referring to things English. You’ll get a warning message.
COP2*documents referring COP22, COP23….
"sea level"just that

The thirty-six stopwords in current databse environment appear to be:

a  about  an  are  as  at  be  by  com  de  en  for  from  how  i  in  is  it  la  of  on  or  that  the  this  to  was  what  when  where  who  will  with  und  the  www

back whence you came

If you’d like to recommend an addition to our library email us.