Unary Operators

Set Unary Operators:
NOTSet compliment
WITHIN[:field]Records with elements within the specified field. RPN queries "term WITHIN:field" and "field/term" are equivalent. (for performance the query "field/term" is prefered to "term WITHIN:field")
WITHIN[:daterange]Record dates within the range
INCLUSIVE[:field]Inclusive Within: ALL Hits (and ONLY THOSE) are elements that are in the specified field. Matching records are those that have their hits absolutely in the specified field and nowhere else.
XWITHIN[:field]Absolutely NOT in the specified field
Special Unary Operators
BOOST:fff.ffBoost the score of the set by fff.ff (Score = Score * fff.ff)
REDUCE[:nnn]Reduce set to those records with nnn matching terms.
NOTE: REDUCE:metric is a special kind of unary operator that trims the result
TRIM:fff.ffTrim to the set to contain a max. number of records. If fff.fff is an integer then its the maximum number. If fff.fff is a floating point number between 0 and 1 it is taken as a per-cent of the total number of records in the index. An index with 1 million records and TRIM:0.1 would mean max. 100000. A floating point number > 1 is taken as the integer component + the percentage of the number of records. Example: 1000.01 for above = 1000 + 10000 = 11000
HITCOUNT:nnnTrim the set to contain only those records with, when nnn is positive, at least nnn hits. When nnn is a negative number then the set it to contain those records with no more than -nnn hits.

Example: HITCOUNT:10 would return those with no less than 10 hits. HITCOUNT:-10 would return those records with up to 10 hits but not more. The combination HITCOUNT:-10 HITCOUNT:10 creates the set of records with exactly 10 hits.

One may specify this as HITCOUNT>nnn (HITCOUNT>10 is equivalent to HITCOUNT:11), HITCOUNT>=nnn (same as HITCOUNT:nnn), HITCOUNT
SORTBY:<keyword>Sort the set (reserved names for <keyword>: Key, Hits, Date, Index, Score, AuxCount, Newsrank, Function, Category, ReverseHits, ReverseDate, etc.)

NOTE: The default sort **MUST** be set to unsorted for the query sort to propagate into the final set.
Unary Neo-Operators (Sets)
FILE:<filespec>The set of all records whose input file path match <filespec> (example: FILE:shakesp*.xml).
EXTENSION:<ext>The set of all records whose input file has the extension <ext> (example: EXTENSION:cgm).
KEY:<keyspec>The set of all records whose key match the <keyspec>.
DOCTYPE:<doctype>The set of all records whose doctype (index format) matches <doctype>.
Note: The above specifications fully support wildcards. They may not be used alone but only as part of a query sentence with at least 1 term and a binary operator. Example (Infix):

"hedgehog " and FILE:shakespeare.*


The WITHIN:field operator can quite effective in exploring (partially unknown) structure paths in XML data.

In our Shakespeare example (SGML/XML markup of Shakespeare's works by Jon Bosak) we
have as paths to LINES where things are said:
- PLAY\ACT\EPILOGUE\SPEECH\LINE
- PLAY\ACT\PROLOGUE\SPEECH\LINE
- PLAY\PROLOGUE\SPEECH\LINE
- PLAY\INDUCT\SCENE\SPEECH\LINE
- PLAY\INDUCT\SPEECH\LINE
- PLAY\ACT\SCENE\SPEECH\LINE

And the only LINE which a child is:
- PLAY\ACT\SCENE\SPEECH\LINE\STAGEDIR

To search for a term in the field LINE one would typically not use one of these operators but issue a LINE/term. The standard field search is faster and more efficient than WITHIN. WITHIN (and its friends) are, however, very powerful and extremely useful.

By specifying in an RPN query: LINE/term WITHIN:PROLOGUE
one can specify only those terms in a line that are within the PROLOGUE.

Multiple unary operators can be applied so one could also say: term WITHIN:LINE WITHIN:PROLOGUE.

To search for the term "love" in LINEs that are in a scene: LINE/love WITHIN:SCENE

Only those "scene's that are in an act (and not those in an induct): LINE/love WITHIN:SCENE WITHIN:ACT

If we wanted those explicitly not in INDUCT: LINE/love WITHIN:SCENE XWITHIN:INDUCT

The mechanism is very powerful, flexible and more generic than the XPATH models but we are after all also more generic and abstract than XML.