The installation of Index Manager includes a handful of configuration files that can be modified to tweak or fine-tune the way searches are performed. This topic deals with what those files are and how you can modify them to suit your needs.
We strongly advise you to make backup copies of these files before you edit them!
Adding or removing noise words
Where: C:\Programdata\FotoWare\Index Manager\Index Control\dtsearch.noi
The dtsearch.noi file is a plain text file that can be edited in Notepad. It contains all of Index Manager's noise words, i.e. words that Index Manager will simply ignore when indexing your archives. Hence, words in this list will not be searchable.
You may want to modify the noise word list to include common filler words in your language that you deem unnecessary for search. However, even if you do not want to make additions to the list, you may want to skim through the list to make sure that none of the words currently listen coincide with words that have a different meaning in your language and would thus be useful to search for.
In the standard dtsearch.noi file that is installed, the noise words are listed in alphabetical order, but this is not required. You can add words to the top or bottom of the list as you prefer.
If you choose to modify the dtsearch.noi file you will need to completely rebuild your indexes to incorporate the changes.
Configuring Stemming
Where: C:\Programdata\FotoWare\Index Manager\Index Control\Stemming.dat
By defining stemming rules you can control how Index Manager interprets searches for morphological variations of a word. In simple terms, it allows you to for example search for a singular form on a noun and have the search engine also return files that contain metadata with the plural form.
The Stemming.dat file is a plain text file that can be opened and edited in e.g. Notepad. It contains standard stemming rules that relate to English. Note, however, that the search engine has no way of handling exceptions from the general rule. Hence, you may find that certain words fall outside the scope of the rules configured simply because they have special conjugation patterns.
Information about how to configure stemming can be found in the Stemming.dat file itself. Changing the contents of the stemming file will require you to completely rebuild the index(es) to incorporate the changes.
Usage note: In FotoStation you will have to use the advanced search dialog and enable Use word stemming (located under Search Options) if you want enable stemming search in Index Manager. In FotoWeb, stemming can be enabled site-wide for all users on the Searching page under Site Settings.
Configuring Synonyms
Where: C:\Programdata\FotoWare\Index Manager\Index Control\THESAUR.XML
The tesaurus file is not enabled by default, and after installation you will find that it is called Sample-THESAUR.XML
Renaming the file to THESAUR.XML will enable it, but you will have to completely rebuild your indexes to incorporate the changes. Since it's an xml file, you may wnt to use an xml editor to modify it since that will make it easier to tell the different nodes apart.
The contents of a standard THESAUR.xml file will look something like this:
<?xml version="1.0" encoding="UTF-8" ?>
<dtSearchUserThesaurus>
<Item>
<Name>usa</Name>
<Synonyms>usa "United States"</Synonyms>
</Item>
<Item>
<Name>fast food</Name>
<Synonyms>"fast food" hamburger pizza taco</Synonyms>
</Item>
</dtSearchUserThesaurus>
Inside each <Item> node you find two child items, <Name> and <Synonyms>
To define synonyms, insert the operative word in the <Name> field, then insert both the operative word and any synonyms in the <Synonyms> field, as shown above. Note how "United States" and "fast food" are put in quotation marks to indicate that the two words form a compound noun.
Usage note: To enable synonym searches in FotoStation, use the advanced search dialog and enable Use Synonym List under Search Options. In FotoWeb, synonym searching can be enabled site-wide for all users on the Searching page under Site Settings.
Configuring allowed characters
Where: C:\Programdata\FotoWare\Index Manager\Index Control\english.abc
Index Manager operates with Unicode, as does XMP. Hence, all special characters will be indexed and searchable. There may, however, be reasons why you would want to treat certain characters in a special way. That's where english.abc comes into play.
English.abc is a text file and can be edited with e.g. Notepad.
It consists of four nodes, or blocks of character definitions:
Important: A character may only occur in one of the nodes. Hence, you will get strange search results if, for example, the backslash character is listed both under [Letters] and [Ignore].
[Letters]
This section contains valid letters that the indexing engine will allow. Because Index Manager uses unicode, the letters A-Z are in the list only for historical reasons. You will be able to add additional characters to the list by adding them at the bottom. For example, if you want to index a backslash (\) as a valid, searchable character, insert it four times in the list below Z:
z z Z z
\ \ \ \
[Hyphens]
Characters defined as hyphens will be construed both as searchable characters in their own right and as a space.
Hence, Reuters/Photographer can be searched for using that phrase exactly, but it could also be found using (Reuters photographer)
Important: Hyphens can only be used once per string. Hence, they cannot for example be used for date fields that contain more than one dash (-) as separators.
[Spaces]
These characters are construed as spaces be the search engine.
[Ignore]
Characters listed here will be completely ignored by the search engine, and the remaining indexed string will be contracted.