Security School

Final Exam / Answer No. 5

5) Identify the two most common errors associated with keyword searching across e-mail messages.

The two most common ways to search for keyword incorrectly are to ignore case significance and to improperly stem words.

Case significance is the easy one because most keyword searching tools are case significant. You have to turn off case significance anytime you're doing policy-based keyword searches. This is the number one error that most people make.

Stemming is a more significant problem and one that is not handled easily. Without stemming, you have to search for every variation of the word that you're looking for. For example, you can't simply search for 'poop' because you won't catch the important variations 'poopy,' 'poops,' 'pooped' and 'pooping.' If you try to ignore the spaces on either side of a word (or, more precisely, the white space, which can include line breaks, tabs and other formatting characters), you'll end up with every word that has 'poop' in it, such as nincompoop (used to describe the person who wanted you to search for poop). Good regular expression and search engines handle word stemming automatically for you; more primitive ones require you to handle this kind of stemming by yourself.

<< Back to quiz


This was first published in April 2005

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: