HP Integrated Archive Platform User Manual

HP Integrated Archive Platform User Guide
Version 2.0
Includes information about using the Integrated Archive Platform (IAP) Web UI.
For additional user information on Email Archiving software for Microsoft Exchange and IBM Domino, see the
Email Archiving software for Microsoft Exchange User Guide
contained in those products.
Guide
and
HP
PDF
Par t number: PDF
econd edition: November 2008
S
Legal and notice information
© Copyright 2004-2008 Hewlett-Packard Development Company, L.P.
Condential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and
12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license.
The information contained herein is subject to change without notice. The only warranties for HP products and ser vices are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.
Microsoft, Windows, Windows XP, and Windows NT are U.S. registered trademarks of Microsoft Corporation.
Adobe and Acrobat are trademarks of Adobe Systems Incorporated.
Contents
Aboutthisguide .......................... 7
Intendedaudience...................................... 7
Prerequisites ........................................ 7
Relateddocumentation.................................... 7
Documentconventionsandsymbols .............................. 7
HPtechnicalsupport..................................... 8
Subscriptionservice ..................................... 8
Otherwebsites....................................... 8
1IAPoverview.......................... 11
Understandingdocumentarchiving.............................. 11
Understandingsearchinganddocumentindexing ....................... 12
Indexeddocumenttypes ................................ 12
MessageMIMEtypes(advancedusers).......................... 12
2IAPWebInterface........................ 17
Logginginandout..................................... 17
Understandingtheuserinterface............................... 17
Usingthetoolbar ................................... 17
Searchbasics..................................... 18
Common tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Completingsimplesearches............................... 19
Completingadvancedsearches ............................. 20
Displayingqueryorsearchresults ............................ 24
Savingqueryorsearchcriteria.............................. 26
Savingqueryorsearchresults.............................. 27
Sendingqueryorsearchresults ............................. 28
Exportingqueryorsearchresults............................. 29
Accessingsavedcriteria ................................ 29
Accessingsavedresults................................. 29
Copyingsavedresultstoaquarantinerepository...................... 30
Deletingquarantinerepositories ............................. 30
Searchingauditlogrepositories ............................. 31
Changingyourpassword................................ 34
Changingyourlanguage................................ 34
Troubleshooting...................................... 34
Unabletodisplaysavedresults ............................. 35
Problemsexportingresults................................ 35
3Queryexpressionsyntaxandmatching............... 37
Queryexpressions..................................... 37
Wordcharacters...................................... 37
Wordcharactersandseparators............................. 38
Regular expression denitionofEnglishwordcharacters................... 38
Lettersanddigitsindifferentcharactersets .......................... 38
Letters and digits dened................................ 38
Letters and digits in les ................................ 38
Matchingwords...................................... 39
Matchingsimilarwords................................... 40
User Guide
3
Fuzzywords ..................................... 40
Measuringwordsimilarity................................ 40
Matchingwordsequences ................................. 40
Simplewordsequences................................. 40
Proximitywordsequences................................ 41
Matchingwordsequencesinattachments......................... 41
Booleanqueryexpressions ................................. 43
NestedBooleanqueryexpressions.............................. 44
Queryexpressionexamples................................. 44
Index .............................. 47
4
Figures
1 2
3
IAPWebInterfacetoolbar............................ 17
SimpleSearchpage .............................. 20
AdvancedSearchpage(emailcontenttype) .................... 21
4
Query Result
5
Queryresultsnavigationbar........................... 25
6 SaveCriteriapage ............................... 27
7
SaveResultspage ............................... 28
8
SavedCriteriaview,QueryManagerpage..................... 29
9
SavedResultsview,QueryManagerpage ..................... 30
10
Simple Sea
11
AdvancedSearchpage(documentcontenttype) .................. 32
12
Auditlogdetails ................................ 34
spage(emailcontenttype) ...................... 24
rchpage(documentcontenttype).................... 31
User Guide
5
Tables
1 Documentconventions............................... 7
2
3
EAs applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Ofce 2007 supported leextensionsandMIMEtypes................ 13
4
Ofce 2007 sup
5
Ofce2007supportedproperties......................... 14
6
Toolbarbuttons,IAPWebInterface ........................ 18
7
IAPWebInterfacetasks............................. 19
8
Additional advanced search query elds...................... 22
9
Queryresultsnavigationbar........................... 26
10
Loggedactionsanddescriptions ......................... 33
11
Additional advanced search query elds (for audit log repository searches) . . . . . . . 33
12
Supportedcharactersets............................. 39
13
Excelspreadsheet................................ 42
14
Boolean
15
Queryexpressionexamples ........................... 45
queryexpressions............................ 43
portedfeatures.......................... 14
6
About this guide
This guide provides information about using the IAP Web Interface.
For additional information on using and conguring Email Archiving software for Microsoft Exchange and IBM Domino, see the HP Email Archiving software for Microsoft Exchange User Guide and HP Email Archiving software for IBM Domino User Guide contained in those products.
Intended aud
This guide is intended for users of the IAP Web UI.
ience
Prerequisites
Prerequisitesforusingthisproductinclude:Knowledgeofusingwebbrowsers
Related docu
In addition to this guide, HP provides the following on the IAP documentation CD:
HP Integrated Archive Platform Administrator Guide
HP Integrat
In addition The IAP Web Interface online help is a subset of this guide. The IAP PCC online help is a subset of the HP Integrated Archive Platform Administrator Guide.
mentation
ed Archive Platform System Release Notes
, online help i s available for the IAP Platform Control Center (PCC) and IAP Web Interface.
Document conventions and symbols
Table 1 Document conventions
Conven
Blue text: Table 1
tion
Elemen
Cross-reference links and email addresses
t
Blue, underlined text: http://
www.hp.com
Bold text
Italic text Text emphasis
Monospace text
Monospace, italic text
Monospace, bold text
Web site addresses
Keys t
Text t
GUI e
File and directory names
System output
Code
Commands, their arguments, and argument
Code variables
Command variables
Emphasized monospac e text
hat are pressed
yped into a GUI element, such as a box
lements that are clicked or selected,
as menu and list items, buttons, tabs,
such
heck boxes
and c
values
User Guide
7
WARNING!
Indicates that failure to follow directions could result in bodily harm or death.
CAUTION:
Indicates that failure to follow directions could result in damage to equipment or data.
IMPORTANT:
Provides clarifying information or specic instructions.
NOTE:
Provides additional information.
TIP:
Provides helpful hints and shortcuts.
HP technical support
Telephone numbers for worldwide technical support are listed on the HP support web site:
ttp://www.hp.com/support/.
h
Collect the following information before calling:
Technical support registration number (if applicable)
Product serial numbers
Product model names and numbers
Applicable error messages
Operating system type and revision level
Detailed, specicquestions
For continuous quality improvement, calls may be recorded or monitored.
Subscription service
HP strongly recommends that customers register online using the Subscriber’s choice web site:
h
ttp://www.hp.com/go/e-updates.
Subscribing to this service provides you with email updates on the latest product enhancem ents, newest driver versions, and rmware documentation updates as well as instant access to numerous other product resources.
After subscribing, locate your products by selecting Storage and then Storage Archiving under Product Category.
Other web sites
For other product information, see the following HP web sites:
8
About this guide
•http://www.hp.com
•http://www.hp.com/go/storage
•http://www.hp.com/service_locator
•http://www.hp.com/support/manuals
User Guide
9
10
About this guide
1IAPoverview
This section introduces HP Integ rated Archive Platform from a user perspective.
IAP is a fault-tolerant, secure system of hardware and software that archives les and email m essages for your organization, and lets you search for archived documents. IAP provides the following main functions:
Automatic, active data archiving (email and specic types of documents) that helps your
organization m eet regulatory requirements.
Interactive data querying to search for and retrieve archived data according to various criteria.
The IAP Web Interface allows you to use your web browser to search for documents archived on the system, andsaveandreuseyoursearch-querydefinitions and results. See “IAP Web Interface” on page 17 andQuery expression syntax and matching”onpage37.
To interact with the system, you can use the following EAs applications:
Table 2 EAs applications
Application What You Can Do
EAs for Microsoft Exchange (customer option)
EAs for Domino (customer option)
The IAP We custome to you.
b Interface is available to all users. EAs for Exchange and for Domino are independent
r options. Depending on the conguration of your system, each may or may not be available
Search for email messages using Microsoft Outlook with an Exchange mail server. Viewandworkwitharchivedemailmessages. SeetheHP EAs for Microsoft Exchange User Guide which is included on the HP EAs for Exchange option documentation CD — it is also available on h
Search for andworkwitharchivedemailmessages. SeetheHP EAs for IBM Domino User Guide whi is also av
email messages using IBM Lotus Notes with a Domino mail server. View
ch is included on the HP EAs for Domino option documentation CD — it
ailable on h
ttp://www.hp.com.
Understanding document archiving
IAP archives les and email messages associated with registered users. With EAs, you can nd and retrieve archived documents to which you have ac cess.
Archiving involves physically storing copies of a document (le or email message ), but also virtually storing it in one or more repositories. A repository is an abstract data store, which is a virtual collection of documents associated with routing rules (for storing) and user access control lists (for retrieving):
Documents associated with a given user are archived to a given set of repositories. User-repository
associations are dened by routing rules.
A user has quer y and retrieval access to a given set of repositories. This is controlled by access
control lists associated with each repository.
Most users have query and retrieval access to only their own documents, which are archived in their individual repositories. The system automatically archives, in your individual repository, all email messages associated with your email account; that is, all messages you send or receive.
In addition to being automatically routed to your individual repository, your email is probably also routed to one or more other repositories established by your company or organization. For example, a company audit repository may be used to keep track of all company email. Some users have access to other repositories, besides their own. For example, your manager or supervisor may have access to your repository .
ttp://www.hp.com.
User Guide
11
Understanding searching and document indexing
You can search for any documents archived in your repository (or any other repositories to which you have access), w your query is ch
hether the documents are email messages or les. When you search for a document, ecked against an index of words that is updated each time a document is archived.
Indexing the co searching. Separators (such as punctuation) between words are ignored during indexing. Note that there is a time delay from when les are archived to when they are indexed. Documents archived less than an hour ago may or may not appear in quer y or search results dep ending on the system’s conguration.
You can search the contents of a document only if the contents h ave been indexed. You can search for other kinds o
ntents of a document involves cataloging the document words to prepare them for later
f les only by using external identifying information.
Indexed document types
In addition to email messages, the following les are indexed:
Plain text les
Rich text les (.rtf)
HTML (HyperText Markup Language) les
Files used by the following Microsoft Ofce programs: Word, Excel, PowerPoint, and Access
PDF (Portable Document Format) les viewed with Adobe Acrobat Reader
Zip les
Embedded messages (RFC 822 messages)
NOTE:
Email message formatting has no bearing on indexing. Only the words you see in your email client are indexing candidates. Invisible source-code words, such as HTML markup tags, are ignored.
NOTE:
For zip les and embedded messages, the content inside the les is expanded and indexed. We support indexing of MS Ofce les for MS Ofce 2007 and prior releases.
Message MIME types (advanced users)
An email message can contain message parts of possibly different MIME (Multipurpose Internet Mail Extensions) Content-Types. The following Content-Types are indexed and each corresponds to one of the indexed document types:
text/xml
text/plain
text/html
application/rtf
application/msword
application/vnd.ms-excel
application/vnd.ms-powerpoint
application/msaccess
application/pdf
application/zip
12
IAP overview
An email message that is entirely plain text, not MIME, is indexed. Also, if an email message has been attached to another email message, the attached email message is not indexed.
IAP 2.0 provides document indexing support for Microsoft Ofce 2007. The supported MIME types and extensions are shown in Table 3. Support for features is shown in Table 4. Support for properties is showninTable5.
Ofce 2007 documents archived prior to installing IAP 1.6.1 or 2.0 will not be indexed or content searchable.
Table 3 Ofce 2007 sup ported le extensions and MIME types
File extension
.docx
.docm
.dotx
.dotm
.xlsx
.xlsm
.xltx Ofce Excel 2007 template
.xltm
.xlam
.pptx
File type
Microsoft Ofce Word 2007 document
Ofce Word 20 document
Ofce Word 2007 template
Ofce Word 2007 macro-enabled document template
Microsoft O workbook
Ofce Excel 2007 macro-enabled workbook
Ofce Exce workbook t
Ofce Excel 2007 add-in application/vnd.ms-excel.addin.macroEnabled.12
Microsoft Ofce PowerPoint 2007 presentation
07 macro-enabled
fce Excel 2007
l 2007 macro-enabled
emplate
MIME type
application/vnd.openxmlformats-ofcedocument.word­processingml.document
application/vnd.ms-word.document.macroEnabled.12
application/vnd.openxmlformats-ofcedocument.word­processingml.template
application/vnd.ms-word.template.macroEnabled.12
applicatio ment.sprea
application/vnd.ms-excel.sheet.macroEnabled.12
application/vnd.openxmlformats-officedocu- ment.spreadsheetml.template
application/vnd.ms-excel.template.macroEnabled.12
application/vnd.openxmlformats-ofcedocument.pre­sentationml.presentation
n/vnd.openxmlformats-officedocu-
dsheetml.sheet
.pptm
.ppsx
.ppsm
.potx
.potm
Ofce Pow macro-en
Ofce PowerPoint 2007 slide show
Ofce PowerPoint 2007 macro-enabled slide show
Ofce PowerPoint 2007 template
Ofce PowerPoint 2007 macro-enabled presentation template
erPoint 2007
abled presentation
applicat abled.12
application/vnd.openxmlformats-ofcedocument.pre­sentationml.slideshow
application/vnd.ms-powerpoint.slideshow.macroEn­abled.12
applica sentati
application/vnd.ms-powerpoint.template.macroEn­abled.12
ion/vnd.ms-powerpoint.presentation.macroEn-
tion/vnd.openxmlformats-ofcedocument.pre-
onml.template
User Guide
13
NOTE:
The following items are not yet supported:
Notes within PowerPoint slides
Spread sheet names within Excel
Some embedded OLE objects
Certain text within Excel charts
Also, some documents converted to Microsoft Of ce version 2007 by the Ofce File converter may not be properly indexed.
Table 4 Ofce 2007 supported features
Feature
Contents
Table
Textbox
Header/Footer
Comment No No No
FootNote/EndNote
Signature
Chart
Object (Microsoft Ofce,
WordPad …)
Embedded Objects
Notes
WordArt
SmartArt No No No
Sheet’s nam e
Microsoft Word Microsoft PowerPoint Microsoft Excel
Yes Yes Yes
Yes Yes Yes
Yes Yes Yes
Yes Yes
No
No No No
Yes
No No No
No No No
N/A
No No No
N/A N/A
Table 5 Ofce 2007 suppor ted properties
No
N/A N/A
No No
No
N/A
No
Type Property
Document Properties
14 IAP overview
Author
Title
Subject
Keywords
Category
Status
Comments
Location
Microsoft Word, PowerPoint,
and Excel
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
Type Property
Advanced Properties: General
Microsoft Word, PowerPoint,
and Excel
Type
Location
Size
MS-DOS name No
Created
No
No
No
No
Advanced Properties: Summary
Advanced Properties: Statistics
Modied
Accessed
Attributes
Title
Subject
Author
Manager No
Company No
Category
Keywords
Comments
Hyperlink base
Template
Created
Modied
Accessed
Printed
No
No
No
Yes
Yes
Yes
Yes
Yes
Yes
No
No
No
No
No
No
Advance
dProperties:Contents
Last save
Revision number
Document Contents
dby
Statistics No
Yes
Yes
No
User Guide
15
Type Property
Microsoft Word, PowerPoint,
and Excel
Advanced Properties: Custom
Checked by
Client
Date completed
Department Yes
Destination Yes
Disposition Yes
Division Yes
Document number
Editor
Forward to
Group
Language Yes
Mailstop
Ofce
Owner
Project Yes
Publisher
Purpose Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Received from
Recorded by
Recorded date
Reference
Source
Status
Telephone number
Typist Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
16
IAP overview
2 IAP Web Interface
Use this web-based tool to search for documents archived in the system. You can also save and reuse query or search criteria and results.
Major topics include:
• Logging in and out,page17
• Understanding the user interface, page 17
Common tasks,page18
Troubleshooting, page 34
Logging in and out
Before logging in for the rsttime,seeyoursystemadministratorfortheURLtouseandforthelistof supported web browsers. (Microsoft 6.0 and 7.0 are recommended.)
The IAP Web I http protoc http by d ef logged in,
nterfacecanbeaccessedfromaclientdesktopusingregularhttpprotocolaswellassecure
ol (https), for better protection during the authentication. The IAP Web UI now suppor ts secure
ault. If you use regular http, you will be redirected automatically to secure https protocol. Once
you can used regular http if needed.
To access t
1. In the Address eld of your web browser, enter the URL (web address) that was provided to you by
2. Enter your user name and password (provided by your system administrator). Both elds are
3. To log out, click LogOut in the toolbar.
NOTE:
Foraccesstotheauditlogrepository,submitarequesttoyouradministrator.
he IAP Web Interface:
your system administrator. The web browser displays a login screen.
case-sen
sitive. Click Login. The Simple Search page is displayed.
Understanding the user interface
User interface topics include:
• Using the toolbar, page 17
Search basics, page 18
Using the toolbar
Each page of the IAP Web Interface has a toolbar at the top.
Figure 1 IAP Web Int erface t oolbar
Thefollowingtabledescribeseachbutton:
User Guide
17
Table 6 Toolbar but tons, IAP Web Interface
Button
New Search
Query Manager
Preferences
Help
LogOut
Search basics
Using the Search For eld
Description
Click to displa “Completing si
To display the A search fromthemenu.See“Completing advanced searches” on page 20.
ClicktodisplaytheQueryManagerpage,whereyoucandisplaysavedqueries (see “Accessing saved criteria” on page 29) and results (see “Accessing saved
results”onpage29).
Click to display the Prefer ences page, where certain users can change their password for accessing the IAP Web Interface. See “Changing your
password”onpage34.
You can also use the Preferences menu to change the language of the user interface. See “Changing your language” on page 34. Supported languages include English, French, German, Spanish, Portuguese, Chinese (traditional), Chinese (Taiwanese), Korean, and Japanese.
Click for online help about the IAP Web Interface.
Click to log out of the IAP Web Interface.
y the Simple Search page, where you can submit a query. See
mple searches” on page 19.
dvanced Search page, point to this button and click Advanced
The Search for eld is available when using the Simple Search and Advanced Search pages. Use this eld to search for specic words in a document (email messages or les).
The query syntax allowed is described in “Quer y expression syntax and matching” on page 37. You c an enter simple words, words with wildcards, or a more sophisticated query involving Boolean expressions or word sequences.
NOTE:
To narrow the search to the documents you want to nd, make your search text as specicaspossible.In general, the more information provided in the Search for eld, the narrower the search. If the eld is blank (empty), all documents within the specied date range of the query are returned.
Searching indexed contents
The Search for eld is checked for a match aga inst the indexed contents of documents.
For email messages, the Search for eld for words in the message body, but not in other
message elds such as Subject, From, or To. In addition, the Search for eld applies to message attachments that are indexed document les.
For les, the Search for eld applies only to indexed document les. Other types of les do not
have indexed contents so their contents cannot be searched.
Common tasks
Use the following table as a quick reference for performing common tasks.
18
IAP Web Interface
Table 7 IAP Web Interface tasks
Task
Search for archived documents
Display or print the query or search results Displaying query or search results”onpage24
Save query or s
Save the results of a search
Send the results to your email account Sending query or search results” on page 28
Export the re
Display or delete saved criteria of a search
Display the saved results of a search
Save archived documents to a quarantine repository
Delete a quarantine repository Deleting quarantine repositories”onpage30
Search the audit log repository Searching audit log repositories” on page 31
Change your
Change your language Changing your language”onpage34
earch criteria
sults
password
Completing simple searches
Reference
Completing simple searches” on page 19 and “Completing
advanced searches”onpage20
Saving query
Saving query or search results”onpage27
Exporting q
Accessing saved criteria” on page 29
Accessing saved results” on page 29
Copying sav
Changing y
or search criteria” on page 26
uery or search results” on page 29
ed results to a quarantine repository”onpage30
our password” on page 34
The Simple Search page searches for documents (email messages or les) containing words you enter in the Search for eld. In the Search for eld, you can enter simple words, words with wildcards (*) , or a more sophisticated query involving Boolean expressions or word sequences. The Simple Search page is simpler than the Advanced Search page only because there are fewer elds you can search on.
To comple
te a simple search:
User Guide
19
1. Click New Search in the toolbar. The Simple Search page is displayed.
Figure 2 Simple S earch p age
2. Search using all of the following elds on the Simple Search page:
• Content Type: Use email to searc h for email message les. You would use document to search the AuditLog repository as described in Searching audit
log repositories,ortosearchforles in a repository such as those migrated using the HP File
Archiving software (formerly known as FMA). To do a search for documents stored via HP File Archiving software, select “document” instead of “email” in the dropdown box, choose the repository to search, a n d then use an empty search string.
NOTE:
Using “document” does not search for email sent by Exchange or Domino.
• Timeframe: The time period to search. This includes the last-modied date of a le or the date an email message was sent.
• Where to Search: The repository to search. A repository is a virtual collection of documents (email messages and les). Only the repositories to which you have access are displayed. At a minimum, you have access to your own repository.
• Search for: Searches for words in the document or message body, but not in message elds such as Subject, From, or To.
3. When you have nished dening your query, click Find Now to start the search. The Q uery
Resultspageisdisplayed.
Completing advanced searches
For more specic searches, use the Advanced Search page. In addition to the elds of the Simple Search page, the Advanced Search page provides additional query elds to help you rene your search.
To complete an advanced search:
20
IAP Web Interface
1. Point to New Search in the toolbar and click Advanced search from the menu. The Advanced
Search page is displayed.
Figure 3 Advanced Search page (email content type)
NOTE: Figure 3 shows the Advanced Search page for the email content type. The document
content type form varies slightly. See Table 8 for an explanation of the differences.
User Guide
21
2. Search using the following elds on the Advanced Search page:
• Content Type: Use email to searc h for email message les. You would use document to search the AuditLog repository as described in Searching audit
log repositories,ortosearchforles in a repository such as those migrated using the HP File
Archiving software (formerly known as FMA). To do a search for documents stored via HP File Archiving software, select “document” instead of “email” in the dropdown box, choose the repository to search, a n d then use an empty search string.
NOTE:
Using “document” does not search for email sent by Exchange or Domino.
• Search For: Searches for words in the document or message body, but not in message elds such as Subject, From, or To.
• By TimeFrame: The time period to search. This includes the last-modied date of a le or the date an email message was sent. As an alternative to the By TimeFrame eld, you can dene a time period to search by specifying the Start and end (To) dates. For example, to search for documents dated between March 8, 2003 and March 23, 2003, enter 03/08/2003 in the Start eld and 03/23/2003 in the To eld.
• Where to Search: The repository to search. A repository is a virtual collection of documents (email messages and les). Only the repositories to which you have access are displayed. At a minimum, you have access to your own repository.
3. To rene the search, use the additional query elds (as shown in Table 8). The available elds
depend on the Content Type you select.
Table 8 Addition al advanced search query elds
Query Field Matches (in the Document)
Email Content Type Only
Subject
From
To / Cc
The Subject message eld.
The From message eld.
If the email repository belongs to a user with audit privileges (i.e. the user is a compliance user), this value will match message recipients in the Rcpt To, To, Bcc, Cc, and Apparently-To message elds. If the email repository belongs to a user without audit privileges, this value will only match message recipients in the To, Cc, and Apparently-To elds.
22
IAP Web Interface
Query Field Matches (in the Document)
The Outlook Exchange folder to which the email belongs. It supports wildcard search like other elds do. (Folder name will appear as a eld if it has been enabled by the system administrator in the domain.jcml le.)
Example queries: 2006 searches all leaf folders with the name “2006,” for example, \Inbox\2006
and \Inbox\test\2006, but not \Inbox.
\Inbox\2006 searches only folder 2006 in path \Inbox\2006. \Inbox\* searches Inbox and all its subfolders and nested folders. \2006 searches only the root folder \2006. \Inbox\2006\personal searches only the folder personal in the path
\Inbox\2006\personal. 2006\personal searches the folder 2006\personal, anywhere in the path. 2006\personal\* searches the folder 2006\personal and all its subfolders
andnestedfolders.
Folder Name
sent* searches all folders that start with “sent,” for example, SentItems and SentEmail. Search is not case sensitive.
bugs\20* searches all folders in “bugs” that start with “20,” for example
\bugs\2006 and \bugs\20data.
NOTE:
Core folders are Outlook hardcoded folders, like Inbox, Outbox, Sync
Issues\Conf
licts. If the core folder only has one level (like Inbox), thereisnolimitation. Ifthecorefolderhasmorethanonelevel(likeSync Issues\Conflicts) and you want to search the emails in the second level (Conf
Issues\Co for “Sync I
licts), type both levels or use wildcards. Searching for “Sync
nicts” will nd email in Sync Issues\Conflicts. Searching
ssues\Con*” will nd email in Sync Issues\Conflicts.But,
searching for Conicts will not nd email in Sync Issues\Conflicts.
NOTE:
To search any folder which has space inside, use quotes, for example: "my folder".
Attachment Name
Message
Document (File) Content Type Only
Document Name
Document Path
ID
The le name of a message attachment. (The contents of indexed document attachments are searched using the Search for eld.)
The MessageID message eld from Outlook, a message identication number (not all messa
To display the MessageID eld in Outlook:
1. Double-click to open the message in its own window.
2. Select View > Options. The Message Options dialog box is displayed.
3. If the me
File name, not including the le extension.
File path. Asforanyothertextqueryeld, separators such as slash ( / ), backslash ( \ ),
and colon ( : ) are ignored, and the query words are searched in any order. For example, query text c:\abc\xyz will match path abc:\xyz\c,aswell as path c:\abc\xyz.
To ensure that path components are searched in order, enclose the eld text in double-quotes ( ” ) to use a word-sequence query. See “Query expression syntax
and matching” on page 37 for a complete denition of the Search Engine query
syntax.
ges have MessageIDs). Use this eld mainly for audit searches.
ssage has a MessageID, the eld is shown in the Internet headers eld of the Message Options dialog box. Example: Message-ID: <LISTMANAGER-115380-9228-2003.03.04-17 .34.24-­user#hp.com@lists.FrameUsers.com>.
User Guide
23
Query Field Matches (in the Document)
4. When you have nished dening your query, click Find Now to start the search. The Q uery
Resultspageisdisplayed.
Displaying q
When you submit a query, the results are displayed on the Query Results page. From the Query Results page, you can save, send, or export searches and results by clicking More Options.
To display t
1. Display the Quer y Results page by completing one of the following tasks:
The Query R
Extension
Title
Author
File extension. Example: doc for a Microsoft Word le.
Title of the document. Only some les have associated titles. For example, to see the title of a Word document, select File > Properties in Word. The Title eld is shown on the Summary panel of the displayed Properties dialog box.
Author of the document. Only some les have associated authors. For example, to see the author of a Word document, select File > Properties in Word. The Author eld is shown on the Summar y panel of the displayed Properties dialog box.
uery or search results
he results:
Submit a sim “Completing advanced searches” on page 20) .
Submit a se Access previously saved results (see “Accessing saved results” on page 29).
ple(see“Completing simple searches” on page 19) or advanced search (see
arch from previously saved criteria (see “Accessing saved criteria”onpage29).
esults page is displayed.
Figure 4 Query Results page (email content type)
NOTE: Figure 4 shows the Query Results page for the email content type. The Query Results page
forthedocumentcontenttypevariesslightly,buthasthesamefunctionality.
24
IAP Web Interface
2. From the Query Results page, complete any of the following tasks:
• To display the contents of an email or document in the viewing pa n e, click the item from the list once. Clicking the item twice will instead open the preview pane as a new window.
• To display a different group of 50 results, click the different symbols in the query results navigation bar. See Query results navigation bar on page 25 for more information.
• To select all displayed documents or clear all selected documents, click Check All or Uncheck All, respectively.
• To print the Query Results page, use the browser print button. (The Print Current Table List printer icon has been removed.)
To save the current criteria, see “Saving query or search criteria”onpage26.
To save the current results, see “Saving query or search results” on page 27.
• To mail all or selected results to your m ailbox, see “Sending query or search results” on page 28.
To export all or selected results, see “Exporting query or search results” on page 29.
NOTE:
Users that have been given "compliance" authorization (generally compliance ofcers) are able to view all recipients (including Bcc addresses).
Query results navigation bar
When the results are retrieved, the most recent d ocuments are displayed rst.
NOTE:
Documentsarchivedlessthananhouragomayormaynotappearinresultsdependingonthesystem’s conguration.
Fifty results (maximum) are shown on the Query Results page. You can use the query results navigation bar to display different groups of 50 results.
Figure5Queryresultsnavigationbar
The query results navigation bar shows the following information:
User Guide
25
Table 9 Query results navigation bar
Item
bars:
arrows:
status
Description
From left to right, the ve bars represent subsequent pages of 50 results (maxim bar represent documents a gi momentarily t
Click an arrow to display a different page of results:
Move the results display forward by 50 (
(
Move the res
(
For example, if the current page shows results 1-50, clicking the right double-arrow (
Text indic
Query Sti
results have been found so far.
Click Above: Searching is in progress. At least 50 results have
been fou todisplayasetof50resultswhilethesearchisinprogress.
Query Results Complete: Searching is complete if the query
produc 500 re searc
triple-arrow symbol (
um). Click a bar to display its page of results. The dark
s the currently displayed results. Note: To see just which
ven bar represents, hold the mouse pointer over it
o display a t ooltip.
), 100 ( ), or 500
)document
ating the current status of results retrieval:
es no more than 500 query results. If there are more than sults, the search for the rst500resultsiscomplete. To
h for additional results with the same query, click the right
s.
ultsdisplaybackby50(
).
) displays results 100-150.
ll In Progress: Searching is in progress. Less than 50
nd and displayed. You can click a navigation bar (
) to retrieve a ll possible results.
), 100 ( ), or 500
)
Result retrieval batches
When you submit a query or search, the rst 500 results are retrieved. To see more results for a query
returns more than 500 results, you must click the right triple-arrow symbol (
that
e than 500 results, the Query Results Complete status message means that the rst batch of 500
mor results has been retrieved. It does not mean that all results h ave been retrieved.
Saving query or search criteria
After you submit a search, you can save the query or search criteria.
To save criteria:
1. Display the Quer y Results page by completing one of the following tasks:
Submit a simple (see “Completing simple searches” on page 19) or advanced search (see “Completing advanced searches” on page 20) .
Search the audit log repository (see “Searching audit log repositories” on page 31).
Access previously saved results (see “Accessing saved results” on page 29).
). When there are
26
IAP Web Interface
2. From the Query Results page, click More Options, and then click Save Current S ea rch Criteria.Or
right-click and select Save criteria. The Save Criteria page is displayed.
Figure 6 Sa
3. Enter the name of the criteria you are saving in the Save Query Criteria as eld. To erase text
entered in the Save Quer y Criteria as eld, click Clear.
NOTE:
Specialcharacters@$%^&*#()[]/\{+}‘~=|arenotallowe
4. Click Save Now.
5.
To access the saved criteria, see “ Accessing saved criteria” on page 29.
ve Criteria page
Saving query or search results
After you submit a search, you can save the results. The search results are saved for two weeks and then deleted.
NOTE:
Any search results you saved using the IAP Web Interface are deleted after t wo weeks. The web interface also contains a button for manually deleting saved query results
Deleting search results does not delete the items on the IAP. The actual items remain on the IAP according to the retention period set by your administrator.
If you need to save the results longer than two weeks, consider copying the results to a quarantine repository. A quarantine repository allows you to save search results for an innite retention period. For more information, see “Copying saved results to a quarantine repository” on page 30.
d.
If the search locates a large number of documents, saving the results is useful. For example, if you are completing a large audit query, you can save the res retrieve all the results at a later time.
When you save query results of a large search, the query is resubmitted as a background process that retrieves all results, no ma tter how many. Because use the IAP Web Interface (for example, by submitt
ults while the query is still processing and then
the query runs in the background, you can continue to
ing other queries).
User Guide
27
To save results:
1. Display the Quer y Results page by completing one of the following tasks:
Submit a simple (see “Completing simple searches” on page 19) or advanced search (see “Completing advanced searches” on page 20) .
Submit a search from previously saved criteria (see “Accessing saved criteria”onpage29).
2. From the Query Results page, click More Options, and then click Save Current Results.Orright-click
and select Save results.TheSaveResultspageisdisplayed.
Figure 7 Save Results page
3. Enter the name of the results you a re saving in the Save Search Results as eld. To erase text entered
intheSaveSearchResultsaseld, click Clear.
NOTE:
Specialcharacters@$%^&*#()[]/\{+}‘~=|arenotallowed.
4. Click Save Now.
5.
To access the saved results, see “Accessing saved results”onpage29.
NOTE:
The results are not retrieved in chronological order, but are sorted chronologically after they have all been retrieved (query processing is Finished). If you access the saved results before the query is Finished, theresultsarenotsorted.
Sending query or search results
You can send search results to your email account. This includes results that you have placed in a quarantine repository.
To send results:
28
IAP Web Interface
1. Display the Query Results page by completing one of the following tasks:
Submitasimple(see“Completing simple searches” on page 19) or advanced search (seeCompleting advanced searches” on page 20).
Submit a search from previously saved criteria (see “Accessing saved criteria” on page 29).
Access previously saved results (see “Accessing saved results” on page 29).
2. From the Query Results page, select the check box next to each item you want to send. Skip this step
if you are sending all items.
3. Click More Options to open the menu.
4. To send all results, click Send All Items. To send the selected items, click Send Chec ked Items.A
conrmation message is displayed when the items are sent.
Exporting query or search results
For information on exporting query or search results, see your HP EAs for Micosoft Exchange User’s Guide or HP EAs for IBM D omino User’s Guide as appropriate. Those user guides are included on the documentation CD in those products.
Accessing saved criteria
If you save the criteria of a query or search, you can access it from the Query Manager page. Each item listed shows the name of the saved criteria and date of when you saved the criteria.
To access a saved criteria:
1. Click Query Manager in the toolbar. The Query Manager page displays.
2. Click Saved Criteria to display all previously saved searches.
Figure 8 Saved Criteria view, Query Manager page
3. Complete any of the following tasks:
•Toresubmitthequery,clickReload. The Advanced Search page with the saved criteria already entered is displayed. You can then resubmit the query by clicking Find Now.
• To delete the saved criteria, click Delete. You are not prompted to conrm the deletion.
• ToswitchtotheSavedResultsview,clickSaved Results.Orright-clickandselectSaved Query Results.
Accessing saved results
You can access saved results from the Query Manager page. This includes results that you have placed in a quarantine repository. Each item listed shows the name of the results, its status (if unnished), and start and end dates of the search.
cess saved results:
To ac
User Guide
29
1. Click Query Manager in the toolbar. The default Query Manager page displays all saved results.
Y ou can also access this view by clicking Saved Results on the Query Manager page.
Figure 9 Saved
2. Complete any
• To display the results, click Reload. The Quer y Results page is displayed.
• To copy the saved results to the quarantine repository, click Start.Acompletedmessage
appears in t
• To delete the save results, click Delete.
• ToswitchtotheSavedCriteriaview,clickSaved Criteria. Or right-click and select Saved
Query Crite
NOTE:
Results are automatically deleted after two weeks. If you need to save the results longer than two weeks, consider copying the results to a quarantine repository. See “Copying s
aved results to a quarantine repository” on page 30.
Results view, Query Manager page
of the following tasks:
he row when the contents is in the quarantine repository.
ria.
Copying saved results to a quarantine repository
You can copy items listed in the saved query results to another destination repository called a quarantine repository. If the results are quarantined, the search results are not deleted from the database and have an innite retention period. This feature is useful for compliance purposes or if you need to save your results for more than two weeks.
NOTE:
Any search results you saved using the IAP Web Interface are deleted after two weeks. Deleting search results does not delete the items on the IAP. The actual items remain on the IAP according to the retention period set by your administrator.
You have access to your quarantine repositories automatically and work with qua rantine repositories as you woul repository from the Where to Search list.
To copy the saved results to the quarantine repository:
1. Click Qu
2. Click Start intheQuarantinecolumntocopythosesavedresultstothequarantinerepository.A
d any repository, such as search within the repository. For instance, you can select a quarantine
ery M anag er in the toolbar to access the saved results.
completed message appears in the row when the contents is in the quarantine repository. You can not see
the quarantine repository until the user logs in again.
Deleting quarantine repositories
If needed, you can delete the contents from a quarantine repository. If a quarantine repository is marked for deletion, the repository contents are removed during the next retention cycle, and either the references to the data are removed or the data itself is removed.
30
IAP Web Interface
NOTE:
Deleting a quarantine repository does not delete the items on the IAP. The actual items remain on the IAP according to the retention period set by your administrator.
To delete a quarantine repository:
1. Click Query M anag er in the toolbar to access the saved results.
2. Click Delete intheQuarantinecolumntoremoveaccesstothatquarantinerepositoryanddeleteits
contents during the next retention cycle.
Searching audit log repositories
Audit log repositories are not available to a ll users. Users with access to an audit log repository can choose the audit log repository when they select a repository to search. If the audit log repository does not appear in the Where to Search list when you specify search criteria, you do not have access. For access to the audit log repository, submit a request to your administrator.
The audit log repository provides a method for creating a surveillance system log. This is useful for demonstrating your company is adhering to surveillance processes. You can complete either a simple or advanced search when searching the repository. If you need to search for multiple criteria, complete an advanced search.
NOTE:
An email search from an audit log repository will produce no results; you need to do a document search.
To search the audit log repository:
1. Select the document content type from the simple or advanced search form:
•Forasimplesearch,clickNew Search in the toolbar. From the Content Type list, select document. The following form is displayed.
Figure 10 Simple Search page (document content type)
• For an advanced search, point to New Search in the toolbar and click Advanced search from the menu. From the Content Type list, select document. The following form is displayed.
TIP:
To search for multiple items, use the advanced search form.
User Guide
31
Figure 11 Advanced Search page (document conte nt type)
2. From the Timeframe list, select the time period to search. This eld searches the audit logs stored to
the IAP during a specied time period.
Advanced searches only: A s an alternative to the By Timeframe eld, you can dene a time period to search by specifying the Start and end (To) dates. For example, to search for documents dated between March 8, 2003 and March 23, 2003, enter 03/08/2003 in the Start eld a nd
03/23/
2003 in the To eld.
3. From the Where to Search list, select the audit log repository.
32
IAP Web Interface
4. In the Search for eld, enter one of the following criterion to search for a specicuseroraction:
• User ID: Enter the login name of the user, such as jdoe.
• First Name: Enter the rstnamefromtheLDAPdirectoryfortheuser,suchasJohn.
• Last Name: Enter the last name from the LDAP directory for the user, such as Doe.
• Logged actions: Enter one or m ore of the actions listed in the following table. Or, leave this eld blank to search for all logged actions.
Table10Loggedactionsanddescriptions
Logged Action
Search
Query Result Information about the query results returned or displayed.
Navigation
View Message
View Document
Download Email attachments downloaded by the user.
Mail Emails or documents sent to the user.
Export
Save Query
Save Query Result Query results saved by the user.
Start Quarantine
Delete Quarantine Quarantined query results deleted by the user.
Description
Simple or advanced searches performed by the user.
Navigation t
Emails displayed from the query results.
Documents displayed from the query results.
Emails or documents exported by the user.
Queries sav
Saved query results quarantined by the user.
hrough the query results.
ed by the user.
TIP:
You can use boolean expressions AND, OR,andNOT when entering search criteria.
5. Advanced searches only:Torefine the search, use the additional query elds as shown in the
following table.
Table 11 Additional advanced search query elds (for audit log repository searches)
Query Field Matches
Documen
Document Path
Extension
Title Not used because it is not applicable to audit log les.
Author
t Name
Name of the component generating the audit log. For example: IAP Web Interface. This is the only option available at this time.
Host or IP address of the host where the audit log was generated. For example: hp-s0-1-93.hp.com.
File extension. Not used because the audit log is always an XML le.
Used to search for a specic user. Enter one of the following criterion:
User ID: Enter the login name of the user, such as jdoe.
First Name: Enter the rst name from the LDAP directory for the user,
such as John.
Last Name: Enter the last name from the LDAP directory for the user,
such as Doe.
User Guide
33
6. Click Find Now to start the search. The Quer y Results page displays the following information:
• User: User for which the audit log was created.
• Session Start: Start time of the user session.
• Session End: End time of the user session.
• Size: Size of the session audit log le.
• Server: Server (HTTP portal) on which the audit log session was captured.
•Date:Datetheauditlogfile was archived.
7. To display the contents of an audit log le in the viewing pane, click the item from the list. If needed,
click New Window to display the audit log content in a new window.
Figure 12 A
8. From the Query Results pag e, you can also save the query or search criteria you entered. See
Saving query or search criteria”onpage26.
udit log details
Changing your password
Depending on how your system is congured, your password is the same as your Windows or your Lotus Notes password or you have the option to manage your password. The following information applies to users who can change their password through the Web Interface.
For security reasons, change your password periodically. Change your password immediately after you log in for the rst time.
To change your password for accessing the IAP Web Interface:
1. Click Preferences in the toolbar.
2. Enter your current password (Old Password), enter the New Password twice, and then click Change.
Chang
ing your language
nge the language of the user interface:
To cha
1. Clic
2. Enter the language you prefer. Suppor ted languages are English, French, German, Spanish,
k Preferences in the toolbar.
Portuguese, Chinese (traditional), Chinese (Taiwanese), Korean, and Japanese.
Troubleshooting
Troubleshooting topics include:
Unable to display saved results, page 35
Problems exporting results, page 35
34
IAP Web Interface
Unable to displa
Search results are saved for t wo weeks and then deleted. If you save the results of a query, but the retention s application, b error because
ysavedresults
ettings delete the les before the end of the two weeks, the results still appear in the ut cannot be reloaded from the saved search results. Clicking the saved results displays an
the application cannot nd the saved results on IAP.
Problems exporting results
For information on problems exporting query or search results, see your HP EAs for Microsoft Exchange User’s Guide or HP EAs for IBM Domino User’s Guide as appropriate. Those user guides are included on the documentation CD in those products.
User Guide
35
36
IAP Web Interface
3 Query exp res sion syntax and matching
Query expression syntax and matching describes the IAP Web Interface syntax to use to search and
retrieve archived documents (lesoremailmessages),andexplainshowqueriesarematchedagainst documents.
Major topics include:
Query expressions, page 37
Word characters, page 37
Letters and digits in different character sets, page 38
Matching words, page 39
Matching similar words,page40
Matching word sequences, page 40
Boolean query expressions, page 43
• Nested Boolean query expressions,page44
• Query expression examples,page44
Query expressions
Query ex retrieval is that query words are compared with document words to nd a match. You can also:
Look for document words that are textually similar, but not necessarily identical, to query words.
Look for word sequences in a document: words that are near each other, and in a particular
Combin
Together, these query constructs provide considerable power to nd what you need, provided you learn to use them well.
The wa Text is parsed (broken down) into words. Remaining characters are considered separators and ignored. Query expressions are fundamentally composed of words, no matter how complex the expression.
For indexing and searching, a word need not belong to a natural language, such as English. For examp such as in f??t.
pressions can be as simple or as complex as needed. The essential idea behind document
See “Mat
order. See “Matching word sequences”onpage40.
Boolean query expressions”onpage43.
ching similar words” on page 40.
e query words using logical (Boolean) operators (AND, OR, NOT). See
y query expressions are interpreted is similar to the way documents are indexed when archived.
le, wt6_ht3 is a valid document word or query word. Query words can contain wildcards,
Word characters
When the system examines a query expression to determine its words, some characters are not included in query words, but are treated as word separators. When a document is archived, indexing d etermines which document words are available for searching in the same way.
Learning the rules of creating query words means also learning the rules of document indexing and, therefore, what words you can search for.
User Guide
37
Word characters and separators
Word characters include all uppercase and lowercase letters, digits, and the following additional characters:
_(underscore)
# (number/pound/hash sign)
& (ampersand)
All other chara ~, ", -,and!).
However, && by itself is no t a word. It is a Boolean operator. When combined with at least one more word characte
Query analysis and document indexing are not case-sensitive. Uppercase and lowercase letters are treated the same.
cters are separators (except i n queries, wildcards ? and *, and special query characters
r, && canbepartofaword. Forexample,a&&b is a word.
Regular expression denition of English word characters
The following regular expression provides, in succinct form, a complete specication of English word characters (except for treatment of && as a non-word):
[ A-Za-z0-9_#& ]+
Letters and digits in different character sets
Topics include:
Letters and digits dened, page 38
Letters and
digits in les, page 38
Letters and digits dened
All letters and digits are word characters. What IAP considers a letter or digit depends on the character set encoding used. For US ASCII encoding, letters are uppercase and lowercase English letters (A-Z, a-z). For ISO 8859-1 (Latin-1) encoding, used for Western European languages, accented letters are included. Most ideographic characters, such as those used in Asian languages, are also c onsidered letters.
Whatever the langu age and encoding used for a particular document (le or email message), IAP maps encoded characters to the Unicode 2.0 standard. The Unicode 2.0 standard is then used to determine if a given character is a letter or a digit (or neither):
A letter is any Unicode character in one of the following Unicode categories: Ll (lowercase letter),
Lu(uppercaseletter),Lt(titlecaseletter),Lm(modifier letter), or Lo (other letter).
A digit is any Unicode character whose Unicode name contains the word DIGIT, provided it is not
in the range \u2000 (en quad = en space) through \u2FFF (ideographic description - future).
Letter
sanddigitsinles
gh all letters and digits are word characters, their treatment in les (including email message
Althou
hments) depends on the character encoding used. You can search for any words in email message
attac
s and headers, regardless of the encoding.
bodie
You ca prov
nsearchforwordsinles (including email body, header, attachments, and indexed documents)
ided the character encoding is one the following:
38
Query expression syntax a nd matching
Table 12 Supported character sets
Supported character set
ISO-8859-1
WINDOWS-1252
US-ASCII
UTF-8
ISO-8859-2
KOI8-R
ISO-8859-5
WINDOWS-1251
WINDOWS-1254
ISO-8859–9
GB18030
BIG5
GB2312
EUC-KR
KS_C-5601-1987
ISO-2022-JP
EUC-JP
SHIFT-JIS
Description
Western European, extended ASCII
(Code pages supported by Windows) Latin 1
7-bit American Standard Code for Information Interchange
Universal (all languages)
Eastern Europ
Cyrillic (Russian and Bulgarian)
Cyrillic (Bulgarian, Belarusian, Russian)
Cyrillic
(Code pages supported by Windows) Turkish
Turkish
Chinese (Mainland)
Chinese (Taiwan)
Chinese (Mainland)
Korean
Korean
Japanese
Japanese
Japanese
ean
Matching words
Matching words is not case-sensitive: cat, Cat, cAt,andCAT all match. Corresponding uppercase and lowercase letters, such as A and a, are treated the same in all respects.
There are two kinds of query words: words that contain occurrences of one or both of the wildcard characters * and ?, and literal words that do not contain wildcards.
Literal words that do not contain wildcards
Words containing o ccurrences of one or both wildcard characters * and ?
A literal word in a quer y expression matches the same word, character for character (case ignored), in an archived document. A word with wildcard characters (* or ?) ma tches a document word in the same way, character by character, except for the following:
A ? matches any single character in a document word. For example, b??t matches beat, beet,
boat, blot, best, bust, bout,andsoon.
An * matches any sequence of characters in a document word, including a sequence of no
characters. For example, f*t matches the document words foot, feet, t, fault,andft;andf* matches any document word beginning with f.
You can use any number of wildcard characters (* or ?) in a query word, but you cannot use a wildcard at the beginning of a query word. An error message results. For example, *ion is not a valid query.
User Guide
39
Matching simila
Topics include:
Fuzzy words,page40
Measuring wor
Fuzzy words
You can search for document words that are textually similar to a given literal query word (that is, one containing no wildcards). To do this, append a tilde (~) character to the word, creating a fuzzy word. For example, the fuzzy word define~ matches the similar words dened and denite,butdoesnot match dening, denition, indenite,orpine.Italsomatchesdene itself.
rwords
d similarity, page 40
Measuring w
Theeditdis operations (deletion, replacement, or insertion) required to change one word into the other word.
For example, the edit distance between dene and pine is three: two deletions (d and e)andone replacement ( f by p). The distance between dene and denite is also three (e replaced by i; te inserted).
The search are the sam length (o it takes less to change one word into the other word relative to their lengths.
The similarity ratio used by the search engine is d/min(query, doc), where d is the edit distance, min is a function that returns the lesser of its arguments, and query and doc are the lengths of the quer y word and document word, respectively. A fuzzy word matches a document word if this ratio is no more than 0.5.
Examples:
ord similarity
tance (also called Levenshtein distance) between two words is the number of single-character
engine considers dene more similar to denite than to pine,eventhoughtheeditdistances
e (three), because the edit distance (number of character changes) is compared to the word
f the shorter of the query and document words). Two words are closer, for querying purposes, if
Words Compared Similarity Ratio Match ?
dene, d
dene,pine 3/min(6,4)=3/4=0.75
enite
3/min(6
Matching word sequences
You can use word sequences to nd documents with words that occur in a specied order and are separated by a specied maximum distance.
Topics include:
Simple word sequences, page 40
• Proximity word sequences, page 41
• Matching word sequences in attachments,page41
,8)=3/6=0.5
yes
no(0.75>0.5)
Simpl
ewordsequences
To search for an ordered sequence of words, use a simple word sequence, which is a list of literal
y words (no wildcards) separated by spaces (or other separators) and enclosed in quotes ("). A
quer document matches a simple word sequence if all words occur in the document in the same order, with no intervening words.
xample, the sequence "like a rolling stone" does not match a document with the text like a
For e large rolling stone because of the intervening word large.
40
Query expression syntax a nd matching
Proximity word s
You can use simple word sequences to search for words separated by separators but not by other words. To searc words, use a proximity word sequence.
To write a proximity word sequence, use the same syntax as a simple word sequence, but append a tilde (~)charactert represents the maximum number of other document words that can occur between any two successive words of the sequence. A document matches a proximity word sequence if all words occur in the document in th
For example, the sequence "bird garden stone"~3 matches any document that has these three wordsinthisorder,withbird and garden separated by no more than three words, and garden and stone separated by n rose garden is near a stone because there are at most three words between successive sequence words. This sequence also matches abirdgardenwithastonefor the same reason.
Simple word sequences are a special case of proximity word sequences: "..."isthesameas". . ."~0.Anydocumentsfoundby". . ."~N are also found by ". . ."~M,whenM>N.
equences
h for document words that are in an ordered sequence, but might be separated by other
o the second quote, and follow that with a numeric proximity value. The proximity value
e same order, with at most N intervening words, where N is the proximity value.
o mo re than three words. This sequence matches a document with the text abirdinthe
Matching word sequences in attachments
This section discusses word matching in attachments. Like other documents, IAP renders attachment documents (like spreadsheets and PDF les) into text words. When IAP renders a document, it follows the
document application’s internal representation of the le.
Certain le types, for example spreadsheets, look very different internally than they do externally. This means that word sequence in the external application representation which the end user sees may differ from the internal application representation. IAP query matching uses the internal application representation. Below are a couple of examples to illustrate .
Example 1. Separators are ignored
IAP renders text into words. Remaining characters such as periods, commas, spaces, and newlines are considered separators and are ignored. Phrase queries ignore all formatting elements and non-word characters. The following original plain text of:
“This was news to Mr. Smith.
Johnson, however, knew b et ter.”
matches the phrase query of:
“Smith Johnson”
This is because internally, the two plain text sentences are represented as one long string of continuous words: “This was news to Mr Smith Johnson however knew better”.
Example 2. Sequence is not intuitive
Internally in a n attachment’s original application, a large multi-page document or a single page spreadsheet equates to a long text sequence. Text may not appear in the same sequence internally as
it appears externally. Also, multiple instances of the same text in cer tain le types are represented as a single instance.
Spreadsheets
Look at the external representation of the following example spreadsheet.
User Guide
41
Table 13 Excel spreadsheet
United States Presidents named John
John Adams
John Quincy Adams
John Fitzgerald Kennedy
John Tyler
1797-1801
1825-1829
1961-1963
1841-1845
The specic order in which the text in the cells is stored internally depends on:
The version of the product, for example Excel or Quattro Pro, used to generate the spreadsheet
The insertion order for the spreadsheet text
For the spreadsheet above, assuming the cell text for names were entered in displayed order from top left to bottom right (John Adams was entered rst) and the title and dates were entered after all the names were entered, most versions of spreadsheets store the text internally as follows:
John Adams John Quincy John Fitzgerald Kennedy Tyler United States Presidents na m ed John 1797–1801 1825–1829 1961–1963 1841–1845
Note the following features of the internal representation:
Text sharing: Where certain text appears in more than one cell in the spreadsheet, the text may
appear only once in the internal representation. In this example, this is the case with the text “John” and “Adams”. (Note that not all versions of Excel consistently share text in exactly this way.)
This text sharing only occurs at the level of the entire text of a cell, and never occurs within cells. Thus,
“John Quincy” and “John Fitzgerald” remain whole and independent.
Even accounting for text sharing, the specic ordering of various cell text in the internal representation
does not necessarily follow presentation order, and instead often follows insertion order.
Because of these factors, text sequence matches in an Excel spreadsheet (for example) are only consistent with the spreadsheet as viewed in Excel if the matched text appears wholly within a cell. However, it is possible for sequences to match in inconsistent ways across cell text depend ing on the precise version and editing history of that spreadsheet.
For the spreadsheet and order of insertion shown above, the following queries would match: "John Adams"
"Adams John" "Quincy John"
"John Fitzgerald Kennedy" "Presidents named John"
And, the following queries would not match: "John Tyler"
"Quincy Adams" "John Quincy Adams" "John Adams 1797–1801"
42
Query expression syntax a nd matching
PDF documents
PDF documents are another case where the internal text representation can vary widely from the visible presentation in PDF readers. Some issues that can arise:
Text sequences can appear out of order on the same page depending on how the page was
composed.
Text can appear doubled or can have spacing inserted into or removed from the internal
representation to assist some specicvisualpresentation.
In general, PDF documents generated via print drivers are far more susceptible to these issues than PDF documents generated directly using Acrobat and other such composing tools. However, because of the nature of PDF itself, even they are not immune.
Boolean query expressions
You can combine words, fuzzy words, and word sequences using Boolean (logical) operators AND, OR, and NOT (these must be uppercase). The following table describes Boolean operators, where exp, exp1, and exp2 represents a word, fuzzy word, word sequence, or other Boolean quer y expression.
Table 14 Boo
lean query expressions
Syntax
NOT exp alternative syntax: - exp alternative syntax: ! exp
exp1 OR exp
exp1 AND exp2
alternative syntax: exp1 && exp2 alternative syntax: exp1exp2
2
Matches
all documents that do not match exp
all documents that match either exp1 or exp2
all documents that match both exp1 and exp2
NOTE:
ThesecondalternativesyntaxforANDindicatesthatANDisthedefaultconnectiveinqueryexpressions. You do not need to supply AND explicitly. It is assumed if neither AND nor OR is used explicitly. For example, the query peace quiet is equivalent to the query peace AND qu iet.
A NOT expression must be combined, using AND or OR, with another expression other than NOT.A query cannot consist solely of negative criteria.
NOT quie
(NOT quiet) AND (NOT blue) illegal
NOTquietANDblue legal
t
illegal
NOT quiet OR nois*
legal
You must provide the proper number of arguments for a Boolean operator or an error message results: oneargumentforNOT (- or !), two arguments for AND (&& )andOR. For example, the following queries result in an error message.
alpha NOT:MissingargumentforNOT
AND alpha: Missing a rgument for AND
User Guide
43
Boolean operators must be surrounded by one or more separators, typically white space. For example, the query peas&&carrots is not equivalent to the query peas && carrots; peas&&carrots is a single word (& isawordcharacter).
Negation operators (- and !) are exceptions to this rule. They must be preceded by a separator, but they need not b e followed by a separator. For example, carrot-a6 is a single query word, but carrot
-a6,likecarrot (- a6), is equivalent to the Boolean expression carrot AND (NOTa6).
Nested Boolean query expressions
You can nest Boolean quer y expressions using Boolean expressions as arguments of Boole an expressions. For example, the following query searches for documents containing bird,butnotgarden or stone:
bird AND NOT (garden OR stone)
Query expression examples
The following are examples of quer y expressions.
44
Query expression syntax a nd matching
Table 15 Query expression examples
Query expression
peace OR quiet peace quiet
peace AND quiet peace && quiet
peace&&quiet
peace or quiet
not quiet
NOT quiet
peace & quiet
peace | quiet peace AND NOT quiet
peace && -quiet
-quiet && peace peace AND quiet OR
silence quiet OR silence AND
peace pea* pea*c*
Finds documents with ...
Either peace or
Both peace and quiet,ineitherorder.
Thesinglewordpeace&&quiet.
The three words peace, or,andquiet, in any order. or is a word. The OR operator must be uppercase.
The words not and quiet. The NOT operator must be uppercase.
Illegal. A query cannot be purely negative and must have some positive expression.
The three wo character.
Both peace and quiet. | is a separator. The AND operator is implied.
The word peace but not quiet.
Avoid using. Parentheses are needed: peace AND (quiet OR silence).
Avoid using. Parentheses are needed: quiet OR (silence AND peace).
Any word starting with pea such as: pea, peas, peace,orpeach.
Words suc
quiet,orboth,ineitherorder.
rds peace, &,andquiet,inanyorder. & is a word
The AND operator is implied.
has: peace, peach,orpeaches.
"peace quiet"
"peace quiet"~1
peace~ (peace quiet)
AND NOT "peace quiet"
Both peace and quiet, in that order, with no intervening words. Examples: peace quiet or peace $^%+{} quiet.
Both peace and quiet, in that order, separated by at most one word. Examples: peace and quiet; peace, now;%^$ quiet; peace quiet;or peace george quiet.
Words similar to peace such as: peaches, piece, place,orplate.
Both peace and quiet,butnotpeace followed immediately by quiet. Examples: quiet peace; quiet blue peace;orpeace, water, land, quiet.
User Guide
45
46
Query expression syntax and matching
Index
Symbols
&&
in Boolean query expressions,43
- character in Boolean expressions,43
A
access control list (ACL)
denition,11
accessing
audit log repository, 17, 31
ACL
See access control list AND queries,43 archiving
denition,11 audience,7 audit log repository
accessing,17,31 audit queries,27
B
Boolean queries
characters,37
expressions,43
nested,44
denition
access control list (ACL),11 archiving,11 document,11 IAP,11 indexing documents,12 matching, Boolean query expression ,43 repository,11 routing rule,11 rule, routing,11
deleting
quarantine repositories,30 query or search results,30
saved criteria,29 digits,38 displaying
Message ID,23
results,24
saved criteria,29
saved results,29 document
conventions,7
denition,11
prerequisites,7
related documentation,7 document archiving
explanation,11
C
case sensitivity
Boolean queries,43 matching words,39 word characters,37
changing
IAP Web Interface password,34 characters,37 Content-Type indexing,12 contents
indexing, denition,12 control list
denition,11 conventions
document,7
text symbols,8 copying
results to a quarantine repository,30
D
default Boolean connective (AND),43
E
EAs
application programs for users,11
EAs for Domino
description,11
EAs for Exchange
description,11 edit distance, word matching,40 exporting
query or search results,29 expressions, query
about,37
Boolean, 43, 44
examples,44
languages,38,38
letters and digits,38
matching words,39
separators,37
sequences, matching,40
word characters,37
F
fuzzy words,40
User Guide
47
H
help
obtaining,8
HP
storage web site ,8 Subscriber’s choice web site,8 technical support,8
I
IAP
denition,11
IAP Web Inter face
advanced searching,20 passwords,34 Query Results page,24 requirements,17 searching,19 toolbar,17
troubleshooting,34 implicit Boolean connective (AND),43 indexed documents
types,12 indexing documents,12 Integrated Archive Platform
See IAP
L
languages, quer y expressions,38,38 letters,38 Levenshtein distance,40 list
access control, denition,11 literal words,39 logging in and out
IAP Web Interface,17
P
passwords
IAP Web Interface,34 PDF les,41 Preferences button
IAP Web Interface,18 prerequisites,7 proximity word sequences,41
Q
quarantine repositories,27
deleting,30 quarantine repository
copying to,30 query criteria
deleting,29
displaying saved criteria,29
saving,26 query expressions
about,37
Boolean,43,44
examples,44
languages,38,38
letters and digits,38
matching words,39
separators,37
sequences, matching,40
word characters,37 Query Manager button,18 query results
deleting,30
displaying,24
displaying saved results,29
exporting,29
saving,27
sending,28 Query Results page,24
M
matching
denition, Boolean query expression,43 sequences,40 words,39
Message ID
displaying,23 MIME Content-Type indexing; ,12 Multipurpose Internet Mail Extensions
See MIME
N
nested Boolean queries,44 New Search but ton
IAP Web Interface,18 NOT queries,43
O
OR queries,43
48
R
related documentation,7 repository
denition,11
requirements
IAP Web Interface,17
results
deleting,30 displaying,24
displaying saved results,29 retrieving a large number of documents,27 routing rule
denition,11 rule
routing, denition,11
S
saving
query or search criteria,26 query results,27
search criteria
deleting,29 displaying saved criteria,29 saving,26
Search for eld
IAP Web Interface,18
search results
deleting,30 displaying,24 displaying saved results,29 exporting,29 quarantine repository,30 saving,27 sending,28
searching
IAP Web Interface, 19, 20
sending
query or search results,28
separators
characters,37
matching word sequences,40 sequences, matching,40 similarity, matching words,40 simple word sequences,40 spreadsheets,41 Subscriber’s choice, HP,8
symbols in text,8
T
technical support
HP,8 text symbols,8 toolbar
IAP Web Interface,17 troubleshooting
IAP Web Interface,34
U
Unicode standards,38
W
web sites
HP documentation,8
HP storage,8
HP Subscriber’s choice,8 wildcard characters,39 words
Boolean queries,43
characters and separators,37
letters and digits,38
matching,39,40
sequences,40
User Guide
49
Loading...