Microsoft EXCHANGE 2000 1 User Manual

Exchange 2000 Operations Guide
Version 1.0
ISBN: 1-4005-2762-7
Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless oth erwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.
© 2001 Microsoft Corporation. All rights reserved. Microsoft, MS-DOS, Windows, Windo ws NT , Active Directory, Outlook, and Visual
Basic are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
ii
Contents
Chapter 1
Introduction 1
Microsoft Operations Framework (MOF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
How to Use This Guide. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Efficiency, Continuity, and Security. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Chapter Outlines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Planning and Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Service Level Agreements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Related Topics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Chapter 2
Capacity and Availability Management 11
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Chapter Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Capacity Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Availability Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Service Hours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Service Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Minimizing System Failures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Minimizing System Recovery Time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Performance Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Making Changes to the Registry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
No Performance Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Optimizable Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Tuning Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Upgrading from Exchange 5.5 to Exchange 2000 . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Tuning the Message Transfer Agent (MTA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Tuning SMTP Transport. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Store and ESE Tuning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
iii
Contentsiv
Tuning Active Directory Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Tuning Outlook Web Access (OWA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Hardware Upgrades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Chapter 3
Change and Configuration Management 37
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Chapter Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Change Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Defining Change Type. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Software Control and Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Configuration Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Tools for Configuration Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Configuration Management and Change Management . . . . . . . . . . . . . . . . . . . . . . 45
Configuration Items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Maintaining the Configuration Management Database . . . . . . . . . . . . . . . . . . . . . . 49
Exchange System Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Chapter 4
Enterprise Monitoring 53
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Chapter Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Performance Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
System Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Exchange 2000 Objects and Counters to Monitor . . . . . . . . . . . . . . . . . . . . . . . . . 55
Windows 2000 Objects and Counters to Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Centralized Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Event Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Event Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Log Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Centralized Event Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Availability Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Monitoring and Status Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Centralized Availability Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Client Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Contents v
Chapter 5
Protection 71
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Chapter Start Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Chapter End Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Chapter Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Protection Against Hacking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Firewall Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Anti-Virus Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Staying Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Dealing With Virus Infection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Blocking Attachments at the Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Disaster Recovery Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Backing Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Restoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Recovery Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Chapter 6
Support 89
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Chapter Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Providing Support for End Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Reducing End User Support Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Exchange Problem Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Glossary 97
Content Lead
Andrew Mason – Microsoft Prescriptive Architecture Group
Key Authors
Paul Slater – ContentMaster Kent Sarff – Microsoft Consulting Services Sasha Frljanic – Microsoft Consulting Services
Reviewers
Jon LeCroy – Microsoft ITG Thomas Applegate – Microsoft ITG Erik Ashby – Microsoft Exchange 2000 Product Group Chase Carpenter – Microsoft Consulting Services
vii
1
Introduction
Introduction
W elcome to the Microsoft® Exchange 2000 Server Operations Guide. This guide is designed to give you the best information available on managing operations within an Exchange 2000 environment.
To manage Exchange in a day-to-day environment, an operations team needs to perform a wide variety of procedures, including server monitoring, backup, verification of scheduled events, protection against attack, and user support. This guide includes instructions for the procedures along with steps for dealing with unresolved issues in a timely manner.
Microsoft Operations Framework (MOF)
For operations to be as efficient as possible in your environment, you must manage them effectively. To assist you, Microsoft has developed the Microsoft Operations Framework (MOF). This is essentially a collection of best practices, principles, and models providing you with technical guidance. Following MOF guidelines should help you to achieve mission­critical production system reliability, availability, supportability, and manageability on Microsoft products.
The MOF process model is split into four integrated quadrants. These are as follows:
Changing
Operating
Supporting
Optimizing
Together, the phases form a spiral life cycle (see Figure 1.1) that can apply to the opera­tions of anything from a specific application to an entire operations environment with
Microsoft Exchange 2000 Operations Guide — Version 1.02
Optimize cost, performance, capacity, and availability.
SLA
Review
Track and resolve incidents, problems, and inquiries quickly. Facilitate CRM.
Release
Approved
Operations
Review
Introduce new service, solutuions, technologies, systems, applications, hardware, and
Release
Review
Execute day-to-day operations tasks effectively.
Figure 1.1
MOF Process Model
multiple data centers. In this case, you will be using MOF in the context of Exchange 2000 operations.
The process model is supported by 20 service management functions (SMFs) and an integrated team model and risk model. Each quadrant is supported with a corresponding ding operations management review (also known as review milestone), during which the effectiveness of that quadrant’ s SMFs ar e assessed.
Chapter 1: Introduction 3
It is not essential to be a MOF expert to understand and use this guide, but a good under­standing of MOF principles will assist you in managing and maintaining a reliable, avail­able, and stable operations environment.
If you wish to learn more about MOF and how it can assist you to achieve maximum reliability, availability, and stability in your enterprise, visit www.microsoft.com/mof for more detailed information. For prescriptive MOF information on all 20 service manage­ment functions, complementing the Exchange-specific information found in this guide, examine the detailed operations guides at:
http://www.microsoft.com/technet/win2000/win2ksrv/default.asp
How to Use This Guide
While this guide is designed to be read from start to finish, you may wish to “dip in” to the guide to assist you in particular problem areas. To assist you in doing so, the guide contains a number of symbols that you will not find elsewhere. It is very important that you read the following section if you are going to get the most out of your piecemeal approach.
Efficiency, Continuity, and Security
Not every Exchange Operations manager thinks in terms of the MOF. Another way of considering operations is in terms of the categories in which they fit. The wide variety of tasks that constitute Exchange 2000 operations can be divided into three broadly overlap­ping groups. Figure 1.2 on the next page shows these groups and how the operations fit within them.
Microsoft Exchange 2000 Operations Guide — Version 1.04
Efficiency
Performance T uning Exchange System Policies Capacity Management
Group Policies Backup
Storage
Management
Hardware Upgrades
Performance
Monitoring
Disaster Recovery
Support
Anti-Virus Event
Monitoring
Change
Management
Security Policies
Firewall Issues
Exchange System Policies
AD Group membership
Continuity
UPS Recovery T esting Availability Monitoring Availablity Management
Security
Figure 1.2
Exchange 2000 Operations divided into groups
This guide covers all three areas described above. Although the chapters are structured according to Microsoft operations principles, you will find information about all of these areas in the guide.
Chapter Outlines
This guide consists of the following chapters, each of which takes you through a part of the operations process. Each chapter is designed to be read in whole or in part, according to your needs.
Chapter 2 – Capacity and Availability Management
To continue to function as it should, Exchange must be managed over time as the load on the system increases. The chapter looks at the different tasks that you may need to perform as the Exchange environment is used more.
This chapter deals with these topics:
Capacity management
Availability management
Performance tuning
Hardware upgrades
Chapter 3 – Change and Configuration Management
Chapter 1: Introduction 5
This chapter presents many of the processes used to manage an Exchange 2000 environment. These processes will help you to evaluate, control, and document change and configuration within your organization.
This chapter deals with the following:
Change management
Configuration management (including use of Exchange system policies)
Chapter 4 – Enterprise Monitoring
To track any problems and to ensure that your Exchange 2000 Server is running efficiently, you need to monitor it effectively . Monitoring is not something that should occur only when there are problems, but should be a continuous part of your maintenance program. The chapter shows how you can monitor your Exchange 2000 to deal with any problems as (or preferably before) they arise.
This chapter deals with these topics:
Creating an enterprise monitoring framework
Performance monitoring
Event monitoring
Microsoft Exchange 2000 Operations Guide — Version 1.06
A vailability monitoring
Proactive monitoring
A vailability prediction
Chapter 5 – Protection
To protect your Exchange environment from failure, you need good protection from intrusion and attack, along with a documented and tested disaster -recovery pr ocedure to cope with system failure. The chapter shows how to ensure that your server running Exchange is protected against these eventualities.
This chapter deals with the following:
Firewall issues
Anti-virus protection
Disaster-r ecovery procedures
Recovery testing
Backup
Restore
Chapter 6 – Support
An effective support environment allows you to deal more efficiently with unforeseen issues, increasing the reliability of your Exchange environment. This chapter shows how to manage your support in an Exchange 2000 environment.
This chapter deals with these topics:
Helpdesk support
Problem management
Planning and Deployment
To make the most out of your Exchange 2000 environment, you should make sure that your operations are carefully planned and structured. The best way of ensuring that your operations are efficient is to have operations intrinsically involved in the planning and deployment phases of Exchange 2000, providing valuable input into those processes.
Many times deployment teams do not involve the operations team in the project until it is near completion. If you are going to perform successful operations from the outset, you should make sure that you plan effectively for the operations team to take over the infra­structure and processes. Making sure that at least one member of operations attends all planning and deployment meetings will help to ensure that your design and implementa­tion takes operations needs into account.
Operations procedures need to be defined during the planning and deployment process. Service level agreements, disaster-recovery documents, and monitoring pr ocedur es all need
to be created at this stage, because waiting until the system is live could be too late. The operations team should be using the planning phase (and in some cases the deployment phase) to test procedures that are defined, such as those for disaster recovery.
Planning and deployment are covered in more detail as part of the Exchange 2000 Upgrade Series. You will find this at the following Web site:
http://www.microsoft.com/technet/exchange/guide/default.asp
Service Level Agreements
Your goal for successful operations is to produce a high quality of service at a reasonable cost. The definition of a high quality of service will vary according to the needs of your organization. The level of service you offer will generally be a compromise between quality-of-service requirements and the costs necessary to provide it.
Central to the idea of successful operations is the service level agreement (SLA) process. Success or failure of an operations environment is measured against the requirements of the SLA. It is therefore very important that you define the SLA realistically according to the resources you can devote to your operations environment.
Note: When it is internal to IT, an SLA is often referred to as an operating level agreement (OLA). For the sake of clarity, this guide refers to SLAs only. However, the recommendations in the guide apply to OLAs as well as SLAs.
Chapter 1: Introduction 7
Your SLAs should be a commitment to providing service in four different areas:
Features
Performance
Recovery
Support
Features
Here you state which Exchange services you will be offering to the client base. These would include some or all of the following:
E-mail (via some or all of MAPI, POP3, IMAP4, and OWA)
Defined mailbox size for different categories of user
Public folder access
Ability to schedule meetings
Instant messaging
Chat
Video conferencing
Microsoft Exchange 2000 Operations Guide — Version 1.08
Performance
Here you show the performance you would expect from each of the previously mentioned features. This would include some or all of the following:
Service availability (this may be given across all services or on a service-by-service
basis)
Service hours
Mail delivery times (note that you are not able to guarantee mail delivery times, either
to the Internet or within your organization, if you use the Internet as part of your intra­organization mail topology)
Mailbox replication times (dependent on Active Directory™ service)
Recovery
Here is where you will specify what you expect in terms of disaster recovery. Although you naturally hope that disaster never strikes, it is important to assume that it will and have recovery times that will meet in a number of disaster-recovery scenarios. These include the following:
Recovery from failed Exchange Store
Recovery from total server failure
Support
Here you specify how you will offer support to the user community, and also how you would deal with problems with your Exchange Server environment. You would include commitments on the following:
Helpdesk response time
When and how problems will be escalated
Of course it is one thing to determine what your SLAs should measure, and quite another to come up with the right figures for them. You should always define them realistically according to the needs of the business. Realistically is the key word here. For example, while your business might want Exchange to have guaranteed uptime of 100 percent, it is unrealistic to require this in an SLA because a single incident anywhere within your organization will cause you to fail to meet the SLA.
Y our operations environment should be built around meeting the r equirements of your SLAs.
Summary
This chapter has introduced you to this guide and summarized the other chapters in it. It has also provided brief descriptions of both service level agreements and planning and deployment. Now that you understand the organization of the guide, you can decide whether to read it from beginning to end, or whether you want to read selected portions. Remember that effective, successful operations require effort in all areas, not just improve­ments in one area, so that if you decide to read the Supporting chapter first, you should go back and read the other chapters as well.
Related Topics
The Microsoft Operations Framework provides technical guidance and industry best practices that encompasses the complete IT Service Management environment, including capacity management, availability management, configuration management, service monitoring and control, service level management, and their inter-relationships. For more information on the Microsoft Operations Framework, go to:
http://www.microsoft.com/mof For prescriptive MOF information on capacity management, availability management,
configuration management, service monitoring and control, and service level management, please review the detailed operations guides that can be found at:
Chapter 1: Introduction 9
http://www.microsoft.com/technet/win2000/win2ksrv
2
Capacity and Availability Management
Introduction
In the vast majority of cases, the load on your Exchange 2000 Server computers will increase over time. Companies increase in size, and as they do, the number of Exchange users increases. Existing users tend to use the messaging environment more over time, not only for traditional e-mail, but also for other collaborative purposes (for example, voicemail, fax, instant messaging, video conferencing). The load on the messaging environment will also vary over the course of the day (for example, there may be a morning peak) and could vary seasonally in response to increased business activity.
The aim of your operations team should be to minimize the effect of the increased load on your users, at all times keeping within the requirements set by your service level agreement (SLA). You will need to ensure that existing servers running Exchange are able to cope with the load placed upon them (and upgrade hardware if appropriate).
Another important requirement of the operations team is to minimize system downtime at all times. The level of downtime your organization is prepared to tolerate needs to be clearly set out in the SLA, separated into scheduled and unscheduled downtime. Many organizations can cope perfectly well with scheduled downtime, but unscheduled down­time almost always needs to be kept to a minimum.
Exchange 2000 is predominantly self-tuning, but there are areas where tuning your servers running Exchange will result in an improvement in performance. It is important to identify these areas and tune where appropriate.
Inevitably there will come a point where the load on your servers running Exchange is such that hardware upgrades are required. If you manage this process effectively, you can significantly reduce the costs associated with upgrading.
Microsoft Exchange 2000 Operations — Version 1.012
Chapter Sections
This chapter covers the following procedures:
Capacity management
Availability management
Performance tuning
Hardware upgrades
After reading this chapter, you will be familiar with the requirements for capacity and availability management in an Exchange 2000 environment and the steps necessary to ensure that the requirements of your SLA are met.
Capacity Management
Capacity management is the planning, sizing, and controlling of service capacity to ensure that the minimum performance levels specified in your SLA are exceeded. Good capacity management will ensure that you can provide IT services at a reasonable cost and still meet the levels of performance you have agreed with the client.
This section will help you meet your capacity management targets for an Exchange 2000 environment.
Of course, whether an individual server reaches its SLA targets will depend greatly upon the functions of that server. In Exchange 2000, servers can have a number of different functions, so you will need to ensure that you categorize servers according to the functions they perform and treat each category of server as an individual case. In particular, do not consider servers purely in terms of the number of mailboxes they hold.
When you are looking at the capacity of a server running Exchange, consider the following:
How many mailboxes are on the server?
What is the profile of the users? (Light, medium, or heavy use of e-mail; do they use
other services, such as video-conferencing?)
How much space do users require for mailboxes?
How many public folders are on the server?
How many connectors on the server are on the server?
How many distribution lists are configured to be expanded by the server?
Is the server a front-end server?
Is the server a domain controller/Global Catalog server? (generally not recommended)
Generally, the more functions a server has, the fewer users that server will be able to support on the same hardware. To gain the maximum capacity from your servers, consider having servers dedicated to a specialized function. In many cases your planning will have
Chapter 2: Capacity and Availability Management 13
resulted in specialized hardware for specialized functions, for example, in the case of front­end servers.
To ensure that you manage capacity appropriately for your Exchange 2000 server , you need a great deal of information about current and projected usage of your server running Exchange. Much of this information will come from monitoring. You will need informa­tion about patterns of usage and peak load characteristics. This information will need to be collected on a server-by-ser ver basis, because a problem with a single server in an Exchange 2000 environment can result in a loss of performance for thousands of users. The perfor ­mance of your network is also critical in ensuring delivery times and timely updating of Exchange directory information.
The main areas you should monitor to ensure that your servers running Exchange exceed your SLAs’ capacity requirements include the following:
CPU utilization
Memory utilization
Hard-disk space used
Paging levels
Network utilization
Delivery time within and between routing groups
Delivery time to and from foreign e-mail systems within your organization
Delivery time to and from the Internet (although this depends greatly on minute-by-
minute performance of your connection to the Internet and the availability of band­width to other messaging environments)
Time for directory updates to complete
You will find more information on monitoring in Chapter 4. It is fairly common to choose the size the disks of a server running Exchange based on how
many mailboxes you plan to have on the server multiplied by the maximum allowable size of each mailbox. Using this approach, however, will generally not help you to meet your SLAs. Y ou should strongly consider approaching this problem from a differ ent perspective. When determining the capacity of your server running Exchange, consider basing it on the time it takes to recover a server from your backup media. Recovery time is generally very important in organizations because downtime can be extremely costly. If you are using a single store on your server running Exchange, use the following procedure to help you size it.
1. Divide the recovery time of a database (defined in your SLA) by half. Around half the recovery time will generally be spent on data recovery, the rest on running diagnostic tools on the recovered files, database startup (which includes replaying all later message logs) and making configuration changes. Of course, this is only a general figure—the longer you leave for recovery time, the smaller the proportion of that time is required for configuration changes.
2. Determine in a test environment how much data can be restored in this time.
Microsoft Exchange 2000 Operations — Version 1.014
3. Divide this figure by the maximum mailbox size you have determined for the server
(again listed in your SLA). This will give you the number of mailboxes you can put on the database.
As an example, assume that your SLA defines a recovery time of four hours for a database. In testing, your recovery solution can restore 2 gigabytes (GB) of Exchange data per hour and your maximum mailbox size is 75Mb.
Using the preceding procedure, the following calculations can be done.
1. The recovery time divided by 2 is 2 hours.
2. 4 GB of Exchange data can be restored in this time.
3. 4 GB divided by 75 MB is 54 Mailboxes.
In this example, if you wanted to provide more mailboxes per server, you would either have to a) alter your SLA to increase recovery time or b) find a faster restore solution.
Each server running Exchange can support up to 20 stores, spread across 4 storage groups (in Enterprise edition). If your server running Exchange is configured with multiple stores, recovery times can be more difficult to calculate. Stores in the same storage group are always recovered in series, whereas ones in separate storage groups can be recovered in parallel. You may also have created multiple stores to allow you to offer different SLAs to different categories of user (for example, you might isolate managers on one store so that you can offer them faster recovery times than the rest of the organization.) If you do have multiple stores, you will need to consider the SLA on each store and the order in which stores will be recovered to accurately determine recovery time.
As a result of the first two steps in the preceding calculations, you will have a figure for the maximum amount of data located in your information stores. Generally, you should at least double this to determine the appropriate disk capacity for the disks containing your stores. This will allow you to perform offline maintenance much more quickly as files can be quickly copied to a location on the same logical disk.
By using a key SLA to define your capacity, you are creating an environment in which you are far more likely to meet the targets you set.
Sizing servers to meet SLAs is crucial, but servers must also meet user performance expec­tations. Using Microsoft and third-party tools, ensure that your predicted user usage will be accommodated on the servers.
If you have sized your database according to the techniques mentioned here, you should be able to ensure that your database is kept to a manageable size. However, keeping an eye on the size of your Exchange 2000 databases is still important. In a large enterprise, it is typical for users to be moved from one server to another quite often, and for users to be deleted. This can result in significant fragmentation of databases, which results in large database sizes, even if you do keep the number of mailboxes below the levels you have determined.
To deal with this problem, you should continually monitor available disk space on your servers running Exchange. If the RAID array containing the stores gets close to half full, an alert should be sent indicating the problem, and that the Exchange Database might need to be defragmented offline. To do this, perform an alternate server restore (see Chapter 5 for details) and then defragment the database on this alternate server. If this is successful and results in a significant reduction in database size, you can perform the defragmentation at the next scheduled maintenance time.
Performing the alternate server restore also has the advantage of ensuring that your backup and restore procedures are working effectively. You should check this regularly in any case. This is also covered in more detail in Chapter 5.
Probably the most important thing to remember when performing capacity planning is to size conservatively. Doing so will minimize availability problems, and the cost reduction will generally more than compensate for any excess capacity costs.
As well as looking at technical issues, you will need to examine staffing levels when you are capacity planning. As your Exchange 2000 environment grows, you might need more people to support the increased load. In particular, if there are more users requiring increased services, there is likely to be a greater need for help desk support.
Availability Management
Chapter 2: Capacity and Availability Management 15
A vailability management is the process of ensuring that any given IT service consistently and cost-effectively delivers the level of availability required by the customer. It is not just concerned with minimizing loss of service, but also with ensuring that appropriate action is taken if service is lost.
One of the main aims of Exchange 2000 operations is to ensure that Exchange is avail­able as much as possible and that both planned and unplanned interruptions to service are minimized. Availability in an organization is typically defined by your SLAs in two ways—service hours and service availability.
Service Hours
These are the hours when the Exchange services should be available. Typically, for a large organization this will be all but a very few hours a month. Defining your service hours allows you to create defined windows when offline maintenance of your servers running Exchange can be performed without breaching the terms of your SLA.
You might choose to define in the SLA the exact times when Exchange services might be unavailable. For example, you might state that Exchange services might be unavailable for four hours every first Saturday of the month. However, in large organizations it is often more practical to commit to, for example, no more than four hours of scheduled downtime per month, with a week’ s notice of any scheduled change. This allows changes to be made much more easily across the organization, at times when the right staff can be devoted to the tasks.
Microsoft Exchange 2000 Operations — Version 1.016
Of course, just because you have allowed for a certain amount of downtime per server per month, this does not mean that you have to use it, and in most cases you will not. On the other hand, just because you haven’t performed offline maintenance one month does not mean that the hours can be carried over to the following month. Your user community will be very unhappy if you take a system down for 2 days, even if it has been up solidly for 2 years!
You might wish to define different service hours for the different services available in Exchange (mail, public folders, etc). This would depend on the amount of offline mainte­nance that is typically required for each service. For example, you might determine that your SMTP bridgehead servers and firewall servers never require offline maintenance and so might set the level of service hours for mail delivery significantly higher than for mail­box access. If you are prepared to spend the appropriate money on resources, it is very possible to achieve extremely low levels of scheduled downtime, and this can be reflected in your SLA.
Service Availability
Service availability is a measure of how available your Exchange services are during the service hours you have defined. In other words, it defines the levels of unscheduled down­time you can tolerate within your organization. Typically levels of availability in an SLA of an enterprise are between 99.9 and 99.999 percent. This corresponds to a downtime of as much as 525 and as few as 5 minutes per service per year.
Of course, ANY unscheduled downtime is inconvenient at best, and very costly at worst, so you need to do your best to minimize it.
To ensure high levels of availability, you need to consider two key questions:
How often, on average, is there downtime for a service?
How long does it take to recover the service if there is downtime?
Once you have considered these questions, you can set about minimizing the number of times a service fails and the time taken to recover that service.
A vailability management is intrinsically linked with capacity management. If capacity is not managed properly, then overloaded servers running Exchange might fail, causing availability problems. A classic example of this would be running out of disk space on a server running Exchange, which would result in the databases shutting down and in users losing a number of services.
Minimizing System Failures
To minimize the frequency of failure in Exchange 2000, you need do the following:
Decrease single points of failure
Increase the reliability of Exchange 2000 itself.
Chapter 2: Capacity and Availability Management 17
Decreasing Single Points of Failure
You can maintain availability in Exchange 2000, even in the event of a failure, provided you ensure that it is not a single point of failure. In some areas, such as database corrup­tion, it is not possible to eliminate single points of failure, but in many cases you can guard against individual failures and still maintain reliability. An obvious example is the direc­tory. By having multiple domain controllers and Global Catalog servers available in any part of your network, you maintain availability of Exchange even in the event of failure of a particular domain controller or Global Catalog server. Having local domain controllers or Global Catalog servers keeps Exchange available in the event of a non-local network failure.
Using front-end servers is another way to avoid single points of failure. The failure of a single front-end server will have no effect on the availability of Exchange to non-MAPI clients. The clients will simply be rerouted to another front-end server, with no loss of service.
Exchange 2000 routing can be modified to minimize single points of failure. In particular, you can modify Routing Group connectors to ensure that there are multiple bridgeheads available, and thus maintain delivery from one part of the organization to another. You can also set up Routing Group meshes, which consist of a series of fully interconnected Routing Groups with multiple possible routes between them.
Multiple messaging routes between servers are useless if they all rely on the same net­work connections and the network goes down. You should therefore ensure that there are multiple network paths (using differing technologies) that Exchange and Windows 2000 can use.
One of the most significant single points of failure is a mailbox server. This can affect very large numbers of users, depending on the server. Mailbox servers can be clustered to ensure their continued high availability. If you are running Exchange 2000 on Windows 2000 Advanced Server, you can cluster over two nodes and you have two possible ways to cluster the servers—active/passive and active/active. Active/passive clustering is the current recom­mended clustering implementation for Exchange. If you choose to implement active/active clustering, you should realize that it requires careful planning to ensure that Exchange can fail over correctly to the other node. With Service Pack 1 of Exchange 2000 and Windows 2000 Datacenter server, you can have four nodes in your cluster. In this implementation consider active/active/active/passive clustering.
In a standard clustered environment, however, the disk array is still the single point of failure, so you should think seriously about using a storage area network (SAN) to maxi­mize the availability of all your servers running Exchange.
If you are creating truly redundant Exchange 2000 servers, you shouldn’t stop at the disk subsystem. Y our servers should be equipped with r edundant RAID controllers, network interface cards (NICs), and power supplies. In fact, you should aim to have redundancy everywhere.
Single points of failure can also be created by improper maintenance of systems. For example, if you are using a RAID 5 array on a server running Exchange with a hot spare,
Microsoft Exchange 2000 Operations — Version 1.018
the disk subsystem becomes a single point of failure once that hot spare is invoked. If you have robust systems in place, you must ensure that any failures are resolved promptly. Make sure that you have notification and monitoring procedures in place and a system for resolving problems.
Remember that Exchange relies on Active Directory and Global Catalog servers to function. If no domain controllers are available to a server running Exchange, stores will dismount. If Global Catalog servers are unavailable, Exchange clients will not function (as they require a Global Catalog server to access the Global Address List). You should minimize single points of failure on these servers as much as possible, or at least ensure that you have redundant servers in every location.
Finally, do not forget non-computing issues. You can have the most robust e-mail system in the world and then find that it falls apart due to a fire in a building, a power failure, or theft of server hardware or data. You should take precautions against all these possibilities. This would include ensuring that you have taken the following into account:
Good physical security
Protection from fire
Protection from flooding
Concealed power switches
Air conditioning
UPS systems
Alternate power generation
You will need to make sure that all of these services are in place and that you have defined a drill to deal with their failure. For example, you should ensure that there are personnel on call for all emergency systems.
You might also wish to house your servers running Exchange in separate locations from one another to help reduce the impact of such events.
Good availability management is intrinsically linked with good change and configuration management. If you manage change and configuration well, you are well positioned to have good availability of your servers running Exchange. You will learn more about change and configuration management with Exchange 2000 in Chapter 3.
Increasing the Reliability of Exchange 2000
While Exchange 2000 is a very robust messaging system, like any product, there are configurations that in particular cases could result in a loss of reliability. In an Enterprise environment, it is important to guard against these difficulties by continually monitoring Exchange. For more detail on monitoring, see Chapter 4.
Chapter 2: Capacity and Availability Management 19
One area where you can guard against problems is database errors. Database errors can be caused by a number of factors, but they are typically hardware related. You will be able to minimize these by doing the following:
Ensure that your hardware is on the Hardware Compatibility List
Checking Event Viewer for database-related errors
Periodically running the Information Store Integrity Checker (isinteg.exe) on the
database to check for errors
Part of your maintenance program should also include routinely searching the Microsoft W eb site (www.microsoft.com/exchange) for any issues that need to be resolved by patches and/or service packs. The patches and service packs will be tested and recorded as part of your change-management program, which is covered in more detail in Chapter 3.
Minimizing System Recovery Time
To recover from failure in an Exchange 2000 environment as quickly as possible, you need to be thoroughly prepared. You will need the following:
Available hardware
Complete configuration information
A recent, working backup
An effective disaster-recovery procedure
Fast access to support resources
Staff availability to perform the restore
System recovery is covered in more detail in Chapter 5.
Performance Tuning
When you tune for performance, you are aiming to reduce your system’s transaction response time. Performance tuning can take a number of forms, including the following:
Balancing workloads between servers
Balancing disk traffic on individual servers
Using memory efficiently on servers
Upgrading hardware
The most effective factor in improving performance comes from upgrading the hard­ware on your servers that are running Exchange. However, regardless of the hardware, there are a number of software changes that you can make to maximize the efficiency of Exchange 2000.
Microsoft Exchange 2000 Operations — Version 1.020
While obtaining the best performance from your Exchange 2000 computers is always an important goal, it is crucial to be cautious in your tuning changes. You should track all alterations in case you make a change that inadvertently reduces performance. Making one change at a time makes it easy to identify which change needs to be reversed.
Customer variation is probably the greatest variable in tuning Exchange for optimum per­formance. Even customers with similar needs often choose solutions that differ significantly . Hardware varies in the number and speed of the processors, the available RAM, the number of disks, and the disk configuration chosen (RAID level).
Exchange 2000 can be configured with different numbers of storage groups and databases, and can be clustered with other servers accessing a central storage subsystem. The server load will vary based on the total number of users with mailboxes on the server, the number of users logged on at a given time, the actions they are performing, and any additional load imposed by the routing of outside messages through the server.
Whenever you are doing performance tuning, you should consider the cost of extensive analysis versus the benefits you expect to get from the tuning. Put simply, if you need to analyze an individual server extensively to gain a 5 percent performance gain, it is prob­ably not worth it, since you could easily spend a fraction of the money on buying better hardware. Not only that, but in some cases performance tuning might become ineffective as the load on the server increases, meaning further analysis might be required after a change has been made. For this reason, this guide does not cover extensive performance analysis; instead it concentrates on the performance tuning changes that are easy to identify. This usually involves modifying settings in the registry.
Making Changes to the Registry
Before you edit the registry, make sure that you understand how to restore it if a problem occurs. For information on how to do this, view the “Restoring the Registry” Help topic in Registry Editor (Regedit.exe) or the “Restoring a Registry Key” Help topic in Regedt32.exe.
Warning: Using Registry Editor incorrectly can cause serious problems that might require you to reinstall your operating system. Microsoft cannot guarantee that problems resulting from the incorrect use of Registry Editor can be solved. Use Registry Editor at your own risk.
If you are unfamiliar with the registry and how to change it, consult an expert. Even if you are very familiar with the registry, you should always carefully document the changes that you make, and monitor your system after each change.
For information about how to edit the registry, view the “Change Keys and Values” Help topic in Registry Editor (Regedit.exe) or the “Add and Delete Information in the Registry”
Chapter 2: Capacity and Availability Management 21
and “Edit Registry Information” Help topics in Regedt32.exe. Note that you should back up the registry before you edit it. If you are running the Microsoft Windows NT® or Microsoft Windows® 2000 operating system, you should also update your emergency repair disk (ERD).
No Performance Optimizer
The Performance Optimizer (also known as PerfWiz) is an Exchange 5.5 tool that enables you to specify how an Exchange 5.5 computer is to be configured—for example as a private store or a public store server. In addition, you can limit an Exchange 5.5 server’s memory usage and specify how many users it would be expected to handle. Based on your choices, files written by the various Exchange components (information store, message transfer agent [MT A], and so forth) can be assigned to specific fixed disks, depending upon available storage.
Exchange 2000 does not use a tool like Performance Optimizer. One reason for this is that the new release of Exchange is better capable of performing certain tasks as they are required, such as dynamically changing certain parameters, spinning up more threads, and so on. To go beyond these dynamic changes, the administrator must manually optimize disk utilization and manually modify registry keys, but fewer such registry changes are necessary.
Optimizable Features
Features in Exchange 2000 that can be optimized include:
Disks
Message transfer agent (MTA)
Simple Mail T ransfer Protocol (SMTP)
Information-store database
Extensible storage engine (ESE) cache and log buffers
Active Directory connector
Active Directory integration
Installable file system (IFS) handle cache, credentials cache, and mailbox cache,
DSAccess cache and DSProxy
Post Office Protocol v3 (POP3) and Internet Mail Access Protocol (IMAP) settings
Outlook Web Access (OWA)
Microsoft Exchange 2000 Operations — Version 1.022
T uning Considerations
The efficiency and capacity of Microsoft Exchange 2000 depends on the administrator’ s choices of server and storage hardware, and on the installation’ s topology. These should be chosen based on expected types and levels of usage. Exchange can be made more efficient through changes to various registry settings on the Exchange computer.
There are three main types of tuning parameters:
Those with fixed optimal values (or values that can be treated as such)
Those that can be dynamically tuned by the software
Those that must be manually tuned (using setup, the exchange system manager, the
registry, and the Active Directory Services Interface Edit tool ADSIEdit)
Some parameters may need to be manually tuned for the following reasons:
Hardware or Exchange configuration information may be needed and this information
cannot or will not be obtained dynamically.
Server load information may also be required; this cannot be obtained dynamically
either.
Upgrading from Exchange 5.5 to Exchange 2000
When Exchange 5.5 is upgraded to Exchange 2000, some registry keys altered by PerfWiz retain their PerfWiz values, some do not, and some keys no longer appear in the registry or they appear in a different location. This means that there might be significant differ­ences between an Exchange 2000 Server that has been upgraded from Exchange 5.5 and one that is a new installation, installed on new hardware. Because there are significant performance improvements with Exchange 2000, and because optimizations do not necessarily transfer from Exchange 5.5, it is best to start from scratch in evaluating the optimization of Exchange 2000. It is useful in this process to know your Exchange 5.5 settings before upgrading to Exchange 2000. The text file WINNT\System32\perfopt.log provides a record of those registry keys and disk assignments changed by PerfWiz.
Tuning the Message Transfer Agent (MTA)
As mentioned earlier, Exchange 2000 does not include a Performance Optimization wizard, mainly because the majority of Exchange 2000 components are self-tuning. However, when the MTA is installed, its tuning state reflects that of an Exchange 5.5 computer that has never been performance optimized.
In scenarios where an organization only has servers running Exchange 2000, the MTA does not perform any processing, and so does not need to be performance tuned. However, when your servers co-exist with X.400-based messaging systems and other foreign systems (such as Lotus cc:Mail, Lotus Notes, Novell GroupWise, and Microsoft Mail) the MTA might be used heavily and you should consider tuning the MT A registry parameters. You will also need to tune the MTA if there is substantial co-existence with Exchange 5.5
Loading...
+ 92 hidden pages