Intel IRP-TR-03-10 User Manual

301.31 Kb

Integrating Portable and

Distributed Storage

N. Tolia, J. Harkes, M. Kozuch, M. Satyanarayanan



Intel may make changes to specifications and product descriptions at any time, without notice.

Copyright © Intel Corporation 2003

* Other names and brands may be claimed as the property of others.

Integrating Portable and Distributed Storage

Niraj Tolia†‡, Jan Harkes, Michael Kozuch, M. Satyanarayanan†‡

Carnegie Mellon University, Intel Research Pittsburgh


We describe a technique called lookaside caching that combines the strengths of distributed le systems and portable storage devices, while negating their weaknesses. In spite of its simplicity, this technique proves to be powerful and versatile. By unifying distributed storage and portable storage into a single abstraction, lookaside caching allows users to treat devices they carry as merely performance and availability assists for distant le servers. Careless use of portable storage has no catastrophic consequences.

1 Introduction

Floppy disks were the sole means of sharing data across users and computers in the early days of personal computing. Although they were trivial to use, considerable discipline and foresight was required of users to ensure data consistency and availability, and to avoid data loss — if you did not have the right oppy at the right place and time, you were in trouble! These limitations were overcome by the emergence of distributed le systems such as NFS [17], Netware [8], LanManager [24], and AFS [7]. In such a system, responsibility for data management is delegated to the distributed le system and its operational staff.

Personal storage has come full circle in the recent past. There has been explosive growth in the availability of USBand Firewire-connected storage devices such as ash memory keychains and portable disk drives. Although very different from oppy disks in capacity, data transfer rate, form factor, and longevity, their usage model is no different. In other words, they are just glori ed oppy disks and suffer from the same limitations mentioned above. Why then are portable storage devices in such demand today? Is there a way to use them that avoids the messy mistakes of the past, where a user was often awash in oppy disks trying togure out which one had the latest version of a speci cle? If loss, theft or destruction of a portable storage device occurs, how can one prevent catastrophic data loss? Since human attention grows ever more scarce, can we reduce the data management demands on attention and discipline in the use of portable devices?

We focus on these and related questions in this paper. We describe a technique called lookaside caching

that combines the strengths of distributed le systems and portable storage devices, while negating their weaknesses. In spite of its simplicity, this technique proves to be powerful and versatile. By unifying “storage in the cloud” (distributed storage) and “storage in the hand” (portable storage) into a single abstraction, lookaside caching allows users to treat devices they carry as merely performance and availability assists for distant le servers. Careless use of portable storage has no catastrophic consequences.

Lookaside caching has very different goals and design philosophy from a PersonalRAID system [18], the only previous research that we are aware of on usage models for portable storage devices. Our starting point is the well-entrenched base of distributed le systems in existence today. We assume that these are successful because they offer genuine value to their users. Hence, our goal is to integrate portable storage devices into such a system in a manner that is minimally disruptive of its existing usage model. In addition, we make no changes to the native le system format of a portable storage device; all we require is that the device be mountable as a local le system at any client of the distributed le system. In contrast, PersonalRAID takes a much richer view of the role of portable storage devices. It views them as rst-class citizens rather than as adjuncts to a distributed le system. It also uses a customized storage layout on the devices. Our design and implementation are much simpler, but also more limited in functionality.

We begin in Section 2 by examining the strengths and weaknesses of portable storage and distributed le systems. In Sections 3 and 4, we describe the design and implementation of lookaside caching. We quantify the performance bene t of lookaside caching in Section 5, using three different benchmarks. We explore broader use of lookaside caching in Section 6, and conclude in Section 7 with a summary.


2 Background

To understand the continuing popularity of portable storage, it is useful to review the strengths and weaknesses of portable storage and distributed le systems. While there is considerable variation in the designs of distributed le systems, there is also a substantial degree of commonality across them. Our discussion below focuses on these common themes.

Performance: A portable storage device offers uniform performance at all locations, independent of factors such as network connectivity, initial cache state, and temporal locality of references. Except for a few devices such as oppy disks, the access times and bandwidths of portable devices are comparable to those of local disks. In contrast, the performance of a distributed le system is highly variable. With a warm client cache and good locality, performance can match local storage. With a cold cache, poor connectivity and low locality, performance can be intolerably slow.

Availability: If you have a portable storage device in hand, you can access its data. Short of device failure, which is very rare, no other common failures prevent data access. In contrast, distributed le systems are susceptible to network failure, server failure, and a wide range of operator errors.

Robustness: A portable storage device can easily be lost, stolen or damaged. Data on the device becomes permanently inaccessible after such an event. In contrast, data in a distributed le system continues to be accessible even if a particular client that uses it is lost, stolen or damaged. For added robustness, the operational staff of a distributed le system perform regular backups and typically keep some of the backups off site to allow recovery after catastrophic site failure. Backups also help recovery from user error: if a user accidentally deletes a critical le, he can recover a backed-up version of it. In principle, a highly disciplined user could implement a careful regimen of backup of portable storage to improve robustness. In practice, few users are suf ciently disciplined and wellorganized to do this. It is much simpler for professional staff to regularly back up a few le servers, thus bene-ting all users.

Sharing/Collaboration: The existence of a common name space simpli es sharing of data and collaboration between the users of a distributed le system. This is much harder if done by physical transfers of devices. If one is restricted to sharing through physical

devices, a system such as PersonalRAID can be valuable in managing complexity.

Consistency: Without explicit user effort, a distributed le system presents the latest version of a le when it is accessed. In contrast, a portable device has to be explicitly kept up to date. When multiple users can update a le, it is easy to get into situations where a portable device has stale data without its owner being aware of this fact.

Capacity: Any portable storage device has nite capacity. In contrast, the client of a distributed le system can access virtually unlimited amounts of data spread across multiple le servers. Since local storage on the client is merely a cache of server data, its size only limits working set size rather than total data size.

Security: The privacy and integrity of data on portable storage devices relies primarily on physical security. A further level of safety can be provided by encrypting the data on the device, and by requiring a password to mount it. These can be valuable as a second layer of defense in case physical security fails. Denial of service is impossible if a user has a portable storage device in hand. In contrast, the security of data in a distributed le system is based on more fragile assumptions. Denial of service may be possible through network attacks. Privacy depends on encryption of network traf c. Fine-grain protection of data through mechanisms such as access control lists is possible, but relies on secure authentication using a mechanism such as Kerberos [19].

Ubiquity: A distributed le system requires operating system support. In addition, it may require environmental support such as Kerberos authentication and speci c rewall con guration. Unless a user is at a client that meets all of these requirements, he cannot access his data in a distributed le system. In contrast, portable storage only depends on widelysupported low-level hardware and software interfaces. If a user sits down at a random machine, he can be much more con dent of accessing data from portable storage in his possession than from a remote le server.

3Lookaside Caching

Our goal is to exploit the performance and availability advantages of portable storage to improve these same attributes in a distributed le system. The resulting design should preserve all other characteristics of the underlying distributed le system. In particular,


there should be no compromise of robustness, consistency or security. There should also be no added complexity in sharing and collaboration. Finally, the design should be tolerant of human error: improper use of the portable storage device (such as using the wrong device or forgetting to copy the latest version of a le to it) should not hurt correctness.

Lookaside caching is an extension of AFS2-style wholele caching [7] that meets the above goals. It is based on the observation that virtually all distributedle system protocols provide separate remote procedure calls (RPCs) for access of meta-data and access of data content. Lookaside caching extends the de nition of meta-data to include a cryptographic hash of data content. This extension only increases the size of metadata by a modest amount: just 20 bytes if SHA-1 [11] is used as the hash. Since hash size does not depend onle length, it costs very little to obtain and cache hash information even for many large les. Using POSIX terminology, caching the results of “ls -lR” of a large tree is feasible on a small client, even if there is not enough cache space for the contents of all the les in the tree. This continues to be true even if one augments stat information for each le or directory in the tree with its SHA-1 hash.

Once a client possesses valid meta-data for an object, it can use the hash to redirect the fetch of data content. If a mounted portable storage device has a le with matching length and hash, the client can obtain the contents of the le from the device rather than from thele server. Whether it is bene cial to do this depends, of course, on factors such as le size, network bandwidth, and device transfer rate. The important point is that possession of the hash gives a degree of freedom that clients of a distributed le system do not possess today.

Since lookaside caching treats the hash as part of the meta-data, there is no compromise in consistency. The underlying cache coherence protocol of the distributed le system determines how closely client state tracks server state. There is no degradation in the accuracy of this tracking if the hash is used to redirect access of data content. To ensure no compromise in security, the le server should return a null hash for any object on which the client only has permission to read the meta-data.

Lookaside caching can be viewed as a degenerate case of the use of le recipes, as described by Tolia et

al. [22]. In that work, a recipe is an XML description ofle content that enables block-level reassembly of thele from content-addressable storage. One can view the hash of a le as the smallest possible recipe for it. The implementation using recipes is considerably more complex than our support for lookaside caching. In return for this complexity, synthesis from recipes may succeed in many situations where lookaside fails.

4Prototype Implementation

We have implemented lookaside caching in the Coda le system on Linux. The user-level implementation of Coda client cache manager and server code greatly simpli ed our effort since no kernel changes were needed. The implementation consists of four parts: a small change to the client-server protocol; a quick index check (the “lookaside”) in the code path for handling a cache miss; a tool for generating lookaside indexes; and a set of user commands to include or exclude speci c lookaside devices.

The protocol change replaces two RPCs,

ViceGetAttr() and ViceValidateAttrs() with the extended calls ViceGetAttrPlusSHA() and ViceValidateAttrsPlusSHA() that have an extra parameter for the SHA-1 hash of the le. ViceGetAttr() is used to obtain meta-data for ale or directory, while ViceValidateAttrs() is used to revalidate cached meta-data for a collection of les or directories when connectivity is restored to a server. Our implementation preserves compatibility with legacy servers. If a client connects to a server that has not been upgraded to support lookaside caching, it falls back to using the original RPCs mentioned above.

The lookaside occurs just before the execution of the ViceFetch() RPC to fetch le contents. Before network communication is attempted, the client consults one or more lookaside indexes to see if a local le with identical SHA-1 value exists. Trusting in the collision resistance of SHA-1 [10], a copy operation on the local le can then be a substitute for the RPC. To detect version skew between the local le and its index, the SHA-1 hash of the local le is re-computed. In case of a mismatch, the local le substitution is suppressed and the cache miss is serviced by contacting the le server. Coda's consistency model is not compromised, although some small amount amount of work is wasted on the lookaside path.

The index generation tool walks the le tree rooted


+ 9 hidden pages