| ![]() |
02-52 | Florian Schintke, Alexander Reinefeld
On the Cost of Reliability in Large Data Grids | ![]() ![]() |
Abstract: Global grid environments do not only provide massive
aggregated
computing power but also an unprecedented amount of
distributed
storage space. Unfortunately, dynamic changes caused by
component
failures, local decisions, and irregular data updates make
it
difficult to efficiently use this capacity.
In this paper, we address the problem of improving data
availability in the presence of unreliable components. We
present
an analytical model for determining an optimal combination
of
distributed replica catalogs, catalog sizes, and replica
servers.
Empirical simulation results confirm the accuracy of our
theoretical analysis.
Our model captures the characteristics of highly dynamic
environments like peer-to-peer networks, but it can also
be
applied to more centralized, less dynamic grid
environments like
the European DataGrid.
Keywords: replication,
availability,
reliability,
distributed catalog,
data grid
CR: C.4, D.4.2, E.5