介绍 | 大狗精读 PBFT 论文(二)

1 Introduction

Malicious attacks and software errors are increasingly common. The growing reliance of industry and government on online information services makes malicious attacks more attractive and makes the consequences of successful attacks more serious. In addition, the number of software errors is increasing due to the growth in size and complexity of software. Since malicious attacks and software errors can cause faulty nodes to exhibit Byzantine (i.e., arbitrary) behavior, Byzantine-fault-tolerant algorithms are increasingly important.

恶意攻击和软件错误越来越普遍。行业和政府对在线信息服务的日益依赖使恶意攻击更具吸引力,并且使成功攻击的后果更加严重。另外,由于软件系统的规模和复杂性的增加,软件系统错误的数量也在增加。由于恶意攻击和软件错误可能导致故障节点表现出拜占庭(即任意)行为,因此拜占庭容错算法变得越来越重要。

This paper presents a new, practical algorithm for state machine replication that tolerates Byzantine faults. The algorithm offers both liveness and safety provided at most (n-1)/3 out of a total of replicas are simultaneously faulty. This means that clients eventually receive replies to their requests and those replies are correct according to linearizability .The algorithm works in asynchronous systems like the Internet and it incorporates important optimizations that enable it to perform efficiently.

本文提出了一种新的实用的状态机复制算法,可以容忍拜占庭式错误。该算法同时提供了活动性和安全性,但同时最多有 ( n – 1 ) / 3 个副本同时出现故障。这意味着客户端最终会收到对其请求的答复,并且遵循线性一致性( linearizability ),这些答复是正确的。该算法在 Internet 等异步系统中运行,并且结合了重要的优化功能,使其能够高效执行。

There is a significant body of work on agreement and replication techniques that tolerate Byzantine faults. However, most earlier work either concerns techniques designed to demonstrate theoretical feasibility that are too inefficient to be used in practice, or assumes synchrony, i.e., relies on known bounds on message delays and process speeds. The systems closest to ours, Rampart and SecureRing , were designed to be practical, but they rely on the synchrony assumption for correctness, which is dangerous in the presence of malicious attacks. An attacker may compromise the safety of a service by delaying non-faulty nodes or the communication between them untilthey are tagged asfaulty and excluded from the replica group. Such a denial-of-service attack is generally easier than gaining control over a non-faulty node.

在协议和复制技术上可以容忍拜占庭式错误的工作量很大。 但是,大多数较早的工作要么涉及「旨在证明理论上的」可行性的技术,而这些技术的效率太低而无法在实践中使用,要么涉及同步性,即依赖消息延迟和处理速度在已知范围内。 最接近我们的系统, Rampart 和 SecureRing 被设计为实用的,但是它们依靠同步假设来确保正确性,这在存在恶意攻击的情况下是危险的。 攻击者可能会延迟非故障节点或它们之间的通信,直到它们被标记为故障节点并从副本组中排除,从而损害服务的安全性。 通常,这种服务拒绝攻击(DOS)比获得对非故障节点的控制要容易得多。

Our algorithm is not vulnerable to this type of attack because it does not rely on synchrony for safety. In addition, it improves the performance of Rampart and SecureRing by more than an order of magnitude as explained in Section 7. It uses only one message round trip to execute read-only operations and two to execute read-write operations. Also, it uses an efficient authentication scheme based on message authentication codes during normal operation; public-key cryptography, which was cited as the major latency [29] and throughput [22] bottleneck in Rampart, is used only when there are faults.

我们的算法不容易受到此类攻击,因为它不依赖同步来保证安全。 此外,如第 7 节所述,它将 Rampart 和SecureRing 的性能提高了一个数量级以上。它仅使用一次消息往返执行只读操作,使用两次消息往返执行读写操作。 此外,它在正常操作期间使用基于消息身份验证代码的有效身份验证方案; 被称为 Rampart 中的主要延迟[29]和吞吐量[22]瓶颈的公钥密码学,仅在出现故障时才使用。

To evaluate our approach, we implemented a replication library and used it to implement a real service: a Byzantine-fault-tolerant distributed file system that supports the NFS protocol. We used the Andrew benchmark [15] to evaluate the performance of oursystem. The results show that our system is only 3% slower than the standard NFS daemon in the Digital Unix kernel during normal-case operation.

为了评估我们的方法,我们实现了一个复制库,并使用它来实现真实的服务:支持 NFS 协议的拜占庭容错分布式文件系统。 我们使用 Andrew 基准[15]来评估我们系统的性能。 结果表明,在正常情况下,我们的系统仅比 Digital Unix 内核中的标准 NFS 后台程序慢 3%。

Thus, the paper makes the following contributions:

  • It describes the first state-machine replication protocol that correctly survives Byzantine faults in asynchronous networks.

  • It describes a number of important optimizations that allow the algorithm to perform well so that it can be used in real systems.

  • It describes the implementation of a Byzantine-faulttolerant distributed file system.

  • It provides experimental results that quantify the cost of the replication technique.

因此,本篇论文做出了如下贡献:

  • 描述了第一个状态机复制协议,该协议可以在异步网络中正确地克服拜占庭式错误。

  • 描述了许多重要的优化,这些优化使算法能够很好地执行,因此可以在实际系统中使用。

  • 描述了拜占庭容错分布式文件系统的实现。

  • 提供了复制技术的量化的成本的试验结果。

The remainder of the paper is organized as follows. We begin by describing our system model, including our failure assumptions. Section 3 describes the problem solved by the algorithm and states correctness conditions. The algorithm is described in Section 4 and some important optimizations are described in Section 5. Section 6 describes our replication library and how we used it to implement a Byzantine-fault-tolerant NFS. Section 7 presents the results of our experiments. Section 8 discusses related work. We conclude with a summary of what we have accomplished and a discussion of future research directions.

在本文的其余部分安排如下。 我们首先描述我们的系统模型,包括我们的故障假设。 第 3 节描述了该算法解决的问题并陈述了正确性条件。 该算法在第 4 节中进行了介绍,而一些重要的优化则在第 5 节中进行了介绍。第 6 节介绍了我们的复制库以及我们如何使用它来实现拜占庭式容错的 NFS 。 第7节介绍了我们的实验结果。 第 8 节讨论相关工作。 最后,我们总结了已完成的工作并讨论了未来的研究方向。