In this paper, we consider the problem of detecting whether a compromised router is maliciously manipulating its stream ofpackets. In particular, we are concerned with a simple yet effective attack in which a router selectively drops packets destined for somevictim. Unfortunately, it is quite challenging to attribute a missing packet to a malicious action because normal network congestion canproduce the same effect. Modern networks routinely drop packets when the load temporarily exceeds their buffering capacities.Previous detection protocols have tried to address this problem with a user-defined threshold: too many dropped packets implymalicious intent. However, this heuristic is fundamentally unsound; setting this threshold is, at best, an art and will certainly createunnecessary false positives or mask highly focused attacks. We have designed, developed, and implemented a compromised routerdetection protocol that dynamically infers, based on measured traffic rates and buffer sizes, the number of congestive packet lossesthat will occur. Once the ambiguity from congestion is removed, subsequent packet losses can be attributed to malicious actions. Wehave tested our protocol in Emulab and have studied its effectiveness in differentiating attacks from legitimate network behavior.
THE Internet is not a safe place. Unsecured hosts canexpect to be compromised within minutes of connectingto the Internet and even well-protected hosts may becrippled with denial-of-service (DoS) attacks. However,while such threats to host systems are widely understood, itis less well appreciated that the network infrastructure itselfis subject to constant attack as well. Indeed, throughcombinations of social engineering and weak passwords,attackers have seized control over thousands of Internetrouters , . Even more troubling is Mike Lynn’scontroversial presentation at the 2005 Black Hat Briefings,which demonstrated how Cisco routers can be compromisedvia simple software vulnerabilities. Once a router hasbeen compromised in such a fashion, an attacker mayinterpose on the traffic stream and manipulate it maliciouslyto attack others—selectively dropping, modifying,or rerouting packets.Several researchers have developed distributed protocolsto detect such traffic manipulations, typically byvalidating that traffic transmitted by one router is receivedunmodified by another , . However, all of theseschemes—including our own—struggle in interpreting theabsence of traffic. While a packet that has been modified intransit represents clear evidence of tampering, a missingpacket is inherently ambiguous: it may have beenexplicitly blocked by a compromised router or it mayhave been dropped benignly due to network congestion.In fact, modern routers routinely drop packets due tobursts in traffic that exceed their buffering capacities, andthe widely used Transmission Control Protocol (TCP) isdesigned to cause such losses as part of its normalcongestion control behavior. Thus, existing traffic validationsystems must inevitably produce false positives forbenign events and/or produce false negatives by failing toreport real malicious packet dropping.In this paper, we develop a compromised routerdetection protocol that dynamically infers the precisenumber of congestive packet losses that will occur. Oncethe congestion ambiguity is removed, subsequent packetlosses can be safely attributed to malicious actions. Webelieve our protocol is the first to automatically predictcongestion in a systematic manner and that it is necessaryfor making any such network fault detection practical.In the remainder of this paper, we briefly survey therelated background material, evaluate options for inferringcongestion, and then present the assumptions, specification,and a formal description of a protocol that achieves thesegoals. We have evaluated our protocol in a small experimentalnetwork and demonstrate that it is capable ofaccurately resolving extremely small and fine-grainedattacks.
There are inherently two threats posed by a compromisedrouter. The attacker may subvert the network control plane(e.g., by manipulating the routing protocol into false routeupdates) or may subvert the network data plane andforward individual packets incorrectly. The first set ofattacks have seen the widest interest and the mostactivity—largely due to their catastrophic potential. Byviolating the routing protocol itself, an attacker may causelarge portions of the network to become inoperable. Thus,there have been a variety of efforts to impart authenticity and consistency guarantees on route update messages withvarying levels of cost and protection , , , , , .We do not consider this class of attacks in this paper.Instead, we have focused on the less well-appreciatedthreat of an attacker subverting the packet forwardingprocess on a compromised router. Such an attack presentsa wide set of opportunities including DoS, surveillance,man-in-the-middle attacks, replay and insertion attacks,and so on. Moreover, most of these attacks can be triviallyimplemented via the existing command shell languages incommodity routers.The earliest work on fault-tolerant forwarding is due toPerlman  who developed a robust routing system basedon source routing, digitally signed route-setup packets, andreserved buffers. While groundbreaking, Perlman’s workrequired significant commitments of router resources andhigh levels of network participation to detect anomalies.Since then, a variety of researchers have proposed lighterweight protocols for actively probing the network to testwhether packets are forwarded in a manner consistent withthe advertised global topology , , . Conversely, the1997 WATCHERS system detects disruptive routers passivelyvia a distributed monitoring algorithm that detectsdeviations from a “conservation of flow” invariant , .However, work on WATCHERS was abandoned, in part dueto limitations in its distributed detection protocol, its overhead,and the problem of ambiguity stemming from congestion. Finally, our own work broke the problem into threepieces: a traffic validation mechanism, a distributed detectionprotocol, and a rerouting countermeasure. In  and , wefocused on the detection protocol, provided a formal frameworkfor evaluating the accuracy and precision of any suchprotocol, and described several practical protocols that allowscalable implementations. However, we also assumed thatthe problem of congestion ambiguity could be solved,without providing a solution. This paper presents a protocolthat removes this assumption.
3 INFERRING CONGESTIVE LOSS
In building a traffic validation protocol, it is necessary toexplicitly resolve the ambiguity around packet losses.Should the absence of a given packet be seen as maliciousor benign? In practice, there are three approaches foraddressing this issue:.
Low rates of packet loss are assumedto be congestive, while rates above some predefinedthreshold are deemed malicious..
Packet loss rates are predicted as afunction of traffic parameters and losses beyond theprediction are deemed malicious..
Individual packet losses arepredicted as a function of measured traffic loadand router buffer capacity. Deviations from thesepredictions are deemed malicious.Most traffic validation protocols, including WATCHERS ,Secure Traceroute , and our own work described in ,analyze aggregate traffic over some period of time in orderto amortize monitoring overhead over many packets. Forexample, one validation protocol described in  maintainspacket counters in each router to detect if traffic flow is notconserved from source to destination. When a packet arrivesat router r and is forwarded to a destination that willtraverse a path segment ending at router x, r increments anoutbound counter associated with router x. Conversely,when a packet arrives at router r, via a path segmentbeginning with router x, it increments its inbound counterassociated with router x. Periodically, router x sends a copyof its outbound counters to the associated routers forvalidation. Then, a given router r can compare the numberof packets that x claims to have sent to r with the number ofpackets it counts as being received from x, and it can detectthe number of packet losses.Thus, over some time window, a router simply knowsthat out of m packets sent, n were successfully received. Toaddress congestion ambiguity, all of these systems employ apredefined threshold: if more than this number is droppedin a time interval, then one assumes that some router iscompromised. However, this heuristic is fundamentallyflawed: how does one choose the threshold?In order to avoid false positives, the threshold must belarge enough to include the maximum number of possiblecongestive legitimate packet losses over a measurementinterval. Thus, any compromised router can drop that manypackets without being detected. Unfortunately, given thenature of the dominant TCP, even small numbers of lossescan have significant impacts. Subtle attackers can selectivelytarget the traffic flows of a single victim and within theseflows only drop those packets that cause the most harm. Forexample, losing a TCP SYN packet used in connectionestablishment has a disproportionate impact on a hostbecause the retransmission time-out must necessarily bevery long (typically 3 seconds or more). Other seeminglyminor attacks that cause TCP time-outs can have similareffects—a class of attacks well described in .All things considered, it is clear that the static thresholdmechanism is inadequate since it allows an attacker tomount vigorous attacks without being detected.Instead of using a static threshold, if the probability ofcongestive losses can be modeled, then one could resolveambiguities by comparing measured loss rates to the ratespredicted by the model. One approach for doing this is topredict congestion analytically as a function of individualtraffic flow parameters, since TCP explicitly responds tocongestion. Indeed, the behavior of TCP has been excessivelystudied