Posts Tagged ‘protocols
Though DTNs(Delay Tolerant Networks) arise in many situations and may take many forms, our terminology in this paper is slanted towards the particular example of rural area DTNs. The use of this concrete example aids exposition and provides motivation, but does not reduce the applicability of our work to other types of DTNs.
Figure 1 illustrates a typical rural area DTN.
Figure 1: A Typical Rural Area DTN
- The approach is applicable to villages and rural areas with no Internet connectivity due to geographic or economic constraints.
- There is an Internet connection available in a nearby town and a transport medium from the rural area to the town in the form of a vehicle, such as a bus or a car.
- The terminal with Internet connectivity is called the gateway. A transport medium that carries data from the end users in a village to a gateway is called a mobile router.
- There is also a special static router called a kiosk, which serves as a computing facility for DTN users. The kiosk also provides a persistent data transfer facility, so users do not have to wait for a mobile router to show up.
- There are two types of end users, mobile users, who use their own personal devices to connect directly to routers (typically a kiosk), and kiosk users, who use a shared terminal at a kiosk.Our anonymous secure communication architecture mainly targets mobile users. However, if a kiosk is trusted, our architecture provides equivalent security and anonymity to kiosk users.
Achieving security and privacy in such disconnected networks is a demanding task, but it is necessary in hostile environments with malicious attackers or even just passive listeners. In rural area DTNs, security and privacy are necessary to effectively implement concepts like e-governance, citizen journalism, distance education and telemedicine. In a hostile environment, secure and anonymous DTN communication can provide an efficient way to let informers transfer information while hiding their identity from an enemy. Therefore, the utility of a DTN is greatly expanded when the DTN provides end-to-end security and privacy. The limitations of DTNs require the design of new security and privacy protocols for DTNs, which forms the basis for this work.
Quantum cryptography involves a surprisingly elaborate suite of specialized protocols, which we term “QKD protocols.”Many aspects of these protocols are unusual – both in motivation and in implementation – and may be of interest to specialists in communications protocols.
Fig.1 Effects of an unbalanced interferometer on a photon
We have designed this engine so it is easy to “plug in” new protocols, and expect to devote considerable time in coming years to inventing new QKD (Quantum Key Distribution) protocols and trying them in practice. As shown in Fig. 1, these protocols are best described as sub-layers within the QKD protocol suite. Note, however, that these layers do not correspond in any obvious way to the layers in a communications stack, e.g., the OSI layers. As will be seen,they are in fact closer to being pipeline stages.
Sifting is the process whereby Alice and Bob winnow away all the obvious “failed qubits” from a series of pulses. As described in the introduction to this section, these failures include those qubits where Alice’s laser never transmitted,Bob’s detectors didn’t work, photons were lost in transmission,and so forth. They also include those symbols where Alice choose one basis for transmission but Bob chose the other for receiving.
At the end of this round of protocol interaction – i.e. after a sift and sift response transaction – Alice and Bob discard all the useless symbols from their internal storage, leaving only those symbols that Bob received and for which Bob’s basis matches Alice’s. In general, sifting dramatically prunes the number of symbols held in Alice and Bob. For instance, assume that 1% of the photons that Alice tries to transmit are actually received at Bob and that the system noise rate is 0. On average, Alice and Bob will happen to agree on a basis 50% of the time in BB84. Thus only 50% x 1% of Alice’s photons give rise to a sifted bit, i.e.,1 photon in 200. A transmitted stream of 1,000 bits therefore would boil down to about 5 sifted bits.
Error correction allows Alice and Bob to determine all the“error bits” among their shared, sifted bits, and correct them so that Alice and Bob share the same sequence of error-corrected bits. Error bits are ones that Alice transmitted as a 0 but Bob received as a 1, or vice versa. These bit errors can be caused by noise or by eavesdropping.
Error correction in quantum cryptography has a very unusual constraint, namely, evidence revealed in error detection and correction (e.g. parity bits) must be assumed to be known to Eve, and thus to reduce the hidden entropy available for key material. As a result, there is very strong motivation to design error detection and correction codes that reveal as little as possible in their public control traffic between Alice and Bob. Our first approach for error correction is a novel variant of the Cascade protocol and algorithms. The protocol is adaptive,in that it will not disclose too many bits if the number of errors is low, but it will accurately detect and correct a large number of errors (up to some limit) even if that number is well above the historical average.
Our version works by defining a number of subsets (currently 64) of the sifted bits and forming the parities of each subset. In the first message, the list of subsets and their parities is sent to the other side, which then replies with its version of the parities. The subsets are pseudo-random bit strings, from a Linear-Feedback Shift Register (LFSR) and are identified by a 32-bit seed for the LFSR. Once an error bit has been found and fixed, both sides inspect their records of subsets and subranges, and flip the recorded parity of those that contained that bit.This will clear up some discrepancies but may introduce other new ones, and so the process continues.
Since these parity fields are revealed in the interchange of “error correction” messages between Alice and Bob, these bits must be taken as known to Eve. Therefore, the QKD protocol engine records amount of information revealed (lost) due to parity fields, and later requires a compensating level of privacy amplification to reduce Eve’s knowledge to acceptable levels.
Privacy amplification is the process whereby Alice and Bob reduce Eve’s knowledge of their shared bits to an acceptable level. This technique is also often called advantage distillation.
Authentication allows Alice and Bob to guard against “man in the middle attacks,” i.e., allows Alice to ensure that she is communicating with Bob (and not Eve) and vice versa. Authentication must be performed on an ongoing basis for all key management traffic, since Eve may insert herself into the conversation between Alice and Bob at any stage in their communication. The original BB84 paper described the authentication problem and sketched a solution to it based on universal families of hash functions, introduced by Wegman and Carter. This approach requires Alice and Bob to already share a small secret key, which is used to select a hash function from the family to generate an authentication hash of the public correspondence between them. By the nature of universal hashing, any party who didn’t know the secret key would have an extremely low probability of being able to forge the correspondence, even an adversary with unlimited computational power. The drawback is that the secret key bits cannot be re-used even once on different data without compromising the security. Fortunately, a complete authenticated conversation can validate a large number of new,shared secret bits from QKD, and a small number of these maybe used to replenish the pool.
There are many further details in a practical system which we will only mention in passing, including symmetrically authenticating both parties, limiting the opportunities for Eveto force exhaustion of the shared secret key bits, and adapting the system to network asynchrony and retransmissions. Another important point: it is insufficient to authenticate just the QKD protocols; we must also apply the these techniques to authenticate the VPN data traffic.
Broadly stated, QKD(Quantum Key Distribution) offers a technique for coming to agreement upon a shared random sequence of bits within two distinct devices, with a very low probability that other devices(eavesdroppers) will be able to make successful inferences as to those bits’ values. In specific practice, such sequences are then used as secret keys for encoding and decoding messages between the two devices. Viewed in this light, QKD is quite clearly a key distribution technique, and one can rate QKD’s strengths against a number of important goals for key distribution, as summarized in the following paragraphs.
Confidentiality of Keys : Confidentiality is the main reason for interest in QKD. Public key systems suffer from an ongoing uncertainty that decryption is mathematically intractable. Thus key agreement primitives widely used in today’s Internet security architecture, e.g., Diffie-Hellman, may perhaps be broken at some point in the future. This would not only hinder future ability to communicate but could reveal past traffic.Classic secret key systems have suffered from different problems, namely, insider threats and the logistical burden of distributing keying material. Assuming that QKD techniques are properly embedded into an overall secure system, they can provide automatic distribution of keys that may offer security superior to that of its competitors.
Authentication : QKD does not in itself provide authentication.Current strategies for authentication in QKD systems include prepositioning of secret keys at pairs of devices, to be used in hash-based authentication schemes, or hybrid QKD-public key techniques. Neither approach is entirely appealing. Prepositioned secret keys require some means of distributing these keys before QKD itself begins, e.g., by human courier,which may be costly and logistically challenging. Furthermore, this approach appears open to denial of service attacks in which an adversary forces a QKD system to exhaust its stockpile of key material, at which point it can no longer perform authentication. On the other hand, hybrid QKD-public key schemes inherit the possible vulnerabilities of public key systems to cracking via quantum computers or unexpectedadvances in mathematics.
Sufficiently Rapid Key Delivery : Key distribution systems must deliver keys fast enough so that encryption devices do not exhaust their supply of key bits. This is a race between the rate at which keying material is put into place and the rate at which it is consumed for encryption or decryption activities. Today’s QKD systems achieve on the order of 1,000 bits/second throughput for keying material, in realistic settings, and often run at much lower rates. This is unacceptably low if one uses these keys in certain ways, e.g., as one-time pads for high speed traffic flows. However it may well be acceptable if the keying material is used as input for less secure (but often secure enough) algorithms such as the Advanced Encryption Standard. Nonetheless, it is both desirable and possible togreatly improve upon the rates provided by today’s QKD technology.
Robustness : This has not traditionally been taken into account by the QKD community. However, since keying material is essential for secure communications, it is extremely important that the flow of keying material not be disrupted, whether by accident or by the deliberate acts of an adversary (i.e. by denial of service). Here QKD has provided a highly fragile service to date since QKD techniques have implicitly been employed along a single point-to-point link. If that link were disrupted,whether by active eavesdropping or indeed by fiber cut, all flow of keying material would cease. In our view a meshed QKD network is inherently far more robust than any single point-to-point link since it offers multiple paths for key distribution.
Distance- and Location-Independence : In the ideal world,any entity can agree upon keying material with any other(authorized) entity in the world. Rather remarkably, the Internet’s security architecture does offer this feature – any computer on the Internet can form a security association with any other, agreeing upon keys through the Internet IPsec protocols. This feature is notably lacking in QKD, which requires the two entities to have a direct and unencumbered path for photons between them, and which can only operate fora few tens of kilometers through fiber.
Resistance to Traffic Analysis : Adversaries may be able to perform useful traffic analysis on a key distribution system,e.g., a heavy flow of keying material between two points might reveal that a large volume of confidential information flows, or will flow, between them. It may thus be desirable to impede such analysis. Here QKD in general has had a rather weak approach since most setups have assumed dedicated, point-to-point QKD links between communicating entities which thus clearly lays out the underlying key distribution relationships.
The concept of a mobile router has been discussed among researchers of ad-hoc and mobile Internet networks for some time. When operating in the context of Internet-style protocols, mobile routers generally present challenges with respect to node addressing and routing performance when the topology or link performance changes. These challenges have their analogues in DTN, but with a some what different set of constraints. Once again, while we can learn from the approaches considered for ad-hoc routing and Mobile IP, the applicabilityof these approaches are limited because generally speaking the network model for these efforts is the traditional (static) network where the nodes are fully connected.
In the context of DTN, a mobile router is a device capable of store-and-forward operations, and thus represents more of a data carrying entity instead of a conventional router that may happen to be moving among. These data carrying “routers” have been called Data Mules. This term has emerged to describe situations where some entity (the mule) provides storage for messages that is physically moved from point to point to provide transit connectivity. This method of message transport is not purely academic fantasy: the DakNet project connects hard-to-reach villages via limited range RF transmissions from a portable computer ona commuter bus that performs store-and-forward data transfers (via smtp). In another example, the Wizzy Digital Courier project in South Africa transfers e-mail messages and Web searches on a USB key that is carried by a bicycle or motorcycle rider between schools. Finally,the postal system has been been proposed as a viable message carrying entity, and entire computers are sometimes shipped in order to move very large quantities of data. The benefit of data mules is essentially the possibility of extremely high capacity and throughput in exchange for large delivery delays.
While the DTN architecture embraces the concept of data mules, it appears to suggest abstracting them as edges in the DTN network multigraph. This may seem most appropriate, as the mule intuitively appears to be a communication link with a particularly high delay (and possibly high capacity). However, a mule generally has finite storage which is unequal to its bandwidth-delay product. We therefore consider the following question: Should the Data Mule be modeled as a node or a link ?
A network link (graph edge) is generally characterized by its delay and throughput and the vertices it interconnects.These edges are directional, so that different performance characteristics can be captured for asymmetric bidirectional links. Links are typically considered to be passive in that they execute neither a path selection nor scheduling decision process. Nodes, conversely, tend to be thought of as active entities (with finite storage) that make forwarding decisions about the messages transiting through them.
Using the active/passive characterization of nodes and edges, it then seems natural to represent a mule as a node instead of an edge, given its activity with respect to message routing. If we apply the same reasoning to passive entities (e.g. USB drives or DVDs), then we naturallyconclude they should be characterized as edges. However, as we shall now explore, this method of partitioning may reveal a false dichotomy.
To make effective forwarding decisions, nodes should maintain state about their communication contacts andact on this state as necessary. This is straight forwardly implemented for an active device (bus with computer aboard), but less clear for a passive device. Taking aUSB drive as an example, it is fairly simple to arrange for the drive to store state representing, at a minimum,the set of nodes that it has (and/or will likely) encounter.With this in mind, there is little fundamental distinction between a USB drive (passive, storage only) and a router equipped bus (active, storage and processing) both can be considered mules. One could easily imagine a software architecture wherein inserting a USB drive into ahost machine causes that host to automatically make aset of forwarding decisions based on state stored on the drive itself. In this way, the USB drive (in conjunction with its host) resembles a message router that employs check pointing to persistent storage. In other words, the USB drive is a node that remains dormant until “activated” by a host.
Thus, while the DTN architecture embraces the concept of mobile nodes and data mules, it does not fully specify how they should be included in the network model. Through the careful consideration of mobility and mules, we have refined the DTN architectural description in order to provide a basis for the design of our implementation.
The core of the Web is formed by document standards and exchange protocols. Here we describe tree-structured documents transferred as SGML or XML. HTML, an SGML application, is the most commonly used document format on the Web. HTML represents documents as a tree using a fixed set of elements (tags), where the SGML DTD (Document Type Declaration) puts constraints on how elements can be nested. Each node in the hierarchy has a name (the element-name), a set of name-value pairs known as its attributes and content, a sequence of sub-elements and text (data).
XML is a rationalisation of SGML using the same tree-model, but removing many rarely used features as well as abbreviations that were introduced in SGML to make the markup easier to type and read by humans. XML documents are used to represent text using custom application-oriented tags as well as a serialization format for arbitrary data exchange between computers. XHTML is HTML based on XML rather than SGML. A stable Prolog term-representation for SGML/XML trees plays a similar role as the DOM (Document Object Model ) representation in use in the object-oriented world.
Some issues have been subject to debate.
- Representation of text by a Prolog atom is biased by the use of SWI-Prolog which has no length-limit on atoms and atoms that can represent Unicode text. At the same time SWI-Prolog stacks are limited to128 MB each. Using atoms only the structure of the tree is represented on the stack, while the bulk of the data is stored on the unlimited heap. Using lists of character codes is another possibility adopted by both PiLLoW and ECLiPSe. Two observations make lists less attractive: lists use two cells per character while practical experience shows text is frequently processed as a unit only. For (HTML) text-documents we profit from the compact representation of atoms. For XML documents representing serialized data-structures we profit from frequent repetition of the same value.
- Attribute values of multi-value attributes (e.g. NAMES) are returned as a Prolog list. This implies the DTD must be available to get unambiguous results. With SGML this is always true, but not with XML.
- Optionally attribute values of type NUMBER or NUMBERS are mapped to Prolog numbers. In addition to the DTD issues mentioned above, this conversion also suffers from possible loss of information. Leading zeros and different floating point number notations used are lost after conversion. Prolog systems with bounded arithmetic may also not be able to represent all values. Still, automatic conversion is useful in many applications, especially involving serialized data-structures.
- Attribute values are represented as Name=Value. Using Name(Value) is an alternative. The Name=Value representation was chosen for its similarity to the SGML notation and because it avoids the need for univ (=..) for processing argument-lists.
Implementation : The SWI-Prolog SGML/XML parser is implemented as a C-library that has been built from scratch to reach at a lightweight parser. Total sourceis 11,835 lines. The parser provides two interfaces. Most natural to Prolog is load structure(+Src, -DOM, +Options) which parses a Prolog stream into a term as described above. Alternatively, sgml_parse/2 provides an event-based parser making call-backs on Prolog for the SGML events. The call-back mode can deal with unbounded documents in streaming mode. It can be mixed with the term-creation mode, where the handler for begin calls the parser to create a term-representation for the content of the element. This feature is used to process long files with a repetitive record structure in limited memory.