Abstract:
Censorship circumvention systems such as Tor are highly vulnerable to
network-level filtering. Because the traffic generated by these systems
is disjoint from normal network traffic, it is easy to recognize and
block, and once the censors identify network servers (e.g., Tor bridges)
assisting in circumvention, they can locate all of their users.
CloudTransport is a new censorship-resistant communication system that hides users' network traffic by tunneling it through a cloud storage service such as Amazon S3. The goal of CloudTransport is to increase the censors' economic and social costs by forcing them to use more expensive forms of network filtering, such as large-scale traffic analysis, or else risk disrupting normal cloud-based services and thus causing collateral damage even to the users who are not engaging in circumvention. CloudTransport's novel passive-rendezvous protocol ensures that there are no direct connections between a CloudTransport client and a CloudTransport bridge. Therefore, even if the censors identify a CloudTransport connection or the IP address of a CloudTransport bridge, this does not help them block the bridge or identify other connections.
CloudTransport can be used as a standalone service, a gateway to an anonymity network like Tor, or a pluggable transport for Tor. It does not require any modifications to the existing cloud storage, is compatible with multiple cloud providers, and hides the user's Internet destinations even if the provider is compromised.
Abstract:
Modern network security rests on the Secure Sockets Layer (SSL) and
Transport Layer Security (TLS) protocols. Distributed systems, mobile
and desktop applications, embedded devices, and all of secure Web rely
on SSL/TLS for protection against network attacks. This protection
critically depends on whether SSL/TLS clients correctly validate X.509
certificates presented by servers during the SSL/TLS handshake protocol.
We design, implement, and apply the first methodology for large-scale testing of certificate validation logic in SSL/TLS implementations. Our first ingredient is "frankencerts," synthetic certificates that are randomly mutated from parts of real certificates and thus include unusual combinations of extensions and constraints. Our second ingredient is differential testing: if one SSL/TLS implementation accepts a certificate while another rejects the same certificate, we use the discrepancy as an oracle for finding flaws in individual implementations.
Differential testing with frankencerts uncovered 208 discrepancies between popular SSL/TLS implementations such as OpenSSL, NSS, CyaSSL, GnuTLS, PolarSSL, MatrixSSL, etc. Many of them are caused by serious security vulnerabilities. For example, any server with a valid X.509 version 1 certificate can act as a rogue certificate authority and issue fake certificates for any domain, enabling man-in-the-middle attacks against MatrixSSL and GnuTLS. Several implementations also accept certificate authorities created by unauthorized issuers, as well as certificates not intended for server authentication.
We also found serious vulnerabilities in how users are warned about certificate validation errors. When presented with an expired, self-signed certificate, NSS, Safari, and Chrome (on Linux) report that the certificate has expired — a low-risk, often ignored error — but not that the connection is insecure against a man-in-the-middle attack.
These results demonstrate that automated adversarial testing with frankencerts is a powerful methodology for discovering security flaws in SSL/TLS implementations.
Abstract:
Hybrid mobile applications (apps) combine the features of Web
applications and "native" mobile apps. Like Web applications, they
are implemented in portable, platform-independent languages such as
HTML and JavaScript. Like native apps, they have direct access to local
device resources — file system, location, camera, contacts, etc.
Hybrid apps are typically developed using hybrid application frameworks such as PhoneGap. The purpose of the framework is twofold. First, it provides an embedded Web browser (for example, WebView on Android) that executes the app's Web code. Second, it supplies "bridges" that allow Web code to escape the browser and access local resources on the device.
We analyze the software stack created by hybrid frameworks and demonstrate that it does not properly compose the access-control policies governing Web code and local code, respectively. Web code is governed by the same origin policy, whereas local code is governed by the access-control policy of the operating system (for example, user-granted permissions in Android). The bridges added by the framework to the browser have the same local access rights as the entire application, but are not correctly protected by the same origin policy. This opens the door to fracking attacks, which allow foreign-origin Web content included into a hybrid app (e.g,. ads confined in iframes) to drill through the layers and directly access device resources. Fracking vulnerabilities are generic: they affect all hybrid frameworks, all embedded Web browsers, all bridge mechanisms, and all platforms on which these frameworks are deployed.
We study the prevalence of fracking vulnerabilities in free Android apps based on the PhoneGap framework. Each vulnerability exposes sensitive local resources — the ability to read and write contacts list, local files, etc. — to dozens of potentially malicious Web domains. We also analyze the defenses deployed by hybrid frameworks to prevent resource access by foreign-origin Web content and explain why they are ineffectual.
We then present NoFrak, a capability-based defense against fracking attacks. NoFrak is platform-independent, compatible with any framework and embedded browser, requires no changes to the code of the existing hybrid apps, and does not break their advertising-supported business model.
Abstract:
Decoy routing is a recently proposed approach for censorship
circumvention. It relies on cooperating ISPs in the middle of
the Internet to deploy the so called "decoy routers" that proxy
network traffic from users in the censorship region. A recent study,
published in an award-winning CCS 2012 paper, suggested that censors in
highly connected countries like China can easily defeat decoy routing
by selecting Internet routes that do not pass through the decoys.
This attack is known as "routing around decoys" (RAD).
In this paper, we perform an in-depth analysis of the true costs of the RAD attack, based on actual Internet data. Our analysis takes into account not just the Internet topology, but also business relationships between ISPs, monetary and performance costs of different routes, etc. We demonstrate that even for the most vulnerable decoy placement assumed in the RAD study, the attack is likely to impose tremendous costs on the censoring ISPs. They will be forced to switch to much more costly routes and suffer from degradation in the quality of service.
We then demonstrate that a more strategic placement of decoys will further increase the censors' costs and render the RAD attack ineffective. We also show that the attack is even less feasible for censors in countries that are not as connected as China since they have many fewer routes to choose from.
The first lesson of our study is that defeating decoy routing by simply selecting alternative Internet routes is likely to be prohibitively expensive for the censors. The second, even more important lesson is that a fine-grained, data-driven approach is necessary for understanding the true costs of various route selection mechanisms. Analyses based solely on the graph topology of the Internet may lead to mistaken conclusions about the feasibility of decoy routing and other censorship circumvention techniques based on interdomain routing.
Abstract:
Code injection attacks continue to plague applications that incorporate
user input into executable programs. For example, SQL injection
vulnerabilities rank fourth among all bugs reported in CVE, yet all
previously proposed methods for detecting SQL injection attacks suffer
from false positives and false negatives.
This paper describes the design and implementation of Diglossia, a new tool that precisely and efficiently detects code injection attacks on server-side Web applications generating SQL and NoSQL queries. The main problems in detecting injected code are (1) recognizing code in the generated query, and (2) determining which parts of the query are tainted by user input. To recognize code, Diglossia relies on the precise definition due to Ray and Ligatti. To identify tainted characters, Diglossia dynamically maps all application-generated characters to shadow characters that do not occur in user input and computes shadow values for all input-dependent strings. Any original characters in a shadow value are thus exactly the taint from user input.
Our key technical innovation is dual parsing. To detect injected code in a generated query, Diglossia parses the query in tandem with its shadow and checks that (1) the two parse trees are syntactically isomorphic, and (2) all code in the shadow query is in shadow characters and, therefore, originated from the application itself, as opposed to user input.
We demonstrate that Diglossia accurately detects both SQL and NoSQL code injection attacks while avoiding the false positives and false negatives of prior methods. By recasting the problem of detecting injected code as a string propagation and parsing problem, we gain substantial improvements in efficiency and precision over prior work. Our approach does not require any changes to the databases, Web servers, or Web browsers, adds virtually unnoticeable performance overhead, and is deployable today.
Abstract:
Genome-wide association studies (GWAS) have become a popular method
for analyzing sets of DNA sequences in order to discover the genetic
basis of disease. Unfortunately, statistics published as the result of
GWAS can be used to identify individuals participating in the study.
To prevent privacy breaches, even previously published results have
been removed from public databases, impeding researchers' access to
the data and hindering collaborative research. Existing techniques for
privacy-preserving GWAS focus on answering specific questions, such as
correlations between a given pair of SNPs (DNA sequence variations).
This does not fit the typical GWAS process, where the analyst may not
know in advance which SNPs to consider and which statistical tests to use,
how many SNPs are significant for a given dataset, etc.
We present a set of practical, privacy-preserving data mining algorithms for GWAS datasets. Our framework supports exploratory data analysis, where the analyst does not know a priori how many and which SNPs to consider. We develop privacy-preserving algorithms for computing the number and location of SNPs that are significantly associated with the disease, the significance of any statistical test between a given SNP and the disease, any measure of correlation between SNPs, and the block structure of correlations. We evaluate our algorithms on real-world datasets and demonstrate that they produce significantly more accurate results than prior techniques while guaranteeing differential privacy.
Abstract:
Perceptual, "context-aware" applications that observe their environment
and interact with users via cameras and other sensors are becoming
ubiquitous on personal computers, mobile phones, gaming platforms,
household robots, and augmented-reality devices. This raises new
privacy risks.
We describe the design and implementation of Darkly, a practical privacy protection system for the increasingly common scenario where an untrusted, third-party perceptual application is running on a trusted device. Darkly is integrated with OpenCV, a popular computer vision library used by such applications to access visual inputs. It deploys multiple privacy protection mechanisms, including access control, algorithmic privacy transforms, and user audit.
We evaluate Darkly on 20 perceptual applications that perform diverse tasks such as image recognition, object tracking, security surveillance, and face detection. These applications run on Darkly unmodified or with very few modifications and minimal performance overheads vs. native OpenCV. In most cases, privacy enforcement does not reduce the applications' functionality or accuracy. For the rest, we quantify the tradeoff between privacy and utility and demonstrate that utility remains acceptable even with strong privacy protection.
Abstract:
In response to the growing popularity of Tor and other censorship
circumvention systems, censors in non-democratic countries have
increased their technical capabilities and can now recognize and block
network traffic generated by these systems on a nationwide scale.
New censorship-resistant communication systems such as SkypeMorph,
StegoTorus, and CensorSpoofer aim to evade censors' observations by
imitating common protocols like Skype and HTTP.
We demonstrate that these systems completely fail to achieve unobservability. Even a very weak, local censor can easily distinguish their traffic from the imitated protocols. We show dozens of passive and active methods that recognize even a single imitated session, without any need to correlate multiple network flows or perform sophisticated traffic analysis.
We enumerate the requirements that a censorship-resistant system must satisfy to successfully mimic another protocol and conclude that "unobservability by imitation" is a fundamentally flawed approach. We then present our recommendations for the design of unobservable communication systems.
Abstract:
We present πBox, a new application platform that prevents apps
from misusing information about their users. To strike a useful balance
between users' privacy and apps' functional needs, πBox shifts much
of the responsibility for protecting privacy from the app and its users
to the platform itself. To achieve this, πBox deploys (1) a sandbox
that spans the user's device and the cloud, (2) specialized storage and
communication channels that enable common app functionalities, and (3)
an adaptation of recent theoretical algorithms for differential privacy
under continual observation. We describe a prototype implementation
of πBox and show how it enables a wide range of useful apps with
minimal performance overhead and without sacrificing user privacy.
Abstract:
The postMessage mechanism in HTML5 enables Web content from different
origins to communicate with each other, thus relaxing the same origin
policy. It is especially popular in websites that include third-party
content. Each message contains accurate information about its origin,
but the receiver must check this information before accepting the message.
The responsibility for preventing cross-origin attacks is thus partially
delegated from the Web browser to the implementors of postMessage
receiver functions.
We collected postMessage receivers from the Alexa top 10,000 websites and found that many perform origin checks incorrectly or not at all. This results in exploitable vulnerabilities in 84 popular sites, including cross-site scripting and injection of arbitrary content into local storage.
We propose two defenses. The first uses pseudo-random tokens to authenticate the source of messages and is intended for the implementors of third-party content. The second, based on a Content Security Policy extension, is intended for website owners. The two defenses are independent and can be deployed jointly or separately.
Abstract:
Access-control policies in Web applications ensure that only authorized
users can perform security-sensitive operations. These policies usually
check user credentials before executing actions such as writing to
the database or navigating to privileged pages. Typically, every
Web application uses its own, hand-crafted program logic to enforce
access control. Within a single application, this logic can vary between
different user roles, e.g., administrator or regular user. Unfortunately,
developers forget to include proper access-control checks, a lot.
This paper presents the design and implementation of FixMeUp, a static analysis and transformation tool that finds access-control errors of omission and produces candidate repairs. FixMeUp starts with a high-level specification that indicates the conditional statement of a correct access-control check and automatically computes an interprocedural access-control template (ACT), which includes all program statements involved in this instance of access-control logic. The ACT serves as both a low-level policy specification and a program transformation template. FixMeUp uses the ACT to find faulty access-control logic that misses some or all of these statements, inserts only the missing statements, and ensures that unintended dependences did not change the meaning of the access-control policy. FixMeUp then presents the transformed program to the developer, who decides whether to accept the proposed repair.
Our evaluation on ten real-world PHP applications shows that FixMeUp is capable of finding subtle access-control bugs and performing semantically correct repairs.
Abstract:
SSL (Secure Sockets Layer) is the de facto standard for secure Internet
communications. Security of SSL connections against an active network
attacker depends on correctly validating public-key certificates presented
when the connection is established.
We demonstrate that SSL certificate validation is completely broken in many security-critical applications and libraries. Vulnerable software includes Amazon's EC2 Java library and all cloud clients based on it; Amazon's and PayPal's merchant SDKs responsible for transmitting payment details from e-commerce sites to payment gateways; integrated shopping carts such as osCommerce, ZenCart, Ubercart, and PrestaShop; AdMob code used by mobile websites; Chase mobile banking and several other Android apps and libraries; Java Web-services middleware - including Apache Axis, Axis 2, Codehaus XFire, and Pusher library for Android - and all applications employing this middleware. Any SSL connection from any of these programs is insecure against a man-in-the-middle attack.
The root causes of these vulnerabilities are badly designed APIs of SSL implementations (such as JSSE, OpenSSL, and GnuTLS) and data-transport libraries (such as cURL) which present developers with a confusing array of settings and options. We analyze perils and pitfalls of SSL certificate validation in software based on these APIs and present our recommendations.
Abstract:
Modern systems keep long memories. As we show in this paper, an
adversary who gains access to a Linux system, even one that implements
secure deallocation, can recover the contents of applications' windows,
audio buffers, and data remaining in device drivers - long after the
applications have terminated.
We design and implement Lacuna, a system that allows users to run programs in "private sessions." After the session is over, all memories of its execution are erased. The key abstraction in Lacuna is an ephemeral channel, which allows the protected program to talk to peripheral devices while making it possible to delete the memories of this communication from the host. Lacuna can run unmodified applications that use graphics, sound, USB input devices, and the network, with only 20 percentage points of additional CPU utilization.
Abstract:
We describe a new side-channel attack. By tracking changes in the
application's memory footprint, a concurrent process belonging to
a different user can learn its secrets. Using Web browsers as the
target, we show how an unprivileged, local attack process - for example,
a malicious Android app - can infer which page the user is browsing,
as well as finer-grained information: whether she is a paid customer,
her interests, etc.
This attack is an instance of a broader problem. Many isolation mechanisms in modern systems reveal accounting information about program execution, such as memory usage and CPU scheduling statistics. If temporal changes in this public information are correlated with the program's secrets, they can lead to a privacy breach. To illustrate the pervasiveness of this problem, we show how to exploit scheduling statistics for keystroke sniffing in Linux and Android, and how to combine scheduling statistics with the dynamics of memory usage for more accurate adversarial inference of browsing behavior.
Abstract:
We systematically describe two classes of evasion exploits against
automated malware detectors. Chameleon attacks confuse the
detectors' file-type inference heuristics, while werewolf attacks
exploit discrepancies in format-specific file parsing between the
detectors and actual operating systems and applications. These attacks
do not rely on obfuscation, metamorphism, binary packing, or any other
changes to malicious code. Because they enable even the simplest, easily
detectable viruses to evade detection, we argue that file processing
has become the weakest link of malware defense. Using a combination of
manual analysis and black-box differential fuzzing, we discovered 45 new
evasion exploits and tested them against 36 popular antivirus scanners,
all of which proved vulnerable to various chameleon and werewolf attacks.
Abstract:
Web applications written in languages such as PHP and JSP are notoriously
vulnerable to accidentally omitted authorization checks and other
security bugs. Existing techniques that find missing security checks in
library and system code assume that (1) security checks can be recognized
syntactically and (2) the same pattern of checks applies universally
to all programs. These assumptions do not hold for Web applications.
Each Web application uses different variables and logic to check the
user's permissions. Even within the application, security logic varies
based on the user's role, e.g., regular users versus
administrators.
This paper describes RoleCast, the first system capable of statically identifying security logic that mediates security-sensitive events (such as database writes) in Web applications, rather than taking a specification of this logic as input. We observe a consistent software engineering pattern - the code that implements distinct user role functionality and its security logic resides in distinct methods and files - and develop a novel algorithm for discovering this pattern in Web applications. Our algorithm partitions the set of file contexts (a coarsening of calling contexts) on which security-sensitive events are control dependent into roles. Roles are based on common functionality and security logic. RoleCast identifies security-critical variables and applies role-specific variable consistency analysis to find missing security checks. RoleCast discovered 13 previously unreported, remotely exploitable vulnerabilities in 11 substantial PHP and JSP applications, with only 3 false positives.
This paper demonstrates that (1) accurate inference of application- and role-specific security logic improves the security of Web applications without specifications, and (2) static analysis can discover security logic automatically by exploiting distinctive software engineering features.
Abstract:
Even experienced developers struggle to implement security policies
correctly. For example, despite 15 years of development, standard Java
libraries still suffer from missing and incorrectly applied permission
checks, which enable untrusted applications to execute native calls or
modify private class variables without authorization. Previous techniques
for static verification of authorization enforcement rely on manually
specified policies or attempt to infer the policy by code-mining. Neither
approach guarantees that the policy used for verification is correct.
In this paper, we exploit the fact that many modern APIs have multiple, independent implementations. Our flow- and context-sensitive analysis takes as input an API, multiple implementations thereof, and the definitions of security checks and security-sensitive events. For each API entry point, the analysis computes the security policies enforced by the checks before security-sensitive events such as native method calls and API returns, compares these policies across implementations, and reports the differences. Unlike code-mining, this technique finds missing checks even if they are part of a rare pattern. Security-policy differencing has no intrinsic false positives: implementations of the same API must enforce the same policy, or at least one of them is wrong!
Our analysis finds 20 new, confirmed security vulnerabilities and 11 interoperability bugs in the Sun, Harmony, and Classpath implementations of the Java Class Library, many of which were missed by prior analyses. These problems manifest in 499 entry points in these mature, well-studied libraries. Multiple API implementations are proliferating due to cloud-based software services and standardization of library interfaces. Comparing software implementations for consistency is a new approach to discovering "deep" bugs in them.
Abstract:
Web applications are vulnerable to semantic attacks such as denial of
service due to infinite loops caused by malicious inputs and unauthorized
database operations due to missing security checks. Unlike "conventional"
threats such as SQL injection and cross-site scripting, these attacks
exploit bugs in the logic of the vulnerable application and cannot be
discovered using data-flow analysis alone.
We give the first characterization of these types of vulnerabilities in PHP applications, develop novel inter-procedural algorithms for discovering them in PHP source code, and implement these algorithms as part of SAFERPHP, a framework for static security analysis of PHP applications. SAFERPHP uncovered multiple, previously unreported vulnerabilities in several popular Web applications.
Inter-domain routing in today's Internet is plagued by security and reliability issues (e.g., prefix hijacking), which are often caused by malicious or Byzantine misbehavior. We argue that route selection policies must move beyond static preferences that select routes on the basis of static attributes such as route length or which neighboring AS is advertising the route.
We prove that route convergence in the presence of Byzantine misbehavior requires that the route selection metric include the dynamics of route updates as a primary component. We then describe a class of simple dynamic policies which consider the observed "ages" of routes. These gerontocratic policies can be combined with static preferences and implemented without major infrastructural changes. They guarantee convergence when adopted universally, without sacrificing most of the flexibility that autonomous systems enjoy in route selection. We empirically demonstrate that even if adopted unilaterally by a single autonomous system, gerontocratic policies yield significantly more stable routes, are more effective at avoiding prefix hijacks, and are as responsive to legitimate route changes as other policies.
We present a new approach to verifying that a completely untrusted, platform-as-a-service cloud is correctly executing an outsourced Web application.
Abstract:
Many commercial websites use recommender systems to help customers locate
products and content. Modern recommenders are based on collaborative
filtering: they use patterns learned from users' behavior to make
recommendations, usually in the form of related-items lists. The scale
and complexity of these systems, along with the fact that their outputs
reveal only relationships between items (as opposed to information about
users), may suggest that they pose no meaningful privacy risk.
In this paper, we develop algorithms which take a moderate amount of auxiliary information about a customer and infer this customer's transactions from temporal changes in the public outputs of a recommender system. Our inference attacks are passive and can be carried out by any Internet user. We evaluate their feasibility using public data from popular websites Hunch, Last.fm, LibraryThing, and Amazon.
Abstract:
TxBox is a new system for sandboxing untrusted applications.
It speculatively executes the application in a system transaction,
allowing security checks to be parallelized and yielding significant
performance gains for techniques such as on-access anti-virus scanning.
TxBox is not vulnerable to TOCTTOU attacks and incorrect mirroring
of kernel state. Furthermore, TxBox supports automatic recovery: if
a violation is detected, the sandboxed program is terminated and all
of its effects on the host are rolled back. This enables effective
enforcement of security policies that span multiple system calls.
Abstract:
DNS cache poisoning is a serious threat to today's Internet. We develop a
formal model of the semantics of DNS caches, including the bailiwick rule
and trust-level logic, and use it to systematically investigate different
types of cache poisoning and to generate templates for attack payloads.
We explain the impact of the attacks on DNS resolvers such as BIND,
MaraDNS, and Unbound and their implications for several defenses against
DNS cache poisoning.
An essay on "personally identifiable information" and its role in digital privacy and privacy protection technologies.
Abstract:
Software developers are increasingly choosing memory-safe languages.
As a result, semantic vulnerabilities---omitted security checks,
misconfigured security policies, and other software design errors---are
supplanting memory-corruption exploits as the primary cause of security
violations. Semantic attacks are difficult to detect because they
violate program semantics, rather than language semantics.
This paper presents PECAN, a new dynamic anomaly detector. PECAN identifies unusual program behavior using history sensitivity and depth-limited context sensitivity. Prior work on context-sensitive anomaly detection relied on stack-walking, which incurs overheads of 50% to over 200%. By contrast, the average overhead of PECAN is 5%, which is low enough for practical deployment.
We evaluate PECAN on four representative real-world attacks from security vulnerability reports. These attacks exploit subtle bugs in Java applications and libraries, using legal program executions that nevertheless violate programmers' expectations. Anomaly detection must balance precision and sensitivity: high sensitivity leads to many benign behaviors appearing anomalous (false positives), while low sensitivity may miss attacks. With application-specific tuning, PECAN efficiently tracks depth-limited context and history and reports few false positives.
Abstract:
We present Airavat, a MapReduce-based system which provides strong
security and privacy guarantees for distributed computations on sensitive
data. Airavat is a novel integration of mandatory access control and
differential privacy. Data providers control the security policy for
their sensitive data, including a mathematical bound on potential privacy
violations. Users without security expertise can perform computations on
the data, but Airavat confines these computations, preventing information
leakage beyond the data provider's policy.
Our prototype implementation demonstrates the flexibility of Airavat on several case studies. The prototype is efficient, with run times on Amazon's cloud computing infrastructure within 32% of a MapReduce system with no security.
As networked systems grow in complexity, they are increasingly vulnerable to denial-of-service (DoS) attacks involving resource exhaustion. A single malicious input of coma can trigger high-complexity behavior such as deep recursion in a carelessly implemented server, exhausting CPU time or stack space and making the server unavailable to legitimate clients. These DoS attacks exploit the semantics of the target application, are rarely associated with network traffic anomalies, and are thus extremely difficult to detect using conventional methods.
We present SAFER, a static analysis tool for identifying potential DoS vulnerabilities and the root causes of resource-exhaustion attacks before the software is deployed. Our tool combines taint analysis with control dependency analysis to detect high-complexity control structures whose execution can be triggered by untrusted network inputs.
When evaluated on real-world networked applications, SAFER discovered previously unknown DoS vulnerabilities in the Expat XML parser and the SQLite library, as well as a new attack on a previously patched version of the wu-ftpd server. This demonstrates the importance of understanding and repairing the root causes of DoS vulnerabilities rather than simply blocking known malicious inputs.
Abstract:
Operators of online social networks are increasingly sharing
potentially sensitive information about users and their relationships
with advertisers, application developers, and data-mining researchers.
Privacy is typically protected by anonymization, i.e., removing
names, addresses, etc.
We present a framework for analyzing privacy and anonymity in social networks and develop a new re-identification algorithm targeting anonymized social-network graphs. To demonstrate its effectiveness on real-world networks, we show that a third of the users who can be verified to have accounts on both Twitter, a popular microblogging service, and Flickr, an online photo-sharing site, can be re-identified in the anonymous Twitter graph with only a 12% error rate.
Our de-anonymization algorithm is based purely on the network topology, does not require creation of a large number of dummy "sybil" nodes, is robust to noise and all existing defenses, and works even when the overlap between the target network and the adversary's auxiliary information is small.
Abstract:
We present an efficient protocol for the privacy-preserving, distributed
learning of decision-tree classifiers. Our protocol allows a user to
construct a classifier on a database held by a remote server without
learning any additional information about the records held in the
database. The server does not learn anything about the constructed
classifier, not even the user's choice of feature and class attributes.
Our protocol uses several novel techniques to enable oblivious classifier construction. We evaluate a prototype implementation, and demonstrate that its performance is efficient for practical scenarios.
Abstract:
Re-identification is a major privacy threat to public datasets
containing individual records. Many privacy protection algorithms rely
on generalization and suppression of "quasi-identifier" attributes such
as ZIP code and birthdate. Their objective is usually syntactic
sanitization: for example, k-anonymity requires that each
"quasi-identifier" tuple appear in at least k records, while
l-diversity requires that the distribution of sensitive attributes
for each quasi-identifier have high entropy. The utility of sanitized
data is also measured syntactically, by the number of generalization
steps applied or the number of records with the same quasi-identifier.
In this paper, we ask whether generalization and suppression of quasi-identifiers offer any benefits over trivial sanitization which simply separates quasi-identifiers from sensitive attributes. Previous work showed that k-anonymous databases can be useful for data mining, but k-anonymization does not guarantee any privacy. By contrast, we measure the tradeoff between privacy (how much can the adversary learn from the sanitized records?) and utility, measured as accuracy of data-mining algorithms executed on the same sanitized records.
For our experimental evaluation, we use the same datasets from the UCI machine learning repository as were used in previous research on generalization and suppression. Our results demonstrate that even modest privacy gains require almost complete destruction of the data-mining utility. In most cases, trivial sanitization provides equivalent utility and better privacy than k-anonymity, l-diversity, and similar methods based on generalization and suppression.
Abstract:
We present a new class of statistical de-anonymization attacks
against high-dimensional micro-data, such as individual preferences,
recommendations, transaction records and so on. Our techniques are
robust to perturbation in the data and tolerate some mistakes in the
adversary's background knowledge.
We apply our de-anonymization methodology to the Netflix Prize dataset, which contains anonymous movie ratings of 500,000 subscribers of Netflix, the world's largest online movie rental service. We demonstrate that an adversary who knows only a little bit about an individual subscriber can easily identify this subscriber's record in the dataset. Using the Internet Movie Database as the source of background knowledge, we successfully identified the Netflix records of known users, uncovering their apparent political preferences and other potentially sensitive information.
Abstract:
Many basic tasks in computational biology involve operations on individual
DNA and protein sequences. These sequences, even when anonymized, are
vulnerable to re-identification attacks and may reveal highly sensitive
information about individuals.
We present a relatively efficient, privacy-preserving implementation of fundamental genomic computations such as calculating the edit distance and Smith-Waterman similarity scores between two sequences. Our techniques are cryptographically secure and significantly more practical than previous solutions. We evaluate our prototype implementation on sequences from the Pfam database of protein families, and demonstrate that its performance is adequate for solving real-world sequence-alignment and related problems in a privacy-preserving manner.
Furthermore, our techniques have applications beyond computational biology. They can be used to obtain efficient, privacy-preserving implementations for many dynamic programming algorithms over distributed datasets.
Abstract:
We investigate the problem of verifying location claims of mobile
devices, and propose a new property called simultaneous distance
modification (SDM). In localization protocols satisfying the SDM
property, a malicious device can lie about its distance from the
verifiers, but all distances can only be altered by the same amount.
We demonstrate that the SDM property guarantees secure verification of
location claims with a small number of verifiers even if some of them
maliciously collude with the device. We also present several lightweight
localization protocols that satisfy the SDM property.
Abstract:
We present an efficient protocol for privacy-preserving evaluation of
diagnostic programs, represented as binary decision trees or branching
programs. The protocol applies a branching diagnostic program with
classification labels in the leaves to the user's attribute vector.
The user learns only the label assigned by the program to his vector;
the diagnostic program itself remains secret. The program's owner does
not learn anything. Our construction is significantly more efficient
than those obtained by direct application of generic secure multi-party
computation techniques.
We use our protocol to implement a privacy-preserving version of the Clarify system for software fault diagnosis, and demonstrate that its performance is acceptable for many practical scenarios.
Abstract:
We design and evaluate a lightweight route verification mechanism that
enables a router to discover route failures and inconsistencies between
advertised Internet routes and actual paths taken by the data packets.
Our mechanism is accurate, incrementally deployable, and secure against
malicious intermediary routers. By carefully avoiding any cryptographic
operations in the data path, our prototype implementation achieves the
overhead of less than 1% on a 1 Gbps link, demonstrating that our method
is suitable even for high-performance networks.
Abstract:
Probe-response attacks are a new threat for collaborative intrusion
detection systems. A probe is an attack which is deliberately crafted so
that its target detects and reports it with a recognizable "fingerprint"
in the report. The attacker then uses the collaborative infrastructure to
learn the detector's location and defensive capabilities from this report.
We analyze the fundamental tradeoff between the ability of a collaborative network to detect epidemic threats and security of individual participants against probe-response attacks. We then design and evaluate a collaborative detection system which provides protection against probe-response attacks. Unlike previously proposed collaborative detection networks, our system supports alert sharing while limiting exposure of members' identities.
Abstract:
The transmission of voice communications as datagram packets over IP
networks, commonly known as Voice-over-IP (VoIP) telephony, is rapidly
gaining wide acceptance. With private phone conversations being
conducted on insecure public networks, security of VoIP communications
is increasingly important. We present a structured security analysis
of the VoIP protocol stack, which consists of signaling (SIP), session
description (SDP), key establishment (SDES, MIKEY, and ZRTP) and secure
media transport (SRTP) protocols. Using a combination of manual and
tool-supported formal analysis, we uncover several design flaws and
attacks, most of which are caused by subtle inconsistencies between the
assumptions that protocols at different layers of the VoIP stack make
about each other.
The most serious attack is a replay attack on SDES, which causes SRTP to repeat the keystream used for media encryption, thus completely breaking transport-layer security. We also demonstrate a man-in-the-middle attack on ZRTP, which allows the attacker to convince the communicating parties that they have lost their shared secret. If they are using VoIP devices without displays and thus cannot execute the "human authentication" procedure, they are forced to communicate insecurely, or not communicate at all, i.e., this becomes a denial of service attack. Finally, we show that the key derivation process used in MIKEY cannot be used to prove security of the derived key in the standard cryptographic model for secure key exchange.
Abstract:
Bluetooth is a popular standard for short-range wireless communications.
Bluetooth device pairing enables two mobile devices to authenticate each
other and establish a secure wireless connection.
We present a formal analysis of authentication properties of Bluetooth device pairing. Using the ProVerif cryptographic protocol verifier, we first analyze the standard device pairing protocol specified in the Bluetooth Core Specification, which relies on short, low-entropy PINs for authentication. Our analysis confirms a previously known attack guessing attack.
We then analyze a recently proposed Simple Pairing protocol. Simple Pairing involves Diffie-Hellman-based key establishment, in which authentication relies on a human visual channel: owner(s) of the mobile devices confirm the established keys by manually comparing the respective hash values of the parameters used to generate each key, as displayed on the devices' screens. This form of authentication presents an interesting modeling challenge. We demonstrate how to formalize it in ProVerif. Our analysis shows that authentication can fail when the same device is involved in concurrent Simple Pairing sessions. We discuss the implications of this authentication failure for typical Bluetooth usage scenarios. We then refine our model to incorporate session identifiers, and prove that the authentication properties of Simple Pairing hold in the new model.
Out-of-band human verification based on image- or audio-matching is increasingly used for authentication of mobile devices. This study is the first step towards automated analysis of formal models of human-authenticated protocols.
Abstract:
We present an efficient construction of Yao's "garbled circuits"
protocol for securely computing any two-party circuit on committed inputs.
The protocol is secure in a universally composable way in the presence of
malicious adversaries under the decisional composite residuosity
(DCR) and strong RSA assumptions, in the common reference string model.
The protocol requires a constant number of rounds (four-five in the
standard model, two-three in the random oracle model, depending on whether
both parties receive the output), O(|C|) modular exponentiations
per player, and a bandwidth of O(|C|) group elements, where
|C| is the size of the computed circuit.
Our technical tools are of independent interest. We propose a homomorphic, semantically secure variant of the Camenisch-Shoup verifiable cryptosystem, which uses shorter keys, is unambiguous (it is infeasible to generate two keys which successfully decrypt the same ciphertext), and allows efficient proofs that a committed plaintext is encrypted under a committed key.
Our second tool is a practical four-round (two-round in ROM) protocol for committed oblivious transfer on strings (string-COT) secure against malicious participants. The string-COT protocol takes a few exponentiations per player, and is UC-secure under the DCR assumption in the common reference string model. Previous protocols of comparable efficiency achieved either committed OT on bits, or standard (non-committed) OT on strings.
Abstract:
Denial of service (DoS) attacks are a growing threat to the availability
of Internet services. We present dFence, a novel network-based defense
system for mitigating DoS attacks. The main thesis of dFence is
complete transparency to the existing Internet infrastructure
with no software modifications at either routers, or the end hosts.
dFence dynamically introduces special-purpose middlebox devices into the
data paths of the hosts under attack. By intercepting both directions of
IP traffic (to and from attacked hosts) and applying stateful defense
policies, dFence middleboxes effectively mitigate a broad range of
spoofed and unspoofed attacks. We describe the architecture of the
dFence middlebox, mechanisms for on-demand introduction and removal,
and DoS mitigation policies, including defenses against DoS attacks on
the middlebox itself. We evaluate our prototype implementation based
on Intel IXP network processors.
Abstract:
Many applications of mix networks such as anonymous Web browsing require
relationship anonymity: it should be hard for the attacker to
determine who is communicating with whom. Conventional methods for
measuring anonymity, however, focus on sender anonymity instead.
Sender anonymity guarantees that it is difficult for the attacker to
determine the origin of any given message exiting the mix network, but
this may not be sufficient to ensure relationship anonymity. Even if
the attacker cannot identify the origin of messages arriving to some
destination, relationship anonymity will fail if he can determine with
high probability that at least one of the messages originated from a
particular sender, without necessarily being able to recognize this
message among others.
We give a formal definition and a calculation methodology for relationship anonymity. Our techniques are similar to those used for sender anonymity, but, unlike sender anonymity, relationship anonymity is sensitive to the distribution of message destinations. In particular, Zipfian distributions with skew values characteristic of Web browsing provide especially poor relationship anonymity. Our methodology takes route selection algorithms into account, and incorporates information-theoretic metrics such as entropy and min-entropy. We illustrate our methodology by calculating relationship anonymity in several simulated mix networks.
Abstract:
Mix networks are a popular mechanism for anonymous Internet
communications. By routing IP traffic through an overlay chain of mixes,
they aim to hide the relationship between its origin and destination.
Using a realistic model of interactive Internet traffic, we study the
problem of defending low-latency mix networks against attacks based on
correlating inter-packet intervals on two or more links of the mix chain.
We investigate several attack models, including an active attack which involves adversarial modification of packet flows in order to "fingerprint" them, and analyze the tradeoffs between the amount of cover traffic, extra latency, and anonymity properties of the mix network. We demonstrate that previously proposed defenses are either ineffective, or impose a prohibitively large latency and/or bandwidth overhead on communicating applications. We propose a new defense based on adaptive padding.
Abstract:
Over the last several years, there has been an emerging interest in the
development of wide-area data collection and analysis centers to help
identify, track, and formulate responses to the ever-growing number
of coordinated attacks and malware infections that plague computer
networks worldwide. As large-scale network threats continue to evolve
in sophistication and extend to widely deployed applications, we expect
that interest in collaborative security monitoring infrastructures
will continue to grow, because such attacks may not be easily diagnosed
from a single point in the network. The intent of this position paper
is not to argue the necessity of Internet-scale security data sharing
infrastructures, as there is ample research and operational examples that
already make this case. Instead, we observe that these well-intended
activities raise a unique set of risks and challenges.
We outline some of the most salient issues faced by global network security centers, survey proposed defense mechanisms, and pose several research challenges to the computer security community. We hope that this position paper will serve as a stimulus to spur groundbreaking new research in protection and analysis technologies that can facilitate the collaborative sharing of network security data while keeping data contributors safe and secure.
Abstract:
The output of a data mining algorithm is only as good as its inputs, and
individuals are often unwilling to provide accurate data about sensitive
topics such as medical history and personal finance. Individuals may
be willing to share their data, but only if they are assured that it
will be used in an aggregate study and that it cannot be linked back
to them. Protocols for anonymity-preserving data collection provide
this assurance, in the absence of trusted parties, by allowing a set of
mutually distrustful respondents to anonymously contribute data to an
untrusted data miner.
To effectively provide anonymity, a data collection protocol must be collusion resistant, which means that even if all dishonest respondents collude with a dishonest data miner in an attempt to learn the associations between honest respondents and their responses, they will be unable to do so. To achieve collusion resistance, previously proposed protocols for anonymity-preserving data collection have quadratically many communication rounds in the number of respondents, and employ (sometimes incorrectly) complicated cryptographic techniques such as zero-knowledge proofs.
We describe a new protocol for anonymity-preserving, collusion resistant data collection. Our protocol has linearly many communication rounds, and achieves collusion resistance without relying on zero-knowledge proofs. This makes it especially suitable for data mining scenarios with a large number of respondents.
Abstract:
Cryptographic security for key exchange and secure session establishment
protocols is often defined in the so called ``adaptive corruptions''
model. Even if the adversary corrupts one of the participants in the
middle of the protocol execution and obtains the victim's secrets such
as the private signing key, the victim must be able to detect this
and abort the protocol. This is usually achieved by adding a key
confirmation message to the protocol. Conventional symbolic methods
for protocol analysis assume unforgeability of digital signatures, and
thus cannot be used to reason about security in the adaptive corruptions
model.
We present a symbolic protocol logic for reasoning about authentication and key confirmation in key exchange protocols. The logic is cryptographically sound: a symbolic proof of authentication and secrecy implies that the protocol is secure in the adaptive corruptions model. We illustrate our method by formally proving security of an authenticated Diffie-Hellman protocol with key confirmation.
Abstract:
We present three case studies, investigating the use of probabilistic
model checking to automatically analyse properties of probabilistic
contract signing protocols. We use the probabilistic model checker
PRISM to analyse three protocols: Rabin's probabilistic protocol for
fair commitment exchange; the probabilistic contract signing protocol
of Ben-Or, Goldreich, Micali, and Rivest; and a randomised protocol for
signing contracts of Even, Goldreich, and Lempel. These case studies
illustrate the general methodology for applying probabilistic model
checking to formal verification of probabilistic security protocols.
For the Ben-Or et al. probabilistic contract signing protocol, we demonstrate the difficulty of combining fairness with timeliness. If, as required by timeliness, the judge responds to participants' messages immediately upon receiving them, then there exists a strategy for a misbehaving participant that brings the protocol to an unfair state with arbitrarily high probability, unless unusually strong assumptions are made about the quality of the communication channels between the judge and honest participants. We quantify the tradeoffs involved in the attack strategy, and discuss possible modifications of the protocol that ensure both fairness and timeliness.
For the Even et al. probabilistic contract signing protocol, we demonstrate that the responder enjoys a distinct advantage. With probability 1, the protocol reaches a state in which the responder possesses the initiator's commitment, but the initiator does not possess the responder's commitment. We then analyse several variants of the protocol, exploring the tradeoff between fairness and the number of messages that must be exchanged between participants.
Abstract:
We consider scenarios in which two parties, each in possession of a
graph, wish to compute some algorithm on their joint graph in
a privacy-preserving manner, that is, without leaking any information
about their inputs except that revealed by the algorithm's output.
Working in the standard secure multi-party computation paradigm, we present new algorithms for privacy-preserving computation of APSD (all pairs shortest distance) and SSSD (single source shortest distance), as well as two new algorithms for privacy-preserving set union. Our algorithms are significantly more efficient than generic constructions. As in previous work on privacy-preserving data mining, we prove that our algorithms are secure provided the participants are honest, but curious.
Abstract:
Human-memorable passwords are a mainstay of computer security.
To decrease vulnerability of passwords to brute-force dictionary attacks,
many organizations enforce complicated password-creation rules and require
that passwords include numerals and special characters. We demonstrate
that as long as passwords remain human-memorable, they are vulnerable to
"smart-dictionary" attacks even when the space of potential passwords
is large.
Our first insight is that the distribution of letters in easy-to-remember passwords is likely to be similar to the distribution of letters in the users' native language. Using standard Markov modeling techniques from natural language processing, this can be used to dramatically reduce the size of the password space to be searched. Our second contribution is an algorithm for efficient enumeration of the remaining password space. This allows application of time-space tradeoff techniques, limiting memory accesses to a relatively small table of "partial dictionary" sizes and enabling a very fast dictionary attack.
We evaluated our method on a database of real-world user password hashes. Our algorithm successfully recovered 67.6% of the passwords using a 2 x 10^9 search space. This is a much higher percentage than Oechslin's "rainbow" attack, which is the fastest currently known technique for searching large keyspaces. These results call into question viability of human-memorable character-sequence passwords as an authentication mechanism.
Abstract:
We investigate whether it is possible to encrypt a database and then
give it away in such a form that users can still access it, but only in a
restricted way. In contrast to conventional privacy mechanisms that aim
to prevent any access to individual records, we aim to restrict the
set of queries that can be feasibly evaluated on the encrypted database.
We start with a simple form of database obfuscation which makes database records indistinguishable from lookup functions. The only feasible operation on an obfuscated record is to look up some attribute Y by supplying the value of another attribute X that appears in the same record (i.e., someone who does not know X cannot feasibly retrieve Y). We then (i) generalize our construction to conjunctions of equality tests on any attributes of the database, and (ii) achieve a new property we call group privacy. This property ensures that it is easy to retrieve individual records or small subsets of records from the encrypted database by identifying them precisely, but "mass harvesting" queries matching a large number of records are computationally infeasible.
Our constructions are non-interactive. The database is transformed in such a way that all queries except those explicitly allowed by the privacy policy become computationally infeasible, i.e., our solutions do not rely on any access-control software or hardware.
Abstract:
We present a cryptographically sound formal method for proving
correctness of key exchange protocols. Our main tool is a fragment of
a symbolic protocol logic. We demonstrate that proofs of key agreement
and key secrecy in this logic imply simulatability in Shoup's secure
multi-party framework for key exchange. As part of the logic, we present
cryptographically sound abstractions of CMA-secure digital signatures and
a restricted form of Diffie-Hellman exponentiation, which is a technical
result of independent interest. We illustrate our method by constructing
a proof of security for a simple authenticated Diffie-Hellman protocol.
Abstract:
We describe a cryptographically sound formal logic for proving protocol
security properties without explicitly reasoning about probability,
asymptotic complexity, or the actions of a malicious attacker.
The approach rests on a new probabilistic, polynomial-time semantics
for an existing protocol security logic, replacing an earlier semantics
that uses nondeterministic symbolic evaluation. While the basic form
of the protocol logic remains unchanged from previous work, there are
some interesting technical problems involving the difference between
efficiently recognizing and efficiently producing a value, and involving
a reinterpretation of standard logical connectives that seems necessary
to support certain forms of reasoning.
Abstract:
Availability is a critical issue in modern distributed systems. While many
techniques and protocols for preventing denial of service (DoS) attacks
have been proposed and deployed in recent years, formal methods for
analyzing and proving them correct have not kept up with the state of the
art in DoS prevention. This paper proposes a new protocol for preventing
malicious bandwidth consumption, and demonstrates how game-based formal
methods can be successfully used to verify availability-related security
properties of network protocols.
We describe two classes of DoS attacks aimed at bandwidth consumption and resource exhaustion, respectively. We then propose our own protocol, based on a variant of client puzzles, to defend against bandwidth consumption, and use the JFKr key exchange protocol as an example of a protocol that defends against resource exhaustion attacks. We specify both protocols as alternating transition systems (ATS), state their security properties in alternating-time temporal logic (ATL) and verify them using MOCHA, a model checker that has been previously used to analyze fair exchange protocols.
Abstract:
We propose a scheme for privacy-preserving escrow of financial
transactions. The objective of the scheme is to preserve privacy and
anonymity of the individual user engaging in financial transactions
until the cumulative amount of all transactions in a certain category,
for example all transactions with a particular counterparty in any single
month, reaches a pre-specified threshold. When the threshold is reached,
the escrow agency automatically gains the ability to decrypt the escrows
of all transactions in that category (and only that category).
Our scheme employs the probabilistic polling idea of Jarecki and Odlyzko [JO97], amended by a novel robustness mechanism which makes such scheme secure for malicious parties. When submitting the escrow of a transaction, with probability that is proportional to the amount of the transaction, the user reveals a share of the key under which all his transactions are encrypted. Therefore, the fraction of shares that are known to the escrow agency is an accurate approximation of the fraction of the threshold amount that has been transacted so far. When the threshold is reached, with high probability the escrow agency possesses all the shares that it needs to reconstruct the key and open the escrows. Our main technical contribution is a novel tool of robust probabilistic information transfer, which we implement using techniques of optimistic fair 2-party computation.
Abstract:
We demonstrate that for any well-defined cryptographic protocol, the
symbolic trace reachability problem in the presence of an Abelian group
operator (eg, multiplication) can be reduced to solvability of
a decidable system of quadratic Diophantine equations. This result
enables complete, fully automated formal analysis of protocols that
employ primitives such as Diffie-Hellman exponentiation, multiplication,
and XOR, with a bounded number of role instances, but without imposing
any bounds on the size of terms created by the attacker.
Abstract:
We propose a formal model for reputation-based trust management.
In contrast to credential-based trust management, in our framework an
agent's reputation serves as the basis for trust. For example, an
access control policy may consider the agent's reputation when deciding
whether to offer him a license for accessing a protected resource.
The underlying semantic model is an event semantics inspired by the
actor model, and assumes that each agent has only partial knowledge
of the events that have occurred. Restrictions on agents' behavior
are formalized as licenses, with ``good'' and ``bad'' behavior
interpreted as, respectively, license fulfillment and violation.
An agent's reputation comprises four kinds of evidence: completely
fulfilled licenses, ongoing licenses without violations or misuses,
licenses with violated obligations, and misused licenses. This approach
enables precise formal modeling of scenarios involving reputations,
such as financial transactions based on credit histories and information
sharing between untrusted agents.
Abstract:
We present a practical scheme for Internet-scale collaborative analysis
of information security threats which provides strong privacy guarantees
to contributors of alerts. Wide-area analysis centers are proving
a valuable early warning service against worms, viruses, and other
malicious activities. At the same time, protecting individual and
organizational privacy is no longer optional in today's business climate.
We propose a set of data sanitization techniques that enable community
alert aggregation and correlation, while maintaining privacy for alert
contributors. Our approach is practical, scalable, does not rely on
trusted third parties or secure multiparty computation schemes, and does
not require sophisticated key management.
Abstract:
The variety of possible anonymity network topologies has spurred much
debate in recent years. In a synchronous batching design, each batch
of messages enters the mix network together, and the messages proceed
in lockstep through the network. We show that a synchronous batching
strategy can be used in various topologies, including a free-route
network, in which senders choose paths freely, and a cascade network,
in which senders choose from a set of fixed paths. We show that
free-route topologies can provide better anonymity as well as better
message reliability in the event of partial network failure.
Abstract:
We propose a practical abuse-resilient transaction escrow scheme with
applications to privacy-preserving audit and monitoring of electronic
transactions. Our scheme ensures correctness of escrows as long as at
least one of the participating parties is honest, and it ensures privacy
and anonymity of transactions even if the escrow agent is corrupt or
malicious. The escrowed information is secret and anonymous, but the
escrow agent can efficiently find transactions involving some user in
response to a subpoena or a search warrant. Moreover, for applications
such as abuse-resilient monitoring of unusually high levels of certain
transactions, the escrow agent can identify escrows with particular
common characteristics and automatically (ie, without a subpoena)
open them once their number has reached a pre-specified threshold.
Our solution for transaction escrow is based on the use of Verifiable Random Functions. We show that by tagging the entries in the escrow database using VRFs indexed by users' private keys, we can protect users' anonymity while enabling efficient and, optionally, automatic de-escrow of these entries. We give a practical instantiation of a transaction escrow scheme utilizing a simple and efficient VRF family secure under the DDH assumption in the Random Oracle Model.
Abstract:
Several related research efforts have led to three different ways of
specifying protocol security properties by simulation or equivalence.
Abstracting the specification conditions away from the computational
frameworks in which they have been previously applied, we show that when
asynchronous communication is used, universal composability, black-box
simulatability, and process equivalence express the same properties of
a protocol. Further, the equivalence between these conditions holds for
any computational framework, such as process calculus, that satisfies
certain structural properties. Similar but slightly weaker results are
achieved for synchronous communication.
Abstract:
We demonstrate that the symbolic trace reachability problem for
cryptographic protocols is decidable in the presence of an Abelian group
operator and modular exponentiation from arbitrary bases. We represent
the problem as a sequence of symbolic inference constraints and reduce
it to a system of linear Diophantine equations. For a finite number of
protocol sessions, this result enables fully automated, sound and complete
analysis of protocols that employ primitives such as Diffie-Hellman
exponentiation and modular multiplication without imposing any bounds
on the size of terms created by the attacker, but taking into account
the relevant algebraic properties.
Abstract:
We use the probabilistic model checker PRISM to analyze the Crowds
system for anonymous Web browsing. This case study demonstrates
how probabilistic model checking techniques can be used to formally
analyze security properties of a peer-to-peer group communication system
based on random message routing among members. The behavior of group
members and the adversary is modeled as a discrete-time Markov chain,
and the desired security properties are expressed as PCTL formulas.
The PRISM model checker is used to perform automated analysis of the
system and verify anonymity guarantees it provides. Our main result is
a demonstration of how certain forms of probabilistic anonymity degrade
when group size increases or random routing paths are rebuilt, assuming
that the corrupt group members are able to identify and/or correlate
multiple routing paths originating from the same sender.
Abstract:
We propose a new specification framework for information hiding properties
such as anonymity and privacy. The framework is based on the concept of a
function view, which is a concise representation of the attacker's
partial knowledge about a function. We describe system behavior as a set
of functions, and formalize different information hiding properties in
terms of views of these functions. We present an extensive case study,
in which we use the function view framework to systematically classify
and rigorously define a rich domain of identity-related properties,
and to demonstrate that privacy and anonymity are independent.
The key feature of our approach is its modularity. It yields precise, formal specifications of information hiding properties for any protocol formalism and any choice of the attacker model as long as the latter induce an observational equivalence relation on protocol instances. In particular, specifications based on function views are suitable for any cryptographic process calculus that defines some form of indistinguishability between processes. Our definitions of information hiding properties take into account any feature of the security model, including probabilities, random number generation, timing, etc., to the extent that it is accounted for by the formalism in which the system is specified.
Abstract:
A contract signing protocol lets two parties exchange digital signatures
on a pre-agreed text. Optimistic contract signing protocols enable
the signers to do so without invoking a trusted third party. However,
an adjudicating third party remains available should one or both signers
seek timely resolution. We analyze optimistic contract signing protocols
using a game-theoretic approach and prove a fundamental impossibility
result: in any fair, optimistic, timely protocol, an optimistic player
yields an advantage to the opponent. The proof relies on a careful
characterization of optimistic play that postpones communication to
the third party. Since advantage cannot be completely eliminated from
optimistic protocols, we argue that the strongest property attainable
is the absence of provable advantage, i.e., abuse-freeness
in the sense of Garay-Jakobsson-MacKenzie.
Abstract:
We present decidability results for the verification of cryptographic
protocols in the presence of equational theories corresponding to
XOR and Abelian groups. Since the perfect cryptography assumption
is unrealistic for cryptographic primitives with visible algebraic
properties such as XOR, we extend the conventional Dolev-Yao model by
permitting the intruder to exploit these properties. We show that the
ground reachability problem in NP for the extended intruder theories in
the cases of XOR and Abelian groups. This result follows from a normal
proof theorem. Then, we show how to lift this result in the XOR case:
we consider a symbolic constraint system expressing the reachability
(eg, secrecy) problem for a finite number of sessions. We prove
that such constraint system is decidable, relying in particular on an
extension of combination algorithms for unification procedures. As a
corollary, this enables automatic symbolic verification of cryptographic
protocols employing XOR for a fixed number of sessions.
Abstract:
Exponential growth in digital information gathering, storage,
and processing capabilities inexorably leads to conflict between
well-intentioned government or commercial datamining, and fundamental
privacy interests of individuals and organizations. This paper
proposes a mechanism that provides cryptographic fetters on the mining
of personal data, enabling efficient mining of previously-negotiated
properties, but preventing any other uses of the protected personal data.
Our approach does not rely on complete trust in the analysts to use the
data appropriately, nor does it rely on incorruptible escrow agents.
Instead, we propose conditional data escrow where the data generators,
not the analysts, hold the keys to the data, but analysts can verify
that the pre-negotiated queries are enabled. Our solution relies on
verifiable, anonymous, and deterministic commitments which play the
role of tags that mark encrypted entries in the analyst's database.
The database owner cannot learn anything from the encrypted entries, or
even verify his guess of the plaintext on which these entries are based.
On the other hand, the verifiable and deterministic property ensures
that the entries are marked with consistent tags, so that the database
manager learns when the number of entries required to enable some query
reaches the pre-negotiated threshold.
Abstract:
We consider the so called "cryptographic protocols" whose aim is to ensure
some security properties when communication channels are not reliable.
Such protocols usually rely on cryptographic primitives. Even if it is
assumed that the cryptographic primitives are perfect, the security goals
may not be achieved: the protocol itself may have weaknesses which can be
exploited by an attacker. We survey recent work on decision techniques
for the cryptographic protocol analysis.
Abstract:
Optimistic contract signing protocols allow two parties to commit to a
previously agreed upon contract, relying on a third party to abort or
confirm the contract if needed. These protocols are relatively subtle,
since there may be interactions between the subprotocols used for normal
signing without the third party, aborting the protocol through the third
party, or requesting confirmation from the third party. With the help of
Murphi, a finite-state verification tool, we analyze two related contract
signing protocols: the optimistic contract signing protocol of Asokan,
Shoup, and Waidner, and the abuse-free contract signing protocol of
Garay, Jakobsson, and MacKenzie. For the first protocol, we discover
that a malicious participant can produce inconsistent versions of the
contract or mount a replay attack. For the second protocol, we discover
that negligence or corruption of the trusted third party may allow abuse
or unfairness. In this case, contrary to the intent of the protocol,
the cheated party is not able to hold the third party accountable.
We present and analyze modifications to the protocols that avoid these
problems and discuss the basic challenges involved in formal analysis
of fair exchange protocols.
Abstract:
The reachability problem for cryptographic protocols with non-atomic
keys can be solved via a simple constraint satisfaction procedure.
Abstract:
We develop an imperative calculus that provides a formal model for both
single and mixin inheritance. By introducing classes and mixins as the
basic object-oriented constructs in a lambda-calculus with records and
references, we obtain a system with an intuitive operational semantics.
New classes are produced by applying mixins to superclasses. Objects are
represented by records and produced by instantiating classes. The type
system for objects uses only functional, record, and reference types,
and there is a clean separation between subtyping and inheritance.
Abstract:
We present an imperative calculus for a class-based language.
By introducing classes as the basic object-oriented construct in a
lambda-calculus with records and references, we obtain a system with an
intuitive operational semantics. Objects are instantiated from classes
and represented by records. The type system for objects uses only
functional, record, and reference types, and there is a clean separation
between subtyping and inheritance. We demonstrate that the calculus is
sound and sufficiently expressive to model advanced language features
such as inheritance with method redefinition, multi-level encapsulation,
and modular object construction.
Abstract:
We describe two state reduction techniques for finite-state models of
security protocols. The techniques exploit certain protocol properties
that we have identified as characteristic of security protocols. We prove
the soundness of the techniques by demonstrating that any violation of
protocol invariants is preserved in the reduced state graph. In addition,
we describe an optimization method for evaluating parameterized rule
conditions, which are common in our models of security protocols.
All three techniques have been implemented in the Murphi verifier.
Abstract:
The Secure Sockets Layer (SSL) protocol is analyzed using a finite-state
enumeration tool called Murphi. The analysis is presented using a
sequence of incremental approximations to the SSL 3.0 handshake protocol.
Each simplified protocol is "model-checked" using Murphi, with the
next protocol in the sequence obtained by correcting errors that Murphi
finds automatically. This process identifies the main shortcomings in
SSL 2.0 that led to the design of SSL 3.0, as well as a few anomalies in
the protocol that is used to resume a session in SSL 3.0. In addition
to some insight into SSL, this study demonstrates the feasibility of
using formal methods to analyze commercial protocols.