Secure Web Applications Group

Research Projects

Understanding and Improving Client-Side Web Security

The Web is arguably the most popular platform for information exchange today. To allow for a better user experience, much functionality is shifted towards the client. This shift also increases the complexity of client-side code and hence the attack surface (Stock et al., 2017). This can be exhibited in increased vulnerabilities such as Client-Side Cross-Site Scripting. We therefore develop novels methods to find such flaws at scale (Lekies et al., 2013), (Steffens et al., 2019), (Steffens and Stock, 2020), analyze their nature (Stock et al., 2015), and develop and evaluate potential solutions (Stock et al., 2014), (Musch et al., 2019) Furthermore, our work considers other threat models, such as through the compromise of trusted third-party sites (Meiser et al., 2021).

Content Security Policy

Of particular focus is CSP, a way of mitigating a range of attacks to limit a site's susceptibility to XSS, Clickjacking, or TLS downgrading attacks. Our group has already conducted numerous studies on the subject, ranging from historic analyses of CSP deployment (Roth et al., 2020), inconsistencies in framing control through CSP and XFO (Calzavara et al., 2020) as well as inconsistencies within the same origin (Calzavara et al., 2021) to bypasses through script gadgets (Roth et al., 2020) and the inability of sites to deploy CSP because of their third-party script providers (Steffens et al., 2021).

Usability of Web Security

Although detection of many types of web-based flaws has been in the focus of researchers over the previous years, notifying affected parties barely got any attention. In one research line, we try to identify potential channels for notification and evaluate their effectiveness (Stock et al., 2016). Also, we try to improve not only on technical measures like avoiding spam filters, but also try to understand the human aspects of a notification, such as how different wording might influence the success of a notification (Stock et al., 2018). Furthermore, our group investigates how to improve mechanisms such as CSP to make them more usable.

Malicious JavaScript

With its prevalence in the browser, JavaScript also makes for a prime target for attackers. Therefore, our group researches new ways of detecting malicious JavaScript in the wild. Specifically, this subsumes work in which we automatically generate signatures for exploit kits, alleviating the burden of malware analysists (Stock et al., 2016). In addition, our work focusses on detection of malicious JavaScript in general through methods of machine learning (Fass et al., 2018; Fass et al., 2019) and novel ways of bypassing existing static analysis tools (Fass et al., 2019).

Publications

Peer-reviewed publications with contributions by members of the Secure Web Applications Group:

2021

Nguyen, T.T., Backes, M., Marnau, N., Stock, B., 2021. Share First, Ask Later (or Never?) - Studying Violations of GDPR’s Explicit Consent in Android Apps, in: USENIX Security.
Meiser, G., Laperdrix, P., Stock, B., 2021. Careful Who You Trust: Studying the Pitfalls of Cross-Origin Communication, in: ACM AsiaCCS.
[show abstract] [paper]
In the past, Web applications were mostly static and most of the content was provided by the site itself. Nowadays, they have turned into rich client-side experiences customized for the user where third parties supply a considerable amount of content, e.g., analytics, advertisements, or integration with social media platforms and external services. By default, any exchange of data between documents is governed by the Same-Origin Policy, which only permits to exchange data with other documents sharing the same protocol, host, and port. Given the move to a more interconnected Web, standard bodies and browser vendors have added new mechanisms to enable cross-origin communication, primarily domain relaxation, postMessages, and CORS. While prior work has already shown the pitfalls of not using these mechanisms securely (e.g., omitting origin checks for incoming postMessages), we instead focus on the increased attack surface created by the trust that is necessarily put into the communication partners. To that end, we report on a study of the Tranco Top 5,000 to measure the prevalence of cross-origin communication. By analyzing the interactions between sites, we build an interconnected graph of the trust relations necessary to run the Web. Subsequently, based on this graph, we estimate the damage that can be caused through real-world exploitability of existing client-side XSS flaws.
Moog, M., Demmel, M., Backes, M., Fass, A., 2021. Statically Detecting JavaScript Obfuscation and Minification Techniques in the Wild, in: Dependable Systems and Networks (DSN).
[show abstract]
JavaScript is both a popular client-side programming language and an attack vector. While malware developers transform their JavaScript code to hide its malicious intent and impede detection, well-intentioned developers also transform their code to, e.g., optimize website performance. In this paper, we conduct an in-depth study of code transformations in the wild. Specifically, we perform a static analysis of JavaScript files to build their Abstract Syntax Tree (AST), which we extend with control and data flows. Subsequently, we define two classifiers, benefitting from AST-based features, to detect transformed samples along with specific transformation techniques. Besides malicious samples, we find that transforming code is increasingly popular on Node.js libraries and client-side JavaScript, with, e.g., 90% of Alexa Top 10k websites containing a transformed script. This way, code transformations are no indicator of maliciousness. Finally, we showcase that benign code transformation techniques and their frequency both differ from the prevalent malicious ones.
Agarwal, S., Stock, B., 2021. First, Do No Harm: Studying the manipulation of security headers in browser extensions, in: MADWeb.
[show abstract] [paper]
Browser extensions are add-ons that aim to enhance the functionality of native Web applications on the client side. They intend to provide a rich end-user experience by leveraging feature-rich privileged JavaScript APIs, otherwise inaccessible for native applications. However, numerous large-scale investigations have also reported that extensions often indulge in malicious activities by exploiting access to these privileged APIs such as ad injection, stealing privacy-sensitive data, user fingerprinting, spying user activities on the Web, and malware distribution. In this work, we instead focus on tampering with security headers. To that end, we analyze over 186K Chrome extensions, publicly available on the Chrome Web Store, to detect extensions that actively intercept requests and responses and tamper with their security headers by either injecting, dropping, or modifying them, thereby undermining the security guarantees that these headers typically provide. We propose an automated framework to detect such extensions by leveraging a combination of static and dynamic analysis techniques. We evaluate our proposed methodology by investigating the extensions’ behavior against Tranco Top 100 domains and domains targeted explicitly by the extensions under test and report our findings. We observe that over 2.4K extensions actively tamper with at least one security header, undermining the purpose of the server-delivered, client-enforced security headers.
Steffens, M., Musch, M., Johns, M., Stock, B., 2021. Who’s Hosting the Block Party? Studying Third-Party Blockage of CSP and SRI, in: NDSS.
[show abstract] [paper] [additional content]
The Web has grown into the most widely used application platform for our daily lives. First-party Web applications thrive due to many different third parties they rely on to provide auxiliary functionality, like maps or ads, to their sites. In this paper, we set out to understand to what extent this outsourcing has adverse effects on two key security mechanisms, namely Content Security Policy (CSP; to mitigate XSS) and Subresource Integrity (SRI; to mitigate third-party compromises) by conducting a longitudinal study over 12 weeks on 10,000 top sites. Under the assumption that a first party wants to deploy CSP and SRI and is able to make their code base compliant with these mechanisms, we assess how many sites could fully deploy the mechanisms without cooperation from their third parties. For those unable to do so without cooperation, we also measure how many third parties would jointly have to make their code compliant to enable first-party usage of CSP and SRI. To more accurately depict trust relations, we rely on holistic views into inclusion chains within all pages of the investigated sites. In addition, based on a combination of heuristics and manual validation, we identify different eTLD+1s belonging to the same business entity, allowing us to more accurately discerning parties from each other. Doing so, we show that the vast majority of sites includes third-party code which necessitates the use of unsafe-inline (75%) or unsafe-eval (61%), or makes deployment of strict-dynamic impossible (76%) without breakage of functionality. For SRI, based on the analysis of a single snapshot (within less than 12 hours), we also show that more than half of all sites cannot fully rely on SRI to protect them from third-party compromise due to randomized third-party content.
Calzavara, S., Urban, T., Tatang, D., Steffens, M., Stock, B., 2021. Reining in the Web’s Inconsistencies with Site Policy, in: NDSS.
[show abstract] [paper]
Over the years, browsers have adopted an ever-increasing number of client-enforced security policies deployed by means of HTTP headers. Such mechanisms are fundamental for web application security, and usually deployed on a per-page basis. This, however, enables inconsistencies, as different pages within the same security boundaries (in form of origins or sites) can express conflicting security requirements. In this paper, we formalize inconsistencies for cookie security attributes, CSP and HSTS, and then quantify the magnitude and impact of inconsistencies at scale by crawling 15,000 popular sites. We show numerous sites endanger their own security by omission or misconfiguration of the aforementioned mechanisms, which lead to unnecessary exposure to XSS, cookie theft and HSTS deactivation. We then use our data to analyse to which extent the recent Origin Policy proposal can fix the problem of inconsistencies. Unfortunately, we conclude that the current Origin Policy design suffers from major shortcomings which limit its practical applicability to address security inconsistencies, while catering to the need of real-world sites. Based on these insights, we propose Site Policy, designed to overcome Origin Policy’s shortcomings and make any insecurity explicit. We make a prototype implementation of Site Policy publicly available, along with a support toolchain for initial policy generation, security analysis, and test deployment.

2020

Steffens, M., Stock, B., 2020. PMForce: Systematically Analyzing postMessage Handlers at Scale, in: ACM CCS.
[show abstract] [paper]
The Web has become a platform in which sites rely on intricate interactions that span across the boundaries of origins. While the Same-Origin Policy prevents direct data exchange with documents from other origins, the postMessage API offers one relaxation that allows developers to exchange data across these boundaries. While prior manual analysis could show the presence of issues within postMessage handlers, unfortunately, a steep increase in postMessage usage makes any manual approach intractable.To deal with this increased work load, we set out to automatically find issues in postMessage handlers that allow an attacker to execute code in the vulnerable sites, alter client-side state, or leak sensitive information. To achieve this goal, we present an automated analysis framework running inside the browser, which uses selective forced execution paired with lightweight dynamic taint tracking to find traces in the analyzed handlers that end in sinks allowing for code-execution or state alterations. We use path constraints extracted from the program traces and augment them with Exploit Templates, i.e., additional constraints, ascertaining that a valid assignment that solves all these constraints produces a code-invoking or state-manipulating behavior. Based on these constraints, we use Z3 to generate postMessages aimed at triggering the insecure functionality to prove exploitability, and validate our findings at scale. We use this framework to conduct the most comprehensive experiment studying the security issues of postMessage handlers found throughout the top 100,000 most influential sites yet, which allows us to find potentially exploitable data flows in 252 unique handlers out of which 111 were automatically exploitable.
Roth, S., Backes, M., Stock, B., 2020. Assessing the Impact of Script Gadgets on CSP at Scale, in: ACM AsiaCCS.
[show abstract] [paper]
The Web, as one of the core technologies of modern society, has profoundly changed the way we interact with people and data through social networks or full-fledged office Web applications. One of the worst attacks on the Web is Cross-Site Scripting (XSS), in which an attacker is able to inject their malicious JavaScript code into a Web application, giving this code full access to the victimized site. To mitigate the impact of markup injection flaws that cause XSS, support for the Content Security Policy (CSP) is nowadays shipped in all browsers. Deploying such a policy enables a Web developer to whitelist from where script code can be loaded, essentially constraining the capabilities of the attacker to only be able to execute injected code from said whitelist. As recently shown by Lekies et al., injecting script markup is not a necessary prerequisite for a successful attack in the presence of so-called script gadgets. These small snippets of benign JavaScript code transform non-script markup contained in a page into executable JavaScript, opening the door for bypasses of a deployed CSP. Especially in combination with CSP’s logic in handling redirected resources, script gadgets enable attackers to bypass an otherwise secure policy. In this paper, we therefore ask the question: is deploying CSP in a secure fashion even possible without a priori knowledge of all files hosted on even a partially trusted origin?To answer this question, we investigate the severity of the findings of Lekies et al., showing real-world Web sites on which, even in the presence of CSP and without code containing such gadgets being added by the developer, an attacker can sideload libraries with known script gadgets, as long as the hosting site is whitelisted in the CSP. In combination with the aforementioned redirect logic, this enables us to bypass 10% of otherwise secure CSPs in the wild. To further answer our main research question, we conduct a hypothetical what-if analysis. Doing so, we automatically generate sensible CSPs for all of the Top 10,000 sites and show that around one-third of all sites would still be susceptible to a bypass through script gadget sideloading due to heavy reliance on third parties which also host such libraries.
Calzavara, S., Roth, S., Rabitti, A., Backes, M., Stock, B., 2020. A Tale of Two Headers: A Formal Analysis of Inconsistent Click-Jacking Protection on the Web, in: USENIX Security.
[show abstract] [paper]
Click-jacking protection on the modern Web is commonly enforced via client-side security mechanisms for framing control, like the X-Frame-Options header (XFO) and Content Security Policy (CSP). Though these client-side security mechanisms are certainly useful and successful, delegating protection to web browsers opens room for inconsistencies in the security guarantees offered to users of different browsers. In particular, inconsistencies might arise due to the lack of support for CSP and the different implementations of the underspecified XFO header. In this paper, we formally study the problem of inconsistencies in framing control policies across different browsers and we implement an automated policy analyzer based on our theory, which we use to assess the state of click-jacking protection on the Web. Our analysis shows that 10% of the (distinct) framing control policies in the wild are inconsistent and most often do not provide any level of protection to at least one browser. We thus propose recommendations for web developers and browser vendors to mitigate this issue. Finally, we design and implement a server-side proxy to retrofit security in web applications.
Roth, S., Barron, T., Calzavara, S., Nikiforakis, N., Stock, B., 2020. Complex Security Policy? A Longitudinal Analysis of Deployed Content Security Policies, in: NDSS.
[show abstract] [paper]
The Content Security Policy (CSP) mechanism was developed as a mitigation against script injection attacks in 2010. In this paper, we leverage the unique vantage point of the Internet Archive to conduct a historical and longitudinal analysis of how CSP deployment has evolved for a set of 10,000 highly ranked domains. In doing so, we document the long-term struggle site operators face when trying to roll out CSP for content restriction and highlight that even seemingly secure whitelists can be bypassed through expired or typo domains. Next to these new insights, we also shed light on the usage of CSP for other use cases, in particular, TLS enforcement and framing control. Here, we find that CSP can be easily deployed to fit those security scenarios, but both lack wide-spread adoption. Specifically, while the underspecified and thus inconsistently implemented X-Frame-Options header is increasingly used on the Web, CSP’s well-specified and secure alternative cannot keep up. To understand the reasons behind this, we run a notification campaign and subsequent survey, concluding that operators have often experienced the complexity of CSP (and given up), utterly unaware of the easy-to-deploy components of CSP. Hence, we find the complexity of secure, yet functional content restriction gives CSP a bad reputation, resulting in operators not leveraging its potential to secure a site against the non-original attack vectors.

2019

Fass, A., Backes, M., Stock, B., 2019. JStap: A Static Pre-Filter for Malicious JavaScript Detection, in: ACSAC.
[show abstract] [paper] [slides]
Given the success of the Web platform, attackers have abused its main programming language, namely JavaScript, to mount different types of attacks on their victims. Due to the large volume of such malicious scripts, detection systems rely on static analyses to quickly process the vast majority of samples. These static approaches are not infallible though and lead to misclassifications. Also, they lack semantic information to go beyond purely syntactic approaches. In this paper, we propose JStap, a modular static JavaScript detection system, which extends the detection capability of existing lexical and AST-based pipelines by also leveraging control and data flow information. Our detector is composed of ten modules, including five different ways of abstracting code, with differing levels of context and semantic information, and two ways of extracting features. Based on the frequency of these specific patterns, we train a random forest classifier for each module. In practice, JStap outperforms existing systems, which we reimplemented and tested on our dataset totaling over 270,000 samples. To improve the detection, we also combine the predictions of several modules. A first layer of unanimous voting classifies 93% of our dataset with an accuracy of 99.73%, while a second layer–based on an alternative modules’ combination–labels another 6.5% of our initial dataset with an accuracy over 99%. This way, JStap can be used as a precise pre-filter, meaning that it would only need to forward less than 1% of samples to additional analyses. For reproducibility and direct deployability of our modules, we make our system publicly available.
Fass, A., Backes, M., Stock, B., 2019. HideNoSeek: Camouflaging Malicious JavaScript in Benign ASTs, in: CCS.
[show abstract] [paper] [slides]
In the malware field, learning-based systems have become popular to detect new malicious variants. Nevertheless, it has been shown that attackers with specific and internal knowledge of a target system may be able to produce input samples which are misclassified. In practice, the assumption of strong attackers is not realistic as it implies access to insider information. We instead propose HideNoSeek, a novel and generic camouflage attack, which evades the entire class of detectors based on syntactic features, without needing any information about the system it is trying to evade. Our attack consists of changing the constructs of a malicious JavaScript sample to imitate a benign syntax. In particular, HideNoSeek uses malicious seeds and searches for similarities at the Abstract Syntax Tree (AST) level between the seeds and traditional benign scripts. Specifically, it replaces benign sub-ASTs by identical malicious ones and adjusts the benign data dependencies–without changing the AST–, so that the malicious semantics is kept after execution. In practice, we are able to generate 91,020 malicious scripts from 22 malicious seeds and 8,279 benign web pages. In addition, we can hide on average 14 malicious samples in a benign AST of the Alexa top 10, and 13 in each of the five most popular JavaScript libraries. In particular, a standard trained classifier has over 99.7% false-negatives with HideNoSeek inputs, while a classifier trained on such samples has over 96% false-positives, rendering the targeted static detectors unreliable.
Musch, M., Steffens, M., Roth, S., Stock, B., Johns, M., 2019. ScriptProtect: Mitigating Unsafe Third-Party JavaScript Practices, in: ACM AsiaCCS.
[show abstract] [paper]
The direct client-side inclusion of cross-origin JavaScript resources in Web applications is a pervasive practice to consume third-party services and to utilize externally provided libraries. The downside of this practice is that such external code runs in the same context and with the same privileges as the first-party code. Thus, all potential security problems in the code directly affect the including site. To explore this problem, we present an empirical study which shows that more than 25% of all sites affected by Client-Side Cross-Site Scripting are only vulnerable due to a flaw in the included third-party code. Motivated by this finding, we propose ScriptProtect, a non-intrusive transparent protective measure to address security issues introduced by external script resources. ScriptProtect automatically strips third-party code from the ability to conduct unsafe string-to-code conversions. Thus, it effectively removes the root-cause of Client-Side XSS without affecting first-party code in this respective. As ScriptProtect is realized through a lightweight JavaScript instrumentation, it does not require changes to the browser and only incurs a low runtime overhead of about 6%. We tested its compatibility on the Alexa Top 5,000 and found that 30% of these sites could benefit from ScriptProtect’s protection today without changes to their application code.
Laperdrix, P., Avoine, G., Baudry, B., Nikiforakis, N., 2019. Morellian analysis for browsers: Making web authentication stronger with canvas fingerprinting, in: DIMVA.
[show abstract]
In this paper, we present the first fingerprinting-based authentication scheme that is not vulnerable to trivial replay attacks. Our proposed canvas-based fingerprinting technique utilizes one key characteristic: it is parameterized by a challenge, generated on the server side. We perform an in-depth analysis of all parameters that can be used to generate canvas challenges, and we show that it is possible to generate unique, unpredictable, and highly diverse canvas-generated images each time a user logs onto a service. With the analysis of images collected from more than 1.1 million devices in a real-world large-scale experiment, we evaluate our proposed scheme against a large set of attack scenarios and conclude that canvas fingerprinting is a suitable mechanism for stronger authentication on the web.
Steffens, M., Rossow, C., Johns, M., Stock, B., 2019. Don’t Trust The Locals: Investigating the Prevalence of Persistent Client-Side Cross-Site Scripting in the Wild, in: NDSS.
[show abstract] [paper]
The Web has become highly interactive and an important driver for modern life, enabling information retrieval, social exchange, and online shopping. From the security perspective, Cross-Site Scripting (XSS) is one of the most nefarious attacks against Web clients. Research has long since focussed on three categories of XSS: reflected, persistent, and DOM-based XSS. In this paper, we argue that our community must consider at least four important classes of XSS and present the first systematic study of the threat of Persistent Client-Side XSS, caused by the insecure usage of client-side storages. While the existence of this class has been acknowledged, especially by the non-academic community like OWASP, prior works have either only found such flaws as side effects of other analyses or focused on a limited set of applications to analyze. Therefore, the community lacks in-depth knowledge about the actual prevalence of Persistent Client-Side XSS in the wild. To close this research gap, we leverage taint tracking to identify suspicious flows from client-side persistent storage (Web Storage, cookies) to dangerous sinks (HTML, JavaScript, and script.src). We discuss two attacker models capable of injecting malicious payloads into these storages, i.e., a network attacker capable of temporarily hijacking HTTP communication (e.g., in a public WiFi), and a Web attacker who can leverage flows into storage or an existing reflected XSS flaw to persist their payload. With our taint-aware browser and these models in mind, we study the prevalence of Persistent Client-Side XSS in the Alexa Top 5,000 domains. We find that more than 8% of them have unfiltered data flows from persistent storages to a dangerous sink, which showcases the developers’ inherent trust in the integrity of storage content. Even worse, if we only consider sites that make use of data originating from storages, 21% of the sites are vulnerable. For those sites with vulnerable flaws from storage to sink, we find that at least 70% are directly exploitable by our attacker models. Finally, investigating the vulnerable flows originating from storages allows us to categorize them into four disjoint categories and propose appropriate mitigations.

2018

Fass, A., Krawczyk, R., Backes, M., Stock, B., 2018. JaST: Fully Syntactic Detection of Malicious (Obfuscated) JavaScript, in: DIMVA.
[show abstract] [paper]
JavaScript is a browser scripting language initially created to enhance the interactivity of web sites and to improve their user-friendliness. However, as it offloads the work to the user’s browser, it can be used to engage in malicious activities such as Crypto-Mining, Drive-by-Download attacks, or redirections to web sites hosting malicious software. Given the prevalence of such nefarious scripts, the anti-virus industry has increased the focus on their detection. The attackers, in turn, make increasing use of obfuscation techniques, so as to hinder analysis and the creation of corresponding signatures. Yet these malicious samples share syntactic similarities at an abstract level, which enables to bypass obfuscation and detect even unknown malware variants. In this paper, we present JaSt, a low-overhead solution that combines the extraction of features from the abstract syntax tree with a random forest classifier to detect malicious JavaScript instances. It is based on a frequency analysis of specific patterns, which are either predictive of benign or of malicious samples. Even though the analysis is entirely static, it yields a high detection accuracy of almost 99.5% and has a low false-negative rate of 0.54%.
Stock, B., Pellegrino, G., Li, F., Backes, M., Rossow, C., 2018. Didn’t you hear me? — Towards more successful Web Vulnerability Notifications, in: NDSS.
[show abstract] [paper] [slides]
After treating the notification of affected parties as mere side-notes in research, our community has recently put more focus on how vulnerability disclosure can be conducted at scale. The first works in this area have shown that while notifications are helpful to a significant fraction of operators, the vast majority of systems remain unpatched. In this paper, we build on these previous works, aiming to understand why the effects are not more significant. To that end, we report on a notification experiment targeting more than 24,000 domains, which allowed us to analyze what technical and human aspects are roadblocks to a successful campaign. As part of this experiment, we explored potential alternative notification channels beyond email, including social media and phone. In addition, we conducted an anonymous survey with the notified operators, investigating their perspectives on our notifications. We show the pitfalls of email-based communications, such as the impact of anti-spam filters, the lack of trust by recipients, and hesitations to fix vulnerabilities despite awareness. However, our exploration of alternative communication channels did not suggest a more promising medium. Seeing these results, we pinpoint future directions in improving security notifications.

2017

Stock, B., Johns, M., Steffens, M., Backes, M., 2017. How the Web Tangled Itself: Uncovering the History of Client-Side Web (In)Security, in: USENIX Security.
[show abstract] [paper] [slides] [additional content]
While in its early days, the Web was mostly static, it has organically grown into a full-fledged technology stack. This evolution has not followed a security blueprint, resulting in many classes of vulnerabilities specific to the Web. Even though the server-side code of the past has long since vanished, the Internet Archive gives us a unique view on the historical development of the Web’s client side and its (in)security. Uncovering the insights which fueled this development bears the potential to not only gain a historical perspective on client-side Web security, but also to outline better practices going forward. To that end, we examined the code and header information of the most important Web sites for each year between 1997 and 2016, amounting to 659,710 different analyzed Web documents. From the archived data, we first identify key trends in the technology deployed on the client, such as the increasing complexity of client-side Web code and the constant rise of multi-origin appli- cation scenarios. Based on these findings, we then assess the advent of corresponding vulnerability classes, investigate their prevalence over time, and analyze the security mechanisms developed and deployed to mitigate them. Correlating these results allows us to draw a set of overarching conclusions: Along with the dawn of JavaScript-driven applications in the early years of the millennium, the likelihood of client-side injection vulnerabilities has risen. Furthermore, there is a noticeable gap in adoption speed between easy-to-deploy security headers and more involved measures such as CSP. But there is also no evidence that the usage of the easy-to- deploy techniques reflects on other security areas. On the contrary, our data shows for instance that sites that use HTTPonly cookies are actually more likely to have a Cross-Site Scripting problem. Finally, we observe that the rising security awareness and introduction of dedicated security technologies had no immediate impact on the overall security of the client-side Web.
Backes, M., Rieck, K., Skoruppa, M., Stock, B., Yamaguchi, F., 2017. Efficient and Flexible Discovery of PHP Application Vulnerabilities, in: Euro S&P.
[show abstract] [paper]
The Web today is a growing universe of pages and applications teeming with interactive content. The security of such applications is of the utmost importance, as exploits can have a devastating impact on personal and economic levels. The number one programming language in Web applications is PHP, powering more than 80% of the top ten million websites. Yet it was not designed with security in mind, and, today, bears a patchwork of fixes and inconsistently designed functions with often unexpected and hardly predictable behavior that typically yield a large attack surface. Consequently, it is prone to different types of vulnerabilities, such as SQL Injection or Cross-Site Scripting. In this paper, we present an interprocedural analysis technique for PHP applications based on code property graphs that scales well to large amounts of code and is highly adaptable in its nature. We implement our prototype using the latest features of PHP 7, leverage an efficient graph database to store code property graphs for PHP, and subsequently identify different types of Web application vulnerabilities by means of programmable graph traversals. We show the efficacy and the scalability of our approach by reporting on an analysis of 1,854 popular open-source projects, comprising almost 80 million lines of code.

2016

Backes, M., Holz, T., Rossow, C., Rytilahti, T., Simeonovski, M., Stock, B., 2016. On the Feasibility of TTL-based Filtering for DRDoS Mitigation, in: RAID.
[show abstract] [paper]
One of the major disturbances for network providers in recent years have been Distributed Reflective Denial-of-Service (DRDoS) attacks. In such an attack, the attacker spoofs the IP address of a victim and sent a flood of tiny packets to vulnerable services which then respond with much larger replies to the victim. Led by the idea that the attacker cannot fabricate the number of hops between the amplifier and the victim, Hop Count Filtering (HCF) mechanisms that analyze the Time to Live of incoming packets have been proposed as a solution. In this paper, we evaluate the feasibility of using Hop Count Filtering to mitigate DRDoS attacks. To that end, we detail how a server can use active probing to learn TTLs of alleged packet senders. Based on data sets of benign and spoofed NTP requests, we find that a TTL-based defense could block over 75% of spoofed traffic, while allowing 85% of benign traffic to pass. To achieve this performance, however, such an approach must allow for a tolerance of +/-2 hops. Motivated by this, we investigate the tacit assumption that an attacker cannot learn the correct TTL value. By using a combination of tracerouting and BGP data, we build statistical models which allow to estimate the TTL within that tolerance level. We observe that by wisely choosing the used amplifiers, the attacker is able to circumvent such TTL-based defenses. Finally, we argue that any (current or future) defensive system based on TTL values can be bypassed in a similar fashion, and find that future research must be steered towards more fundamental solutions to thwart any kind of IP spoofing attacks.
Stock, B., Pellegrino, G., Rossow, C., Johns, M., Backes, M., 2016. Hey, You Have a Problem: On the Feasibility of Large-Scale Web Vulnerability Notification, in: USENIX Security.
[show abstract] [paper] [slides] [additional content]
Large-scale discovery of thousands of vulnerable Web sites has become a frequent event, thanks to recent advances in security research and the rise in maturity of Internet-wide scanning tools. The issues related to disclosing the vulnerability information to the affected parties, however, have only been treated as a side note in prior research. In this paper, we systematically examine the feasibility and efficacy of large-scale notification campaigns. For this, we comprehensively survey existing communication channels and evaluate their usability in an automated notification process. Using a data set of over 44,000 vulnerable Web sites, we measure success rates, both with respect to the total number of fixed vulnerabilities and to reaching responsible parties, with the following high-level results: Although our campaign had a statistically significant impact compared to a control group, the increase in the fix rate of notified domains is marginal. If a notification report is read by the owner of the vulnerable application, the likelihood of a subsequent resolution of the issues is sufficiently high: about 40%. But, out of 35,832 transmitted vulnerability reports, only 2,064 (5.8%) were actually received successfully, resulting in an unsatisfactory overall fix rate, leaving 74.5% of Web applications exploitable after our month-long experiment. Thus, we conclude that currently no reliable notification channels exist, which significantly inhibits the success and impact of large-scale notification.
Stock, B., Livshits, B., Zorn, B., 2016. Kizzle: A Signature Compiler for Detecting Exploit Kits, in: DSN.
[show abstract] [paper] [slides]
In recent years, the drive-by malware space has undergone significant consolidation. Today, the most common source of drive-by downloads are socalled exploit kits (EKs). This paper presents Kizzle, the first prevention technique specifically designed for finding exploit kits. Our analysis shows that while the JavaScript delivered by kits varies greatly, the unpacked code varies much less, due to the kits authors’ code reuse between versions. Ironically, this well-regarded software engineering practice allows us to build a scalable and precise detector that is able to quickly respond to superficial but frequent changes in EKs. Kizzle is able to generate anti-virus signatures for detecting EKs, which compare favorably to manually created ones. Kizzle is highly responsive and can generate new signatures within hours. Our experiments show that Kizzle produces high-accuracy signatures. When evaluated over a four-week period, false-positive rates for Kizzle are under 0.03%, while the false-negative rates are under 5%.

2015

Stock, B., Kaiser, B., Pfistner, S., Lekies, S., Johns, M., 2015. From Facepalm to Brain Bender: Exploring Client-Side Cross-Site Scripting, in: CCS.
[show abstract] [paper]
Although studies have shown that at least one in ten Web pages contains a client-side XSS vulnerability, the prevalent causes for this class of Cross-Site Scripting have not been studied in depth. Therefore, in this paper, we present a large-scale study to gain insight into these causes. To this end, we analyze a set of 1,273 real-world vulnerabilities contained on the Alexa Top 10k domains using a specifically designed architecture, consisting of an infrastructure which allows us to persist and replay vulnerabilities to ensure a sound analysis. In combination with a taint-aware browsing engine, we can therefore collect important execution trace information for all flaws. Based on the observable characteristics of the vulnerable JavaScript, we derive a set of metrics to measure the complexity of each flaw. We subsequently classify all vulnerabilities in our data set accordingly to enable a more systematic analysis. In doing so, we find that although a large portion of all vulnerabilities have a low complexity rating, several incur a significant level of complexity and are repeatedly caused by vulnerable third-party scripts. In addition, we gain insights into other factors related to the existence of client-side XSS flaws, such as missing knowledge of browser-provided APIs, and find that the root causes for Client-Side Cross-Site Scripting range from unaware developers to incompatible first- and third-party code.
Lekies, S., Stock, B., Wentzel, M., Johns, M., 2015. The Unexpected Dangers of Dynamic JavaScript, in: USENIX Security.
[show abstract] [paper] [additional content]
Modern Web sites frequently generate JavaScript on-the- fly via server-side scripting, incorporating personalized user data in the process. In general, cross-domain access to such sensitive resources is prevented by the Same-Origin Policy. The inclusion of remote scripts via the HTML script tag, however, is exempt from this policy. This exemption allows an adversary to import and execute dynamically generated scripts while a user visits an attacker-controlled Web site. By observing the execution behavior and the side effects the inclusion of the dynamic script causes, the attacker is able to leak private user data leading to severe consequences ranging from privacy violations up to full compromise of user accounts. Although this issues has been known for several years under the term Cross-Site Script Inclusion, it has not been analyzed in-depth on the Web. Therefore, to systematically investigate the issue, we conduct a study on its prevalence in a set of 150 top-ranked domains. We observe that a third of the surveyed sites utilize dynamic JavaScript. After evaluating the effectiveness of the deployed countermeasures, we show that more than 80% of the sites are susceptible to attacks via remote script inclusion. Given the results of our study, we provide a secure and functionally equivalent alternative to the use of dynamic scripts.

2014

Stock, B., Lekies, S., Mueller, T., Spiegel, P., Johns, M., 2014. Precise Client-side Protection against DOM-based Cross-Site Scripting, in: USENIX Security.
[show abstract] [paper] [slides] [additional content]
The current generation of client-side Cross-Site Scripting filters rely on string comparison to detect request values that are reflected in the corresponding response’s HTML. This coarse approximation of occurring data flows is incapable of reliably stopping attacks which leverage nontrivial injection contexts. To demonstrate this, we conduct a thorough analysis of the current state-of-the-art in browser-based XSS filtering and uncover a set of conceptual shortcomings, that allow efficient creation of filter evasions, especially in the case of DOM-based XSS. To validate our findings, we report on practical experiments using a set of 1,602 real-world vulnerabilities, achieving a rate of 73% successful filter bypasses. Motivated by our findings, we propose an alternative filter design for DOM-based XSS, that utilizes runtime taint tracking and taint-aware parsers to stop the parsing of attacker-controlled syntactic content. To examine the efficiency and feasibility of our approach, we present a practical implementation based on the open source browser Chromium. Our proposed approach has a low false positive rate and robustly protects against DOM-based XSS exploits.
Stock, B., Johns, M., 2014. Protecting Users Against XSS-based Password Manager Abuse, in: ACM AsiaCCS.
[show abstract] [paper] [slides]
To ease the burden of repeated password authentication on multiple sites, modern Web browsers provide password managers, which offer to automatically complete password fields on Web pages, after the password has been stored once. Unfortunately, these managers operate by simply inserting the clear-text password into the document’s DOM, where it is accessible by JavaScript. Thus, a successful Cross-site Scripting attack can be leveraged by the attacker to read and leak password data which has been provided by the password manager. In this paper, we assess this potential threat through a thorough survey of the current password manager generation and observable characteristics of password fields in popular Web sites. Furthermore, we propose an alternative password manager design, which robustly prevents the identified attacks, while maintaining compatibility with the established functionality of the existing approaches.
Stock, B., Lekies, S., Johns, M., 2014. DOM-basiertes Cross-Site Scripting im Web: Reise in ein unerforschtes Land., in: GI Sicherheit.
[show abstract] [paper]
Cross-site Scripting (XSS) ist eine weit verbreitete Verwundbarkeitsklasse in Web-Anwendungen und kann sowohl von server-seitigem als auch von client-seitigem Code verursacht werden. Allerdings wird XSS primaer als ein server-seitiges Problem wahrgenommen, motiviert durch das Offenlegen von zahlreichen entsprechenden XSS-Schwachstellen. In den letzten Jahren kann jedoch eine zunehmende Verlagerung von Anwendungslogik in den Browser beobachtet werden; eine Entwicklung, die im Rahmen des sogenannten Web 2.0 begonnen hat. Dies legt die Vermutung nahe, dass auch client-seitiges XSS an Bedeutung gewinnen koennte. In diesem Beitrag stellen wir eine umfassende Studie vor, in der wir, mittels eines voll-automatisierten Ansatzes, die fuehrenden 5000 Webseiten des Alexa Indexes auf DOM-basiertes XSS untersucht haben. Im Rahmen dieser Studie, konnten wir 6.167 derartige Verwundbarkeiten identifizieren, die sich auf 480 der untersuchten Anwendungen verteilen.

2013

Lekies, S., Stock, B., Johns, M., 2013. 25 Million Flows Later - Large-scale Detection of DOM-based XSS, in: CCS.
[show abstract] [paper] [slides]
In recent years, the Web witnessed a move towards sophisticated client-side functionality. This shift caused a significant increase in complexity of deployed JavaScript code and thus, a proportional growth in potential client-side vulnerabilities, with DOM-based Cross-site Scripting being a high impact representative of such security issues. In this paper, we present a fully automated system to detect and validate DOM-based XSS vulnerabilities, consisting of a taint-aware JavaScript engine and corresponding DOM implementation as well as a context-sensitive exploit generation approach. Using these components, we conducted a large-scale analysis of the Alexa top 5000. In this study, we identified 6,167 unique vulnerabilities distributed over 480 domains, showing that 9,6% of the examined sites carry at least one DOM- based XSS problem.
Johns, M., Lekies, S., Stock, B., 2013. Eradicating DNS Rebinding with the Extended Same-Origin Policy, in: USENIX Security.
[show abstract] [paper] [additional content]
The Web’s principal security policy is the Same-Origin Policy (SOP), which enforces origin-based isolation of mutually distrusting Web applications. Since the early days, the SOP was repeatedly undermined with variants of the DNS Rebinding attack, allowing untrusted script code to gain illegitimate access to protected network resources. To counter these attacks, the browser vendors introduced countermeasures, such as DNS Pinning, to mitigate the attack. In this paper, we present a novel DNS Rebinding attack method leveraging the HTML5 Application Cache. Our attack allows reliable DNS Rebinding attacks, circumventing all currently deployed browser-based defense measures. Furthermore, we analyze the fundamental problem which allows DNS Rebinding to work in the first place: The SOP’s main purpose is to ensure security boundaries of Web servers. However, the Web servers themselves are only indirectly involved in the corresponding security decision. Instead, the SOP relies on information obtained from the domain name system, which is not necessarily controlled by the Web server’s owners. This mismatch is exploited by DNS Rebinding. Based on this insight, we propose a light-weight extension to the SOP which takes Web server provided information into account. We successfully implemented our extended SOP for the Chromium Web browser and report on our implementation’s interoperability and security properties.