Industry Insights

Analyzing the Limitations of OWASP JuiceShop as a Benchmarking Target for DAST Tools

Table of Content Introduction OWASP JuiceShop, a widely used Capture The Flag (CTF) contest application for penetration testing (PT) teams. It offers a gamified experience with logical puzzles. While it serves its intended purpose, it is not a suitable benchmarking target for Dynamic Application Security Testing (DAST). We will explain why this is the case […]

Analyzing the Limitations of OWASP JuiceShop as a Benchmarking Target for DAST Tools
Bar Hofesh Co-founder of Bright Security, Bar acts at their CTO. Globally recognized security & technology expert, Bar has played many roles including CISO, System architect , Security, and DevSecOps advisor at over 10 companies. As a leader & researcher, he has multiple publications & projects in cybersecurity. CISO & MCITP certified.
February 27, 2024
7 minutes

Table of Content

  1. Introduction
  2. The Purpose of Benchmarking
  3. Approaching DAST Testing
  4. Why does JuiceShop fall short
  5. Conclusion

Introduction

OWASP JuiceShop, a widely used Capture The Flag (CTF) contest application for penetration testing (PT) teams. It offers a gamified experience with logical puzzles. While it serves its intended purpose, it is not a suitable benchmarking target for Dynamic Application Security Testing (DAST). We will explain why this is the case in this post. Before we dive into the concerns of using JuiceShop as a DAST benchmarking tool first define why and how we should approach DAST benchmarking.

The Purpose of Benchmarking

In the realm of DAST benchmarking involves comparing the performance, capabilities, and efficacy of various tools in identifying and mitigating security vulnerabilities. The primary goal is to select a DAST solution that aligns with the unique requirements and objectives of an organization’s security strategy. As such we should also make sure the benchmarking target resembles the end target applications of the organization as closely as possible. This is a key reason that selecting very old benchmarking targets with obsolete technologies like DVWAbWAPP or targets which do not behave like real world applications does not align with the end goal of finding the best tool for the job; with the job being testing real world applications of the organization.

Approaching DAST Testing

To extract maximum value from DAST benchmarking, it’s crucial to adopt a comprehensive testing approach. Consider the following key aspects:

a. Ability to Test Modern Technologies: Ensure that the DAST tool supports and effectively tests applications built on modern technologies. Compatibility with diverse tech stacks is vital for addressing the ever-evolving nature of web applications.

An example to technologies we should ensure are present at a modern benchmark are:

  1. Modern backend language like: NodeJS, Go, Elixer, etc..
  2. Modern frontend frameworks such as React, Angular, and Vue.js.
  3. Modern Architectures: SPA, BackendFrontend API communicating over RESTGraphQL.
  4. Dynamic Application: JS Events, Complicated DOM, Frontend logic.
  5. Modern Stack: PostgresQL, NoSQL, modern web server, etc..

b. Modern Vulnerabilities: Evaluate the tool’s proficiency in detecting modern vulnerabilities. The benchmarking process should include testing for threats beyond traditional issues, such as those related to cloud services, microservices, and serverless architectures.

An example of modern vulnerabilities we should ensure are present at modern benchmark are:

  1. Cloud resources: AWS S3 issues, Google Storage, Azure Blobs, API key leaks and secrets.
  2. API Security: GraphQL misconfiguration, OWASP API top 10, business constraint issues, business logic issues.
  3. Authorization: JWT Token issues, privilege elevation issues, Access Control misconfiguration.

c. Authentication Scenarios: Assess the DAST tool’s capability to handle various authentication mechanisms. Robust testing should encompass scenarios involving single sign-on (SSO), multi-factor authentication (MFA), and other authentication protocols to provide a holistic security assessment.

d. Crawling and Discovery: The tool’s ability to thoroughly crawl and discover the application’s attack surface is critical. Effective crawling ensures comprehensive coverage of the application, uncovering hidden vulnerabilities that may escape less sophisticated tools.

e. API and Backend Testing: With the rise of API-centric architectures, a robust DAST tool should extend its testing capabilities to APIs and backend services. Evaluate how well the tool can identify vulnerabilities in API endpoints, this includes different API technologies like RESTGraphQL and others. we should also make sure the DAST tool can support multiple ways to map and identify all of the different API endpoints (loading schemes, handing introspection, allowing editing or manual setup of specific API EPs)

Now that we agree on the requirements from an effective benchmark we need to ensure the target of our benchmark can enable us to effectively support all these points. This will enable us to stay as true to actual targets we will test for the organization, encompass multiple modern vulnerabilities and behave and be architected in a way that resembles real world applications as much as possible.

Why does JuiceShop fall short

Gamified Approach and Logical Puzzles:

OWASP Juice Shop’s design heavily emphasizes a play-like approach, incorporating logical puzzles that may not align with real-world application security challenges.

One prominent example is the scenario where a user is prompted to “Reset the password of Bjoern’s OWASP account via the Forgot Password mechanism with the truthful answer to his security question.” To solve this scenario one needs to either watch Bjoern’s OWASP lecture from 2018 to see his playthrough of JuiceShop or go to his twitter and scroll until a post talking about his favorite cat “Zaya” happens to come into view.

Another good example is the “Receive a coupon code from the support chatbot” challenge, to win this one a user needs to “bully” the chatbot while asking consistently again and again for a coupon code until the Bot gives up and supplies the user with a coupon.
Many similar “vulnerabilities” have been programmed into JuiceShop. While this makes the application a very fun PT puzzle platform these issues are hardly in the realm of real world vulnerabilities or issues that a DAST tool is expected to find.

Limited Automated Vulnerability Detection:

Certain vulnerabilities within Juice Shop cannot be efficiently detected through automated means. An illustrative example involves extracting security question answers from external sources like YouTube videos. This kind of manual intervention and information retrieval, as demonstrated by Bjoern Kimminich himself in a conference talk, highlights the inherent limitations of automated vulnerability detection in Juice Shop.

Non-Conformity to HTTP Standards:

A major drawback of Juice Shop lies in its non-conformity to HTTP standards. Every page, regardless of existence, returns a 200 OK status, creating potential confusion for DAST tools relying on standard status codes for interpretation.

As the application uses only relative links every such non existent URL has the potential to endlessly increase the sitemap if the tool is not configured to handle such situations.

Furthermore, the application employs unconventional HTTP response status messages, such as using a 500 Internal Error for unauthorized access, a departure from the industry-standard 401 or 403 status.

Moreover, much has been invested to make sure the application behaves in a way that will make automated scanner’s job harder to ensure PT players do not “cheat” the game using automated tools, this also includes other complicated scenarios like forms which are not really forms:

JS events attached to images, fields which do not open, or are not editable until an icon is clicked.
One good example can be seen when looking at the images sources in the main page:

We can see multiple events listeners in the image, each one creating a different behavior.

Another good example is the “search” bar which hides a DOM XSS:

The search bar is non-existent until a click or touch event triggers happens and then the DOM enables the search bar:

Another example if the “Directory Listing”, usually this issue talks about a misconfiguration in the server level that enables browsing directories using the browser, it looks like:

In Juiceshop instead the behavior is an in-app directory browsing library, that allows you to go through the files on a specific folder. this is not what we would classify as “Directory Listing” and it’s more about application feature inside of JuiceShop:

There are other examples of behavior that is very human centrist in order to make sure automated tools have hard time parsing the targets and managing to run scans.

Conclusion

In conclusion, while OWASP Juice Shop provides an engaging platform for PT teams and serves its intended purpose as a gamified CTF application, it falls short as an ideal benchmarking target for DAST tools. Its unique design choices, non-standard HTTP practices, and deliberate anti-automation features pose challenges that diverge from the realistic security scenarios encountered in actual applications. To ensure comprehensive security testing and benchmarking, it is crucial to consider applications that more closely emulate real-world conditions. As the cybersecurity landscape evolves, the need for reliable and realistic benchmarks becomes increasingly vital in fortifying applications against emerging threats.

This is why we should consider proper modern benchmarks like the following:

  1. BrokenCrystals – Broken Crystals (sources at: GitHub – NeuraLegion/brokencrystals: A Broken Application – Very Vulnerable! )
  2. DVGA – GitHub – dolevf/Damn-Vulnerable-GraphQL-Application: Damn Vulnerable GraphQL Application is an intentionally vulnerable implementation of Facebook’s GraphQL technology, to learn and practice GraphQL Security.
  3. VAPI – GitHub – roottusk/vapi: vAPI is Vulnerable Adversely Programmed Interface which is Self-Hostable API that mimics OWASP API Top 10 scenarios through Exercises.
  4. crAPI – GitHub – OWASP/crAPI: completely ridiculous API (crAPI)

What Our Customers Say About Us

"Empowering our developers with Bright Security's DAST has been pivotal at SentinelOne. It's not just about protecting systems; it's about instilling a culture where security is an integral part of development, driving innovation and efficiency."

Kunal Bhattacharya | Head of Application Security

"Bright DAST has transformed how we approach AST at SXI, Inc. Its seamless CI/CD
integration, advanced scanning, and actionable insights empower us to catch
vulnerabilities early, saving time and costs. It's a game-changer for organizations aiming to
enhance their security posture and reduce remediation costs."

Carlo M. Camerino | Chief Technology Officer

"Bright Security has helped us shift left by automating AppSec scans and regression testing early in development while also fostering better collaboration between R&D teams and raising overall security posture and awareness. Their support has been consistently fast and helpful."

Amit Blum | Security team lead

"Bright Security enabled us to significantly improve our application security coverage and remediate vulnerabilities much faster. Bright Security has reduced the amount of wall clock hours AND man hours we used to spend doing preliminary scans on applications by about 70%."

Alex Brown

"Duis aute irure dolor in reprehenderit in voluptate velit esse."

Bobby Kuzma | ProCircular

"Since implementing Bright's DAST scanner, we have markedly improved the efficiency of our runtime scanning. Despite increasing the cadence of application testing, we've noticed no impact to application stability using the tool. Additionally, the level of customer support has been second to none. They have been committed to ensuring our experience with the product has been valuable and have diligently worked with us to resolve any issues and questions."

AppSec Leader | Prominent Midwestern Bank

Book a Demo

See how Bright validates real risk inside your CI/CD pipeline and eliminates false positives before they reach developers.

Our clients:
SulAmerica Barracuda SentinelOne MetLife Nielsen Heritage Bank Versant Health