What is GraphQL?
Simply put, GraphQL is a query language specifically designed for processing data. It’s most often used to communicate between the client and server. The biggest GraphQL advantage is that it’s very efficient in saving bandwidth as it serves the data with a single query using schemas.
However, given its wide array of usage, GraphQL is very sensitive to vulnerabilities, and you’ll want to be 100% sure that your queries are well protected as these issues could lead to endless vulnerabilities on your app.
In this article:
- GraphQL Security Challenges
- The 5 Most Common GraphQL Security Vulnerabilities
- GraphQL Security Best Practices
- GraphQL Security with Bright
GraphQL Security Challenges
If implemented properly, GraphQL is an extremely elegant methodology for data retrieval. GraphQL offers more back-end stability and increased query efficiency.
Please note the phrase “when implemented properly.” The problem with GraphQL is that many people aren’t considering what adopting GraphQL means for their system, and what security implications come with its adoption.
With GraphQL, security concerns have changed. Thanks to the architectural differences and nuances, some security concerns have gone away, but others have been amplified.
In this article, we are going to cover the security concerns that an API system supporting GraphQL should acknowledge.
The 5 Most Common GraphQL Security Vulnerabilities
1. Inconsistent Authorization Checks
When assessing GraphQL-based applications, flaws in authorization logic are a common issue. While GraphQL helps implement proper data validation, API developers are left to implement authorization and authentication methods on their own. The multiple “layers’ ‘ of resolvers used for GraphQL APIs add complexity since authorization checks are required for both query-level resolvers and resolvers that load additional data.
Generally, we see two types of authorization flaws in GraphQL APIs. The first and most common is seen when authorization functionality is controlled directly by resolvers at the GraphQL API layer. Authorization checks need to be performed separately in each location to prevent an exploitable authorization flaw. This is compounded as the complexity of the API schema enlarges and there are more distinct resolvers that are responsible for the access control to the same data.
In our demo API example below, there are several ways to retrieve a listing of Post objects – a client can retrieve a list of users, public posts, or simply recover a post by its numeric ID. For example, the following query might be used to read all of the currently logged-in user’s posts:
query ReadMyPosts {
# "me" returns the current user
me {
# then, resolve the posts
posts {
# finally, return the content
# and whether this is a public post or not.
public
content
}
}
}
However, each of these various paths used to retrieve a post has its own set of logic to check the accessibility. Particularly, if examining the code to retrieve a post by its ID (for example the GetPostById function in lib/gql/types/post.ts of the source repository),it should be mentioned that there are no authorization checks in place. This is how the attacker is allowed to perform the GraphQL equivalent of a traditional insecure direct object reference attack and retrieve any post they want to, whether it is public or private. Our database assigns Post object IDs by ascending order:
query ReadPost {
# we shouldn't be able to read post "1"
post(id: 1) {
public
content
}
}
The example might seem simple, but similar issues are often found in real-world GraphQL deployments. A similar problem was recently disclosed to the HackerOne bug bounty program where an attacker was able to read all the email addresses that belong to users they sent an invitation to by their username. (The intended behavior is to only allow access to the email address if that was originally used to create the invitation object).
GraphQL documentation provides guidance on performing authorization safely. The advice is simple – instead of performing authorization logic inside of resolver functions, all the logic should be performed by the business-logic layer underneath it. This results in all authorization checks being performed in one location, which makes applying constraints easier and consistent.
2. REST Proxies Allow Attacks on Underlying APIs
To adapt an existing REST API for GraphQL clients, you will usually begin the transition by implementing the new GraphQL interface as a thin proxy layer on top of internal REST APIs. In a very simple implementation, the API resolver will simply “translate” requests to the REST API format, and the response will be formatted in a way that the GraphQL client can understand.
For example, the resolver for user(id: 1) could be implemented in the GraphQL proxy layer by making a request to GET /api/users/1 on the backend API. If this is implemented unsafely, the attacker is able to modify the path or parameters to the backend API, presenting a limited form of SSRF. If the attacker provides the ID 1/ delete, the GraphQL proxy layer might instead access GET /api/users/1/delete with its credentials, not the desired response. One could say this is not an ideal REST API design, but similar scenarios are not uncommon in real-world implementations.
We implemented the getAsset resolver in the following manner:
getAsset: {
type: GraphQLString,
args: {
name: {
type: GraphQLString
}
},
resolve: async (_root, args, _context) => {
let filename = args.name;
let results = await axios.get(`http://localhost:8081/assets/${filename}`);
return results.data;
}
}
In this instance, we’re using the name of the asset that we’re aiming for. The name of the asset itself is added to the full path of the service we’re trying to access. It’s not predefined whether we should add it to the beginning or the end of the path. By using the function below, we’re getting the secret path to the file in the root directory:
query ReadSecretFile {
getAsset(name: "../secret");
}
To protect against this type of vulnerability, proper validation of any parameter passed to another service is required. You can do this by ensuring the GraphQL schema type validator requires a number for the file name, as the numbers are the valid inputs for this request. Alternatively, you can implement validation of input values. GraphQL will validate the types, but the format validation is left to you. A custom scalar type can be used to apply any custom validation rules that apply for a commonly used type.Â
3. Missing Validation of Custom Scalars
The data that GraphQL works with is a scalar type, whether the data is input data or the returned output. There are five types of scalar data – int, float, string, ID, and bool.
As a developer however, you can create your own custom data types, for example, a time and date datatype.
While this is very useful, care and restraint are required as the responsibility for sanitizing the user input and validating the data properly lies with you. If you’re using JavaScript, for example, you could implement parseValue and parseLiteral to keep your application safe.
You may also want to avoid using GraphQL libraries for creating new scalar types, as this could create vulnerabilities in your application. While it is an easier method to use, it creates many problems that you want to avoid as a security-conscious developer.
For example, in the below, we’ve used graphql-json library to obtain password reset mutation.
export const PasswordReset: GraphQLFieldConfig<any,any,any> = {
type: UserType,
args: {
input: {
type: GraphQLJSON
},
},
resolve: async(_root, args, context) => {
console.log(args);
if (args.input.username === undefined || args.input.reset_token === undefined || args.input.new_password === undefined) {
throw new Error("Must provide username, new_password, and reset_token.")
}
let user = await db.User.findOne({where: {username: args.input.username, resetToken: args.input.reset_token}})
if (user) {
// Update the user in the database first.
user.password = await argon2.hash(args.input.new_password);
user.save();
// Now, return it.
context.user = user;
context.session.user_id = user.id;
return user;
}
else {
throw new Error('The password reset token you submitted was incorrect.')
}
}
}
The API queries that database for the input of a username, new password, and a password reset token. However, this process results in data being directly input from the form, resulting in a vulnerability as the input was not properly checked.
Our password reset function takes in a JSON object that contains a username, a new password, and a password reset token checked to ensure validity. The API backend queries the database to check if the token was correct, directly passing the username and reset token values that haven’t been properly checked from the input. Since our application uses the Sequelize ORM, which allows complex operators to be embedded in queries, removing the object in favor of a string gives us an option of creating a query that’s similar to NoSQL injection techniques. In the example below, a user is resetting the password to “RTest!”:
mutation ResetPassword {
passwordReset(input: {username:"Helena_Simonis", new_password: "RTest!", reset_token:{gt:""}}) {
username
}
}
4. Failure to Appropriately Rate-limit
Rate-limiting and creating DOS protection in general is getting more difficult by the day due to the complexity of GraphQL APIs. This is because the GraphQL query is able to take in multiple actions, meaning that no specific amount of server resources is prepared beforehand. This makes for an unpredictable application. This means that you cannot use the same strategy for limiting the number of requests for a GraphQL as you usually would with an API.
Even the smallest queries could easily “explode” in terms of the execution complexity. Here’s an example of our query where a User has a set of Posts, which in turn has an Author, that also has Posts. As you can see, the query, even though it looks small and simple, is actually very complex.
query Recurse {
allUsers {
posts {
author {
posts {
author {
posts {
author {
posts {
id
}
}
}
}
}
}
}
}
}
If we add another layer to this query the complexity is compounded even further. The most common strategy to prevent DOS attacks in GraphQL is to put a limit on the query depth. Although this can be quite limiting, it ensures your application is secure with a simple strategy to implement.
An alternative solution to this would be to implement a complexity score system. Every part of the query gets its own complexity score. Should the total score exceed a predetermined amount, the query is rejected. Although this is a popular solution, it’s also one that is subjective and difficult to implement in practice.
In the below example, we’re using rate-limiting to prevent brute-force on the password reset token. The problem with GraphQL, in this case, is that a single query can contain multiple actions, allowing the attacker to send multiple requests, which may include a large number of guesses to try and retrieve data.
mutation BruteForce {
p000000: passwordReset(input: {username:"Helena_Simonis", new_password: "CarveSystems!", reset_token:"000000"}) {
username
}
p000001: passwordReset(input: {username:"Helena_Simonis", new_password: "CarveSystems!", reset_token:"000001"}) {
username
}
...
p999999: passwordReset(input: {username:"Helena_Simonis", new_password: "CarveSystems!", reset_token:"999999"}) {
username
}
}
In this case, rate-limiting on individual mutation types may be a useful mitigation, along with using harder-to-guess password reset tokens.
5. Introspection Reveals Non-public Information
It can be beneficial to add “hidden” API endpoints that provide functionality that is not accessible to the general public – hidden administrative functionality, or an API endpoint for facilitating server to server communications, for example. Development tools such as GraphQL IDE use this to dynamically retrieve the schema. When it applies to a public API, introspection might improve the developer experience.
GraphQL Security Best Practices
You’d be surprised to see just how far using best practices with GraphQL could take you. The good thing is that there are plenty of ways to secure our queries in order to avoid malicious attempts. Things you should focus on when securing your apps include:
Query Timeouts
Query timeouts are a simple, yet incredibly effective way of limiting the operational window for the attacker. By using query timeouts, you’re basically setting a fixed limit on how long a single query can be executed.
Limiting Query Depth
Perhaps the biggest issue that GraphQL security has are unbounded queries. This means that an attacker is able to send huge queries to your server, potentially resulting in a denial of service.
However, by limiting query depth to a reasonable stage, you’re effectively shutting down this possibility as the attacker won’t have the ability to spam your server with useless requests.
GraphQL Security with Bright
Bright offers the most advanced API testing automation. This allows you to test your GraphQL queries for any potential vulnerabilities. One of the biggest benefits of Bright is that you can deploy full depth parsing, meaning that you can parse varying interactive definitions (in this case, a good example would be an XML object inside a GraphQL query).
Bright offers simple updates on any potential vulnerabilities that your web application might have, and you can code freely knowing your apps are well guarded.
