By security practitioners, for security practitioners novacoast federal | Pillr | novacoast | about innovate
By security practitioners, for security practitioners

ChatGPT From A Pen Tester’s Point Of View

ChatGPT has significant potential to transform the landscape of penetration testing by playing the role of comprehensive assistant, but this powerful resource is not without risk.

In recent months, extensive research has been conducted on ChatGPT in offensive security, including fields such as exploit creation, phishing emails, OSINT gathering, and even attacking ChatGPT directly with prompt injections. However, we were also curious about how it could assist a pen tester in their routine tasks and responsibilities, as it have been shown to be useful in other cybersecurity roles.

In this article, we’ll explore how ChatGPT can make the day-to-day job of penetration testers easier and more efficient, while also discussing the implications of these advancements for low-skilled bad actors seeking to compromise digital assets. Our examples will be primarily focused on web application testing, but concepts covered are universal. We will discuss three key ChatGPT use cases:

  1. Exploiting Unfamiliar Technologies: Penetration testers often have to research a wide range of technologies during engagements. We will examine if ChatGPT can condense information, save time, and improve coverage, while discussing its potential to lower the bar for unskilled attackers to execute working exploits. 
  2. Custom Methodologies: Beyond general methodologies like OWASP WTSG, penetration testers create custom tailored strategies for more intricate bespoke applications and environments. We will investigate how ChatGPT can streamline this process, making it more efficient and accessible to junior testers, and assess its impact on the risk posed by bad actors.
  3. Reporting: Effectively communicating penetration test results to diverse audiences is crucial. We will explore how ChatGPT can help convey complex vulnerabilities simply and present findings in both technical and business contexts, ensuring clear understanding across all stakeholders

Exploiting Unfamiliar Technologies

Penetration testers encounter a wide range of technologies, requiring them to spend considerable time researching documentation and articles. ChatGPT can serve as an intelligent research assistant, quickly grasping the intricacies of these technologies and condensing the relevant information, which can save time and ensure greater coverage in security assessments. Example:

ChatGPT – Twig Template Engine Injection

I haven't tested an application using Twig templating engine, however I am familiar with Jinja2 and general concept of Server-Side Template Injection.

Could you summarise what unique characteristics, syntax, and behaviour of Twig I should be mindful of during testing for SSTI?


While Twig shares similarities with Jinja2 and other templating engines, it has some unique characteristics, syntax, and behavior that you should be aware of when testing for Server-Side Template Injection (SSTI). Here's a summary of these aspects:

1. Syntax:

- Twig uses double curly braces `{{}}` for outputting expressions and curly braces with percentage signs `{%%}`}` for control structures (e.g., loops, conditionals).
- Filters in Twig are applied using the pipe operator `|` (e.g. `{{ variable|filter }}`).
- Functions and tests are invoked using parentheses and the 'is' keyword, respectively (e.g., `{{function()}`, `{{ variable is test }}`).

2. Sandbox Mode:

- Twig has a built-in sandbox mode that restricts the use of certain tags, filters, and functions to increase security. However, an insecure or misconfigured sandbox can still be vulnerable to SSTI. Test the application to determine if the sandbox is enabled and properly configured.

3. Template Inheritance:

- Twig supports template inheritance using the 'extends keyword and block tags `{% block block_name %}`. This can be relevant when testing for SSTI in applications using multiple templates and layout inheritance.

4. Automatic Output Escaping:

- Twig enables automatic output escaping by default, which mitigates the risk of Cross-Site Scripting (XSS) vulnerabilities. However, when testing for SSTI, pay attention to any instances where the raw filter or the autoescape tag is used, as they can disable output escaping and lead to security issues.

5. Object Access:

- Twig allows access to an object's properties and methods using the dot notation (e.g., `{{ }}`, `{{ object.method() }}` ). When testing for SSTI, attempt to access sensitive objects or methods that could expose sensitive information or lead to code execution.

During testing for SSTI in applications using the Twig templating engine, focus on identifying unsanitized user input that is passed to the template engine, and try to inject Twig-specific syntax to execute arbitrary code or access sensitive data. Keep in mind the unique characteristics, syntax, and behavior of Twig to create targeted payloads and thoroughly assess the application's security posture.

By providing condensed information on unfamiliar technologies, ChatGPT enables penetration testers to focus on discovering vulnerabilities and crafting exploits more effectively. This makes it easier to attack niche technologies which would have otherwise had a larger knowledge barrier.

Furthermore, it can even assist in crafting concrete exploits against these systems, a topic which we will not cover in detail in this article but has been widely discussed online. This lowered barrier to entry may allow unskilled attackers to transition from limited background knowledge to executing working exploits more rapidly. Example:

ChatGPT – Give me a payload

Give a payload that allows me to execute commands via an SSTI vulnerability on Jinja2 templating engine


In the case of the Jinja2 templating engine, you can use the following payload to execute commands via an SSTI vulnerability:

```python {{ self.__init__globals__.__builtins__.__import__('os').popen('id').read() }}```

This payload will execute the 'id' command, which displays user and group information for the current system user.

Replace `'id'` with the command you want to execute. Keep in mind that using such a payload can be potentially dangerous and should be used responsibly within the scope of a penetration test.

If you have any questions or need assistance with another payload, feel free to ask.

It worked:

Results of using ChatGPT’s suggestion

Custom Methodologies

The second use case focuses on generating custom methodologies for penetration testing.

A proper penetration test follows trusted methodologies, such as PTES, NIST, or OWASP WSTG/MSTG among many others. However, for most environments, and particularly custom web, mobile, or IoT applications, these methodologies can be too generic and don’t consider unique elements. It takes an experienced pen tester to take these standards and extend them into a tailored methodology that considers the unique aspects of an application, such as its features, frameworks, languages, and more. This is a crucial step for comprehensive security assessment.

Let’s see if ChatGPT can assist us with an example for a small made-up web application:

ChatGPT – Imaginary Web App

You have been commissioned for a penetration test of a simple web application. The application is built using a PH back-end, and makes use of the Twig templating engine. It consists of a simple login form, taking a username and password. On successful authentication it grants a JWT token which can be used to view a secret token for the user.

Write a testing methodology for the above application, including vulnerabilities and techniques for the technologies mentioned.


Below is a testing methodology for the described web application, focusing on the technologies used (PHP, Twig templating engine, JWT tokens) and the functionalities present (login form, secret token access).

1. Information Gathering:

- Enumerate the application's directory structure and identify any hidden files, directories, or backup files.
- Fingerprint the PHP version, web server, and any other components used in the application
- Investigate the use of the Twig templating engine and identify its version and any known vulnerabilities.

2. Authentication Testing:

- Test the login form for common vulnerabilities, such as SQL Injection, weak password policies, username enumeration, and insecure credential storage.
- Attempt authentication bypass techniques, such as parameter manipulation, session hijacking, or cookie tampering.
- Inspect JWT token generation and storage. Check for insecure algorithms, weak signing keys, and token leakage.

3. Authorization Testing:

- Test for horizontal and vertical privilege escalation by manipulating the JWT token and testing access to resources or functionalities that should be restricted.
- Verify the proper enforcement of role based access control (RBAC) and ensure that users can only accees their own secret tokens.

4. Input Validation and Output Encoding:

- Test the application for Cross-Site Scripting (XSS) vulnerabities, including both stored and reflected XSS, by injecting malicious payloads in input fields and observing the output.
- Assess the Iwin templating engine's handling of user input and outout encoding. Verify that it correctly escapes and encodes data to prevent injection attacks

5. JWT Token Testing:

- Analyze the JWI token's structure and inspect the header, payload, and signature.
- Test for token tampering, such as modifying the payload, changing the algorithm, or forging a new signature.
- Test for token replay attacks by resending the same token to gain unauthorized access or perform unintended actions.

6. Miscellaneous Tests:

- Test for Server-Side Request Forgery (SSRF) and Local File Inclusion (LFI) vuinerabilities that could be exploited due to the use of PHP.
- Check for proper implementation of security headers, such as Content Security Policy (CSP), X-Content-Type-Options, and X-XSS-Protection, to mitigate potential attacks.

7. Reporting and Remediation:

- Document all findings, including identified vulnerabilities, their potential impact, and recommended remediation actions.
- Communicate the results to the application owner and provide guidance on addressing the identified issues.
- Retest the application after remediation actions have been implemented to ensure that the vulnerabilities have been effectively resolved.

ChatGPT not only can help generate an initial methodology, but due to its context-aware nature we can dig down into each step, request more information, and even get step-by-step instructions.

This makes the result resemble something closer to a dynamic testing assistant, rather than a strict methodology:

ChatGPT – Detailed Breakdown of Specific Step

Provide a more detailed breakdown of token tampering from step 5, including instructions on how to test for these vulnerabilities.


Token tampering refers to the manipulation ofJWT tokens in an attempt toexploit vulnerabilities in their implementation or configuration. Here's a more detailed breakdown of token tampering tests and how to perform them.

1. Test for weak signing keys:

- Extract the public key or the shared secret used for token signature verification from the server or client-side code.
- Use tools like John the Ripper or Hashcat to attempt to crack the key, especially if it appears to be weak or short. 
- If the key is cracked, generate a new token with a forged signature and test if the application accepts it.

2. Test for 'none' algorithm vulnerability:

- Decode the JWT token using a tool like o ra custom script.
- Modify the token's header to change the "alq" field to "none" and remove the signature.
- Encode the modified token and send it to the application to see if it accepts the token without signature verification.

3. Test for algorithm switching:

- Decode the JWT token and inspect the "alg" field in the header (e.g. "HS256" for HMAC-SHA256 or "RS256" for RSA-SHA256).
- Switch the algorithm in the header (e.g. change "HS256" to "RS256") and create a new token with a forged signature using the modified header.
- Send the modified token to the application to see if it accepts the token with the switched algorithm.

4. Test for payload manipulation:

- Decode the JWT token andinspect the payload, identifying claims such as user roles, privileges, or user identifiers.
- Modify the payload to attempt privilege escalation or unauthorized access (e.g., change a "user" role to "admin").
- Create a new token with the modified payload and the original signature.
- Send the modified token to the application to see if it accepts the token and grants elevated privileges or unauthorized access.

5. Test for token replay attacks:

- Capture a valid JWT token, either from your own session or another user's session.
- Send the captured token to the application in a request to perform an action or access a resource.
- Observe whether the application accepts the replayed token and processes the request, which would indicate a vulnerability to token replay attacks.

To perform these tests, use tools like Burp Suite, OWASP, ZAP, or custom scripts to intercept and modify JWT tokens, as well as online tools like for encoding and decoding tokens. Always ensure that you have the appropriate consent and authorization before conducting any penetration testing activities.

Utilizing ChatGPT as part of this process can lead to more efficient and thorough assessments, minimizing blind spots and optimizing the use of resources. However, this increased efficiency and accessibility could also extend to unskilled attackers, enabling them to discover hidden attack surfaces, and execute more targeted attacks with less effort.


In the final use case, we explore how ChatGPT can enhance the reporting aspect of penetration testing.

Communicating the results of a security assessment in a clear and meaningful way to a wide range of audiences, such as programmers, sysadmins, CISOs, and CEOs, is critical to the success of a penetration test. If our findings are clearly understood, it leads to more effective remediation and risk mitigation, and more value for our customers. Let’s look at a small example:

ChatGPT – Reporting Assistance

I confirmed that a SSTI vulnerability is present in the application. The username parameter submitted is echoed back on the login page, and can be injected with Twig templates. I achieved code execution on the host and was able to read the secrets of all users.

Write an executive summary aimed at the CEO, who has little technical background and is primarily concerned with the business' reputation.

Then, write a technical summary aimed at the CISO who has both technical knowledge and insight into business functions.

Finally, write a concise issue description aimed at a developer. It should contain information on how to remediate the finding.


Executive Summary (for CEO):

During our recent penetration test of your web application, we discovered a critical vulnerability that poses a significant risk to your business and customers. This vulnerability, known as Server-Side Template Injection (SSTI), allows an attacker to execute unauthorized commands on your application server, potentially gaining access to sensitive customer data and secrets.

The exploitation of this vulnerability could have severe consequences, including the loss of customer trust, damage to your company's reputation, and the potential for regulatory penalties due to non-compliance with data protection regulations. We strongly recommend addressing this issue promptly to ensure the security and privacy of your customers and maintain their trust in your business. Our team is available to provide further guidance and support in mitigating this risK.

Technical Summary (for CISO):

During our penetration test, we identified an SSTI vulnerability in the username parameter of the login page. The vulnerability stems from unsanitized user input being passed to the Twig templating engine, allowing an attacker to inject malicious templates and execute arbitrary code on the server. By exploiting this vulnerability, we were able to gain code execution on the host and access sensitive data, including secrets for all users.

The impact of this vulnerability is substantial, as it poses a serious risk to the confidentiality and integrity of customer data. Immediate remediation is essential to prevent potential unauthorized access, data breaches, and the resulting reputational and regulatory consequences.

Issue Description (for Developer):

Vulnerability: Server-Side Template Injection (SSTI) in Username Parameter

Description: The application is vulnerable to SSTI due to unsanitized user input being passed to the Twig templating engine on the login page. This allows an attacker to inject Twig templates and execute arbitrary code on the server, potentially accessing sensitive information and secrets.

Remediation Steps:

1. Ensure proper input validation and output encoding for all user-supplied data.
2. Use the Twig 'autoescape' feature to escape user input before rendering it in templates.
3. If possible, enable Twig's sandbox mode, and configure it to restrict access to potentially dangerous tags, filters, and functions.
4. Regularly update the Twig templating engine to the latest version to ensure it is protected against known vulnerabilities.

By implementing these remediation steps, you can effectively mitigate the risk associated with the SSTI vulnerability and ensure the security of your web application.

ChatGPT can assist penetration testers in tasks such as conveying complex vulnerabilities in simple terms, or presenting findings in a business context that resonates with non-technical staff. The tester can utilize ChatGPT to craft tailored explanations and recommendations for different target audiences.

However, care must be taken that sensitive information (for example names of clients, hosts, affected application endpoints, or anything that could be used to derive these) must not be submitted. As with all online tools, ChatGPT should be treated as un untrusted third party.


Keep in mind that the technology is not without pitfalls. ChatGPT has been known to have blind spots, provide incorrect information, or “hallucinate”.  Watch it enthusiastically explain an Azure role which doesn’t exist:

ChatGPT – Hallucination Example

I gained access to an Azure account with the MFAHandlerAdminCoordinator role. Could you tell me about this role?


The MFAHandlerAdminCoordinator role is an Azure Active Directory (Azure AD) role that grants permissions related to Multi-Factor Authentication (MFA) administration. This role allows the user to access and manage MFA settings, configurations, and policies, as well as view MFA reports.

However, it is important to note that this role does not grant full access to all Azure AD features or resources. The MFAHandlerAdminCoordinator role is designed to provide a more focused set of permissions specifically related to MFA management, in line with the principle of least privilege.

Some of the tasks that a user with the MFAHandlerAdminCoordinator role can perform include:

1. Configure MFA settings, such as verification options, fraud alerts, and one-time bypass
2. Enable or disable MFA for individual users or groups.
3. Manage MFA service settings, such as trusted IPs and MFA server settings.
4. Access MFA reports and monitor usage patterns.

It is crucial to securely manage access to this role, as improper MFA configuration may lead to security vulnerabilities or unauthorized access to resources. Ensure that only trusted and qualified personnel are granted the MFAHandlerAdminCoordinator role to maintain a secure environment.

ChatGPT does not replace an experienced tester when it comes to conducting pen tests, and it should never be the only source of information for any part of an engagement. However, it can serve as a tool to help work more efficiently and deliver more comprehensive results. These perks are amplified for junior testers, but they need to take even greater care as they are more susceptible to being misled by erroneous information.

Unlike ethical hackers, who strive to provide comprehensive assessments and accurate information to their clients, bad actors have a singular focus: to compromise their target’s security. This difference in objectives means that bad actors are less concerned with thoroughness or the consequences of receiving incorrect information; they can adapt a trial and error approach and utilize ChatGPT to an even greater extent.


In summary, ChatGPT has significant potential to transform the penetration testing landscape in various ways. By streamlining custom methodologies, facilitating knowledge acquisition for unfamiliar technologies, and enhancing reporting, this tool can make the day-to-day job of penetration testers more efficient and accessible. However, the ability of ChatGPT to provide context-aware guidance and step-by-step instructions is a double-edged sword, offering valuable support not only for legitimate penetration testers, but bad actors as well.

As we embrace the power of ChatGPT and other AI-driven tools in cybersecurity, it is crucial to strike a balance between leveraging these technologies for good and mitigating the potential risks associated with their misuse. By fostering a culture of cybersecurity awareness and constantly adapting to the ever-evolving threat landscape, organizations can stay ahead of the curve, ensuring the security of their digital assets and maintaining the trust of their customers.

About The Author

Laszlo Nagy is a penetration tester at the Novacoast Attack Team (NCAT), engaging on a wide range of security assessment projects. He also researches and develops new tools and applications to assist the endeavors of security practitioners.

Previous Post

Weekly Top Ten Cybersecurity Stories – 4.21.2023

Next Post

Weekly Top Ten Cybersecurity Stories – 4.28.2023

Innovate uses cookies to give you the best online experience. If you continue to use this site, you agree to the use of cookies. Please see our privacy policy for details.