- OWASP Java HTML Sanitizer
- License
- Categories
- GroupId
- ArtifactId
- Last Version
- Release Date
- Type
- Description
- Project Organization
- Download owasp-java-html-sanitizer
- Dependencies
- compile (1)
- provided (2)
- test (3)
- Project Modules
- OWASP
- Versions
- OWASP Java HTML Sanitizer
- About
- Benefits
- Questions
- Licensing
- Example
- How to Use
- Creating a HTML Policy
- 1. Use prepackaged policies
- 2. Configure own policy
- 3. Define custom policies
- 4. Use ebay / slashdot policies
- CSS Sanitization
- Inline/Embedded Images
- News and Events
- Related Projects
- Roadmap
- Project Information
- Classification
- Audience
- Code Repository
- Change Log
- Leaders
- Upcoming OWASP Global Events
- Corporate Supporters
- Saved searches
- Use saved searches to filter your results more quickly
- Releases: OWASP/java-html-sanitizer
- Release 20220608.1
- v20211018.2
- 20211018.1
- 20200713.1
- 20190610.1
- 19 Feb 2018
OWASP Java HTML Sanitizer
Takes third-party HTML and produces HTML that is safe to embed in your web application. Fast and easy to configure.
License
Categories
GroupId
ArtifactId
Last Version
Release Date
Type
Description
Takes third-party HTML and produces HTML that is safe to embed in your web application. Fast and easy to configure.
Project Organization
Download owasp-java-html-sanitizer
Dependencies
compile (1)
provided (2)
test (3)
Project Modules
OWASP
Versions
Version |
---|
20220608.1 Jun 8, 2022 |
20211018.2 Oct 18, 2021 |
20211018.1 Oct 18, 2021 |
20200713.1 Jul 13, 2020 |
20200615.1 Jun 15, 2020 |
20191001.1 Oct 2, 2019 |
20190610.1 Jun 10, 2019 |
20190503.1 May 3, 2019 |
20190325.1 Mar 25, 2019 |
20181114.1 Nov 14, 2018 |
20180219.1 Feb 19, 2018 |
20171016.1 Oct 16, 2017 |
20170515.1 May 15, 2017 |
20170512.1 May 12, 2017 |
20170411.1 Apr 12, 2017 |
20170408.1 Apr 8, 2017 |
20170329.1 Mar 29, 2017 |
20160924.1 Sep 24, 2016 |
20160827.1 Aug 27, 2016 |
20160628.1 Jun 28, 2016 |
20160614.1 Jun 14, 2016 |
20160526.1 May 26, 2016 |
20160422.1 Apr 22, 2016 |
20160413.1 Apr 14, 2016 |
20160203.1 Feb 3, 2016 |
20151202.2 Dec 3, 2015 |
20150501.1 May 2, 2015 |
1.1 Oct 8, 2015 |
r239 Jun 2, 2014 |
r232 May 8, 2014 |
r223 Mar 1, 2014 |
r209 Sep 5, 2013 |
r198 Jul 22, 2013 |
r173 May 16, 2013 |
r164 May 3, 2013 |
r163 Apr 24, 2013 |
r156 Feb 26, 2013 |
r136 Jan 25, 2013 |
All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.
OWASP Java HTML Sanitizer
The OWASP HTML Sanitizer Projects provides Java based HTML sanitization of untrusted HTML!
About
The OWASP HTML Sanitizer is a fast and easy to configure HTML Sanitizer written in Java which lets you include HTML authored by third-parties in your web application while protecting against XSS. The existing dependencies are on guava and JSR 305. The other jars are only needed by the test suite. The JSR 305 dependency is a compile-only dependency, only needed for annotations. This code was written with security best practices in mind, has an extensive test suite, and has undergone adversarial security review. A great place to get started using the OWASP Java HTML Sanitizer is here: https://github.com/OWASP/java-html-sanitizer/blob/master/docs/getting_started.md.
Benefits
- Very easy to use. It allows for simple programmatic POSITIVE policy configuration (see below). No XML config.
- Actively maintained by Mike Samuel from Google’s AppSec team!
- Passing 95+% of AntiSamy’s unit tests plus many more.
- This is code from the Caja project that was donated by Google. It is rather high performance and low memory utilization.
- Java 1.5+
- Provides 4X the speed of AntiSamy sanitization in DOM mode and 2X the speed of AntiSamy in SAX mode.
Questions
- How was this project tested? This code was written with security best practices in mind, has an extensive test suite, and has undergone adversarial security review.
- How is this project deployed? This project is best deployed through Maven
Licensing
The OWASP HTML Sanitizer is free to use and is dual licensed under the Apache 2 License and the New BSD License..
Example
Put whatever you like here: news, screenshots, features, supporters, or remove this file and don’t use tabs at all.
How to Use
Creating a HTML Policy
1. Use prepackaged policies
`PolicyFactory policy = Sanitizers.FORMATTING.and(Sanitizers.LINKS);` `String safeHTML = policy.sanitize(untrustedHTML);`
2. Configure own policy
`PolicyFactory policy = new HtmlPolicyBuilder()` ` .allowElements("a")` ` .allowUrlProtocols("https")` ` .allowAttributes("href").onElements("a")` ` .requireRelNofollowOnLinks()` ` .build();` `String safeHTML = policy.sanitize(untrustedHTML);`
3. Define custom policies
You can write custom policies :
`PolicyFactory policy = new HtmlPolicyBuilder()` ` .allowElements("p")` ` .allowElements(` ` new ElementPolicy() ` attrs) ` ` >, "h1", "h2", "h3", "h4", "h5", "h6"))` ` .build();` `String safeHTML = policy.sanitize(untrustedHTML);`
Please note that the elements “a”, “font”, “img”, “input” and “span” need to be explicitly whitelisted using the `allowWithoutAttributes()` method if you want them to be allowed through the filter when these elements do not include any attributes.
4. Use ebay / slashdot policies
You can also use the default “ebay” and “slashdot” policies.
The Slashdot policy allows the following tags (“a”, “p”, “div”, “i”, “b”, “em”, “blockquote”, “tt”, “strong”n “br”, “ul”, “ol”, “li”) and only certain attributes. This policy also allows for the custom slashdot tags,”quote” and “ecode”.
CSS Sanitization
CSS sanitization is challenging.
We disallow position:sticky and position:fixed so that client code can use a position:relative;overflow:hidden to contain self-styling sanitized snippets. Embedders of sanitized content do have to consistently do that and make sure that contributed content is clearly demarcated.
Most CSS attacks require a payload to specify selectors which the sanitizer should not allow. Unproxied images do allow tracking and, by positioning below the fold, can track whether a user scrolls down. Embedders do need to use URL rewriting if they allow background styling and use sensible Referrer-Policy and related headers.
That said, even if care is taken, CSS has a large attack surface, so not using it puts you in a safer place.
Inline/Embedded Images
Inline images use the data URI scheme to embed images directly within web pages. The following describes how to allow inline images in an HTML Sanitizer policy.
1) Add the “data” protocol do your whitelist. Se example how to add “data” protocol.
2) You can then allow an attribute with an extra check thus
`.allowAttributes("src")` `.matching(. )` `.onElements("img")`
3) There are a number of things you can do in the matching part such as allow the following instead of just allowing data.
4) Since allowUrlProtocols(“data”) allows data URLs anywhere data URLs are allowed, you might want to also add a matcher to any other URL attributes that reject anything with a colon that does not start with http: or https: or mailto:
`.allowAttributes("href")` `.matching(. )` `.onElements("a")`
News and Events
- [18 Oct 2021] v20211018.2 Released — addresses issue with elements
- [10 Sep 2020] Migrate OWASP wiki page
- [20 Feb 2018] Update 20180219.1 — addresses iOS/MacOS “text bomb”
- [28 June 2016] v20160628.1 Released
- [14 Apr 2016] v20160413.1 Released
- [1 May 2015] Move to GitHub
- [2 July 2014] v239 Released
- [3 Mar 2014] v226 Released
- [5 Feb 2014] New Wiki
- [4 Sept 2013] v209 Released
Related Projects
Roadmap
- Maintaining a fully featured HTML sanitizer is a lot of work. We intend to continue to handle community questions and bug reports in a very timely manner.
- There are no plans for major new features other than supporting incoming requests for advanced sanitization such as additional HTML5 support.
The OWASP ® Foundation works to improve the security of software through its community-led open source software projects, hundreds of chapters worldwide, tens of thousands of members, and by hosting local and global conferences.
Project Information
Classification
Audience
Code Repository
Change Log
Leaders
Upcoming OWASP Global Events
Corporate Supporters
OWASP, the OWASP logo, and Global AppSec are registered trademarks and AppSec Days, AppSec California, AppSec Cali, SnowFROC, and LASCON are trademarks of the OWASP Foundation, Inc. Unless otherwise specified, all content on the site is Creative Commons Attribution-ShareAlike v4.0 and provided without warranty of service or accuracy. For more information, please refer to our General Disclaimer. OWASP does not endorse or recommend commercial products or services, allowing our community to remain vendor neutral with the collective wisdom of the best minds in software security worldwide. Copyright 2023, OWASP Foundation, Inc.
Saved searches
Use saved searches to filter your results more quickly
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.
Releases: OWASP/java-html-sanitizer
Release 20220608.1
- Fix bugs in CSS tokenization
- Fix deocding of HTML character references that lack semicolons
like ¶ in HTML attribute values that affected
URL query parameters.
v20211018.2
Changes how we avoid problems with special tags inside elements. Instead of complicating the rendering of elements in all cases, now we just close special elements when they are embedded in elements so no text under a is interpreted as anything other than PCDATA.
20211018.1
This release fixes a vulnerability as tracked by CVE-2021-42575
20200713.1
Improves SVG and MathML support.
Now policies don’t lower-case element and attribute names that are defined in either the SVG or MathML schemas.
Be aware that SVG’s is now distinct from HTML’s .
20190610.1
19 Feb 2018
This tag was signed with the committer’s verified signature. The key expired after the commit was signed.
This commit was signed with the committer’s verified signature. The key expired after the commit was signed.