Php html code as text

Содержание

html_entity_decode
Parameters
Return Values
htmlspecialchars_decode
Parameters
Return Values
Changelog
Saved searches
Use saved searches to filter your results more quickly
License
soundasleep/html2text
Name already in use
Sign In Required
Launching GitHub Desktop
Launching GitHub Desktop
Launching Xcode
Launching Visual Studio Code
Latest commit
Git stats
Files
README.md
About

html_entity_decode

html_entity_decode() is the opposite of htmlentities() in that it converts HTML entities in the string to their corresponding characters.

More precisely, this function decodes all the entities (including all numeric entities) that a) are necessarily valid for the chosen document type — i.e., for XML, this function does not decode named entities that might be defined in some DTD — and b) whose character or characters are in the coded character set associated with the chosen encoding and are permitted in the chosen document type. All other entities are left as is.

Parameters

A bitmask of one or more of the following flags, which specify how to handle quotes and which document type to use. The default is ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401 .

Available flags constants

Constant Name	Description
ENT_COMPAT	Will convert double-quotes and leave single-quotes alone.
ENT_QUOTES	Will convert both double and single quotes.
ENT_NOQUOTES	Will leave both double and single quotes unconverted.
ENT_SUBSTITUTE	Replace invalid code unit sequences with a Unicode Replacement Character U+FFFD (UTF-8) or � (otherwise) instead of returning an empty string.
ENT_HTML401	Handle code as HTML 4.01.
ENT_XML1	Handle code as XML 1.
ENT_XHTML	Handle code as XHTML.
ENT_HTML5	Handle code as HTML 5.

An optional argument defining the encoding used when converting characters.

Читайте также: Mutable type in python

If omitted, encoding defaults to the value of the default_charset configuration option.

Although this argument is technically optional, you are highly encouraged to specify the correct value for your code if the default_charset configuration option may be set incorrectly for the given input.

The following character sets are supported:

Supported charsets

Charset	Aliases	Description
ISO-8859-1	ISO8859-1	Western European, Latin-1.
ISO-8859-5	ISO8859-5	Little used cyrillic charset (Latin/Cyrillic).
ISO-8859-15	ISO8859-15	Western European, Latin-9. Adds the Euro sign, French and Finnish letters missing in Latin-1 (ISO-8859-1).
UTF-8	ASCII compatible multi-byte 8-bit Unicode.
cp866	ibm866, 866	DOS-specific Cyrillic charset.
cp1251	Windows-1251, win-1251, 1251	Windows-specific Cyrillic charset.
cp1252	Windows-1252, 1252	Windows specific charset for Western European.
KOI8-R	koi8-ru, koi8r	Russian.
BIG5	950	Traditional Chinese, mainly used in Taiwan.
GB2312	936	Simplified Chinese, national standard character set.
BIG5-HKSCS	Big5 with Hong Kong extensions, Traditional Chinese.
Shift_JIS	SJIS, SJIS-win, cp932, 932	Japanese
EUC-JP	EUCJP, eucJP-win	Japanese
MacRoman	Charset that was used by Mac OS.
»	An empty string activates detection from script encoding (Zend multibyte), default_charset and current locale (see nl_langinfo() and setlocale() ), in this order. Not recommended.

Note: Any other character sets are not recognized. The default encoding will be used instead and a warning will be emitted.

Return Values

Returns the decoded string.

Источник

htmlspecialchars_decode

This function is the opposite of htmlspecialchars() . It converts special HTML entities back to characters.

The converted entities are: & , " (when ENT_NOQUOTES is not set), ' (when ENT_QUOTES is set), < and > .

Parameters

A bitmask of one or more of the following flags, which specify how to handle quotes and which document type to use. The default is ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401 .

Available flags constants

Constant Name	Description
ENT_COMPAT	Will convert double-quotes and leave single-quotes alone.
ENT_QUOTES	Will convert both double and single quotes.
ENT_NOQUOTES	Will leave both double and single quotes unconverted.
ENT_SUBSTITUTE	Replace invalid code unit sequences with a Unicode Replacement Character U+FFFD (UTF-8) or � (otherwise) instead of returning an empty string.
ENT_HTML401	Handle code as HTML 4.01.
ENT_XML1	Handle code as XML 1.
ENT_XHTML	Handle code as XHTML.
ENT_HTML5	Handle code as HTML 5.

Return Values

Returns the decoded string.

Changelog

Version	Description
8.1.0	flags changed from ENT_COMPAT to ENT_QUOTES \| ENT_SUBSTITUTE \| ENT_HTML401 .

Источник

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

A PHP component to convert HTML into a plain text format

License

soundasleep/html2text

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Git stats

Files

Failed to load latest commit information.

README.md

html2text is a very simple script that uses DOM methods to convert HTML into a format similar to what would be rendered by a browser — perfect for places where you need a quick text representation. For example:

html> title>Ignored Titletitle> body> h1>Hello, World!h1> p>This is some e-mail content. Even though it has whitespace and newlines, the e-mail converter will handle it correctly. p>Even mismatched tags.p> div>A divdiv> div>Another divdiv> div>A divdiv>within a divdiv>div> a href pl-s">http://foo.com">A linka> body> html>

Hello, World! This is some e-mail content. Even though it has whitespace and newlines, the e-mail converter will handle it correctly. Even mismatched tags. A div Another div A div within a div [A link](http://foo.com)

You can use Composer to add the package to your project:

< "require": < "soundasleep/html2text": "~1.1" > >

And then use it quite simply:

$text = \Soundasleep\Html2Text::convert($html);

You can also include the supplied html2text.php and use $text = convert_html_to_text($html); instead.

Option	Default	Description
ignore_errors	false	Set to true to ignore any XML parsing errors.
drop_links	false	Set to true to not render links as [http://foo.com](My Link) , but rather just My Link .
char_set	‘auto’	Specify a specific character set. Pass multiple character sets (comma separated) to detect encoding, default is ASCII,UTF-8

Pass along options as a second argument to convert , for example:

$options = array( 'ignore_errors' => true, // other options go here ); $text = \Soundasleep\Html2Text::convert($html, $options);

Some very basic tests are provided in the tests/ directory. Run them with composer install && vendor/bin/phpunit .

Class ‘DOMDocument’ not found

You need to install the PHP XML extension for your PHP version. e.g. apt-get install php7.4-xml

html2text is licensed under MIT, making it suitable for both Eclipse and GPL projects.

Also see html2text_ruby, a Ruby implementation.

About

A PHP component to convert HTML into a plain text format

Источник

Php html code as text

html_entity_decode

Parameters

Return Values

htmlspecialchars_decode

Parameters

Return Values

Changelog

Saved searches

Use saved searches to filter your results more quickly

License

soundasleep/html2text

Name already in use

Sign In Required

Launching GitHub Desktop

Launching GitHub Desktop

Launching Xcode

Launching Visual Studio Code

Latest commit

Git stats

Files

README.md

About