- DOMDocument::saveHTML
- Parameters
- Return Values
- Examples
- See Also
- User Contributed Notes 18 notes
- Foo
- Foo
- Hello world!
- Hello world!
- return html from a php function
- Treehouse
- See Full Catalogue
- Techdegree
- Tracks
- Courses
- DOMDocument::saveHTML
- Parameters
- Return Values
- Examples
- See Also
- User Contributed Notes 18 notes
- Foo
- Foo
- Hello world!
- Hello world!
DOMDocument::saveHTML
Creates an HTML document from the DOM representation. This function is usually called after building a new dom document from scratch as in the example below.
Parameters
Optional parameter to output a subset of the document.
Return Values
Returns the HTML, or false if an error occurred.
Examples
Example #1 Saving a HTML tree into a string
$root = $doc -> createElement ( ‘html’ );
$root = $doc -> appendChild ( $root );
$head = $doc -> createElement ( ‘head’ );
$head = $root -> appendChild ( $head );
$title = $doc -> createElement ( ‘title’ );
$title = $head -> appendChild ( $title );
$text = $doc -> createTextNode ( ‘This is the title’ );
$text = $title -> appendChild ( $text );
See Also
- DOMDocument::saveHTMLFile() — Dumps the internal document into a file using HTML formatting
- DOMDocument::loadHTML() — Load HTML from a string
- DOMDocument::loadHTMLFile() — Load HTML from a file
User Contributed Notes 18 notes
As of PHP 5.4 and Libxml 2.6, there is currently simpler approach:
when you load html as this
$html->loadHTML($content, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
in the output, there will be no doctype, html or body tags
When saving HTML fragment initiated with LIBXML_HTML_NOIMPLIED option, it will end up being «broken» as libxml requires root element. libxml will attempt to fix the fragment by adding closing tag at the end of string based on the first opened tag it encounters in the fragment.
Foo
bar
Foo
bar
Easiest workaround is adding root tag yourself and stripping it later:
$html->loadHTML(‘‘ . $content .’‘, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$content = str_replace(array(‘‘,’‘) , » , $html->saveHTML());
This method, as of 5.2.6, will automatically add and tags to the document if they are missing, without asking whether you want them. In my application, I needed to use the DOM methods to manipulate just a fragment of html, so these tags were rather unhelpful.
Here’s a simple hack to remove them in case, like me, all you wanted to do was perform a few operations on an HTML fragment.
I am using this solution to prevent tags and the doctype from being added to the HTML string automatically:
$html = ‘
Hello world!
‘ ;
$html = ‘
‘ ;
$doc = new DOMDocument ;
$doc -> loadHTML ( $html );
echo substr ( $doc -> saveXML ( $doc -> getElementsByTagName ( ‘div’ )-> item ( 0 )), 5 , — 6 )
// Outputs: «
Hello world!
»
?>
Since PHP/5.3.6, DOMDocument->saveHTML() accepts an optional DOMNode parameter similarly to DOMDocument->saveXML():
If you load HTML from a string ensure the charset is set.
Otherwise the charset will be ISO-8859-1!
Tested in PHP 5.2.9-2 and PHP 5.2.17.
saveHTML() игнорирует свойство DOMDocument->encoding. Метод saveHTML() сохраняет html-документ в кодировке, которая указана в теге исходного (загруженного) html-документа.
saveHTML() ignores property DOMDocument->encoding. Method saveHTML() saves the html-document encoding, which is specified in the tag source (downloaded) html-document.
Example:
file.html. Кодировка файла должна совпадать с указанной в теге . The encoding of the file must match the specified tag .
error_reporting (- 1 );
$document =new domDocument ( ‘1.0’ , ‘UTF-8’ );
$document -> preserveWhiteSpace = false ;
$document -> loadHTMLFile ( ‘file.html’ );
$document -> formatOutput = true ;
$document -> encoding = ‘UTF-8’ ;
$htm = $document -> saveHTML ();
echo «Записано байт. Recorded bytes: » . file_put_contents ( ‘file_new.html’ , $htm );
?>
file_new.html будет в кодировке Windows-1251 (НЕ в UTF-8).
file_new.html will be encoded in Windows-1251 (not in UTF-8).
saveHTML() и file_put_contents() позволяют преодолеть недостаток метода saveHTMLFile().
Смотрите мой комментарий к методу saveHTMLFile().
saveHTML() and file_put_contents() allows you to overcome the lack of a method saveHTMLFile().
See my comment on the method saveHTMLFile().
http://php.net/manual/ru/domdocument.savehtmlfile.php
To solve the script tag problem just add an empty text node to the script node and DOMDocument will render nicely.
To avoid script tags from being output as , you can use the DOMDocumentFragment class:
$doc = new DOMDocument ();
$doc -> loadXML ( $xmlstring );
$fragment = $doc -> createDocumentFragment ();
/* Append the script element to the fragment using raw XML strings (will be preserved in their raw form) and if succesful proceed to insert it in the DOM tree */
if( $fragment -> appendXML ( «» ) <
$xpath = new DOMXpath ( $doc );
$resultlist = $xpath -> query ( «//*[local-name() = ‘html’]/*[local-name() = ‘head’]» ); /* namespace-safe method to find all head elements which are childs of the html element, should only return 1 match */
foreach( $resultlist as $headnode ) // insert the script tag
$headnode -> appendChild ( $fragment );
>
$doc -> saveXML (); /* and our script tags will still be */
If you want a simpler way to get around the
$script = $doc -> createElement ( ‘script’ );\
// Creating an empty text node forces
$script -> appendChild ( $doc -> createTextNode ( » ));
$head -> appendChild ( $script );
If created your DOMDocument object using loadHTML() (where the source is from another site) and want to pass your changes back to the browser you should make sure the HTTP Content-Type header matches your meta content-type tags value because modern browsers seem to ignore the meta tag and trust just the HTTP header. For example if you’re reading an ISO-8859-1 document and your web server is claiming UTF-8 you need to correct it using the header() function.
header ( ‘Content-Type: text/html; charset=iso-8859-1’ );
?>
return html from a php function
Hi thank you for your help, I had to amend your code slightly by taking out the $ sign when calling the key inside the foreach. I also added an if statement for best practice reasons.
function machineJobs($jobs) $output = ""; foreach ($jobs as $job) if ( isset($job['id']) and isset($job['jobTitle']) and isset($job['jobStartDate']) and isset($job['jobDuration']) and isset($job['qty'])) $output . p">tr>"; $output . p">td>".$job['id']."td>"; $output . p">td>".$job['jobTitle']."td>"; $output . p">td>".$job['jobStartDate']."td>"; $output . p">td>".$job['jobDuration']."td>"; $output . p">td>".$job['qty']."td> tr>"; >> return $output; >
Alexandre Babeanu
Alexandre Babeanu
Posting to the forum is only allowed for members with active accounts.
Please sign in or sign up to post.
Treehouse
See Full Catalogue
Techdegree
Tracks
Courses
DOMDocument::saveHTML
Creates an HTML document from the DOM representation. This function is usually called after building a new dom document from scratch as in the example below.
Parameters
Optional parameter to output a subset of the document.
Return Values
Returns the HTML, or false if an error occurred.
Examples
Example #1 Saving a HTML tree into a string
$root = $doc -> createElement ( ‘html’ );
$root = $doc -> appendChild ( $root );
$head = $doc -> createElement ( ‘head’ );
$head = $root -> appendChild ( $head );
$title = $doc -> createElement ( ‘title’ );
$title = $head -> appendChild ( $title );
$text = $doc -> createTextNode ( ‘This is the title’ );
$text = $title -> appendChild ( $text );
See Also
- DOMDocument::saveHTMLFile() — Dumps the internal document into a file using HTML formatting
- DOMDocument::loadHTML() — Load HTML from a string
- DOMDocument::loadHTMLFile() — Load HTML from a file
User Contributed Notes 18 notes
As of PHP 5.4 and Libxml 2.6, there is currently simpler approach:
when you load html as this
$html->loadHTML($content, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
in the output, there will be no doctype, html or body tags
When saving HTML fragment initiated with LIBXML_HTML_NOIMPLIED option, it will end up being «broken» as libxml requires root element. libxml will attempt to fix the fragment by adding closing tag at the end of string based on the first opened tag it encounters in the fragment.
Foo
bar
Foo
bar
Easiest workaround is adding root tag yourself and stripping it later:
$html->loadHTML(‘‘ . $content .’‘, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$content = str_replace(array(‘‘,’‘) , » , $html->saveHTML());
This method, as of 5.2.6, will automatically add and tags to the document if they are missing, without asking whether you want them. In my application, I needed to use the DOM methods to manipulate just a fragment of html, so these tags were rather unhelpful.
Here’s a simple hack to remove them in case, like me, all you wanted to do was perform a few operations on an HTML fragment.
I am using this solution to prevent tags and the doctype from being added to the HTML string automatically:
$html = ‘
Hello world!
‘ ;
$html = ‘
‘ ;
$doc = new DOMDocument ;
$doc -> loadHTML ( $html );
echo substr ( $doc -> saveXML ( $doc -> getElementsByTagName ( ‘div’ )-> item ( 0 )), 5 , — 6 )
// Outputs: «
Hello world!
»
?>
Since PHP/5.3.6, DOMDocument->saveHTML() accepts an optional DOMNode parameter similarly to DOMDocument->saveXML():
If you load HTML from a string ensure the charset is set.
Otherwise the charset will be ISO-8859-1!
Tested in PHP 5.2.9-2 and PHP 5.2.17.
saveHTML() игнорирует свойство DOMDocument->encoding. Метод saveHTML() сохраняет html-документ в кодировке, которая указана в теге исходного (загруженного) html-документа.
saveHTML() ignores property DOMDocument->encoding. Method saveHTML() saves the html-document encoding, which is specified in the tag source (downloaded) html-document.
Example:
file.html. Кодировка файла должна совпадать с указанной в теге . The encoding of the file must match the specified tag .
error_reporting (- 1 );
$document =new domDocument ( ‘1.0’ , ‘UTF-8’ );
$document -> preserveWhiteSpace = false ;
$document -> loadHTMLFile ( ‘file.html’ );
$document -> formatOutput = true ;
$document -> encoding = ‘UTF-8’ ;
$htm = $document -> saveHTML ();
echo «Записано байт. Recorded bytes: » . file_put_contents ( ‘file_new.html’ , $htm );
?>
file_new.html будет в кодировке Windows-1251 (НЕ в UTF-8).
file_new.html will be encoded in Windows-1251 (not in UTF-8).
saveHTML() и file_put_contents() позволяют преодолеть недостаток метода saveHTMLFile().
Смотрите мой комментарий к методу saveHTMLFile().
saveHTML() and file_put_contents() allows you to overcome the lack of a method saveHTMLFile().
See my comment on the method saveHTMLFile().
http://php.net/manual/ru/domdocument.savehtmlfile.php
To solve the script tag problem just add an empty text node to the script node and DOMDocument will render nicely.
To avoid script tags from being output as , you can use the DOMDocumentFragment class:
$doc = new DOMDocument ();
$doc -> loadXML ( $xmlstring );
$fragment = $doc -> createDocumentFragment ();
/* Append the script element to the fragment using raw XML strings (will be preserved in their raw form) and if succesful proceed to insert it in the DOM tree */
if( $fragment -> appendXML ( «» ) <
$xpath = new DOMXpath ( $doc );
$resultlist = $xpath -> query ( «//*[local-name() = ‘html’]/*[local-name() = ‘head’]» ); /* namespace-safe method to find all head elements which are childs of the html element, should only return 1 match */
foreach( $resultlist as $headnode ) // insert the script tag
$headnode -> appendChild ( $fragment );
>
$doc -> saveXML (); /* and our script tags will still be */
If you want a simpler way to get around the
$script = $doc -> createElement ( ‘script’ );\
// Creating an empty text node forces
$script -> appendChild ( $doc -> createTextNode ( » ));
$head -> appendChild ( $script );
If created your DOMDocument object using loadHTML() (where the source is from another site) and want to pass your changes back to the browser you should make sure the HTTP Content-Type header matches your meta content-type tags value because modern browsers seem to ignore the meta tag and trust just the HTTP header. For example if you’re reading an ISO-8859-1 document and your web server is claiming UTF-8 you need to correct it using the header() function.
header ( ‘Content-Type: text/html; charset=iso-8859-1’ );
?>