Php dom to json

Convert dynamic DOM structure to JSON

Question: I have a file as index.html and there are div tags in that file, I am trying to fetch content from all the div tags in the html page, but i am getting the content from only first div tag, I need content from all the div present in html page. here is my code: and the html file is: as the output of the code on the following file i am getting the json conversion of the content present in only first div tag as: Solution: The Reason that you see only one DIV element is that you are creating an associative array that its elements (in your case the DIVs) are being overwritten when iterating over the DIVs elements since they are on the same tree level .

Parse an HTML page to a JSON representation of the DOM tree

I have a file as index.html and there are div tags in that file, I am trying to fetch content from all the div tags in the html page, but i am getting the content from only first div tag, I need content from all the div present in html page.

attributes as $attribute ) < $obj_attribute [$attribute->name] = $attribute->value; > return $obj_attribute; > // Function to get contents of a child element of an HTML tag function get_child_contents($element) < $obj_child = array (); foreach ( $element->childNodes as $subElement ) < if ($subElement->nodeType != XML_ELEMENT_NODE) < if (trim ( $subElement->wholeText ) != "") < $obj_child ["value"] = $subElement->wholeText; > > else < if ($subElement->getAttribute ( 'id' )) < $obj_child [$subElement->tagName . "#" . $subElement->getAttribute ( 'id' )] = get_tag_contents ( $subElement ); > else < $obj_child [$subElement->tagName] = get_tag_contents ( $subElement ); > > > return $obj_child; > // Function to get the contents of an HTML tag function get_tag_contents($element) < $obj_tag = array (); if (get_attribute_contents ( $element )) < $obj_tag ["attributes"] = get_attribute_contents ( $element ); >if (get_child_contents ( $element )) < $obj_tag ["child_nodes"] = get_child_contents ( $element ); >return $obj_tag; > // Function to convert a DOM element to an object function element_to_obj($element) < $object = array (); $tag = $element->tagName; $object [$tag] = get_tag_contents ( $element ); return $object; > // Function to convert an HTML to a DOM element function html_to_obj($html) < $dom = new DOMDocument (); $dom->loadHTML ( $html ); $docElement = $dom->documentElement; return element_to_obj ( $dom->documentElement ); > // Reading the contents of an HTML file $html = file_get_contents ( 'index.html' ); header ( "Content-Type: text/plain" ); // Coverting the HTML to JSON $output = json_encode ( html_to_obj ( $html ) ); // Writing the JSON output to an external file $file = fopen ( "js_output.json", "w" ); fwrite ( $file, $output ); fclose ( $file ); echo "HTML to JSON conversion has been completed.\n"; echo "Please refer to json_output.json to view the JSON output."; ?> 

as the output of the code on the following file i am getting the json conversion of the content present in only first div tag as:

Читайте также:  Err no java title

The Reason that you see only one DIV element is that you are creating an associative array that its elements (in your case the DIVs) are being overwritten when iterating over the DIVs elements since they are on the same tree level .

Your code is a mess and I think it’s to much for something that simple. Here is my version of your code — parsing HTML DOM element into an associative PHP array:

Note: to overcome the overwriting of the same elements I’m simply pushing the children into an indexed array and storing the tagname as an element.

A simple recursive approach (packed into a static class):

You can see a working example here

 Object DomElement * @return Array */ private static function get_attribute_contents($element) < $obj_attribute = []; if ($element->hasAttributes()) < foreach ( $element->attributes as $attribute ) < $obj_attribute [$attribute->name] = $attribute->value; > > return $obj_attribute; > /* Recursive method to walk the DOM tree and Extract the metadata we need * @param $element-> Object DomElement * @param &$tree-> Array Element * @param $text -> String || null * @return Array */ private static function get_tag_contents($element, &$tree, $text = null) < //The node representation in our json model $tree = array( "tagName" =>($element->nodeType === 1 ? $element->tagName : $element->nodeName), "nodeType" => $element->nodeType, "attributes" => self::get_attribute_contents($element), "value" => $text, "child_nodes" => [] ); // iterate over children and Recursively parse them: if ($element->hasChildNodes()) < foreach ($element->childNodes as $subElement) < $text = null; if ($subElement->nodeType === 3) < $text = trim(preg_replace('/\s+/', ' ', $subElement->textContent)); //Removes also \r \n if (empty($text)) continue; //Jump over empty text elements. > self::get_tag_contents($subElement, $tree["child_nodes"][], $text); > > > /* Main Method to convert an HTML string to an Array of nested elements that represents the DOM tree. * @param &$html -> String * @return Array */ public static function html_to_obj(&$html) < $dom = new DOMDocument (); $dom->loadHTML($html); $tree = []; self::get_tag_contents($dom->documentElement, $tree); return $tree; > > 

Now consider this Program and input:

$source = "
"; $array_tree = DomToArray::html_to_obj($source); echo json_encode($array_tree);

The output will be:

< "tagName": "html", "nodeType": 1, "attributes": [], "value": null, "child_nodes": [ < "tagName": "body", "nodeType": 1, "attributes": [], "value": null, "child_nodes": [ < "tagName": "div", "nodeType": 1, "attributes": < "class": "issue-message" >, "value": null, "child_nodes": [ < "tagName": "#text", "nodeType": 3, "attributes": [], "value": "Rename this package name to match the regular expression '^[a-z]+(\\.[a-z][a-z0-9]*)*$'.", "child_nodes": [] >, < "tagName": "button", "nodeType": 1, "attributes": < "class": "button-link issue-rule icon-ellipsis-h little-spacer-left", "aria-label": "Rule Details" >, "value": null, "child_nodes": [] > ] >, < "tagName": "div", "nodeType": 1, "attributes": < "class": "issue-message" >, "value": null, "child_nodes": [ < "tagName": "#text", "nodeType": 3, "attributes": [], "value": "Replace this use of System.out or System.err by a logger.", "child_nodes": [] >, < "tagName": "button", "nodeType": 1, "attributes": < "class": "button-link issue-rule icon-ellipsis-h little-spacer-left", "aria-label": "Rule Details" >, "value": null, "child_nodes": [] > ] > ] > ] > 

What is JSON, The JSON format is syntactically identical to the code for creating JavaScript objects. Because of this similarity, a JavaScript program can easily convert JSON

HTML : convert HTML DOM structure to JSON

HTML : convert HTML DOM structure to JSON [ Beautify Your Computer : https://www.hows.tech/p/recommended.html ] HTML : convert HTML DOM structure to JSON No

Convert HTML DOM structure to JSON

convert HTML DOM structure to JSON — HTML [ Glasses to protect eyes while coding : https://amzn.to/3N1ISWI ] convert HTML DOM structure to JSON — HTML Discl

Convert dynamic DOM structure to JSON

How can convert below DOM to JSON array which will have many more leafs in thefuture . I have tried a lot but didn’t got any solution. I found this but it is not possible in my case.

JSON Required is

function recursive(dis) < $(this).find(' >ul ').each(function()< params[$(this).attr('data-self')]=[]; var i=0; $(this).find('li').each(function()< if($(this).has('ul')) < params[$(this).attr('data-parent')][i++]=rec(this); >else < params[$(this).attr('data-parent')][i++]=$(this).attr('data-self'); >>); >); > recursive('#org1'); console.log(params); 

I would prefer something like the following. The 2 changes in the HTML you need to make are:

1) Add a class to your ul such as . 2) Add a class to the wrapper ul such as .

var json = <>; $("#org1").find("ul.parent").each(function() < var $this = $(this); json[$this.data("self")] = <>; var $nodes = $this.find("ul.nodes"); $nodes.each(function() < var children = $(this).find("li").map(function() < return $(this).data("self") >).get(); json[$this.data("self")][$(this).data("self")] = children; >); >);alert (JSON.stringify(json));

Convert HTML DOM structure to JSON, my general idea is to go deep until it finds the INPUT , then create an object with the key/value of the span.innerHTML/input.value , and return

Parsing «flat» HTML structure with PHP DOM

I’m attempting to use PHP DOM with help parsing an HTML file that I want to translate into JSON. However, unfortunately the HTML DOM is fairly flat (and I have no way to change that). By flat I mean the structure is something like this:

title

child node another child

title

child node another child

title

child node another child

I need to be able to get the ‘s and treat the ‘s as children. I’m not completely set on using PHP DOM if there’s a better alternative, it’s simply what I found in an answer I came across, so please feel free to suggest anything. What I really need is to serve this HTML string into JSON, and PHP DOM looks like my best bet thus far.

$XML =title child node another child 

title

child node another child

title

child node another child XML; $dom = new DOMDocument; $dom->loadHTML($XML); $xp = new DOMXPath($dom); $new = new DOMDocument; $root = $new->createElement('root'); foreach($xp->query('/html//*/node()') as $i => $node) < if ($node->nodeType == XML_TEXT_NODE) continue; if ($node->nodeName == 'h2') < if(isset($current)) $root->appendChild($current); $current = $new->createElement('div'); $current->appendChild($new->importNode($node, true)); continue; > $current->appendChild($new->importNode($node, true)); > $new->appendChild($root); $xml2 = simplexml_load_string($new->saveHTML()); echo json_encode($xml2);

How to convert HTML to JSON using PHP?, $dom = new DOMDocument(); $dom->loadHTML($html); foreach($dom->getElementsByTagName(‘*’) as $el) < $result[] = ["type" =>$el->tagName, «value» =

Источник

matb33 / DOMDocumentExtended.php

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters

Stupid PHP newbie question. How does someone get this to work? I am using include (‘DOMDocumentExtended.php’);, $doc = new DomDocument(‘1.0’); blah blah XML coding then $doc->saveJSON(); which results in: Fatal error: Call to undefined method DOMDocument::saveJSON() in .

Try $doc = new DOMDocumentExtended() instead

I tried to use your solution but get error:
Parse error: syntax error, unexpected T_SL in /home/ccc/domains/ccc.com/public_html/amvl/DOMDocumentExtended.php on line 48

This line 48 = define( «XML_TO_JSON_STYLESHEET»,

Do you understand what is wrong?

Probably because of the «nowdoc» syntax:

You’ll need PHP 5.3.0 or higher

Couldn’t get this to work fully. It generally worked, but for some strange reason some of the attribute’s values were randomly missing in the resulting JSON even though they had normal everyday values in the XML.

For json_encoding a DOMDocument $a

Used this for some SOAP applications where I couldn’t get nuSOAP or the built in SOAP functionality to work very well in PHP. Couple of minor bugs that I got fixed and it works like a charm now.

 "|" . implode( "|", $forceArray ) . "|" ); $xslDocument = new DOMDocument(); $xslDocument->loadXML( XML_TO_JSON_STYLESHEET ); try < $processor = new XSLTProcessor(); $processor->importStyleSheet( $xslDocument ); $processor->setParameter( "", $xsltParameters ); $result = $processor->transformToXML( $this ); $failure = $result === false || empty( $result ); > catch( Exception $e ) < $failure = $result = false; >if( $failure ) < // TODO: implement error handling (throw an exception, preferably looking up libxml errors) >return $result; > > define( "XML_TO_JSON_STYLESHEET",       >              "":  , > ,  "":"" ,     "":[   , > ,  ]    "@":  >                                                  XML ); ?> 

Источник

trinhnk / html_to_obj.php

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters

// https://stackoverflow.com/questions/23062537/how-to-convert-html-to-json-using-php
function html_to_obj ( $ html )
$ dom = new DOMDocument ();
$ dom -> loadHTML ( $ html );
return element_to_obj( $ dom -> documentElement );
>
function element_to_obj ( $ element )
if ( isset( $ element -> tagName ) )
$ obj = array ( ‘tag’ => $ element -> tagName );
>
if ( isset( $ element -> attributes ) )
foreach ( $ element -> attributes as $ attribute )
$ obj [ $ attribute -> name ] = $ attribute -> value ;
>
>
if ( isset( $ element -> childNodes ) )
foreach ( $ element -> childNodes as $ subElement )
if ( $ subElement -> nodeType == XML_TEXT_NODE )
$ obj [ ‘html’ ] = $ subElement -> wholeText ;
> elseif ( $ subElement -> nodeType == XML_CDATA_SECTION_NODE )
$ obj [ ‘html’ ] = $ subElement -> data ;
> else
$ obj [ ‘children’ ][] = element_to_obj( $ subElement );
>
>
>
return ( isset( $ obj ) ) ? $ obj : null ;
>

Источник

Оцените статью