- как извлечь атрибуты тегов с помощью простого html dom?
- 2 ответа
- How to extract tag attributes with Simple html dom ?
- How to extract tag attributes with Simple html dom ?
- Using PHP simple html dom get attributes name in span?
- Simple HTML DOM get meta tag’s special attribute’s value
- Simple html dom: how get a tag without certain attribute
- Parsing documents
- DOM methods & properties
- Element methods & properties
- DOM traversing
- Camel naming conventions
как извлечь атрибуты тегов с помощью простого html dom?
Мне нужна часть «Пн, Вт, Ср, Чт, Пт, Сб, Вс 08: 00-00: 00».
Вот что я попробовал до сих пор:
$url="https://www1.shoppersdrugmart.ca/en/store-locator/store/668"; include ('../classes/simple_html_dom.php'); $html = file_get_html($url); //this works fine $eg = $html->find('dd[itemprop="telephone"]'); echo "Phone: ".$eg[0]->plaintext. "
"; //this does not work $eg = $html->find('meta[itemprop="openingHours"]'); echo "openingHours: ". $eg['content']->plaintext. "
"; $oh_content=$html->find('meta[itemprop="openingHours"]')->attr("content"); echo $oh_content."*
"; $oh_content1=$html->find('meta[itemprop="openingHours"]')->content; echo $oh_content1."*
";
2 ответа
Поскольку $eg = $html->find(‘dd[itemprop=»telephone»]’); $eg — это массив отфильтрованных узлов, тогда это верно для вашего второго find :
$eg = $html->find('meta[itemprop="openingHours"]'); // $eg is array: var_dump($eg[0]->content);
Если кому-то это нужно, вот код, который работает:
$url="https://www1.shoppersdrugmart.ca/en/store-locator/store/668"; include ('../classes/simple_html_dom.php'); $html = file_get_html($url); $eg = $html->find('dd[itemprop="telephone"]'); echo "Phone: ".$eg[0]->plaintext. "
"; $eg = $html->find('meta[itemprop="openingHours"]'); echo "openingHours: ". $eg[0]->content. "
";
How to extract tag attributes with Simple html dom ?
Solution 2: Simple HTML DOM class does not support CSS3 pseudo classes which is required for negative attribute matching. a,b,c,d Solution 1: by official manual: http://simplehtmldom.sourceforge.net/manual.htm Solution 2: You can extract all the meta tags and loop through them: Then you can check if the meta tag has an attribute called ‘itemprop’, and if the itemprop has a value of ‘keywords’: Then you can just output the content attribute: Complete code:
How to extract tag attributes with Simple html dom ?
I an trying to extract information using simple_html_dom.php
I need the «Mo,Tu,We,Th,Fr,Sa,Su 08:00-00:00» part.
Here is what I tried so far:
$url="https://www1.shoppersdrugmart.ca/en/store-locator/store/668"; include ('../classes/simple_html_dom.php'); $html = file_get_html($url); //this works fine $eg = $html->find('dd[itemprop="telephone"]'); echo "Phone: ".$eg[0]->plaintext. "
"; //this does not work $eg = $html->find('meta[itemprop="openingHours"]'); echo "openingHours: ". $eg['content']->plaintext. "
"; $oh_content=$html->find('meta[itemprop="openingHours"]')->attr("content"); echo $oh_content."*
"; $oh_content1=$html->find('meta[itemprop="openingHours"]')->content; echo $oh_content1."*
";
As in $eg = $html->find(‘dd[itemprop=»telephone»]’); $eg is array of filtered nodes, then it is true for your second find :
$eg = $html->find('meta[itemprop="openingHours"]'); // $eg is array: var_dump($eg[0]->content);
In case someone need it, here is the code that works:
$url="https://www1.shoppersdrugmart.ca/en/store-locator/store/668"; include ('../classes/simple_html_dom.php'); $html = file_get_html($url); $eg = $html->find('dd[itemprop="telephone"]'); echo "Phone: ".$eg[0]->plaintext. "
"; $eg = $html->find('meta[itemprop="openingHours"]'); echo "openingHours: ". $eg[0]->content. "
";
Fetch url-data attribute using simple html dom, I need to fetch all domains names from a div then store them in a php array using simple html dom parser exemple :.
Using PHP simple html dom get attributes name in span?
I am not if ‘tags’ are the right term but i have to get the «data-time» values from this span into an array. How can I use simple html dom to get them?
Here is on span I am trying to get the «data-time» out of.
include('../simpleHtmlDom/simple_html_dom.php'); // Put the Twitters username here $user = "yadayada"; $html = file_get_html("https://twitter.com/$user"); $ret = $html->find('div[class=ProfileTweet-contents]'); $ret = $html->find('p[class=ProfileTweet-text js-tweet-text u-dir]'); /// tries to get the time code but does only gets the span $date = $html->find('span[class=js-short-timestamp js-relative-timestamp]', 0); $DoesNotWork = $html->find( "data-time", 0 ); echo $ret[1]; // get's a users tweet. echo $DoesNotWork;
I would think it is something like this but this code does not work. $html->find( «data-time», 0 );
// Include the script $url = 'https://twitter.com/yourusername'; $html = file_get_html($url); $dateTimes = array(); foreach ($html->find('div.GridTimeline .js-short-timestamp') as $value) < $dateTimes[] = $value->innertext; >
Result of print_r($dateTimes); :
Array ( [0] => 2h [1] => 2h [2] => 2h // Truncated. [10] => 11h [11] => May 30 [12] => May 30 [13] => May 6 // Truncated. )
I was able to get the date using this code, tho I think there is a better way. I think it would be best to find a simple dom code that gets the text of the date-time in line
but instead I used two «list» php lines as seen below and that worked.
$dateTimes = array(); foreach ($html->find('div.GridTimeline .js-short-timestamp') as $value) < $dateTimes[] = $value->outertext; > // These are the lines I get the date-time from. list($Gone,$Keep) = explode("data-time=\"", $dateTimes[0]); list($Date,$Gone) = explode("\"", $Keep); $Date = date('M d, Y', $Date);
In case anyone landing here in 2021, following no 1 google search result:
Unless I misinterpreted your intention, you might achieve what you want using (with simplehtmldom):
$html->find('span[data-time]')->attr[data-time];
The official simplehtmldom documentation fails to mention that. However, https://stackoverflow.com/a/14456823/10050838 is one possible source.
Simple HTML DOM get meta tag’s special attribute’s value, Simple HTML DOM get meta tag’s special attribute’s value How can I get content’s value? ..
Simple HTML DOM get meta tag’s special attribute’s value
I want to use http://simplehtmldom.sourceforge.net/ on my project
How can I get content’s value?
Im trying below but it is not working
echo $html->find('meta[itemprop=keywords]')[0]["attr"]["content"]
expected output should be: a,b,c,d
echo $html->find('meta[itemprop=keywords]', 0)->content;
by official manual: http://simplehtmldom.sourceforge.net/manual.htm
You can extract all the meta tags and loop through them:
foreach($html->find('meta') as $meta)
Then you can check if the meta tag has an attribute called ‘itemprop’, and if the itemprop has a value of ‘keywords’:
if($meta->getAttribute('itemprop') && $meta->getAttribute('itemprop') == 'keywords')
Then you can just output the content attribute:
echo $meta->getAttribute('content');
foreach($html->find('meta') as $meta)< if($meta->getAttribute('itemprop') && $meta->getAttribute('itemprop') == 'keywords')< echo $meta->getAttribute('content'); > >
Simple html dom: how get a tag without certain attribute, From the PHP Simple HTML DOM Parser Manual, under the How to find HTML elements?, we can read: [!attribute] Matches elements that don’t have
Simple html dom: how get a tag without certain attribute
I want to get the tags with «class» attribute equal to «someclass» but only those tags that hasn’t defined the attribute «id».
I tried the following (based on this answer) but didn’t work:
I’m using Simple HTML DOM class and in the basic documentation that they give, I didn’t find what I need.
From the Php simple html dom parser manual, under the How to find HTML elements?, we can read:
[!attribute] Matches elements that don’t have the specified attribute.
Your code would become:
This will match elements with a class someClass that do not have an id attribute.
My original answer was based on the selection of elements just like we would with jQuery since the Simple HTML DOM Parser claims to support them on their main page where we can read:
Find tags on an HTML page with selectors just like jQuery.
My sincere apologies to those who were offended by my original answer and expressed their displeasure in the comments!
Simple HTML DOM class does not support CSS3 pseudo classes which is required for negative attribute matching.
It is simple to work around the limitation without much trouble.
$nodes = array_filter($html->find('.something'), function($node)id);>);
PHP Simple HTML DOM list of attributes, The simple_html_dom library is great for getting known attributes, but is there a way to get a list of all the attributes for an element?
Parsing documents
The parser accepts documents in the form of URLs, files and strings. The document must be accessible for reading and cannot exceed MAX_FILE_SIZE .
Name | Description |
---|---|
str_get_html( string $content ) : object | Creates a DOM object from string. |
file_get_html( string $filename ) : object | Creates a DOM object from file or URL. |
DOM methods & properties
Name | Description |
---|---|
__construct( [string $filename] ) : void | Constructor, set the filename parameter will automatically load the contents, either text or file/url. |
plaintext : string | Returns the contents extracted from HTML. |
clear() : void | Clean up memory. |
load( string $content ) : void | Load contents from string. |
save( [string $filename] ) : string | Dumps the internal DOM tree back into a string. If the $filename is set, result string will save to file. |
load_file( string $filename ) : void | Load contents from a file or a URL. |
set_callback( string $function_name ) : void | Set a callback function. |
find( string $selector [, int $index] ) : mixed | Find elements by the CSS selector. Returns the Nth element object if index is set, otherwise return an array of object. |
Element methods & properties
Name | Description |
---|---|
[attribute] : string | Read or write element’s attribute value. |
tag : string | Read or write the tag name of element. |
outertext : string | Read or write the outer HTML text of element. |
innertext : string | Read or write the inner HTML text of element. |
plaintext : string | Read or write the plain text of element. |
find( string $selector [, int $index] ) : mixed | Find children by the CSS selector. Returns the Nth element object if index is set, otherwise return an array of object. |
DOM traversing
Name | Description |
---|---|
$e->children( [int $index] ) : mixed | Returns the Nth child object if index is set, otherwise return an array of children. |
$e->parent() : element | Returns the parent of element. |
$e->first_child() : element | Returns the first child of element, or null if not found. |
$e->last_child() : element | Returns the last child of element, or null if not found. |
$e->next_sibling() : element | Returns the next sibling of element, or null if not found. |
$e->prev_sibling() : element | Returns the previous sibling of element, or null if not found. |
Camel naming conventions
Method | Mapping |
---|---|
$e->getAllAttributes() | $e->attr |
$e->getAttribute( $name ) | $e->attribute |
$e->setAttribute( $name, $value) | $value = $e->attribute |
$e->hasAttribute( $name ) | isset($e->attribute) |
$e->removeAttribute ( $name ) | $e->attribute = null |
$e->getElementById ( $id ) | $e->find ( «#$id», 0 ) |
$e->getElementsById ( $id [,$index] ) | $e->find ( «#$id» [, int $index] ) |
$e->getElementByTagName ($name ) | $e->find ( $name, 0 ) |
$e->getElementsByTagName ( $name [, $index] ) | $e->find ( $name [, int $index] ) |
$e->parentNode () | $e->parent () |
$e->childNodes ( [$index] ) | $e->children ( [int $index] ) |
$e->firstChild () | $e->first_child () |
$e->lastChild () | $e->last_child () |
$e->nextSibling () | $e->next_sibling () |
$e->previousSibling () | $e->prev_sibling () |