diff options
Diffstat (limited to 'devdocs/elisp/parsing-html_002fxml.html')
| -rw-r--r-- | devdocs/elisp/parsing-html_002fxml.html | 30 |
1 files changed, 0 insertions, 30 deletions
diff --git a/devdocs/elisp/parsing-html_002fxml.html b/devdocs/elisp/parsing-html_002fxml.html deleted file mode 100644 index d60e483f..00000000 --- a/devdocs/elisp/parsing-html_002fxml.html +++ /dev/null @@ -1,30 +0,0 @@ - <h3 class="section">Parsing HTML and XML</h3> <p>Emacs can be compiled with built-in libxml2 support. </p> <dl> <dt id="libxml-available-p">Function: <strong>libxml-available-p</strong> -</dt> <dd><p>This function returns non-<code>nil</code> if built-in libxml2 support is available in this Emacs session. </p></dd> -</dl> <p>When libxml2 support is available, the following functions can be used to parse HTML or XML text into Lisp object trees. </p> <dl> <dt id="libxml-parse-html-region">Function: <strong>libxml-parse-html-region</strong> <em>start end &optional base-url discard-comments</em> -</dt> <dd> -<p>This function parses the text between <var>start</var> and <var>end</var> as HTML, and returns a list representing the HTML <em>parse tree</em>. It attempts to handle real-world HTML by robustly coping with syntax mistakes. </p> <p>The optional argument <var>base-url</var>, if non-<code>nil</code>, should be a string specifying the base URL for relative URLs occurring in links. </p> <p>If the optional argument <var>discard-comments</var> is non-<code>nil</code>, any top-level comment is discarded. (This argument is obsolete and will be removed in future Emacs versions. To remove comments, use the <code>xml-remove-comments</code> utility function on the data before you call the parsing function.) </p> <p>In the parse tree, each HTML node is represented by a list in which the first element is a symbol representing the node name, the second element is an alist of node attributes, and the remaining elements are the subnodes. </p> <p>The following example demonstrates this. Given this (malformed) HTML document: </p> <div class="example"> <pre class="example"><html><head></head><body width=101><div class=thing>Foo<div>Yes -</pre> -</div> <p>A call to <code>libxml-parse-html-region</code> returns this <acronym>DOM</acronym> (document object model): </p> <div class="example"> <pre class="example">(html nil - (head nil) - (body ((width . "101")) - (div ((class . "thing")) - "Foo" - (div nil - "Yes")))) -</pre> -</div> </dd> -</dl> <dl> <dt id="shr-insert-document">Function: <strong>shr-insert-document</strong> <em>dom</em> -</dt> <dd><p>This function renders the parsed HTML in <var>dom</var> into the current buffer. The argument <var>dom</var> should be a list as generated by <code>libxml-parse-html-region</code>. This function is, e.g., used by <a href="https://www.gnu.org/software/emacs/manual/html_node/eww/index.html#Top">EWW</a> in <cite>The Emacs Web Wowser Manual</cite>. </p></dd> -</dl> <dl> <dt id="libxml-parse-xml-region">Function: <strong>libxml-parse-xml-region</strong> <em>start end &optional base-url discard-comments</em> -</dt> <dd><p>This function is the same as <code>libxml-parse-html-region</code>, except that it parses the text as XML rather than HTML (so it is stricter about syntax). </p></dd> -</dl> <table class="menu" border="0" cellspacing="0"> <tr> -<td align="left" valign="top">• <a href="document-object-model" accesskey="1">Document Object Model</a> -</td> -<td> </td> -<td align="left" valign="top">Access, manipulate and search the <acronym>DOM</acronym>. </td> -</tr> </table><div class="_attribution"> - <p class="_attribution-p"> - Copyright © 1990-1996, 1998-2022 Free Software Foundation, Inc. <br>Licensed under the GNU GPL license.<br> - <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Parsing-HTML_002fXML.html" class="_attribution-link">https://www.gnu.org/software/emacs/manual/html_node/elisp/Parsing-HTML_002fXML.html</a> - </p> -</div> |
