Extract attributes, text, and HTML from elements


Problem

After parsing a document, and finding some elements, you'll want to get at the data inside those elements.

Solution

For example:
String html = "<p>An <a href='http://example.com/'><b>example</b></a> link.</p>";
Document doc = Jsoup.parse(html);
Element link = doc.select("a").first();
String text = doc.body().text(); // "An example link"
String linkHref = link.attr("href"); // "http://example.com/"
String linkText = link.text(); // "example""
String linkOuterH = link.outerHtml(); 
    // "<a href="http://example.com"><b>example</b></a>"
String linkInnerH = link.html(); // "<b>example</b>"

Description

The methods above are the core of the element data access methods. There are additional others:
  • Element.id()
  • Element.tagName()
  • Element.className() and Element.hasClass(String className)
All of these accessor methods have corresponding setter methods to change the data.

Previous
Next Post »