XPath for HTML markup

- April 02, 2023

XPath

Web Scraping

Web Scraping is a technique to traverse the DOM (Document Object Model) of an HTML or an HTTP web page Web Scraping is achieved due to XPath

XPath nodes:
There are seven kinds of nodes

Element - It represents any HTML 5 element. For example, <strong></strong>
Attribute - It represents any one attribute of any HTML 5 element in the document object model
Text - It represents the text between the opening HTML 5 tag and a closing HTML 5 tag
Namespace - It represents the pseudo selector of an HTML 5 element
Processing-Instructions
Comment - It represents any HTML 5 comment
It represents the topmost element of the tree is called the root element. For example, the root element for any HTNL 5 document is HTML

The first XPath node is \ which Suppose, I have an HTML5 code snippet as follows:

Gaurav
Shirodkar

Then to find the 2nd list item from the unordered list XPath is /div/ul/li[1] The XPath is traversed from left to right. The first / denotes the root <html> tag then, div represents the <div> child element of the root element Then the /ul denotes the first unordered list in the div tag Then, the /li[1] denotes the second list item child of the unordered list which is its parent node in the Document Object Model(DOM)

XPath

Comments

Post a Comment

Search This Blog

Web Technology Tutorials

XPath for HTML markup

Web Scraping

Comments

Post a Comment

Popular posts from this blog

Apache Hadoop | Running MapReduce Jobs

Laravel | PHP | Basics | Part 2