Applications of Web Structure Mining

  • Information retrieval in social networks.
  • To find out the relevance of each web page.
  • Measuring the completeness of Websites.
  • Used in Search engines to find the relevant information.

Web Structure Mining

Pre-requisites:  Web Mining

Web Structure Mining is one of the three different types of techniques in Web Mining. In this article, we will purely discuss about the Web Structure Mining. Web Structure Mining is the technique of  discovering structure information from the web. It uses graph theory to analyze the nodes and connections in the structure of a website.

 

 Depending upon the type of Web Structural data, Web Structure Mining can be categorised into two types:

1.Extracting patterns from the hyperlink in the Web: The Web works through a system of hyperlinks using the hyper text transfer protocol (http). Hyperlink is a structural component that connects the web page according to different location. Any page can create a hyperlink of any other page and that page can also be linked to some other page. the intertwined or self-referral nature of web lends itself to some unique network analytical algorithms. The structure of Web pages could also be analyzed to examine the pattern of hyperlinks among pages.

 

2. Mining the document structure. It is the analysis of tree like structure of web page to describe HTML or XML usage or the tags usage . There are different terms associated with Web Structure Mining :

  • Web Graph: Web Graph is the directed graph representing Web.
  • Node: Node represents the web page in the graph.
  • Edge(s): Edge represents the hyperlinks of the web page in the graph (Web graph)
  • In degree(s): It is the number of hyperlinks pointing to a particular node in the graph.
  • Degree(s): Degree is the number of links generated from a particular node. These are also called the Out Degrees.

All these terminologies will be more clear by looking at the following diagram of Web Graph:

Similar Reads

Example of Web Structure Mining:

One of the techniques is the Page rank Algorithm that the Google uses to rank its web pages. The rank of a page is dependent on the number of pages and the quality of links pointing to the target node....

Applications of Web Structure Mining:

Information retrieval in social networks. To find out the relevance of each web page. Measuring the completeness of Websites. Used in Search engines to find the relevant information....