Find out broken links on website using selenium webdriver and HTTP Client

Earlier we have seen working with finding broken images, now here we will see finding invalid URLs. Here a valid URL will always have a status with 200. We have different HTTP status codes which are used for different purposes. You can check Wiki page for more information on HTTP Status Codes

Here 2xx class of status codes indicates that the action request by client was received and processed successfully without any issues.

And 4xx class of status code is mainly intended for cases in which the client seems to have erred.

And 5xx class of status codes are intended for cases in which the server seems to have erred.

The following are the list of different HTTP status codes.
Http status codes

By just seeing the Links in the UI, we may not be able to confirm if that link is working or not until we click and verify it.

To achieve this, we can use HTTPClient library to check status codes of the URLs on a page. You need to downloadand add it to the build path.

If request was NOT processed correctly, then the HTTP status codes may return any of the above listed codes but not a 200 status code. We can easily say whether the link is broken or not with status codes.

Now let us jump into the example, First we will try to find all anchor tags on the page by using Webdriver. By using the below syntax:

List<WebElement> anchorTagsList = driver.findElements(By.tagName(“a”));

We need to iterate through each link and verify request response Status codes and it should be 200 if not, we will increment invalid links count

Let us look into the example :

package com.linked;

import java.util.List;

import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.HttpClientBuilder;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.testng.annotations.AfterClass;
import org.testng.annotations.BeforeClass;
import org.testng.annotations.Test;

public class FindBrokenLinksExample {

private WebDriver driver;
private int invalidLinksCount;

@BeforeClass
public void setUp() {

driver = new FirefoxDriver();
driver.get(“http://google.com&#8221;);
}

@Test
public void validateInvalidLinks() {

try {
invalidLinksCount = 0;
List<WebElement> anchorTagsList = driver.findElements(By
.tagName(“a”));
System.out.println(“Total no. of links are ”
+ anchorTagsList.size());
for (WebElement anchorTagElement : anchorTagsList) {
if (anchorTagElement != null) {
String url = anchorTagElement.getAttribute(“href”);
if (url != null && !url.contains(“javascript”)) {
verifyURLStatus(url);
} else {
invalidLinksCount++;
}
}
}

System.out.println(“Total no. of invalid links are ”
+ invalidLinksCount);

} catch (Exception e) {
e.printStackTrace();
System.out.println(e.getMessage());
}
}

@AfterClass
public void tearDown() {
if (driver != null)
driver.quit();
}

public void verifyURLStatus(String URL) {

HttpClient client = HttpClientBuilder.create().build();
HttpGet request = new HttpGet(URL);
try {
HttpResponse response = client.execute(request);
// verifying response code and The HttpStatus should be 200 if not,
// increment invalid link count
////We can also check for 404 status code like response.getStatusLine().getStatusCode() == 404
if (response.getStatusLine().getStatusCode() != 200)
invalidLinksCount++;
} catch (Exception e) {
e.printStackTrace();
}
}
}

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s