The broken links bible - article 1 of 3

404 page hell - Why's that link broken?

A cool 404 page from Limpfish
A cool 404 page from Limpfish

Want to learn how to stop your visitors seeing a 404 page? Read on...

If there's an eternal truth online it's probably that not only will broken links happen, but they'll happen at the worst possible moment. Perhaps it's Murphys law; you just need one - more - click to reach the information that'll change your life, then... bam. A 404 page hits you square between the eyes.

As a web surfer this is an annoying problem. As web developers and site owners this is a serious logistical issue. How on earth do we find and fix all of these errors to prevent visitors landing in 404 page hell?!

In this three part series we're delving into the world of broken links. By the time you're finished you'll know all there is to know about what a broken link is, what causes them, how to find them and how to fix them.

We'll start with an explanation of the different types of broken links & how they happen. Yes - it's not just the dreaded 404 page you need to worry about.

Anatomy of links & urls

A link in the html of your site is a pretty simple thing. Even if you're using a CMS like Wordpress or Drupal your links end up looking like the one below if you click View > Source in your browser.

The are two important parts - the text (Click Me! in the example above) and the url. For the purposes of this article we'll only be talking about the url. The url is pretty much divided into two:

These two elements can cause broken links that reveal themselves in different ways.


The domain

If there's a problem with the domain the browser may hang or say the site is unavailable when the link is clicked. Essentially, the whole website isn't responding to requests. This can happen for several reasons:

1) The domain in the link was typed wrongly.
2) The site is genuinely down.
3) You aren't currently online (having issues with your internet connection?). A wrong domain or down site will never cause a 404 page error, but they're just as serious.


The path

We're in 404 page territory. If there's a problem with the path a page will usually load when the link is clicked but most likely it will be a 404 page.

404 (or occasionally 410) is a shorthand web servers use for saying that the page you're trying to access doesn't exist. Maybe it did once and it's now gone or maybe the path was mistyped when the link was created. The server returns a 404 error to the browser and all the visitor sees is a 404 page.

We'll go more into how to fix broken links in part 3 - but stick with us, there's interesting stuff first.


Internal vs external broken links

This terminology isn't strictly necessary but broken links to tend to fall into these categories.


Internal

Internal links are links from your own pages to other pages in your site. Generally when an internal link breaks the issue will be with the path and you'll get a 404 page. These can have a couple of causes:

1) You typed the path wrong in the link
2) You deleted or renamed the page the link points to


External

External links are links from your site to someone else's site. External links can break for just about any reason. Perhaps the other site is down, perhaps they removed some content. Anyone who clicks a broken external link could end up on a 404 page, or their browser could just hang if the site is down.


The esoteric stuff

There are a few more reasons links can break but they're rare. Just in case you run into one, here's some notes:

500

These are like 404 page errors in that the request made it to the server but their meaning is a lot more vague. Any error from 500 to 599 means something unexpected went wrong on the site. Think of this like a computer crashing. The site owner didn't plan on it happening, but it did anyway. When the link is clicked, the visitor will usually see something like a 404 page, just with different text.

Missing bookmarks / anchors

Unless you're using software like our product DeepTrawl with anchor checking specifically turned on you may not notice these errors even exist. If you make a link whose url has something like #myBookmark at the end the browser will try to find an element on the linked page called myBookmark and scroll the user to it.

If there's no element with that name the browser will simply default to scrolling the user to the top of the page as normal. So these can be a little disruptive, but depending on the link they may go unnoticed. They aren't in the same league as the 404 page or down site in terms of severity.

Next page: How to find broken links in your site

Check out our software

We have two products which can help you with issues like users hitting a 404 page in your site.


DeepTrawl

DeepTrawl is a desktop app for Windows & Mac which checks every page of your site for issues like broken links, spelling mistakes, invalid html & css.
Tell me more!

CloudTrawl

CloudTrawl is a hosted webapp (nothing to download) which regularly checks your entire site for broken links automatically. It also monitors your uptime and sends you text alerts when there's a problem. Because it's in the cloud results are super-easy to share with others.
Tell me more!





comments powered by Disqus