[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
H.1 Infostructure Groups of resources. H.2 Link A connection between two resources. H.3 Resource The information on the World Wide Web. H.4 URIs A type of link including URLs H.5 URLs The connections in the World Wide Web. H.6 URIs Names for resources without location.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
An infostructure is a concept which was introduced in Link Checking in the MOMspider package. It is a collection of related resources. For us it's mostly just a way of saying `web pages' but includes things like databases which may not have any real identifiable `pages' that we can read through directly.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The term link in LinkController is used for a connection between two resources. It's existence really comes from the `class' or piece of type of computer data which is used to store information about `links'. Properties of a link include:
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A resource is almost anything. `It' can range from a person to an HTML file to a computer to a database or presumably eventually to phone numbers, possibly physical hardware. This generality is a very important concept for the World Wide Web. Really the key thing about a resource is that it can be `identified'. See section H.5 URLs, for more details.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A URI or `Uniform Resource Identifier' is a more generic form of the URL H.5 URLs which also includes URNs H.6 URIs. It also allows links to abstract objects which can't be reached through a network server. Since all URLs are URIs we mostly try to talk about URIs when we can since that includes both. Often people say URL when they mean URI. We try to use correct usage always so that in future we can support all forms of URI without confusing existing users. URIs and URLs are defined in RFC 2396.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A URL or `Uniform Resource Locator' are the essence of the World Wide Web. Approximately, they are addresses through which `resources' can be located. The idea is that almost anything can be given some kind of address in a form that a machine can work with. By defining a set of rules, this can then be converted into a URL. A URL has two parts. The first tells us what rules to use and the second tells us what the address is. URLs and URIs are defined in RFC 2396. URLs are not the only kind of link, but they are the most common and currently the only ones LinkController really handles well.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A URN or `Uniform Resource Name' is a URI H.4 URIs which is not a URL H.5 URLs. This means a way of specifying a resource without saying how to get it. For example, a scheme which has been considered is for ISBN (International Standardised Book Numbers) numbers. This would allow us to specify a book as a resource but wouldn't tell us how to get it.
It's not totally clear where these will be useful in link checking (they are used internally in several computer systems), but LinkController intends to support them whenever needed, wherever possible.
Within the programs, a link is different from a URL in that it is specifically aimed at checking connections, where a URL just specifies what the connection should be if it is working.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |