Resource (Web)

From Wikipedia, the free encyclopedia

The concept of Resource is primitive in the Web architecture, and is used in the definition of its fundamental elements. The term was first introduced to refer to targets of Uniform Resource Locators (URLs), but its definition has been further extended to include the referent of any Uniform Resource Identifier (RFC 3986), or Internationalized Resource Identifier (RFC 3987). In the Semantic Web, abstract resources and their semantic properties are described using the family of languages based on Resource Description Framework (RDF).

1 History
- 1.1 From documents and files to Web resources
- 1.2 From Web resources to abstract resources
2 Resources in RDF and the Semantic Web
- 2.1 Using HTTP URIs to identify abstract resources
- 2.2 Resource ownership, intellectual property and trust
3 References

[edit] History

The concept of resource has evolved during the Web history, from the early notion of static addressable document or file, to a more generic and abstract definition, now encompassing every thing or entity that can be identified, named, addressed or handled, in any way whatsoever, in the Web at large, or in any networked information system. The declarative aspects of a resource (identification and naming) and its functional aspects (addressing and technical handling) were not clearly distinct in the early specifications of the Web, and the very definition of the concept has been the subject of long and still open debate involving difficult, and often arcane, technical, social, linguistic and philosophical issues.

[edit] From documents and files to Web resources

In the early specifications of the Web (1990-1994), the term "resource" is barely used at all. The Web is designed as a network of more or less static addressable objects, basically files and documents, linked together in the form of an hypertext. The objects are considered insofar as they can be addressed and handled through a specific protocol, Web documents are accessed and browsed using HTTP protocol, files are exchanged using file transfer protocol, etc.

The first systematic use of the term resource was introduced in June 1994 by RFC 1630. In this document is defined the generic notion of Universal Resource Identifier (URI), with its two variants Universal Resource Locator (URL) and Universal Resource Name (URN). A resource is implicitly defined as something which can be identified, the identification deserving two distinct purposes, naming and addressing, the latter only being dependent on a protocol. It is noticeable that RFC 1630 does not attempt to define at all the notion of resource, actually it barely uses the term besides its occurrence in URI, URL and URN, and still speaks about "Objects of the Network".

RFC 1738 (December 1994) further specifies URLs, the term 'Universal' being changed to 'Uniform'. The document is making a more systematic use of 'resource' to refer to objects which are 'available', or 'can be located and accessed' through the Internet. There again, the term 'resource' itself is not explicitly defined.

[edit] From Web resources to abstract resources

The first explicit definition of resource is found in RFC 2396, in August 1998 : A resource can be anything that has identity. Familiar examples include an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources. If examples in this document are still limited to physical entities, the definition opens the door to more abstract resources. Providing a concept is given an identity, and this identity is expressed by a well-formed URI, then a concept can be a resource as well. In January 2005, RFC 3986 makes this extension of the definition completely explicit: ... abstract concepts can be resources, such as the operators and operands of a mathematical equation, the types of a relationship (e.g., "parent" or "employee"), or numeric values (e.g., zero, one, and infinity).

[edit] Resources in RDF and the Semantic Web

First released in 1999, RDF was first intended to describe resources, in other words to declare metadata of resources in a standard way. A RDF description of a resource is a set of triples (subject, predicate, object), where subject represents the resource to be described, predicate a type of property relevant to this resource, whereas object can be data or another resource. The predicate itself is considered as a resource and identified by a URI. Hence, properties like "title", "author" are represented in RDF as resources, which can be used, in a recursive way, as subject of other triples. Building on this recursive principle, RDF vocabularies, such as RDFS, OWL, and SKOS will pile up definitions of abstract resources such as classes, properties, concepts, all identified by URIs.

RDF also specifies the definition of anonymous resources or blank nodes, which are not absolutely identified by URIs.

[edit] Using HTTP URIs to identify abstract resources

Using URLs, and singularly HTTP URIs, to identify abstract resources, such as classes, properties or other kind of concepts, is a frequent practice, for example in RDFS or OWL ontologies. Since such URIs are associated with the HTTP protocol, the question arose of which kind of representation, if any, should be get for such resources through this protocol, typically using a Web browser, and if the syntax of the URI itself could help to differentiate "abstract" resources from "information" resources. The URI specifications such as RFC 3986 let to the protocol specification the task of defining actions performed on the resources and they don't provide any answer to this question. It had been suggested that http URIs identifying a resource in the original sense, file, document or any kind of so-called information resource, should be "slash" URIs, in other words should not contain fragment identifiers, whereas URIs used to identify concepts or abstract resources should be "hash" URIs using fragment identifiers. For example ''http://www.sillywidgets.org/catalogue/widgets.html'' would both identify and locate a web page (maybe providing some human-readable description of the widgets sold by Silly Widgets, Inc.) whereas ''http://www.widgets.org/ontology#Widget'' would identify the abstract concept or class "Widget" in this company ontology, and would not necessarily retrieve any physical resource through http protocol. But it has been answered that such a distinction is impossible to enforce in practice, and famous standard vocabularies provide counter-examples widely used. For example the Dublin Core concepts such as "title", "publisher", "creator" are identified by "slash" URIs like http://purl.org/dc/elements/1.1/title.

The general question of which kind of resources http URI should or should not identify has been formerly known in W3C as the httpRange-14 issue, following its name on the list defined by the Technical Architecture Group (TAG). The TAG has delivered in 2005 a final answer to this issue, making the distinction between an "information resource" and a "non-information" resource dependent on the type of answer given by the server to a "GET" request. This solution puts and end to the "hash" vs "slash" debate, and seems to have met a consensus in the Semantic Web community, although some of its prominent members such as Pat Hayes have expressed concerns both on its technical feasibility and conceptual foundation. According to Patrick Hayes' view point, the very distinction between "information resource" and "other resource" is impossible to found, and should better not be specified at all, and ambiguity of the referent resource is inherent to URIs like to any naming mechanism.

[edit] Resource ownership, intellectual property and trust

In RDF, "anybody can declare anything about anything". Resources are "defined" by formal descriptions which anyone can publish, copy, modify and publish over the Web. If the content of a Web resource in the classical sense (a Web page or on-line file) is clearly owned by its publisher, who can claim intellectual property on it, an abstract resource can be defined by an accumulation of RDF descriptions, not necessarily controlled by a unique publisher, and not necessarily consistent with each other. It's an open issue to know if a resource should have an authoritative definition with clear and trustable ownership, and in this case, how to make this description technically distinct from other descriptions. A parallel issue is how intellectual property applies to such descriptions.