| Author: | Tres Seaver |
|---|---|
| Version: | 0.1 |
Overview
repoze.urispace implements the URISpace [1] 1.0 spec, as proposed to the W3C by Akamai. Its aim is to provide an implementation of that language as a vehicle for asserting declarative metadata about a resource based on pattern matching against its URI.
Once asserted, such metadata can be used to guide the application in serving the resource, with possible applciations including:
The URISpace [1] specification provides for matching on the following portions of a URI:
scheme
o host, including wildcarding (leading only) and port
o user (if specified in the URI)
path elements, including nesting and wildcarding, as well as parameters, where used.
query elements, including test for presence or for specific value
fragments (likely irrelevant for server-side applications)
Note
repoze.urispace does not yet provide support for fragment matching.
The asserted metadata can be scalar, or can use RDF Bag and Sequences to indicate sets or ordered collections.
Note
repoze.urispace does not yet provide support for parsing multi-valued assertions using RDF.
Operators are provided to allow for incrementally updating or clearing the value for a given metadata element. Specified operators include:
Suppose we want to select different Delieverance themes and or rulesets based on the URI of the resource being themed. In particular:
A URISpace file specifying these policies would look like:
<?xml version="1.0" ?> <themeselect xmlns='http://pypi.python.org/pypi/Deliverance/' xmlns:uri='http://www.w3.org/2000/urispace' xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' > <!-- default theme and rules --> <theme>http://themes.example.com/default.html</theme> <rules>http://static.example.com/rules/default.xml</rules> <uri:path uri:match="news"> <theme>http://themes.example.com/news.html</theme> <uri:path uri:match="world"> <theme>http://themes.example.com/news.html?style=world</theme> </uri:path> <uri:path uri:match="national"> <theme>http://themes.example.com/news.html?style=national</theme> </uri:path> <uri:path uri:match="local"> <theme>http://themes.example.com/news.html?style=local</theme> </uri:path> </uri:path> <uri:path uri:match="lifestyle"> <theme>http://themes.example.com/lifestyle.html</theme> </uri:path> <uri:path uri:match="sports"> <theme>http://themes.example.com/sports.html</theme> </uri:path> <!-- Note that the following rules match "across" sections --> <uri:path uri:match="index.html"> <rules>http://static.example.com/rules/index.xml</rules> </uri:path> <uri:path uri:match="*.html"> <rules>http://static.example.com/rules/story.xml</rules> </uri:path> </themeselect>
Given that URISpace file, one can test how given URIs matches using the uri_test script:
$ /path/to/bin/uri_test examples/dv_news.xml \
http://example.com/foo \
http://example.com/news/ \
http://example.com/news/index.html \
http://example.com/news/world/index.html \
http://example.com/sports/ \
http://example.com/sports/world_series_2008.html
------------------------------------------------------------------------------
URI: http://example.com/
------------------------------------------------------------------------------
{http://pypi.python.org/pypi/Deliverance/}rules = http://static.example.com/rules/default.xml
{http://pypi.python.org/pypi/Deliverance/}theme = http://themes.example.com/default.html
------------------------------------------------------------------------------
URI: http://example.com/news/
------------------------------------------------------------------------------
{http://pypi.python.org/pypi/Deliverance/}rules = http://static.example.com/rules/default.xml
{http://pypi.python.org/pypi/Deliverance/}theme = http://themes.example.com/news.html
------------------------------------------------------------------------------
URI: http://example.com/news/index.html
------------------------------------------------------------------------------
{http://pypi.python.org/pypi/Deliverance/}rules = http://static.example.com/rules/default.xml
{http://pypi.python.org/pypi/Deliverance/}theme = http://themes.example.com/news.html
------------------------------------------------------------------------------
URI: http://example.com/news/world/index.html
------------------------------------------------------------------------------
{http://pypi.python.org/pypi/Deliverance/}rules = http://static.example.com/rules/default.xml
{http://pypi.python.org/pypi/Deliverance/}theme = http://themes.example.com/news.html?style=world
------------------------------------------------------------------------------
URI: http://example.com/sports/
------------------------------------------------------------------------------
{http://pypi.python.org/pypi/Deliverance/}rules = http://static.example.com/rules/default.xml
{http://pypi.python.org/pypi/Deliverance/}theme = http://themes.example.com/sports.html
------------------------------------------------------------------------------
URI: http://example.com/sports/world_series_2008.html
------------------------------------------------------------------------------
{http://pypi.python.org/pypi/Deliverance/}rules = http://static.example.com/rules/default.xml
{http://pypi.python.org/pypi/Deliverance/}theme = http://themes.example.com/sports.html
Once parsing is complete, the URISpace is available as tree-like object. The canonical operators to extract metadata for a given URI are:
scheme, nethost, path, query, fragment = urlsplit(uri)
path = path.split('/')
if len(path) > 1 and path[0] == '':
path = path[1:]
info = {'scheme': scheme,
'nethost': nethost,
'path': path,
'query': parse_qs(query, keep_blank_values=1),
'fragment': fragment,
}
operators = urispace.collect(info)
assertions = {}
for operator in operators:
operator.apply(assertions)
At this point, assertions will contain keys and values for all operators found while matching against the URI.
repoze.urispace implements “Scheme Selectors” (section 3.1) by combining a selector and a predicate:
Of the “Authority Selectors” (section 3.2), repoze.urispace implements the “Host” variant (section 3.2.2) by combining a selector and a predicate:
repoze.urispace does not implement selectors for “Authority Name” (section 3.2.1) or “User” (section 3.2.3). at this time.
repoze.urispace implements “Path Segment Selectors” (section 3.3) by combining a selector and a predicate:
Note
the semantics of the path segment selector in the spec require matching only on the first element of the current path. repoze.urispace provides extensions which allow for matches on the last element of the current path, and for matches on any element of the current path. See Extending the Spec.
The URISpace [1] specification contemplates extension via what it calls “External Selectors” (see chapter 4). repoze.urispace in fact uses this facility to provide additional selectors:
| [1] | (1, 2, 3, 4) http://www.w3.org/TR/urispace.html |
| [2] | http://www.ietf.org/rfc/rfc2396.txt |