2009-07-13

ArticleRead (19):GEON: Making sense of the myriad resources, researchers and concepts that comprise a geoscience cyberinfrastructure

GEON: Making sense of the myriad resources, researchers and concepts that comprise a geoscience cyberinfrastructure, By Mark Gahegan, Junyan Luo, Stephen D. Weaver, William Pike, Tawan Banchuen, in Computers & Geosciences, Vol. 35, Issue 4, 836-854, (Apr. 2009)

There are two main considerations for exploring the geoscientific meaning of e-resources: the top-down defined domain ontology and conventional metadata, and the bottom-up emergent meaning carried in how the resource are used by users (epistemology: which can be captured in workflows, in provenance meta-data and even in the interactions between people.).

The primary argument of this article is that whether ontological or epistemological, no single one of these web threads is sufficient to carry the essence of meaning. A review of ontology base on this article is made in a SWOT analysis in the figure below, where current engineering methods are considered along with those of the data integration (the Strength table).

Using a case study of a knowledge portal GEON to illustrate their main arguments, problems of accessing to e-resources are:

(1) large resource
(2) dynamic nature of catalogs
(3) varieties of search strategies of user needs
(4) meanings of resource

Suggested solutions to these difficulties include:

(1)adopting a visualize-on-demand strategy, and visualizing multiple perspectives which highlight connections or overlaps among the e-resources,
(2) classifying resources into 4 categories and 18 subcategories, and translating resource descriptions into RDF triples.

Problems (3) and (4) are tackled together by augmenting ontologies. The authors augment ontologies by adding situational knowledge initiated from:

(i) meaning resides in a nexus of interactions (Whitehead, 1929-1997), thus a knowledge nexus can support multiple strands of meaning.

(ii) semiotics that according to different subject of interest, nodes can change their semiotic role in the nexus (Sowa, 1999-2002).

In short, the authors add use-cases, provenance data, social networks and workflows, to ontology, through the use of Perspectives(global and local), as a pragmatic aspect to understand meaning and definition of e-resource. In particular, perspective filters, defined against an OWL model, facilitate examination of a subset of connections within a complex concept space in a manner that suits thematic exploration.

p.s. GEON is an open collaborative project started in 2002 funded under the NSF Information Technology Research (ITR) program . The aim is to develop CyberInfrastructure, a vision in the U.S. while e-Science is in the Europe, that the need of a comprehensive infrastructure to capitalize on dramatic advances in information technology in support of data sharing and integration.

ArticleRead (18): The Dark Side of the Semantic Web

The Dark Side of the Semantic Web, By James Hendler, in IEEE Intelligent Systems, 2007, Vol 22, No 1, 2-3 ; and his presentation.

The short analysis on the relationship of Semantic Web and AI research fields in observable Web 2.0 is given. The "dark side" of the Semantic Web is treated from the point of AI community that they are lack of experiences on social aspects of the Web (e.g. the use of social tagging).

Hendler emphasizes that the theme of AI – “that which looks easy in the small is often much harder in the large” . On the other hand, the catch phrase in Semantic Web – “a little semantics goes a long way” is on revealing the main trend: RDF/OWL in practice.

Two benefits for building RDF triple stores include:


  1. RDF enables you to store data in a flexible schema so you can store additional types of information that you might have been unware of when you originally designed the schema.
  2. RDF helps you to create Web-like relationships between data, which is not easily done in a typical relational database.

In particular, Hendler notices that the main design principle for RDF is that having unique names for different terms has a great impact on the Web. However, the challenge goes to the question of two URIs whether they are indicating the “same” , the “different”, “different parts of the same thing”, or the others.

ArticleRead (17): Geographical Linked Data: The Administrative Geography of Great Britain on the Semantic Web

Geographical Linked Data: The Administrative Geography of Great Britain on the Semantic Web, By John Goodwin, Catherine Dolbear and Glen Hart, in Transactions in GIS, Vol. 12, Issue suppl.1, 2008. Pages: 19–30

This article reviewed is drawn from the UK Ordance Survey (OS) paper and its presentation that was presented at the Terra Cognita Workshop held in connection with 7th International Semantic Web Conference (ISWC2008). Inevitably some of the “future work” mentioned in this article such as using OWL for domain ontology and ontology modules are now able to be found on their outlets.

Nonetheless, what I have found of greatest interest is that they started their initial work to participate in the Semantic Web by working on Place Name first, encoding spatial information in RDF, using RDF schema to create the ontology. In addition, the OS approach solves some important spatial data problems within the Semantic Web such as that is for end users – this approach supports the query “find me things of type X in [or next to] area y” and that it can be done without the need of geometric computation(Section 3.3). They also provide alternative strategies to tackle confusions caused by owl:sameAs construct which has been suggested by W3C in the identity linking. Instead, they consider to use rdfs:seeAlso or coref:duplicate to bundle URIs that are known to be in some way related together as alternatives to link RDF nodes from different graphs (see P.27). Plus, they plan to build up ”settlement gazetteer” in the “future” is what we have to pay close attention to.

In general, this article is divided into four parts:

(1)introduction and motivation,
(2)the confusion between the administrative geography of Great Britain and unofficial sources (i.e. GeoNames),
(3) the creation of RDF datasets
(4) adding their geo data to the Web of Linked Data.

One of the reasons that the OS develops this Place Name RDF prototype is to investigate the technical challenges and limitations of creating RDF based geo-resources. The RDF approach may offer the potential to solve traditional problems in integrating different relational database schemas or the syntaxes of different file formats, and the chance to provide geospatial data to end user in a more flexible form over the web.

On the other hand, the RDF cannot support any form of spatial indexing, buffering or containment within a user defined area. For a geo data provider, the question of modularisation for the RDF/XML file representation in a manageable and coherent chunk remains unsolved.

A flagship suggestion for the published data to be able to be found over the Web is to refer to the Vocabulary for Interlinked Datasets (voiD). The voiD is an RDF based schema to describe linked datasets. With voiD the discovery and usage of linked datasets can be performed both effectively and efficiently. The heart of voiD has two classes: A dataset (void:Dataset) is a collection of data; and the interlinking is modelled by a linkset (void:Linkset) which is a subclass of a dataset, used for storing triples to express the interlinking relationship between datasets. (see the png )

The OS experience in linking their data to the Web as a discussion in section 4.2 is refreshing. They raise four issues: identity, modularisation, provenance and authorisation. They argue that in OWL-DL, the owl:sameAs should not be used because for the issue of identity and the semantic accuracy of the links that:

[there is no single common entity or “non information resource” that everyone is mapping back to; instead there are multiple different representations of what may be similar or overlapping concepts. For example, there are many different ways of describing London's spatial extent – official boundaries of Greater London, or a vaguer extent denoted by estate agents or local people. Although in specific contexts, it may be sufficient to state that these are the same thing, in the general case, it is not.] (p.27)

In a critical view on the issue of modularisation, the OS is aware to create small RDF documents containing a description of individual resources, which may include details of other, closely related resources, as called “slicing”. However, they also note that ,for a geo data provider, the question of modularisation for the RDF/XML file representation in a manageable and coherent chunk to help users who need to manipulate large triple sets remains unsolved. The existing solutions include:

(1) repeating the URI in both graph, but this is not recommended in the W3C; or
(2)assigning one graph as the primary dataset, then using rdfs:isDefinedBy to link to the newly minted URIs in the secondary dataset.

Unfortunately, nethither solution is ideal.

Issues of provenance and authorisation of the data has always a problem in data quality. There are several other points in this article that readers may find stimulating and novel and a few that seem more akin to technical design (such as Named Graph or the OAuth protocol rather than CC licesing in the RDF dataset and the SPARQL endpoint of the OS service.

ArticleRead (16): Towards a semantics-based approach in the development of geographic portals



Towards a semantics-based approach in the development of geographic portals, By Nikolaos Athanasis, Kostas Kalabokidis, Michail Vaitis, Nikolaos Soulakellis,In
Computers & Geosciences, 35(2009)301–308

Athanasis et al. (2009) have proposed and implemented a geoportal with the aim of providing a methodology for geospatial information discovery with a new approach responsive to what we understand to be significant issues of the Semantic Web on RDF and ontology-based metadata.

By taking advantages of multiple ontology design, they advocate the ontology-based metadata organization into three special context (three schemas: content type, application domain, and the ISO 19115 metadata standard) to enhance users’ navigation techniques in the geoportal interface as a solution to response to the keyword search problem.

In addition to the difficulty of appropriate search criteria they have identified, problems such as semantic heterogeneity (cognitive and naming heterogeneity in particular), the limitation of expressiveness of queries, or the lack/ discrepancy of geospatial metadata standard at the implementation stage are discussed.

Their solutions to these difficulties by focusing on the relationship among geoportals, users, and information providers in a geoportal infrastructure are described as follow:


(1) propose a RDF-based metadata organisation method using multiple ontologies.

(2) design a interface for information providers to describe characters of new resources, characterise their possible relationships between related resources, and classify them based on the build-in RDF classes of metadata schemas.

(3) auto-translate the submitted data into RDF metadata descriptions stored in PostgreSQL, and maintained by the ICS-FORTH RDFSuite (Alexaki et al.,2001) which enables the validation, storage, and querying of the RDF metadata (both schemas and data descriptions).

(4) use RDF Query Language (RQL) based on a formal graph model, and which are executed in a SQL-like “select-from-where” pattern.

Final comments: the implementation is still a prototype supported by the EU RTD project and the website is : http://incendi.geo.aegean.gr/en/incendi_aggl.html . The future work of this team is to develop OWL.