Wednesday 9 April 2014

Creating HTML Web Pages from Linked Data Using PHP, SPARQL, and ARC2

Here you'll learn how to output SPARQL queries directly to an HTML web page, instead of using a public endpoint client such as SNORQL

I'll expand on the SPARQL query I described 3 years ago, which outputs all members of the Characidae family of freshwater fish, with corresponding genus and Binomial name (ordered by genus):


SELECT DISTINCT ?species ?binomial ?genus
WHERE { ?species dbpedia-owl:family :Characidae;
dbpedia-owl:genus ?genus;
dbpedia2:binomial ?binomial }
ORDER BY ?genus


The first thing we'll need is a PHP library that queries SPARQL endpoints. We'll be using Semsol's ARC2 library. Huge thanks to Gilles Falquet (associate professor at the University of Geneva), and his tutorial on Creating web pages from linked data with PHP and SPARQL for getting us past this initial hump. His post is one of the few of its kind out there, and he was even generous enough to clarify some of the points I had difficulty with.

  1. Download Semsol's ARC2 library from their Github repository
  2. Create a directory on your web server for your project. Let's name the directory proj
  3. Decompress the downloaded archive, and place it into your proj directory. Rename it from semsol-arc2-bc67abe to something simpler. Let's make it semsol
  4. Create a php file in your project directory, and add the following:
    <html>
      <body>
     
      <?php
      /* ARC2 static class inclusion */ 
      include_once('semsol/ARC2.php');  
     
      $dbpconfig = array(
      "remote_store_endpoint" => "http://dbpedia.org/sparql",
       );
     
      $store = ARC2::getRemoteStore($dbpconfig); 
     
      if ($errs = $store->getErrors()) {
         echo "<h1>getRemoteSotre error<h1>" ;
      }
     
      $query = '
      PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>
      PREFIX owl: <http://www.w3.org/2002/07/owl#>
      PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
      PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
      PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
      PREFIX foaf: <http://xmlns.com/foaf/0.1/>
      PREFIX dc: <http://purl.org/dc/elements/1.1/>
      PREFIX : <http://dbpedia.org/resource/>
      PREFIX dbpedia2: <http://dbpedia.org/property/>
      PREFIX dbpedia: <http://dbpedia.org/>
      PREFIX dbpprop: <http://dbpedia.org/property/>
    
      SELECT DISTINCT ?species ?binomial ?genus ?label
      WHERE { ?species dbpedia-owl:family :Characidae;
            dbpprop:genus ?genus;
            rdfs:label ?label;
            dbpedia2:binomial ?binomial.
            filter ( langMatches(lang(?label), "en") ) }
      ORDER BY ?genus';
      
      /* execute the query */
      $rows = $store->query($query, 'rows'); 
     
        if ($errs = $store->getErrors()) {
           echo "Query errors" ;
           print_r($errs);
        }
     
        /* display the results in an HTML table */
        echo "<table border='1'>
        <thead>
            <th>#</th>
            <th>Species (Label)</th>
            <th>Binomial</th>
            <th>Genus</th>
        </thead>";
    
        /* loop for each returned row */
        foreach( $rows as $row ) { 
        print "<tr><td>".++$id. "</td>
        <td><a href='". $row['species'] . "'>" . 
        $row['label']."</a></td><td>" . 
        $row['binomial']. "</td><td>" . 
        $row['genus']. "</td></tr>";
        }
        echo "</table>" 
    
      ?>
      </body>
    </html>
    

In this SPARQL query, I went a couple of steps further than I did with my previous 'Characidae family' query, just to make things a bit more interesting:

  • I queried genus by entity type ObjectProperty (Prefix of dbpedia-owl:genus), instead of entity type Property (Prefix of dbpprop:genus)
  • I added rdfs:label to the SELECT clause. rdfs:label is an instance of rdf:Property that provides a human-readable version of a resource name
  • I linked to the dbpedia resource page for each species using its rdfs:label as the link's anchor text. This originally resulted in duplicate columns, due to labels available for multiple languages. This was resolved using filter ( langMatches(lang(?label), "en") )

Output:


More to come!