Select all MusicalArtist and Bands from DBpedia Endpoint

 select COUNT(distinct ?artist) AS ?count where {  
 ?artist <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/MusicalArtist>}   

Result: 45107

Get resources will return 10000 results by the following query:


 select distinct ?artist where {   
  ?artist <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/MusicalArtist>}  
 ORDER BY ?artist  

Get the next 10000 results with offset


 select distinct ?artist where {   
  ?artist <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/MusicalArtist>}  
 ORDER BY ?artist  
 LIMIT 10000  
 OFFSET 20001  

However, when the offset comes to 30001, there's an error:


Virtuoso 22023 Error SR353: Sorted TOP clause specifies more then 40001 rows to sort. Only 40000 are allowed. Either decrease the offset and/or row count or use a scrollable cursor

So I changed the ordering direction and get 20000 resources and removed duplicates

 select distinct ?artist where {   
  ?artist <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/MusicalArtist>}  
 ORDER BY DESC(?artist)  
 LIMIT 10000  

Obviously, if you want to get more than 60000 resources. The problem will become complex...

No comments:

Post a Comment