19. Data Discovery 2: use ADQL

The relational registry says how to query this data set using ADQL. All tables are in the rr schema and can be combined through NATURAL JOIN. The same use case in ADQL looks like:

SELECT ivoid, access_url, name,
  ucd, column_description
FROM rr.capability
  NATURAL JOIN rr.interface
  NATURAL JOIN rr.table_column
  NATURAL JOIN rr.res_table
WHERE standard_id='ivo://ivoa.net/std/tap'
  AND 1=ivo_hasword(table_description, 'quasar')
  AND ucd='src.redshift'

As you can see, I’m using UCD to express physics. It’s instructive to compare the query above with the following one:

SELECT ivoid, access_url, name, ucd, column_description
FROM rr.capability
  NATURAL JOIN rr.interface
  NATURAL JOIN rr.table_column
  NATURAL JOIN rr.res_table
WHERE standard_id='ivo://ivoa.net/std/tap'
  AND 1=ivo_hasword(table_description, 'quasar')
  AND 1=ivo_hasword(column_description, 'redshift')

– the difference here is that we don’t use the controlled UCD vocabulary but do a freetext query similar to the query we performed with WIRR. You notice that precision is down (in late 2013, two columns containing not redshifts but references are returned) but recall is up (in late 2013, you find redshift columns from SDSS catalogs that weren’t there with the UCD query).

That’s fairly typical. The recommended remedy: Complain to data providers that have lousy metadata, and make sure metadata is good on data that you publish yourself. High-quality metadata is of utmost importance for the VO – but on the other hand: Even shoddily published data is better than entirely unpublished data.

There are a few sample queries in the standard document – with those to start with, it’s unlikely you’ll ever going to need to resort to graphical interfaces to the registry like WIRR.


Markus Demleitner

Copyright Notice