18. TAP: Locating data

The VO has a “registry"that keeps an inventory of the services and data kept within the VO. TAP services communicate basically what’s in TAP_SCHEMA to the registry.

The relational registry says how to query this data set using ADQL. All tables are in the rr schema and can be combined through NATURAL JOIN.

Find tables talking about quasars having a column containing redshifts:

SELECT ivoid, access_url, name,
  ucd, column_description
FROM rr.capability
  NATURAL JOIN rr.interface
  NATURAL JOIN rr.table_column
  NATURAL JOIN rr.res_table
WHERE standard_id='ivo://ivoa.net/std/tap'
  AND 1=ivo_hasword(table_description, 'quasar')
  AND ucd='src.redshift'

As you can see, I’m using UCD to express physics. It’s instructive to compare the query above with the following one:

SELECT ivoid, access_url, name, ucd, column_description
FROM rr.capability
  NATURAL JOIN rr.interface
  NATURAL JOIN rr.table_column
  NATURAL JOIN rr.res_table
WHERE standard_id='ivo://ivoa.net/std/tap'
  AND 1=ivo_hasword(table_description, 'quasar')
  AND 1=ivo_hasword(column_description, 'redshift')

– the difference here is that we don’t use the controlled UCD vocabulary but do a freetext query. You notice that precision is down (in late 2013, two columns containing not redshifts but references are returned) but recall is up (in late 2013, you find redshift columns from SDSS catalogs that weren’t there with the UCD query).

That’s fairly typical. The recommended remedy: Complain to data providers that have lousy metadata, and make sure metadata is good on data that you publish yourself. High-quality metadata is of utmost importance for the VO – but on the other hand: Even shoddily published data is better than entirely unpublished data.

There are a few sample queries in the standard document – with those to start with, it’s unlikely you’ll ever going to need to resort to graphical interfaces to the registry like WIRR.

"


Markus Demleitner

Copyright Notice