16. Almost real world

Just so you get an idea how SQL expressions can evolve to span several pages: Suppose you have a catalog giving alpha, delta, and an epoch of observation reasonable far away from J2000. To match it, you have to bring the reference catalog on our side to the epoch of your observation. For larger reference catalogs, that would be quite an expensive endeavour. Thus, it’s usually better to just transform a bunch of candidate stars.

To do this, you decide how far one of your stars can have moved (in the example below 0.1 degrees, the inner crossmatch), and you generate a crossmatch there. From that crossmatch, you select the rows for which the transformed coordinates match to the precision you want.

In the following, we use a rough approximation to applying proper motions. Unfortunately, ADQL does not contain builtins for applying proper motions, and the exact expressions are messy. The following query should run (for a little while) with an artificial input file.

SELECT * FROM (
  SELECT
    mine.*,
    raj2000+pmra/cos(radians(dej2000))*(epoch-2000)
      as palpha,
    dej2000+pmde*(epoch-2000) as pdelta,
    pmra,
    pmde
  FROM
    ppmxl.main AS ppmxl
    JOIN tap_upload.t1 AS mine
    ON (1=CONTAINS(
      POINT('ICRS', ppmxl.raj2000, ppmxl.dej2000),
      CIRCLE('ICRS', mine.alpha, mine.delta, 0.1)))) as q
WHERE
  palpha BETWEEN alpha-0.5/3600 AND alpha+0.5/3600
  AND pdelta BETWEEN delta-0.5/3600 AND delta+0.5/3600

(don’t forget to adapt the table name behind tap_upload!). Done really correctly, this would still be a bit longer, since the outer where actually is a crossmatch criterion, too. You could either write a contains clause as in the inner select or, if you insist on a box-type criterion as used in the query, you should at least divide the tolerance in alpha by cosδ.

If you’ve tried it, you’ll have noticed that 100 rows were returned for 100 input rows. For “real"data you’d of course not have this; there’d be objects not matching at all and probably objects matching multiple objects. The reason this worked so nicely in this case is that the sample data is artificial: I made that up using ADQL, too. The statement was:

select
  raj2000-epdiff*pmra/cos(radians(dej2000))+(rand()-0.5)/4000 as alpha,
  dej2000-epdiff*pmde+(rand()-0.5)/5000 as delta, 2000-epdiff as epoch
from (
  select TOP 100 m.*, 75-RAND()*50 as epdiff
  from ppmxl.main as m
  where sqrt(POWER(pmra,2)+POWER(pmde,2)) BETWEEN 1.7/3600. and 2/3600.) as qi
"

Files


Markus Demleitner

Copyright Notice