14. DISTANCE

A special geometry is the DISTANCE function. It is used to compute the distance between two points.

DISTANCE(lon1, lat1, lon2, lat2)

or

DISTANCE(POINT (lon1, lat1),
         POINT (lon2, lat2) )

Both versions are valid and will lead to the same result. The DISTANCE function can be used to make cone selections and is the prefered way to perform crossmatches on sky positions.

SELECT
   TOP 1000
   *
   FROM arihip.main
   WHERE
      DISTANCE(POINT(raj2000, dej2000),
               POINT(189.2, 62.21)) < 10

When catalogs are on different epochs, you may need to account for proper motions to match faster stars. You should, however, not apply the proper motions in the primary selection. If you do that, the index cannot be used, and your query will waste a lot of CPU and disk bandwidth. Instead, decide about the maximum proper motion your objects might have (to get an idea, of the statistics, try selecting the fastest stars from ppmxl.main – apart from the fact that the catalog got the fastest stars pretty wrong with two copies of some fast stars, there’s only a handful stars moving faster than four arcsecs per year).

Then multiply this with your epoch difference and make that your initial crossmatch radius. Then filter out the spurious matches with an extra where clause taking into account the proper motions. For moderate epoch differences, don’t worry about going into the tangential plane to apply proper motions and, for now, say something like

SELECT
   TOP 30
   *
   FROM ppmxl.main AS m
   JOIN gaia.edr3lite AS g
   ON DISTANCE (POINT(m.raj2000, m.dej2000),
                POINT(g.ra, g.dec)) < 30./3600.
   WHERE DISTANCE (POINT(
m.raj2000+m.pmra*COS(RADIANS(m.dej2000))*16
m.dej2000+m.pmde*16),
                POINT(g.ra, g.dec)) < 0.5/3600

The 16 is because Gaia E-DR3 is on J2016, whereas PPMXL is on J2000. Also, be careful with the units – in many catalogs, positions and proper motions are given in different units.

Also note how the outer PM-based filter is just a WHERE-clause. Since JOIN is a combination of operators of the relational algebra, the result of a join is a relation again and thus can be treated like any other table.

See that we used the explicit POINT function here in spite of the DISTANCE function accepting a list of four options. The reason is due to readability for humans. Long queries (especially more elaborate crossmatches) can easily get a little messy. The rule of thumb should be: if you expect others to reuse and understand your code, keep it readable. If in doubt, think of your future you in, let’s say, 2 years from now.


Hendrik Heinl

Copyright Notice