Query Advice

This page contains some general advice about how to search Sloan Digital Sky Survey data with SkyServer.

Here are some things to keep in mind while searching SkyServer:

  1. An excellent way to learn SQL is to modify pre-existing SQL queries. Look at the Sample SQL Queries on SkyServer. There is a link to them under the Help menu.
  2. If you're not sure how many objects a query is going to return, it's always a good idea to first do a "count" query first. A count query will return only the number of objects that match the query, and will not return the actual data. This will give you an idea of how long the query will take, so you don't find yourself waiting a lot longer than you expected to. Here is an example of such a query - click Load Query to load it into the SQL Search tool:

    Using count

    How many galaxies have g < 18?

    SELECT count(*)
    FROM galaxy
    WHERE g < 18
    Load Query
  3. If a query takes much longer to run than you think it should, you should try it again later to see if the problem is the load on the server. If it still runs slowly, look at the Optimizing Queries section of SkyServer's guide SQL in SkyServer.
  4. Be sure to exclude invalid values (unset or uncalculated quantities) as described in the Excluding Invalid Data Values section of SkyServer's SQL in SkyServer. For example, the following query will exclude invalid magnitude data for the u magnitude - click Load Query to load it into the SQL Search tool:

    Excluding invalid data values

    How many faint galaxies are there between ra = 180 and ra = 181?

    SELECT ra, dec, u, err_u
    FROM PhotoObj
    WHERE ra BETWEEN 180 and 181
    	AND u > -9999 AND u< 20.0-- instead of just "u < 20.0"
    	AND err_u > -1000 AND err_u< 0.1-- instead of just "err_u < 0.1"
    Load Query
  5. Use the Image List tool as a sanity check on your queries. Click on Use query to fill form in the left panel, then enter your query and click Submit. A table of results will appear; click Send to List. You will see thumbnail images of all the objects that matched your query. You can click on the thumbnails to go to the Navigate tool, or on the object names to go to the Explore tool. Remember that Image List queries are limited to 1,000 objects, and that the SELECT block must have the form SELECT name, ra, dec, where name can be anything you want. For example, this query uses the SDSS redshift as a name:

    Using the Image List tool

    How many galaxies have g < 18?

    SELECT z as name, ra, dec
    FROM specPhoto
    WHERE z BETWEEN 0.01 and 0.05
  6. If you are running a query for photometric data, and you are searching for common parameters, consider using the PhotoTag view instead of the PhotoObj view. PhotoTag contains all the objects that PhotoObj contains, but has many fewer parameters for each object. Your query will run much faster. But remember that in PhotoTag, magnitudes are not referred to as [u, g, r, i, z], but as modelMag_[u, g, r, i, z].
  7. If your query returns a lot of output (more than a few thousand objects), it is generally not a good idea to let the browser render the output by selecting the HTML output format (default). You can try using one of the other output formats that save results to a file, e.g., CSV, JSON or FITS. However, in general, for queries that take a long time (more than a few minutes) to complete and that return large output data sets, you're much better off using an asynchronous service like the CasJobs batch query system to run such queries and fetch large numbers of objects.
  8. If you know you want to search for both photometric and spectroscopic objects, search on the SpecPhoto view. In SpecPhoto, the redshift is referred to as z and the magnitudes are referred to as modelmag_x (where x is the waveband). Here is a query to get magnitudes and redshifts from specPhoto:

    Searching the specPhoto table

    Get magnitudes for 100 objects with quality redshifts

    SELECT top 100 modelmag_u, modelmag_g, modelmag_r, modelmag_i, modelmag_z, z
    FROM SpecPhoto
    WHERE zWarning = 0
    Load Query