Simple Cone Search Creator (SCSC) aims at providing a very easy way to set up a Cone Search service, complying with the IVOA standard described at http://www.ivoa.net/documents/REC/DAL/ConeSearch-20080222.html This allows to quickly setup a service queriable by position by various tools and libraries (Topcat, Aladin, etc).
Requirements are minimal on the server side: you will only need an HTTP server able to execute CGI scripts (no database, no Tomcat needed).
Python libraries requirements: numpy, healpy
SCSC is made of 2 parts :
- an ingestor
- a CGI Python script
The starting point is a CSV file (with or without initial header line) having at least right ascension and declination in decimal degrees in the ICRS coordinate system.
-
Ingestion
The ingestor takes a data file formatted in CSV, and converts it in an ad-hoc set of files, later used by the CGI script. Usage:
./ingest.py --csvfile CSVFILE --outputdir OUTPUTDIR --rafield RAFIELD --decfield DECFIELD [--idfield IDFIELD] [--debug]
Parameters explanation:
- CSVFILE (compulsory): path to the CSV input file
- OUTPUTDIR (compulsory): directory that will contain the data converted to the ad-hoc format
- RAFIELD (compulsory): name or index (zero-based) in the CSV file of the field holding the right ascension
- DECFIELD (compulsory): name or index (zero-based) in the CSV file of the field holding the declination
- IDFIELD (optional): name or index (zero-based) in the CSV file of the field holding the identifier string. If not given, the script will generate an identifier based on the row index.
Example (using HIP.csv file available in test-data directory):
./ingest.py --csvfile ../test-data/HIP.csv --outputdir HIP-cs --rafield _RAJ2000 --decfield _DEJ2000 --idfield HIP
If the CSV file has no header, the script will automatically create column names (col_0, col_1, ...).
Once the data has been parsed and converted, a small summary of the parsing is displayed. If some rows were ignored, you might want to re-run the ingestion adding the
--debug
flag to get more information.You might want to review the file OUTPUTDIR/metadata.json which describes the different fields. Feel free to update this file as long as you do not change the number of fields and do not remove the description of FIELDS holding UCD POS_EQ_RA_MAIN and POS_EQ_DEC_MAIN.
A
cgi-config.json
file is also created in OUTPUTDIR. -
CGI Python script
The CGI is responsible to parse the cone search query, and outputs data accordingly, in compliance with the Cone Search standard defined in this document.
Once the data has been ingested, here is how to test the cone search service:
cd TEMP_DIR
mkdir cgi-bin
- Copy
cs.py
andOUTPUTDIR/cgi-config.json
toTEMP_DIR/cgi-bin
- Launch from
TEMP_DIR
the commandpython3 -m http.server 1234 --cgi
- Open the link http://0.0.0.0:1234/cgi-bin/cs.py?RA=0&DEC=0&SR=0 in your browser. You should see an XML file with the list of elements.
- If the previous step is working, we can go further and test the service in Aladin :
- Launch Aladin
- Go to File-->Open
- Click on Others tab, at the bottom right of the window, and select Generic Cone Search query
- Enter
http://0.0.0.0:1234/cgi-bin/cs.py?
as the base URL, enter a target and a radius and Submit - You should be able to visualize sources in the requested cone
- We might also try our service in TOPCAT:
- Launch TOPCAT
- Go to File-->Load Table
- Go to Data Sources-->Cone Search
- Enter
http://0.0.0.0:1234/cgi-bin/cs.py?
as the Cone URL (bottom panel of the window) - Enter a position and a radius and click on OK
- A new table with corresponding sources should appear in TOPCAT
Deploying the cone search service on production server is just a matter of copying
OUTPUTDIR
data, CGI scriptcs.py
along withcgi-config.json
and ajusting path incgi-config.json
Generated cone search services have been tested against VO Paris (http://voparis-validator.obspm.fr/) and NVO validators (http://nvo.ncsa.uiuc.edu/dalvalidate/csvalidate.html).
We have tested successfully our scripts to generate a Cone Search service from a 18 million rows CSV file. (PPMX data with 26 columns) Ingestion of the data took 15 minutes. Querying a 10 degrees radius cone centered on LMC outputs the 113,686 corresponding rows in 7 seconds. Queries with a radius smaller than 1 degree are usually returned in less than 1 second.
The source code is available under the BSD 3-clause license.
The test data HIP.csv has been extracted from the Hipparcos catalogue (1997ESASP1200.....P) available in VizieR.
Please send your comments, questions, bug reports, etc to [email protected]