Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option write_skewers_Npix to write one skewers file per HEALPix pixel #24

Open
andreufont opened this issue Nov 6, 2017 · 6 comments

Comments

@andreufont
Copy link
Contributor

andreufont commented Nov 6, 2017

Right now, each node writes a fraction of the skewers to a file called something like:
skewers_srcs_s0_0.fits
where _0 specifies the node.

The current setting might end up with really large files, specially when running on only a few nodes.

Moreover, most codes that will use these files work with HEALPix pixel based files, so it would be great if CoLoRe could already print a file per HEALPix pixel. Note that this does not need to be the same HEALPix grid used internally in CoLoRe, but it could use that as default.

The new files could be named something like:
skewers_srcs_s0_pix37.fits
where _pix37 now specifies the HEALPix pixel contained in the file. Or something like this. Of course, write_skewers_Npix would be added as meta data to one of the ouput files so that we can map back a given pixel to a sky position.

Important question: In the current code, is a HEALPix pixel always contained in a single node-based output? If not, would it be difficult to assign these output HEALPix pixels to nodes, and let each node write the files corresponding to its pixels? The alternative would be having two or more nodes writing to the same pixel file, what I guess it is not ideal.

A reasonable default value for write_skewers_Npix could be either 8 or 16. With write_skewers_Npix=8, each file would have ~ 5000 quasars, resulting in of sizes of 400 Mb (?).

@damonge
Copy link
Owner

damonge commented Nov 6, 2017

This can be done. Note that right now each node will contain all skewers in big healpix pixels (N_side=1, 2 or higher, depending on the number of nodes you run with), with no overlap between nodes. So one could make the code spit out sub-pixels of those big pixels individually.

@andreufont
Copy link
Contributor Author

That's great. I noticed now that when using 32 nodes, I get 16 files of size 9Gb and 16 files of size 4.5Gb, what probably means that the printed files are using Nside=2, i.e., 48 pixels, and half of the nodes are writing 2 pixels to the file. It does look like an easy change to implement.

@andreufont
Copy link
Contributor Author

Actually, we have a very simple post-process script that allows us to split the files into smaller files with a different value of Nside, so this is not a priority at all.

@andreufont
Copy link
Contributor Author

As pointed out in issue #27, current files might not correspond to HEALPIx pixels, so it would be good to implement this.

@damonge
Copy link
Owner

damonge commented Jan 4, 2018

@andreufont : see the new branch output_perpix. The catalogs are now split into individual healpix pixels (although the user has no control over the Nside). The pixel indices are in NEST order (I should document this).

@andreufont
Copy link
Contributor Author

This is great!
@jfarr03, it would be great if you could give it a try!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants