Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drizzlepac/HAP: Address issues related to new "rel" keywords #1878

Open
stscijgbot-hstdp opened this issue Sep 9, 2024 · 6 comments
Open

Comments

@stscijgbot-hstdp
Copy link
Collaborator

Issue HLA-1325 was created on JIRA by Steve Goldman:

Found when testing HLA-1271:

I noticed some unexpected results in the test data, which i report here for the following observations:

 

  1. ) For a single FLC dataset, the keyword values for the relative and catalog fits are identical, but should have unique values corresponding to their unique number of matches, rms, etc.

/ifs/archive/dev/processing/hla/home/mburger/singlevisits/results_2024-09-04/ibxl04/ibxl04kdq_flc.fits

WCSNAME = 'IDC_2731450pi-FIT_REL_GAIAeDR3' / Coordinate system title            
RMS_RA  =    5.711020259082877 / RMS in RA of WCS fit(mas)                      
RMS_DEC =   5.2894068148818985 / RMS in Dec of WCS fit(mas)                     
NMATCHES=                  686                                                                                                       
RELREFIM= 'ibxl04kbq'          / base reference image rootname for relative fit 
RELRMS_R=    5.711020259082877 / RMS in RA of relative WCS fit(mas)             
RELRMS_D=   5.2894068148818985 / RMS in DEC of relative WCS fit(mas)            
RELMATCH=                  686 / number of matches for relative fit             

2.) For multiple FLC datasets, the keyword values for the catalog fit are the same (as expected since they are aligned as a group), but the values for the relative fit should be unique.  

/ifs/archive/dev/processing/hla/home/mburger/singlevisits/results_2024-09-04/ibxl04/i*_flc.fits

For F502N, the relative fit uses the refimage 'ibxl04kbq', so the RELRMS* values for that dataset should be empty and each subsequent FLC should have slightly different values.

A promising result is that each filter set has a unique set of fitting results, however the relative and catalog fit RMS values should not be the same (see item 1 above).

See attachment ibxl04.png

 

3.) For the SVM images, the relrefim for all FLCs is the same (hst_12812_04_wfc3_uvis_f606w_ibxl04kp, see last column), which is good! On the other hand, the relrms_r and relrms_d values for that image should  not be populated or else cleared if they were previously populated (see item 2).  Again, the rms_ra and relrms_r values should not be the same.

/ifs/archive/dev/processing/hla/home/mburger/singlevisits/results_2024-09-04/ibxl04/hst*_flc.fits

See attachment hst_ibxl04.png

 

 4.) For the SVM image below, I see fitgeom='rshift' which should not be allowed. (only 'rscale'). In this case, there were fewer than 10 catalog matches, so it should skip eDR3 and go to the next catalog.

/ifs/archive/dev/processing/hla/home/mburger/singlevisits/results_2024-09-04/ibl738/hst*_flt.fits

IMAGE                                                                  NMATCHES  FITGEOM. WCSNAME
hst_12286_38_wfc3_ir_f125w_ibl738k2_flt.fits  7                  rshift       IDC_w3m18525i-FIT_SVM_GAIAeDR3

@stscijgbot-hstdp stscijgbot-hstdp changed the title Drizzlepac/HLA: Address issues related to new "rel" keywords Drizzlepac/HAP: Address issues related to new "rel" keywords Sep 9, 2024
@stscijgbot-hstdp
Copy link
Collaborator Author

Comment by Steve Goldman on JIRA:

Jennifer Mack I think I mis-interpreted what was originally requested so let me clarify before making any more changes. Also these changes would require considerable time and refactoring of the code, so I want to make sure. 

Is the desired outcome that the best-fit catalog results are saved in the older output header keywords "fitgeom", "nmatch", 'rms_ra', etc., and that the best relative fit results are captured in the new "rel" keywords (e.g. relmatch, relgeom, etc.)?

At the moment, the older header keywords set the best overall fit. If we are going to use them only for catalog fit results we will need to completely change how the code passes the best overall fit (relative or catalog) result through the code. 

 

Also I want to clarify the requested fit geometries and required nmatches for all scenarios:

pipeline products

  • relative fitting
    ** {}rshift{}: nmatches={}20{}
    ** general: nmatches=9
  • catalog fitting
    ** rscale: nmatches=10
    ** general: nmatches=9

SVM

  • relative fitting
    ** rscale: nmatches=10
    ** general: nmatches=9
  • catalog fitting
    ** rscale: nmatches=10
    ** general: nmatches=9

At the moment, the code uses the same configuration nmatch requirement values for a given set of pipeline and SVM products. If we want them to differ (e.g. different numbers of require nmatches for pipeline and SVM relative fitting), we will need to change how the code sets all of the nmatch requirements. 

I will have to look into why each individual relative fit rms values are the same for all of the FLTs in an association. 

@mackjenn
Copy link

mackjenn commented Sep 13, 2024

Hi Steve,
From ticket HLA-1271, we suggested these config settings for a successful fit. (I've added conditions at right to show what happens if the requirement is not met).

    FIT-REL:      relmatch >=20, relgeom = 'rshift')    else 'No Fit' (skip to catalog fit)
    FIT-REL-cat:  nmatches >=10, fitgeom = 'rscale')    else 'No Fit' (skip to next catalog in the list. 
                                                        If No Catalog fit, keep 'a prior' WCS)

    FIT-SVM:      relmatch >=10, relgeom = 'rscale')    else 'No Fit' (skip to SVM catalog fit)
    FIT-SVM-cat:  nmatches >=10, fitgeom = 'rscale')    else 'No Fit' (keep 'a priori' WCS)

If it's hard to have different fit parameters for different steps of the code, Varun and I agree that you can just make it the requirement >=10 matches with 'rscale' for everything that meets this requirement).

For N<=9 matches, you would deem the fit unsuccessful and go to the next step (e.g. if relative alignment, you would skip and just do the catalog alignment (the resulting WCS should be FIT-IMG-eDR3 rather than FIT-REL-eDR3). In this case, the relative alignment keyword values would be either unpopulated or N/A).

If you were doing a catalog alignment and did not get 10 matches, you would move down the list of catalogs (e.g. the resulting WCS would be FIT-REL-GSC242 or FIT-REL-2MASS instead of FIT-REL-eDR3).

Note that fitgeom=general has too many free parameters, so we don't want this to be used in the pipeline, especially for very few matches.

Thanks for looking into the keyword values. For the relative fit need, the values should be unique for each input FLT, and the values for the relative fit should be different than the results for the catalog fit. I'm hoping this is just a simple fix to not overwrite the keywords, rather than a total refactoring. Let me know.

@mackjenn
Copy link

In the above, I left out the condition when there is no successful FIT-REL solution, eg. fit each image to the catalog (not the group of images)

FIT-IMG-Cat : 'a posteriori' WCS matched to a reference catalog, where 'IMG' implies each FLT is separately aligned to the reference catalog

In this case, use the same logic, e.g. N>=10 matches with 'rscale' else 'No Fit' and keep 'a priori'


Attached is a table from MAST HAP documentation I wrote for the user community, in case it's helpful. It's slightly different than what is in readthedocs.

Screenshot 2024-09-13 at 1 54 03 PM

@s-goldman
Copy link
Collaborator

Hi Steve, From ticket HLA-1271, we suggested these config settings for a successful fit. (I've added conditions at right to show what happens if the requirement is not met).

    FIT-REL:      relmatch >=20, relgeom = 'rshift')    else 'No Fit' (skip to catalog fit)
    FIT-REL-cat:  nmatches >=10, fitgeom = 'rscale')    else 'No Fit' (skip to next catalog in the list. 
                                                        If No Catalog fit, keep 'a prior' WCS)

    FIT-SVM:      relmatch >=10, relgeom = 'rscale')    else 'No Fit' (skip to SVM catalog fit)
    FIT-SVM-cat:  nmatches >=10, fitgeom = 'rscale')    else 'No Fit' (keep 'a priori' WCS)

If it's hard to have different fit parameters for different steps of the code, Varun and I agree that you can just make it the requirement >=10 matches with 'rscale' for everything that meets this requirement).

For N<=9 matches, you would deem the fit unsuccessful and go to the next step (e.g. if relative alignment, you would skip and just do the catalog alignment (the resulting WCS should be FIT-IMG-eDR3 rather than FIT-REL-eDR3). In this case, the relative alignment keyword values would be either unpopulated or N/A).

If you were doing a catalog alignment and did not get 10 matches, you would move down the list of catalogs (e.g. the resulting WCS would be FIT-REL-GSC242 or FIT-REL-2MASS instead of FIT-REL-eDR3).

Note that fitgeom=general has too many free parameters, so we don't want this to be used in the pipeline, especially for very few matches.

Thanks for looking into the keyword values. For the relative fit need, the values should be unique for each input FLT, and the values for the relative fit should be different than the results for the catalog fit. I'm hoping this is just a simple fix to not overwrite the keywords, rather than a total refactoring. Let me know.

Thanks for this @mackjenn.

So I think moving forward, there are three required changes of increasing complexity.

  1. Changes to the config files.
  2. Investigating why individual files are having the same relative fit results for all of the files in a visit.
  3. Changing the way drizzlepac identifies and passes the information of the best fit result (relative or catalog), and using the header keywords (fitgeom, nmatch, etc.) only for catalog fit results.

This will take some time, but I'll try to make it a priority.

@stscijgbot-hstdp
Copy link
Collaborator Author

Comment by Steve Goldman on JIRA:

After much searching i've found that the various fit statistics are saved to the imglist in line 919 of align_utils.py specifically within the call to tweakwcs.imalign(). It's still unclear why, during a relative fit, each of the flt images in an association have the exact same fit statistics in their output header. 

@stscijgbot-hstdp
Copy link
Collaborator Author

Comment by Steve Goldman on JIRA:

Alright, so i've confirmed that the behavior where we have the same nmatches, fit_RA, and fit_DEC for all of the image headers in an association is the correct behavior. 

 

The way that the fitting works is that the catalogs for all of the images are combined and then that combined catalog is fit to the reference catalog to determine the fit errors. My understanding is that the nmatches is the number of matches from the reference catalog to any catalog source in any of the images. 

There are, however, some fit metrics in imalign.py for individual images (below). We may be able to determine other fit metrics using the residuals array. 

!Screenshot 2024-09-18 at 4.00.13 PM.png!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants