Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link mosviz data as a batch after they are loaded #762

Merged
merged 24 commits into from
Sep 7, 2021

Conversation

javerbukh
Copy link
Contributor

@javerbukh javerbukh commented Aug 5, 2021

Description

Creates an option in the application initialization to turn off auto linking of data when it is loaded into the application. Mosviz utilizes this in this PR by linking data as a batch after it is loaded into the table. Only the data in a row is linked, although this does lead to problems with subsets. However, data loading is significantly improved and no longer has an exponential increase in load time depending on the number of data loaded.

Depends on:

Follow-up after merge:

Fixes #430

Checklist for package maintainer(s)

This checklist is meant to remind the package maintainer(s) who will review this pull request of some common things to look for. This list is not exhaustive.

  • Are two approvals required? Branch protection rule does not check for the second approval. If a second approval is not necessary, please apply the trivial label.
  • Do the proposed changes actually accomplish desired goals? Also manually run the affected example notebooks, if necessary.
  • Do the proposed changes follow the STScI Style Guides?
  • Are tests added/updated as required? If so, do they follow the STScI Style Guides?
  • Are docs added/updated as required? If so, do they follow the STScI Style Guides?
  • Did the CI pass? If not, are the failures related?
  • Is a change log needed? If yes, is it added to CHANGES.rst?
  • Is a milestone set? Milestone is only currently required for PRs related to Imviz MVP.
  • After merge, any internal documentations need updating (e.g., JIRA, Innerspace)?

@github-actions github-actions bot added the mosviz label Aug 5, 2021
@javerbukh
Copy link
Contributor Author

A couple errors that show up with this implementation:

Long traceback after a subset is added in the specviz viewer and another row is selected

---------------------------------------------------------------------------
IncompatibleAttribute                     Traceback (most recent call last)
~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/core/data.py in get_mask(self, subset_state, view)
   1384         try:
-> 1385             return subset_state.to_mask(self, view=view)
   1386         except IncompatibleAttribute:

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/core/subset.py in to_mask(self, data, view)
    772     def to_mask(self, data, view=None):
--> 773         x = data[self.att, view]
    774         result = (x >= self.lo) & (x <= self.hi)

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/core/data.py in __getitem__(self, key)
    570 
--> 571         return self.get_data(key, view=view)
    572 

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/core/data.py in get_data(self, cid, view)
   1360         else:
-> 1361             raise IncompatibleAttribute(cid)
   1362 

IncompatibleAttribute: World 0

During handling of the above exception, another exception occurred:

IncompatibleAttribute                     Traceback (most recent call last)
~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/ipyvue/VueTemplateWidget.py in _handle_event(self, _, content, buffers)
     55                 getattr(self, 'vue_' + event)(data, buffers)
     56             else:
---> 57                 getattr(self, 'vue_' + event)(data)
     58 
     59 

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue_jupyter/table/viewer.py in vue_on_row_clicked(self, index)
     96 
     97     def vue_on_row_clicked(self, index):
---> 98         self.highlighted = index
     99 
    100 

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/traitlets/traitlets.py in __set__(self, obj, value)
    602             raise TraitError('The "%s" trait is read-only.' % self.name)
    603         else:
--> 604             self.set(obj, value)
    605 
    606     def _validate(self, obj, value):

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/traitlets/traitlets.py in set(self, obj, value)
    591             # we explicitly compare silent to True just in case the equality
    592             # comparison above returns something other than True/False
--> 593             obj._notify_trait(self.name, old_value, new_value)
    594 
    595     def __set__(self, obj, value):

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/traitlets/traitlets.py in _notify_trait(self, name, old_value, new_value)
   1215 
   1216     def _notify_trait(self, name, old_value, new_value):
-> 1217         self.notify_change(Bunch(
   1218             name=name,
   1219             old=old_value,

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/ipywidgets/widgets/widget.py in notify_change(self, change)
    604                 # Send new state to front-end
    605                 self.send_state(key=name)
--> 606         super(Widget, self).notify_change(change)
    607 
    608     def __repr__(self):

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/traitlets/traitlets.py in notify_change(self, change)
   1225     def notify_change(self, change):
   1226         """Notify observers of a change event"""
-> 1227         return self._notify_observers(change)
   1228 
   1229     def _notify_observers(self, event):

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/traitlets/traitlets.py in _notify_observers(self, event)
   1262                 c = getattr(self, c.name)
   1263 
-> 1264             c(event)
   1265 
   1266     def _add_notifiers(self, handler, name, type):

~/Documents/jdaviz_dev/jdaviz/jdaviz/configs/mosviz/plugins/viewers.py in _on_row_selected(self, event)
    140                     add_data_to_viewer_message = AddDataToViewerMessage(
    141                         'spectrum-viewer', selected_data, sender=self)
--> 142                     self.session.hub.broadcast(add_data_to_viewer_message)
    143 
    144                     self._selected_data['spectrum-viewer'] = selected_data

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/core/hub.py in broadcast(self, message)
    213             logging.getLogger(__name__).info("Broadcasting %s", message)
    214             for subscriber, handler in self._find_handlers(message):
--> 215                 handler(message)
    216 
    217     def __getstate__(self):

~/Documents/jdaviz_dev/jdaviz/jdaviz/app.py in <lambda>(msg)
    175 
    176         self.hub.subscribe(self, AddDataToViewerMessage,
--> 177                            handler=lambda msg: self.add_data_to_viewer(
    178                                msg.viewer_reference, msg.data_label))
    179 

~/Documents/jdaviz_dev/jdaviz/jdaviz/app.py in add_data_to_viewer(self, viewer_reference, data_path, clear_other_data, ext)
    679         if data_id is not None:
    680             data_ids.append(data_id)
--> 681             self._update_selected_data_items(viewer_item['id'], data_ids)
    682         else:
    683             raise ValueError(

~/Documents/jdaviz_dev/jdaviz/jdaviz/app.py in _update_selected_data_items(self, viewer_id, selected_items)
    994                                               viewer_id=viewer_id,
    995                                               sender=self)
--> 996             self.hub.broadcast(add_data_message)
    997 
    998         # Remove any deselected data objects from viewer

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/core/hub.py in broadcast(self, message)
    213             logging.getLogger(__name__).info("Broadcasting %s", message)
    214             for subscriber, handler in self._find_handlers(message):
--> 215                 handler(message)
    216 
    217     def __getstate__(self):

~/Documents/jdaviz_dev/jdaviz/jdaviz/configs/specviz/plugins/unit_conversion/unit_conversion.py in _on_viewer_data_changed(self, msg)
     75             return
     76 
---> 77         self._viewer_data = self.app.get_data_from_viewer('spectrum-viewer')
     78 
     79         self.dc_items = [data.label

~/Documents/jdaviz_dev/jdaviz/jdaviz/app.py in get_data_from_viewer(self, viewer_reference, data_label, cls, include_subsets)
    452                         if cls is not None:
    453                             handler, _ = data_translator.get_handler_for(cls)
--> 454                             layer_data = handler.to_object(layer_data,
    455                                                            statistic=statistic)
    456 

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue_astronomy/translators/spectrum1d.py in to_object(self, data_or_subset, attribute, statistic)
    142             return data_kwargs
    143 
--> 144         data_kwargs = parse_attributes(
    145             [attribute] if not hasattr(attribute, '__len__') else attribute)
    146 

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue_astronomy/translators/spectrum1d.py in parse_attributes(attributes)
    109                     mask = None
    110                 else:
--> 111                     mask = data.get_mask(subset_state=subset_state)
    112                     mask = ~mask
    113 

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/core/data.py in get_mask(self, subset_state, view)
   1385             return subset_state.to_mask(self, view=view)
   1386         except IncompatibleAttribute:
-> 1387             return get_mask_with_key_joins(self, self._key_joins, subset_state, view=view)
   1388 
   1389     def __setitem__(self, key, value):

~/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/core/joins.py in get_mask_with_key_joins(data, key_joins, subset_state, view)
    103                             "contain a single component.")
    104 
--> 105     raise IncompatibleAttribute

IncompatibleAttribute: 

ERROR:tornado.application:Exception in callback functools.partial(<function debounced.<locals>.wrapped.<locals>.execute.<locals>.debounced_execute at 0x7fcd723d0d30>)
Traceback (most recent call last):
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/tornado/ioloop.py", line 741, in _run_callback
    ret = callback()
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue_jupyter/utils.py", line 134, in debounced_execute
    f(*args, **kwargs)
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue_jupyter/bqplot/image/frb_mark.py", line 42, in debounced_update
    return self.update(self, *args, **kwargs)
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue_jupyter/bqplot/image/frb_mark.py", line 68, in update
    image = self.array_maker(bounds=bounds)
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/viewers/image/layer_artist.py", line 245, in __call__
    mask = self.layer_state.get_sliced_data(bounds=bounds)
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/viewers/image/state.py", line 455, in get_sliced_data
    image = self.layer.data.compute_fixed_resolution_buffer(full_view, target_data=self.viewer_state.reference_data,
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/core/data.py", line 1945, in compute_fixed_resolution_buffer
    return compute_fixed_resolution_buffer(self, *args, **kwargs)
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/core/fixed_resolution_buffer.py", line 197, in compute_fixed_resolution_buffer
    translated_coord = np.round(unbroadcast(translated_coord)).astype(int)
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/utils/array.py", line 26, in unbroadcast
    if array.ndim == 0 or not hasattr(array, 'strides'):
AttributeError: 'NoneType' object has no attribute 'ndim'
ERROR:tornado.application:Exception in callback functools.partial(<function debounced.<locals>.wrapped.<locals>.execute.<locals>.debounced_execute at 0x7fcd723d0d30>)
Traceback (most recent call last):
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/tornado/ioloop.py", line 741, in _run_callback
    ret = callback()
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue_jupyter/utils.py", line 134, in debounced_execute
    f(*args, **kwargs)
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue_jupyter/bqplot/image/frb_mark.py", line 42, in debounced_update
    return self.update(self, *args, **kwargs)
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue_jupyter/bqplot/image/frb_mark.py", line 68, in update
    image = self.array_maker(bounds=bounds)
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/viewers/image/layer_artist.py", line 245, in __call__
    mask = self.layer_state.get_sliced_data(bounds=bounds)
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/viewers/image/state.py", line 455, in get_sliced_data
    image = self.layer.data.compute_fixed_resolution_buffer(full_view, target_data=self.viewer_state.reference_data,
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/core/data.py", line 1945, in compute_fixed_resolution_buffer
    return compute_fixed_resolution_buffer(self, *args, **kwargs)
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/core/fixed_resolution_buffer.py", line 197, in compute_fixed_resolution_buffer
    translated_coord = np.round(unbroadcast(translated_coord)).astype(int)
  File "/Users/javerbukh/opt/anaconda3/envs/glue_1.1/lib/python3.8/site-packages/glue/utils/array.py", line 26, in unbroadcast
    if array.ndim == 0 or not hasattr(array, 'strides'):
AttributeError: 'NoneType' object has no attribute 'ndim'

Subset highlight on the spectrum also does not appear.
Screen Shot 2021-08-05 at 2 16 38 PM

jdaviz/app.py Outdated Show resolved Hide resolved
for index in range(0, len(mos_data.get_component('1D Spectra').data)):
spec_1d = mos_data.get_component('1D Spectra').data[index]
spec_2d = mos_data.get_component('2D Spectra').data[index]
app.session.data_collection.add_link(LinkSame(spec_1d, spec_2d))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need to check if that link already exists before actually adding it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, I can look into how to do that though. I think we can assume that someone using Mosviz will have auto_link set to False since it speeds up data loading considerably, but you know what they say about assumptions...

jdaviz/app.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Aug 5, 2021

Codecov Report

Merging #762 (592b3f6) into main (d0c2ce0) will increase coverage by 0.06%.
The diff coverage is 89.18%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #762      +/-   ##
==========================================
+ Coverage   67.20%   67.26%   +0.06%     
==========================================
  Files          65       65              
  Lines        4619     4650      +31     
==========================================
+ Hits         3104     3128      +24     
- Misses       1515     1522       +7     
Impacted Files Coverage Δ
jdaviz/configs/specviz/plugins/viewers.py 60.75% <60.00%> (-0.19%) ⬇️
jdaviz/app.py 82.98% <75.00%> (-0.20%) ⬇️
jdaviz/configs/mosviz/helper.py 70.64% <100.00%> (+1.21%) ⬆️
jdaviz/configs/mosviz/plugins/parsers.py 72.80% <100.00%> (+1.11%) ⬆️
...specviz/plugins/unit_conversion/unit_conversion.py 73.64% <100.00%> (ø)
...onfigs/mosviz/plugins/slit_overlay/slit_overlay.py 51.35% <0.00%> (-4.54%) ⬇️
jdaviz/configs/imviz/plugins/parsers.py 100.00% <0.00%> (+1.55%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d0c2ce0...592b3f6. Read the comment docs.

@pllim
Copy link
Contributor

pllim commented Aug 6, 2021

@javerbukh said @ojustino is taking over this one and will push a fix for the subset bug.

@ojustino
Copy link
Contributor

ojustino commented Aug 6, 2021

As a progress report, interestingly enough, I still see exponentially increasing load times. Additionally, attempting to selectively link data (either as done here or using methods I've tried locally) has thus far gotten me slower load times than using main's version of load_data().

I'll continue testing in the meantime.

@ojustino
Copy link
Contributor

ojustino commented Aug 9, 2021

I had assumed that the previous commits had disabled auto-linking in Mosviz by default. I can confirm that setting auto_link=False does improve load times and lead to the errors seen earlier with subset selection.

I can load 20 spectra with cutouts in the time it takes to load 15 with main's version of Mosviz, about 25 seconds. Load times begin to balloon after that -- 30 spectra + cutouts take me 76 seconds to load, but it's still better than on main, where times balloon in the 15-20 spectra + cutout range.

I tested a possible solution to the subset selection error in link_data_in_table() from the Mosviz version of parsers.py that I discussed last week with @javerbukh and @duytnguyendtn. It tries to take advantage of implicit linking in glue by adding a link in the loop between the current and previous iterations' 1D world component IDs.

Unfortunately, it still leads to similar results as before -- no highlighted region and the following (different) traceback:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/ipyvue/VueTemplateWidget.py in _handle_event(self, _, content, buffers)
     55                 getattr(self, 'vue_' + event)(data, buffers)
     56             else:
---> 57                 getattr(self, 'vue_' + event)(data)
     58 
     59 

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue_jupyter/table/viewer.py in vue_on_row_clicked(self, index)
     96 
     97     def vue_on_row_clicked(self, index):
---> 98         self.highlighted = index
     99 
    100 

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/traitlets/traitlets.py in __set__(self, obj, value)
    602             raise TraitError('The "%s" trait is read-only.' % self.name)
    603         else:
--> 604             self.set(obj, value)
    605 
    606     def _validate(self, obj, value):

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/traitlets/traitlets.py in set(self, obj, value)
    591             # we explicitly compare silent to True just in case the equality
    592             # comparison above returns something other than True/False
--> 593             obj._notify_trait(self.name, old_value, new_value)
    594 
    595     def __set__(self, obj, value):

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/traitlets/traitlets.py in _notify_trait(self, name, old_value, new_value)
   1215 
   1216     def _notify_trait(self, name, old_value, new_value):
-> 1217         self.notify_change(Bunch(
   1218             name=name,
   1219             old=old_value,

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/ipywidgets/widgets/widget.py in notify_change(self, change)
    604                 # Send new state to front-end
    605                 self.send_state(key=name)
--> 606         super(Widget, self).notify_change(change)
    607 
    608     def __repr__(self):

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/traitlets/traitlets.py in notify_change(self, change)
   1225     def notify_change(self, change):
   1226         """Notify observers of a change event"""
-> 1227         return self._notify_observers(change)
   1228 
   1229     def _notify_observers(self, event):

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/traitlets/traitlets.py in _notify_observers(self, event)
   1262                 c = getattr(self, c.name)
   1263 
-> 1264             c(event)
   1265 
   1266     def _add_notifiers(self, handler, name, type):

~/repositories/jdaviz/jdaviz/configs/mosviz/plugins/viewers.py in _on_row_selected(self, event)
    164                         remove_data_from_viewer_message = RemoveDataFromViewerMessage(
    165                             'image-viewer', prev_data, sender=self)
--> 166                         self.session.hub.broadcast(remove_data_from_viewer_message)
    167 
    168                     add_data_to_viewer_message = AddDataToViewerMessage(

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/hub.py in broadcast(self, message)
    213             logging.getLogger(__name__).info("Broadcasting %s", message)
    214             for subscriber, handler in self._find_handlers(message):
--> 215                 handler(message)
    216 
    217     def __getstate__(self):

~/repositories/jdaviz/jdaviz/app.py in <lambda>(msg)
    179 
    180         self.hub.subscribe(self, RemoveDataFromViewerMessage,
--> 181                            handler=lambda msg: self.remove_data_from_viewer(
    182                                msg.viewer_reference, msg.data_label))
    183 

~/repositories/jdaviz/jdaviz/app.py in remove_data_from_viewer(self, viewer_reference, data_path, ext)
    723             selected_items.remove(data_id)
    724 
--> 725             self._update_selected_data_items(
    726                 viewer_item['id'], selected_items)
    727 

~/repositories/jdaviz/jdaviz/app.py in _update_selected_data_items(self, viewer_id, selected_items)
   1004         for data in viewer_data:
   1005             if data.label not in active_data_labels:
-> 1006                 viewer.remove_data(data)
   1007                 remove_data_message = RemoveDataMessage(data, viewer,
   1008                                                         viewer_id=viewer_id,

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/viewers/common/viewer.py in remove_data(self, data)
    227                 else:
    228                     if layer_state.layer.data is data:
--> 229                         self.state.layers.remove(layer_state)
    230 
    231     def get_data_layer_artist(self, layer=None, layer_state=None):

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/echo/core.py in __exit__(self, *args)
    536 
    537         for p, args in notifications:
--> 538             p.notify(*args)
    539 
    540 

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/echo/core.py in notify(self, instance, old, new)
    123             return
    124         for cback in self._callbacks.get(instance, []):
--> 125             cback(new)
    126         for cback in self._2arg_callbacks.get(instance, []):
    127             cback(old, new)

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/viewers/common/viewer.py in _sync_layer_artist_container(self, *args)
    181         for layer_artist in self._layer_artist_container:
    182             if layer_artist.layer not in layer_states:
--> 183                 self._layer_artist_container.remove(layer_artist)
    184 
    185     def warn(self, message, *args, **kwargs):

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/layer_artist.py in remove(self, artist)
    280         if artist in self.artists:
    281             self.artists.remove(artist)
--> 282             artist.remove()
    283             self._notify()
    284 

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue_jupyter/bqplot/image/layer_artist.py in remove(self)
    188     def remove(self):
    189         super(BqplotImageSubsetLayerArtist, self).remove()
--> 190         self.image_artist.invalidate_cache()
    191         ARRAY_CACHE.pop(self.state.uuid, None)
    192         PIXEL_CACHE.pop(self.state.uuid, None)

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue_jupyter/bqplot/image/frb_mark.py in invalidate_cache(self)
     76 
     77     def invalidate_cache(self):
---> 78         self.update()

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue_jupyter/bqplot/image/frb_mark.py in update(self, *args, **kwargs)
     66 
     67         # Get the array and assign it to the artist
---> 68         image = self.array_maker(bounds=bounds)
     69         if image is not None:
     70             with self.hold_sync():

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/viewers/image/layer_artist.py in __call__(self, bounds)
    243 
    244         try:
--> 245             mask = self.layer_state.get_sliced_data(bounds=bounds)
    246         except IncompatibleAttribute:
    247             self.layer_artist.disable_incompatible_subset()

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/viewers/image/state.py in get_sliced_data(self, view, bounds)
    453                                                                target_cid=self.attribute, broadcast=False, cache_id=self.uuid)
    454         else:
--> 455             image = self.layer.data.compute_fixed_resolution_buffer(full_view, target_data=self.viewer_state.reference_data,
    456                                                                     subset_state=self.layer.subset_state, broadcast=False, cache_id=self.uuid)
    457 

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/data.py in compute_fixed_resolution_buffer(self, *args, **kwargs)
   1943     def compute_fixed_resolution_buffer(self, *args, **kwargs):
   1944         from .fixed_resolution_buffer import compute_fixed_resolution_buffer
-> 1945         return compute_fixed_resolution_buffer(self, *args, **kwargs)
   1946 
   1947     # DEPRECATED

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/fixed_resolution_buffer.py in compute_fixed_resolution_buffer(data, bounds, target_data, target_cid, subset_state, broadcast, cache_id)
    266             invalid_value = -np.inf
    267         else:
--> 268             array = data.get_mask(subset_state, view=translated_coords)
    269             invalid_value = False
    270 

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/data.py in get_mask(self, subset_state, view)
   1383     def get_mask(self, subset_state, view=None):
   1384         try:
-> 1385             return subset_state.to_mask(self, view=view)
   1386         except IncompatibleAttribute:
   1387             return get_mask_with_key_joins(self, self._key_joins, subset_state, view=view)

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/subset.py in to_mask(self, data, view)
    771     @contract(data='isinstance(Data)', view='array_view')
    772     def to_mask(self, data, view=None):
--> 773         x = data[self.att, view]
    774         result = (x >= self.lo) & (x <= self.hi)
    775         return result

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/data.py in __getitem__(self, key)
    569                 raise IncompatibleAttribute(_k)
    570 
--> 571         return self.get_data(key, view=view)
    572 
    573     def _ipython_key_completions_(self):

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/data.py in get_data(self, cid, view)
   1362 
   1363         if view is not None:
-> 1364             result = comp[view]
   1365         else:
   1366             result = comp.data

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/component.py in __getitem__(self, key)
    196 
    197     def __getitem__(self, key):
--> 198         return self._link.compute(self._data, key)
    199 
    200 

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/component_link.py in compute(self, data, view)
    164 
    165         # First we get the values of all the 'from' components.
--> 166         args = [data[join_component_view(f, view)] for f in self._from]
    167 
    168         # We keep track of the original shape of the arguments

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/component_link.py in <listcomp>(.0)
    164 
    165         # First we get the values of all the 'from' components.
--> 166         args = [data[join_component_view(f, view)] for f in self._from]
    167 
    168         # We keep track of the original shape of the arguments

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/data.py in __getitem__(self, key)
    569                 raise IncompatibleAttribute(_k)
    570 
--> 571         return self.get_data(key, view=view)
    572 
    573     def _ipython_key_completions_(self):

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/data.py in get_data(self, cid, view)
   1362 
   1363         if view is not None:
-> 1364             result = comp[view]
   1365         else:
   1366             result = comp.data

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/component.py in __getitem__(self, key)
    196 
    197     def __getitem__(self, key):
--> 198         return self._link.compute(self._data, key)
    199 
    200 

~/opt/anaconda3/envs/jdaviz-dev4/lib/python3.8/site-packages/glue/core/component_link.py in compute(self, data, view)
    167 
    168         # We keep track of the original shape of the arguments
--> 169         original_shape = args[0].shape
    170         logger.debug("shape of first argument: %s", original_shape)
    171 

IndexError: list index out of range

Also, these extra links noticeably slow load_data() compared to the current state of the PR. 20 spectra + images go from 25 to 33 seconds and 30 spectra + images go from 76 to 119 seconds. That's still faster than main's Mosviz, but I'm guessing it's not satisfactory.

@astrofrog astrofrog mentioned this pull request Aug 12, 2021
1 task

wc_spec_1d = app.session.data_collection[spec_1d].world_component_ids
wc_spec_2d = app.session.data_collection[spec_2d].world_component_ids
app.session.data_collection.add_link(LinkSame(wc_spec_1d[0], wc_spec_2d[0]))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a better linking approach than what is in main at the moment, but you can then get significantly better performance for this step if you also use the context manager I used in #782 - could you test this out?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that #782 is merged you can encapsulate the whole for loop in the delay context manager

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see it now. The gain in speed for using the delay context manager here (close to an order of magnitude for link_table_data() on its own with 20 spectra + images) is similar to what we found with a different approach of appending the LinkSame object from each iteration of this loop to a list and waiting until after the loop to call add_link() on that list.

Combining these approaches is even faster. Are any of these strategies better or worse from a glue perspective?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combining the two approaches makes sense from a glue perspective!

@astrofrog
Copy link
Collaborator

astrofrog commented Aug 12, 2021

See my comment above, if you use the context manager from #782 it should load in seconds. Maybe we could try and get #782 in first and then this PR could be rebased to include those changes? (we'll need to wait for a glue-core release first though)

@ojustino
Copy link
Contributor

The latest commit uses the delay link manager and other adjustments to make further speed enhancements to link_table_data().

It works for me, but a similar traceback to the one I included in my Aug 9 comment still persists on the creation of a subset. Whatever is happening also now prevents the image and spectra from changing even after I click a new table row.

@astrofrog
Copy link
Collaborator

Investigating!

@astrofrog
Copy link
Collaborator

I found an issue which I've fixed in javerbukh#8 - let me know if there are still issues after this

@ojustino
Copy link
Contributor

Pulling javerbukh#8 resolves the issue for me in MosvizExample.ipynb, though I notice that 1D spectra don't remain highlighted over a subset's wavelength range unless I select the table row in which the subset was created. In main, the 1D spectrum remains highlighted in the subset's wavelength range regardless of which table row is selected.

However, the pull request does not fix the error I reported earlier when I use a notebook from Patrick that loads newer simulated data for NIRSpec. If it's a problem with the data, I'm not sure why that would affect glue.

@javerbukh I'd be interested to see how using this pull request affects your experience with the jdaviz examples and with loading the newer NIRSpec data alongside my cutouts.

@javerbukh javerbukh requested review from rosteen and pllim August 27, 2021 19:50
Copy link
Collaborator

@duytnguyendtn duytnguyendtn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works wonderfully! Anecdotally, both our NIRISS and sample datasets load SO much faster. Great work to all involved!

There is one small code comment I'd like to get in here that I feel particularly strong about, but I'm happy to approve if you all wouldn't mind my pedantic nature!

jdaviz/configs/mosviz/helper.py Show resolved Hide resolved
jdaviz/configs/mosviz/plugins/parsers.py Outdated Show resolved Hide resolved
@javerbukh
Copy link
Contributor Author

@ojustino I noticed that when loading NIRISS data, the image viewer is a black screen. Do you see this same interaction? If so, could it be because we are not linking the image data?

@rosteen
Copy link
Collaborator

rosteen commented Aug 31, 2021

@javerbukh Did you check your layer scaling? The NIRISS data was almost black upon initial load for me other than a couple stars, had to rescale to see much.

@ojustino
Copy link
Contributor

I won't give a formal review since this is partially my ticket, but I'll add the following:

  1. The subset problem still persists and is more debilitating on this branch than on main. If we push ahead with a merge here, I think it merits opening an issue. The issue should use a pre-762 commit as a baseline for comparison and we can update it when @astrofrog gets a chance to look at it with the new NIRSpec data.
  2. I'm not 100% sure, but reading the current state of Tom's linking documentation makes me think that we should not be linking index 0 of the image to index 0 of the spectra since the 1D and 2D spectra's x axes are both wavelengths while the viewer is in pixel space. However, once the 2D spectrum viewer is working properly, should its y axis be linked to either axis on the image viewer?

@javerbukh I do see the "Image canucs F150W" layer in the image viewer. The color is stretched such that the background is black instead of gray, but I can pick out a few of the brighter sources.

@javerbukh
Copy link
Contributor Author

@rosteen That was the problem, thank you!

@ojustino I don't think I can give an answer on this so let's bring it up at tomorrow's tag up and see what the team says.

Copy link
Collaborator

@duytnguyendtn duytnguyendtn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After catching up on the conversation for this one, I can agree with @ojustino that selecting subsets in the spectral viewers is still broken. While I am nervous about this, crunch time is starting to settle in, and I can be convinced of filing a bug ticket and addressing this afterwards. I'm also debating whether to hold this off because this one is an improvement over an improvement and we could do without it, but I suspect if we don't address this now, this can get quickly forgotten. I'm going to approve this for now, should another dev agree to move forward with the bug ticket (particularly since we have #733 in progress as well)

EDIT: I should also add, I tested Cubeviz, Imviz, and Specviz, all of which seem to be unaffected
EDIT2: Clarifying point, I'm still seeing this subset issue with the NIRISS dataset

@pllim
Copy link
Contributor

pllim commented Sep 2, 2021

So, I guess there are still some unresolved issues w.r.t. this PR, as I can see in #762 (comment) . We should not merge until we have a clear path forward with this (either fix it here or defer as future work). Also my comment about the commented code still stands.

@astrofrog
Copy link
Collaborator

I am now investigating issues related to the subset selection.

Copy link
Collaborator

@astrofrog astrofrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments below - the main one being that MosViz(auto_link=False) doesn't work.

@ojustino - I looked into the notebook you sent and it looks like the subset issue is related to the fact that the 2D spectrum WCS is incorrect:

Number of WCS axes: 3
CTYPE : 'RA---TAN'  'DEC--TAN'  'WAVE'  
CRVAL : 215.0466793383765  52.90137035377627  2.4999999999999998e-06  
CRPIX : -0.00020768428328117  -7.3732734554624e-07  1024.0  
PC1_1 PC1_2 PC1_3  : -1.0  0.0  0.0  
PC2_1 PC2_2 PC2_3  : 0.0  1.0  0.0  
PC3_1 PC3_2 PC3_3  : 1.0  0.0  0.0  
CDELT : 2.96691831370797e-05  1.05359563567715e-07  6.72e-12  
NAXIS : 436  25  1

The PC matrix is not correct as the PC?_3 elements are all zero which means that the WAVE axis isn't correlated connected with the first pixel axis of the data array. This in turn causes all kinds of problems in the linking. I think this is a byproduct of setting up the fake 3D WCS in the parsers, which would be solved by moving away from SpectralCube as done in #733 so I think the best approach might be to go ahead and merge this and I can then rebase #733 and make sure it works with the two standard notebooks and the one you sent, for the subsets.

I'm happy to approve this once the minor issue of the initializer option is missing.

@@ -148,6 +149,10 @@ def __init__(self, configuration=None, *args, **kwargs):
# Parse the yaml configuration file used to compose the front-end UI
self.load_configuration(configuration)

# If true, link data on load. If false, do not link data to speed up
# data loading
self.auto_link = kwargs.pop('auto_link', True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't actually work properly is auto_link is passed to the initializer as the auto_link option is first passed to the parent class in the super().__init__ call and is then not recognized. For this to work it has to come before the super() call I think.

data_obj : obj
Input for Mosviz data parsers.
"""
super().load_data(data_obj, parser_reference="mosviz-link-data")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of interest, what is the motivation for doing the linking as a parser as opposed to just a method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was to keep all related code in one file and have the linking method be callable from the helper file, which (if I remember correctly) can only be done using parser_reference.

jdaviz/configs/mosviz/plugins/parsers.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@astrofrog astrofrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to approve this as @ojustino explained to me about the auto_link argument on Slack.

@rosteen
Copy link
Collaborator

rosteen commented Sep 7, 2021

This has two approvals, but before I hit the merge button I want to make sure I'm clear about the status of the subset bug(s?) mentioned. I see a bug using this PR that I don't see on main: when using the x-range tool in the 1D viewer, if I then drag the selected region around to move the gray region selection box, it creates a bunch of new subsets instead of applying the moved selection to the same subset. Is that what you think will be fixed by #733 @astrofrog ? Or was there another subset bug being discussed separate from the one I described?

@astrofrog
Copy link
Collaborator

I think any subset issue here can be ignored in favour of dealing with it in 733 as they are likely all link related.

@rosteen
Copy link
Collaborator

rosteen commented Sep 7, 2021

I committed @pllim 's code suggestions to remove commented-out image linking code, since it seems the consensus is to not link the images here. I'll merge this once the CI passes again...this just became relevant for something else I'm working on so I want to get it in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working mosviz performance Performance related Ready for final review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NIRISS Parser: Improve speed of data loading
6 participants