Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V2.1 #307

Draft
wants to merge 49 commits into
base: master
Choose a base branch
from
Draft

V2.1 #307

wants to merge 49 commits into from

Conversation

okennedy
Copy link
Contributor

@okennedy okennedy commented Jan 28, 2024

Significant User-Facing Changes

  • Workflows can now be exported as 'Scripts' for use in other workflows (Export workflows as functions #294)
    • TODO: Update Documentation
  • Vizier now supports plugin modules. Pass a path to a plugin jar with --plugin or as a ':' delimited list of paths in the vizier-plugins property of the config file. (Factor Sedona out into a plugin #288 )
    • TODO: Update Documentation

Papercuts Fixed

- Moving documentation menu to the left of the separator.
- Merging documentation links into the help menu; instead of a docs area.
Although it *is* really nice to be able to write code without implicitly
calling .render on ScalaTag tags, this also means that the point where
the tag is rendered is (i) chosen semi-automatically, and (ii) not
immediately visible to someone reading the code.

This is dangerous because each call to .render creates an entirely new,
fully independent DOM node.  This is problematic because saving the tag
and saving the DOM node have very different semantics: DOM operations on a tag instantiate a new node to act on, while DOM operations on the
DOM node act on the node (as probably intended).  In particular, moving
to explicit rendering avoids the following bug:

```
object Foo
{
  val root = div("foo")
}
object Main
{
  document.body.appendChild(Foo.root)
  Foo.root.replaceInnerText("bar") // does not change the document.
}
```

This could be avoided with implicits with explicit typing:
```
  val root: dom.Node = div("foo")
```
... but defaulting to sensible mechanics is preferable.
- Fixed stale routes file path in build.sc
- Added VizierScriptModule placeholder type info into build_routes.sc
- Discovered and fixed a bug where lazy scalikejdbc type fields get lazy-loaded and break Vizier.
- Conformant names in Script/ScriptRevision schemas
- Debug: Prepared statements now trace the statement
- ExecutionContext can now run scripts.
- Rejiggered ExecutionContext to be safer when run on nested scripts
- Python now initializes available artifacts properly
- Python's vizierdb now has a run_script method
- Implemented endpoints for creating/listing/etc... scripts, along with relevant datatypes.
- Implemented a prototype (if somewhat unskinned) script editor.
- For RxBuffer, I needed a way to access the buffer as a whole (e.g., to recompute dependencies); Added an asVar to it.
- Dropped implicit .render of ScalaTags.
- Moving lazy variable initialization to a CatalogDB initializer
- Cleanup and document Vizier init sequence.
- Notify user on schema migrations
- Automatically drop documentation modules from scripts.
- CSS polish for ScriptEditor
- Polishing ScriptEditor interface
- Feedback on save
- Save automatically reassigns the URL from project/branch -> script if necessary.
- Added a widget to put all of the URL futzing code in one place.
- List scripts now only returns names, not content
- Added a tombstone marker to scripts, to avoid deleting dependencies.
- Added a delete script endpoint
- Scripts (if available) are now present on the landing page.
- Code reorg: Moving landing page content into its own root component.
- Graphical glitch: UnloadDataset now properly displays its parameters in rows.
- Futures seem to get discarded when they fall out of scope??  Unclear on why, but saving the future to a variable and *then* calling onComplete seems to work, whereas inlining does not.

Publishing cleanup
- Parameter validation logging now produces a sensible error message when a list parameter gets a non-object as an input.
- Script execution now properly decodes list parameters
- Python now correctly places artifacts output by a script into scope
…ing a zero

Under some circumstances, JSON may decode floats as integers.  When this happens, python type assertions start complaining that the data value is unsafe for re-serialization when it gets echoed back to the server.

This fix generalizes python type assertions to allow ints to get passed in where a float would be expected.
If run_script overwrites a dataset that has already been read by the
python client, then subsequent access to the dataset will only see the
initially cached version.  This update correctly invalidates the cache.
- JSON parse errors during Python Channel negotiations are now reported properly.
- SparkPrimitive now translates json parse errors into more informative error messages.
Note: It might be useful to add a 'no-op' re-run feature that
instantiates a new version of the workflow with specific cell results
dropped, but that retains references to past modules.
Figures need to be explicitly closed after being shown.  Once the figure
is rendered and output, explicitly close the figure to avoid resource
leaks.
@okennedy okennedy added this to the Version 2.1 milestone Jan 28, 2024
- Added a 'Plugin' manager object that dynamically loads a jar, parses out plugin metadata, and in
- Added --plugin/-P command-line option to load plugins at startup
- Added a vizier-plugins option to the Vizier properties file to load plugins at startup
- Fixed typo in capitalization of Class__L__oader in ClassLoaderUtils
- No more crash if no plugins are listed
- Support loading plugins relative to the CWD
- Cleaning up how plugins interact with the classloader.  The result
  seems to make plugin-loaded Spark UDFs/UDTs happy.
- `import` or `run` may need `Vizier.urls` to be set, so we now initialize these in the pre-command step.
- `import` now properly exits after loading.
- Python no longer pretends to crash when something writes to stdout.
- We now support general UDTs via plugins.
- No more crash when trying to serialize +/- Infinity
- Python now copes with +/- Infinity more gracefully
- sql.query now has a show_output parameter (if false, no database is shown)
- sql.query now has all of its parameter constants encoded as statics.
- CodeModuleEditor now properly displays 'show output' parameter.
- Made a [note](http://localhost:5050/project.html?project=1) about CodeModuleEditor's kludgy parameter allow-list
- BooleanParameter now correctly respects the 'default' option.
- vizierdb.create_file() now relays exceptions
- Geoplot now nicely renders images.
- Fixed typo in Get Vizier Artifact python code snippet
- Temporarily removed buggy Get Dataset Dataframe code snippet
- CheckpointDataset now saves the checkpointed dataset with the correct name
- CheckpointDataset now uses the correct file path to access data
- CSV export no longer mangles the names of Geometry or BinaryTypes.
okennedy added 20 commits April 27, 2024 18:07
- Sample now sets a default seed value
- Sampled dataframes can now be properly decoded
- Setttings panel no longer freaks out if pyenv is not installed
- scripts/bootstrap.sh now builds the latest
…edSequence

In the years since we've been working on Vizier, the Spark folks came up with AttachDistributedSequence.  This does the same thing as AnnotateWithSequenceNumber, and it does it waaaaaaay more efficiently and reliably.  This patch tentatively replaces the core functionality of AnnotateWithSequneceNumber with AttachDistributedSequence
- Sort filesystem files in load file UI
- Correctly handle 'AttachDistributedSequence' operator in RowIds
- More automation and better documentation in release scripts
- Safely handle Primitive encoding when Arrays are encoded as UnsafeArrayData
- Add a 'Records Span Multiple Lines' load parameter to the load JSON UI
- Sort artifacts alphabetically before listing them in the artifact picker widget
- Create Project text entry box now creates on 'Enter' key (closes #316)
- Upgrading to Scala 2.12.20 / ScalaJS 1.16.0
- Updating license to 2024 (license V3)
- ScalaJS is getting pissy about GlobalExecutionContext's unfairness.
  I suppose we should look at this at some point, but not today.
  for now, disabling the warning.
- A few bits of cleanup to clear warnings (e.g., toIndexedSeq on
  JavaArrays)
- ScalaJS 1.16.0 does some slightly different magic with
  dom.window.location.search; We need to explicitly check for the case
  where it's null, and the case where it's the empty string when
  looking up arguments in ui.Vizier.scala
- Fixing ui test cases
Fixed a dumb off-by-one error in TentativeEdits was inserting newly
created modules one position earlier than expected (when looking for
the preceding cell, it accidentally called .prev twice).

This issue was obscured by the fact that the typical code path for
module insertions is the one that replaces a TentativeModule (and
modules appended to the workflow were immune as well).

Ultimately, the issue affected any module that was inserted at a
non-tail position without replacing a Tentative module:
- Spreadsheets (created by clicking a button, not via TentativeModule)
- Modules created by another client of the same workflow

Specific changes include
- Removed a spurious .prev that would cause certain asynchronous module
  insertions to appear one cell above where they actually should have
  been until a reload.
- Better documentation for how TentativeEdits.onInsertOne works.
- Test cases for TentativeEdits
- Documenting why WorkflowElement.safePrev and .safeNext are 'safe'
- Reset the spreadsheet if the user hits the 'back' button on a
  spreadsheet.  This (i) closes the spreadsheet socket, (ii) cancels
  any cells being edited, and (iii) reverts to standard display mode
- Increase the spreadsheet socket outgoing message buffer size.  100k slots should be enough to keep up with most scrolling through the spreadsheet.

The *correct* way to fix this, is to change SpreadsheetExecutor/SingleRowExecutor to support notifying on full-row updates, rather than on a cell-by-cell basis (which it does right now).  This shouldn't be too hard, since SingleRowExecutor operates at the granularity of rows anyway... but I want to have the time to do it right (which is not the case right now OK).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant