Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat/transformers #83

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from
Open

feat/transformers #83

wants to merge 2 commits into from

Conversation

JarbasAl
Copy link
Member

@JarbasAl JarbasAl commented Apr 20, 2024

closes #82

in the regular mycroft.conf add the transformers configs as usual, but this time under "hivemind" section (config file may be shared with OVOS with different plugins in each)

"hivemind": {
  "utterance_transformers": {
     "ovos-xxx": {}
  },
  "metadata_transformers": {
     "ovos-xxx": {}
  },
  "dialog_transformers": {
     "ovos-xxx": {}
  }
}

relevant plugins:

this PR brings transformer functionality directly to hivemind, allowing plugins to also be used with non ovos-core minds

Summary by CodeRabbit

  • New Features

    • Enhanced message handling with new transformer services for utterances, dialogs, and metadata.
    • Improved transformation capabilities with added methods for handling and managing plugins.
  • Dependencies

    • Updated dependencies: ovos-bus-client (v0.0.6), ovos-plugin-manager, and HiveMind_presence (v0.0.2a3).

@JarbasAl JarbasAl added the enhancement New feature or request label Apr 20, 2024
@JarbasAl
Copy link
Member Author

@coderabbitai review

Copy link

coderabbitai bot commented Jul 16, 2024

Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

coderabbitai bot commented Jul 16, 2024

Walkthrough

The recent updates enhance the hivemind-core functionality by integrating transformer plugins for utterances, metadata, and dialogues. These plugins allow the transformation of messages before they are processed further, improving flexibility and user management. The changes include new service classes for handling transformations and modifications to the handle_bus_message method to incorporate these transformations.

Changes

File Change Summary
hivemind_core/protocol.py Added methods and attributes to integrate utterance, metadata, and dialog transformer plugins.
hivemind_core/transformers.py Introduced DialogTransformersService, UtteranceTransformersService, and MetadataTransformersService to manage transformations.
requirements.txt Updated to include ovos-bus-client version 0.0.6, ovos-plugin-manager, and HiveMind_presence version 0.0.2a3.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant HiveMindListenerProtocol
  participant UtteranceTransformersService
  participant DialogTransformersService
  participant MetadataTransformersService

  Client->>HiveMindListenerProtocol: Send bus message
  HiveMindListenerProtocol->>UtteranceTransformersService: Transform utterance
  UtteranceTransformersService-->>HiveMindListenerProtocol: Transformed utterance
  HiveMindListenerProtocol->>DialogTransformersService: Transform dialog
  DialogTransformersService-->>HiveMindListenerProtocol: Transformed dialog
  HiveMindListenerProtocol->>MetadataTransformersService: Transform metadata
  MetadataTransformersService-->>HiveMindListenerProtocol: Transformed metadata
  HiveMindListenerProtocol-->>Client: Processed message
Loading

Assessment against linked issues

Objective Addressed Explanation
Allow the usage of utterance/metadata transformers in hivemind-core before forwarding the utterance
Integrate transformer plugins to map users from a chat service to an internal user_id

Poem

In the realm of code so bright,
Transformers join the night,
Utterance, dialog, metadata sway,
Enhancing messages on their way.
Rabbit hops with glee,
For changes bring new harmony. 🌟


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Outside diff range, codebase verification and nitpick comments (1)
hivemind_core/protocol.py (1)

Line range hint 261-504: Review the integration of transformer services in HiveMindListenerProtocol.

The integration of UtteranceTransformersService, MetadataTransformersService, and DialogTransformersService into the HiveMindListenerProtocol is a significant enhancement. Consider the following:

  • Error Handling: Ensure that errors in the transformer services do not halt the entire message handling process.
  • Performance: The transformation process should be optimized to avoid delays in message processing.
  • Security: Carefully manage the flow of sensitive data through the transformers to prevent leakage.

The implementation is robust but could benefit from additional error handling and performance optimizations.

-            except:
+            except SpecificException as e:  # Replace with more specific exceptions
-        utterances, message.context = self.utterance_plugins.transform(utterances, message.context)
+        try:
+            utterances, message.context = self.utterance_plugins.transform(utterances, message.context)
+        except TransformerException as e:
+            LOG.error(f"Error in utterance transformation: {e}")
+            # Handle error appropriately
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 0eff030 and 35973c0.

Files selected for processing (3)
  • hivemind_core/protocol.py (10 hunks)
  • hivemind_core/transformers.py (1 hunks)
  • requirements.txt (1 hunks)
Additional context used
Ruff
hivemind_core/transformers.py

41-41: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


132-135: Use contextlib.suppress(Exception) instead of try-except-pass

Replace with contextlib.suppress(Exception)

(SIM105)


134-134: Do not use bare except

(E722)


188-191: Use contextlib.suppress(Exception) instead of try-except-pass

Replace with contextlib.suppress(Exception)

(SIM105)


190-190: Do not use bare except

(E722)

Additional comments not posted (1)
requirements.txt (1)

5-6: Verify dependency versions for compatibility.

The addition of ovos-bus-client and ovos-plugin-manager, and the update of HiveMind_presence are crucial for the new functionality. It's important to ensure these versions are compatible with each other and the existing project dependencies.

Comment on lines +13 to +91
class DialogTransformersService:
""" transform dialogs before being sent to TTS """

def __init__(self, bus, config=None):
self.loaded_plugins = {}
self.has_loaded = False
self.bus = bus
# to activate a plugin, just add an entry to mycroft.conf for it
self.config = config or Configuration().get("dialog_transformers", {})
self.load_plugins()

@property
def blacklisted_skills(self):
# dialog should NEVER be rewritten if it comes from these skills
return self.config.get("blacklisted_skills",
["skill-ovos-icanhazdadjokes.openvoiceos"] # blacklist jokes by default
)

def load_plugins(self):
for plug_name, plug in find_dialog_transformer_plugins().items():
if plug_name in self.config:
# if disabled skip it
if not self.config[plug_name].get("active", True):
continue
try:
self.loaded_plugins[plug_name] = plug(config=self.config[plug_name])
self.loaded_plugins[plug_name].bind(self.bus)
LOG.info(f"loaded audio transformer plugin: {plug_name}")
except Exception as e:
LOG.exception(f"Failed to load dialog transformer plugin: "
f"{plug_name}")
self.has_loaded = True

@property
def plugins(self) -> list:
"""
Return loaded transformers in priority order, such that modules with a
higher `priority` rank are called first and changes from lower ranked
transformers are applied last.

A plugin of `priority` 1 will override any existing context keys and
will be the last to modify `audio_data`
"""
return sorted(self.loaded_plugins.values(),
key=lambda k: k.priority, reverse=True)

def shutdown(self):
"""
Shutdown all loaded plugins
"""
for module in self.plugins:
try:
module.shutdown()
except Exception as e:
LOG.warning(e)

def transform(self, dialog: str, context: dict = None, sess: Session = None) -> Tuple[str, dict]:
"""
Get transformed audio and context for the preceding audio
@param dialog: str to be spoken
@return: transformed dialog to be sent to TTS
"""

# TODO property not yet introduced in Session
sess = sess or SessionManager.get()
# if isinstance(sess, dict):
# sess = Session.deserialize(sess)
# active_transformers = sess.dialog_transformers or self.plugins

active_transformers = self.plugins

for module in active_transformers:
try:
LOG.debug(f"checking dialog transformer: {module}")
dialog, context = module.transform(dialog, context=context)
LOG.debug(f"{module.name}: {dialog}")
except Exception as e:
LOG.exception(e)
return dialog, context
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review the implementation of DialogTransformersService.

The class is well-structured with methods for loading plugins, transforming dialog, and shutting down. However, there are several points to consider:

  • Error Handling: The exception handling in the load_plugins and transform methods could be more specific. Catching general exceptions is not a best practice.
  • Logging: The use of logging is appropriate, but ensure sensitive data is not logged.
  • Performance: The sorting of plugins every time they are accessed might impact performance if the list is large. Consider caching the sorted list.

Overall, the implementation aligns with the objectives but could be refined to improve error handling and performance.

-                except Exception as e:
+                except SpecificException as e:  # Replace with more specific exceptions
-        return sorted(self.loaded_plugins.values(),
+        if not hasattr(self, '_sorted_plugins'):
+            self._sorted_plugins = sorted(self.loaded_plugins.values(),
+                      key=lambda k: k.priority, reverse=True)
+        return self._sorted_plugins
Tools
Ruff

41-41: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

Comment on lines +94 to +148
class UtteranceTransformersService:

def __init__(self, bus, config=None):
self.config_core = config or {}
self.loaded_plugins = {}
self.has_loaded = False
self.bus = bus
self.config = self.config_core.get("utterance_transformers") or {}
self.load_plugins()

def load_plugins(self):
for plug_name, plug in find_utterance_transformer_plugins().items():
if plug_name in self.config:
# if disabled skip it
if not self.config[plug_name].get("active", True):
continue
try:
self.loaded_plugins[plug_name] = plug()
LOG.info(f"loaded utterance transformer plugin: {plug_name}")
except Exception as e:
LOG.error(e)
LOG.exception(f"Failed to load utterance transformer plugin: {plug_name}")

@property
def plugins(self):
"""
Return loaded transformers in priority order, such that modules with a
higher `priority` rank are called first and changes from lower ranked
transformers are applied last

A plugin of `priority` 1 will override any existing context keys and
will be the last to modify utterances`
"""
return sorted(self.loaded_plugins.values(),
key=lambda k: k.priority, reverse=True)

def shutdown(self):
for module in self.plugins:
try:
module.shutdown()
except:
pass

def transform(self, utterances: List[str], context: Optional[dict] = None):
context = context or {}

for module in self.plugins:
try:
utterances, data = module.transform(utterances, context)
_safe = {k: v for k, v in data.items() if k != "session"} # no leaking TTS/STT creds in logs
LOG.debug(f"{module.name}: {_safe}")
context = merge_dict(context, data)
except Exception as e:
LOG.warning(f"{module.name} transform exception: {e}")
return utterances, context
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review the implementation of UtteranceTransformersService.

This class follows a similar structure to DialogTransformersService. The comments for error handling, logging, and performance optimizations apply here as well. Additionally, consider the following:

  • Data Handling: The transformation logic should ensure that no data integrity issues occur, especially when merging dictionaries.
  • Security: Ensure that no sensitive data is leaked through logs or transformations.

Refinements in error handling and data security are recommended.

-                except Exception as e:
+                except SpecificException as e:  # Replace with more specific exceptions
-                _safe = {k: v for k, v in data.items() if k != "session"}
+                _safe = {k: v for k, v in data.items() if k not in ["session", "other_sensitive_key"]}
Tools
Ruff

132-135: Use contextlib.suppress(Exception) instead of try-except-pass

Replace with contextlib.suppress(Exception)

(SIM105)


134-134: Do not use bare except

(E722)

Comment on lines +151 to +204
class MetadataTransformersService:

def __init__(self, bus, config=None):
self.config_core = config or {}
self.loaded_plugins = {}
self.has_loaded = False
self.bus = bus
self.config = self.config_core.get("metadata_transformers") or {}
self.load_plugins()

def load_plugins(self):
for plug_name, plug in find_metadata_transformer_plugins().items():
if plug_name in self.config:
# if disabled skip it
if not self.config[plug_name].get("active", True):
continue
try:
self.loaded_plugins[plug_name] = plug()
LOG.info(f"loaded metadata transformer plugin: {plug_name}")
except Exception as e:
LOG.error(e)
LOG.exception(f"Failed to load metadata transformer plugin: {plug_name}")

@property
def plugins(self):
"""
Return loaded transformers in priority order, such that modules with a
higher `priority` rank are called first and changes from lower ranked
transformers are applied last.

A plugin of `priority` 1 will override any existing context keys
"""
return sorted(self.loaded_plugins.values(),
key=lambda k: k.priority, reverse=True)

def shutdown(self):
for module in self.plugins:
try:
module.shutdown()
except:
pass

def transform(self, context: Optional[dict] = None):
context = context or {}

for module in self.plugins:
try:
data = module.transform(context)
_safe = {k: v for k, v in data.items() if k != "session"} # no leaking TTS/STT creds in logs
LOG.debug(f"{module.name}: {_safe}")
context = merge_dict(context, data)
except Exception as e:
LOG.warning(f"{module.name} transform exception: {e}")
return context
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review the implementation of MetadataTransformersService.

This class handles metadata transformations. The same recommendations for error handling, logging, and data handling apply. Additionally:

  • Modularity: Consider abstracting some of the repeated logic across the transformer services into a base class or utility functions to reduce code duplication and improve maintainability.

Propose a refactor to improve modularity and reduce code duplication.

+class BaseTransformerService:
+    def load_plugins(self, plugin_finder):
+        for plug_name, plug in plugin_finder().items():
+            if plug_name in self.config:
+                if not self.config[plug_name].get("active", True):
+                    continue
+                try:
+                    self.loaded_plugins[plug_name] = plug()
+                    LOG.info(f"loaded transformer plugin: {plug_name}")
+                except Exception as e:
+                    LOG.error(e)
+                    LOG.exception(f"Failed to load transformer plugin: {plug_name}")
+
-    def load_plugins(self):
-        for plug_name, plug in find_metadata_transformer_plugins().items():
-            ...

Committable suggestion was skipped due to low confidence.

Tools
Ruff

188-191: Use contextlib.suppress(Exception) instead of try-except-pass

Replace with contextlib.suppress(Exception)

(SIM105)


190-190: Do not use bare except

(E722)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feature - transformer plugins
1 participant