Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add z-score for the normalization processor #376

Open
sam-herman opened this issue Oct 2, 2023 · 7 comments
Open

[FEATURE] Add z-score for the normalization processor #376

sam-herman opened this issue Oct 2, 2023 · 7 comments
Assignees
Labels
Features Introduces a new unit of functionality that satisfies a requirement help wanted Extra attention is needed hybrid search

Comments

@sam-herman
Copy link

Is your feature request related to a problem?

Currently the normalization processor doesn't support z-score which is a popular technique and according in some instances produces superior results to min-max see blog

What solution would you like?

Allow to specify z-score as a normalization technique in the normalization processor

What alternatives have you considered?

Not much at the moment but suggestions are welcome.

Do you have any additional context?

see blog mentioned earlier

@heemin32 heemin32 added Features Introduces a new unit of functionality that satisfies a requirement and removed untriaged labels Oct 2, 2023
@heemin32 heemin32 assigned vamshin and unassigned vamshin Oct 2, 2023
@heemin32
Copy link
Collaborator

heemin32 commented Oct 2, 2023

@samuel-oci Thanks for the feature request. Also welcome to take it and work on it.

@sam-herman
Copy link
Author

Sounds good, I'll take it, feel free to assign to me.

@vamshin vamshin added help wanted Extra attention is needed neural-search labels Oct 3, 2023
@sam-herman
Copy link
Author

Hi @heemin32 @vamshin @navneet1v I don't see any IT that tests normalization and was thinking to add those also as part of this PR. If I missed those or you already have those worked on somewhere else just let me know and we can consolidate.

@navneet1v
Copy link
Collaborator

navneet1v commented Oct 10, 2023

@martin-gaievski can you please provide the reference for the integration tests that has been added.

I am able to find this: https://github.com/opensearch-project/neural-search/blob/main/src/test/java/org/opensearch/neuralsearch/query/HybridQueryIT.java#L32

@sam-herman
Copy link
Author

@navneet1v this one test hybrid but for some reason when I run it in my IDE to debug I can't see the normalization processor being triggered. Could also be something wrong with my setup.. if it is then feel free to ignore.

@navneet1v
Copy link
Collaborator

@navneet1v this one test hybrid but for some reason when I run it in my IDE to debug I can't see the normalization processor being triggered. Could also be something wrong with my setup.. if it is then feel free to ignore.

If you are using debugger it might not hit the code. I would recommend adding logs and check it. Debugger is tricky with IT tests, if you are just running ./gradlew integTest

@sam-herman
Copy link
Author

sam-herman commented Oct 10, 2023

@navneet1v I added the following lines build.gradle under IntegTest task and also created a corresponding log4j2-test.xml:

    systemProperty 'log4j2.configurationFile', "${projectDir}/src/test/resources/log4j2-test.xml"

    // Set this to true this if you want to see the logs in the terminal test output.
    // note: if left false the log output will still show in your IDE
    testLogging.showStandardStreams = true

I do see logs of IT test and other core open search but don't see any logs showing from the normalization processor or NormalizationProcessorWorkflow. which is why I later run with debugger to double check.
Is there a better way to add logging and making them to show in integTest task?

EDIT:
regarding debugger, I see similar problem to what is described here
https://forum.opensearch.org/t/debugging-test-cluster-provided-by-opensearchintegtestcase/13447/3
Let me know if there is a setup that works to attach debugger to test cluster that RestIntegTest spins?

Update:
regarding debugger thanks to @martin-gaievski I now able to get it to attach. Also updated the forum with the answer to make sure in the future folks can google it ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Features Introduces a new unit of functionality that satisfies a requirement help wanted Extra attention is needed hybrid search
Projects
None yet
Development

No branches or pull requests

5 participants