Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error using some pretrained pipelines in Spark/PySpark 3.x #2738

Open
muhammetsnts opened this issue Apr 6, 2021 · 7 comments
Open

Error using some pretrained pipelines in Spark/PySpark 3.x #2738

muhammetsnts opened this issue Apr 6, 2021 · 7 comments
Assignees
Labels
bug models_hub pretrained models and pipelines

Comments

@muhammetsnts
Copy link
Contributor

Description

When I try to use match_chunks and match_datetime pretrained pipelines in sparknlp_version 3.0.1, get errors while downloading these pipelines.

Current Behavior

1
2

Steps to Reproduce

  1. pipeline = PretrainedPipeline('match_chunks', lang='en')
  2. pipeline = PretrainedPipeline('match_datetime', lang='en')

Context

Your Environment

  • Spark NLP version sparknlp.version(): 3.0.1
  • Apache NLP version spark.version: 3.1.1
  • Java version java -version: openjdk version "1.8.0_282"
@maziyarpanahi maziyarpanahi added bug models_hub pretrained models and pipelines labels Apr 6, 2021
@maziyarpanahi maziyarpanahi changed the title Error using some pretrained pipelines Error using some pretrained pipelines in Spark/PySpark 3.x Apr 6, 2021
@maziyarpanahi
Copy link
Member

@Digaari Only report anything related to public and open-source. We are not responsible for anything else.

@JohnSnowLabs JohnSnowLabs deleted a comment from Digaari Apr 9, 2021
@Digaari
Copy link
Contributor

Digaari commented Apr 12, 2021

Faced similar issue with check_spelling_dl in the same environment.
image

@maziyarpanahi
Copy link
Member

Any model/pipeline with that specific error means they are not trained/saved in Spark 3.x/Scala 2.12, they need to be trained/saved using Spark 3.x.

Please make a list and pass it to the Models Hub team to fix them

@muhammetsnts
Copy link
Contributor Author

@Digaari here is a list of pipelines that not working on spark 3.x.

  • clean_slang
  • check_spelling_dl
  • match_chunks
  • match_datetime

@muhammetsnts
Copy link
Contributor Author

@maziyarpanahi this issue is still open and we've tested these models, they are still broken. As you said, @josejuanmartinez can assign someone from modelshub team.

@maziyarpanahi
Copy link
Member

Thanks @muhammetsnts

These pipelines need to be re-do/re-uploaded by using Apache Spark 3.x. Some models/pipelines need two copies one for spark 2.x and one for soark 3.x.

So these work in Spark 2.x but the 3.x are missing. Please assign a member to save and upload these pipelines with the same version/metadata but on spark 3.x this time.

@cholojuanito
Copy link

cholojuanito commented Apr 7, 2022

I'm still running into this issue with the following environment

  • Spark NLP version sparknlp.version(): 3.4.2
  • Apache NLP version spark.version: 3.1.2
  • Java version java -version: openjdk version "11.0.14"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug models_hub pretrained models and pipelines
Projects
None yet
Development

No branches or pull requests

4 participants