Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

different error arisen sometimes #22

Open
imElliottt opened this issue Oct 30, 2024 · 5 comments
Open

different error arisen sometimes #22

imElliottt opened this issue Oct 30, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@imElliottt
Copy link

I ran the example notebook using one of the CVs provided. I try to build graph for the cv and two job offers.

  1. I build graph for the cv succesfully most of the time, but TypeError is arisen sometimes.
    itext2kg/irelations_extraction/irelations_extractor.py, line 96.
    'relationship' is supposed to be a dict, but it's a str sometimes.

  2. I tried to build graph for the twe job offers so many times and succeeded only once.
    a. Same problem as above, part of relation extractor
    b. I often fail when try to extracte entities for the second text of the job offer. /itext2kg/itext2kg.py, line 85
    /itext2kg/ientities_extraction/ientities_extractor.py, line 71.
    the para 'name' is supposed to be a str, but it could be a list sometimes.

I think the reason for these problems has something to do with the model I'm using, llama3-8b, because the semantic block, entities and relations I got using this model are not always the same. But I think these problems(unexpected type) should be considered.

Thanks for your patient.

@lairgiyassir
Copy link
Collaborator

Hello,

Thank you for pointing out this issue.
Which version of iText2KG are you using ? please check that you are using the latest version.

@lairgiyassir lairgiyassir added the enhancement New feature or request label Oct 30, 2024
@imElliottt
Copy link
Author

Thanks for your relpy!

itext2kg-0.0.7. I think it's the latest version.
So I built the kg separately for the two offers. job_offer is ok, job_offer_2 didn't work.
Do you have some suggetions for the situation above.

thank you

@Wonder947
Copy link

Hi,

I'm meeting the same issue when trying to construct the KG for the second job offer

It gives this error:
"
ValidationError: 1 validation error for Entity
name
Input should be a valid string [type=string_type, input_value=['Translate our core valu...ency and effectiveness'], input_type=list]
"
And it probably originiates from when trying to parse this

"[INFO] ------- Extracting Entities from the Document 2
{'entities': [{'$ref': '#/$defs/Entity', 'label': 'Job Offer', 'name': 'Innovate Design Co. Job Offer'}, {'$ref': '#/$defs/Entity', 'label': 'Company', 'name': 'Innovate Design Co.'}, ..., {'$ref': '#/$defs/Entity', 'label': 'Responsibilities', 'name': ['Translate our core values, product, marketing, and sales objectives into beautifully crafted deliverables', 'Design compelling, brand-aligned digital and print materials, including websites, social media content, ads, third-party marketplaces, presentations, animations, events, prints, etc.', 'Develop and maintain visual brand identity guidelines, ensuring brand consistency across all media and multichannel platforms', "Communicate Innovate Design Co.'s narrative through conversion and data-driven design", 'Participate in brainstorming sessions and collaborate with stakeholders to articulate a creative vision that enhances our brand’s visual storytelling', 'Promote design comprehension and sensibility across the organization, refining work methodologies and design processes to enhance efficiency and effectiveness']}, ....]}"

As it expects the "name" field to be string but get "list" of str.

@JIAQI549
Copy link

JIAQI549 commented Nov 3, 2024

I had the same problem running "kg_ = itext2kg.build_graph(sections=[semantic_blocks_job_offer, semantic_blocks_job_offer_2], existing_knowledge_graph=kg, ent_threshold=0.6, rel_threshold=0.6)" with llama3:ValidationError: 1 validation error for Entity
name
Input should be a valid string [type=string_type, input_value=['Translate our core valu...ency and effectiveness'], input_type=list]
For further information visit https://errors.pydantic.dev/2.9/v/string_type

@lairgiyassir
Copy link
Collaborator

Hello,
Thank you all for your feedback.

The major problem of these local and small models is their inability to structure the output.
The only solution is to upgrade the model size (yes, you will need more resources).

But, we will consider this issue in our next releases ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants