Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

85% working list item detection and notation #65

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

thetrebor
Copy link

Semi working example of List Item detection
having trouble with manipulation of the schema.

Belwo are some examples of the current output.
[90-2] means the bbox-x is 90 and 2 means it is indented twice

problems i'm seeing are that i cannot merge spans/lines to prevent improper line breaks
2 i can't figure out how to split spans/lines so i can add line breaks where they are actually required

Procedure 611 - Mentally Ill Persons

	[90-2]5.Thinks people are watching or talking to him;

	[90-2]6.Exhibits an extreme degree of panic or fright;

	[90-2]7.Behaves in a way dangerous to himself or others (i.e., hostile, suicidal, makes threats towards others, etc.);

	[90-2]8.Poor personal hygiene or appearance; or

	[90-2]9.Demonstrates an unusual thought process or verbal expressions or is catatonic.

[67-1]C.

Upon recognition of a mental health crisis situation the officer's responsibilities include:

	[90-2]1.Maintaining a high degree of caution in dealing with the potentially unpredictable nature of persons with mental

illness;

	[90-2]2.Protecting the general public from the actions of the persons with mental illness;

	[90-2]3.Protecting the persons with mental illness from his/her own actions; and

	[90-2]4.Providing the most effective remedy available at the time to resolve the crisis situation.

.05 Crisis Intervention Team (Cit) Officers

[72-1]A.A Crisis Intervention Team (CIT) officer is defined as any officer on the Department who has successfully completed

the 40 hours Crisis Intervention Team training. [72-1]B. CIT Officers are assigned to regular patrol duties and when available respond to situations involving persons who are
experiencing a mental health crisis. [72-1]C. The CIT Officer at the scene of a call involving a mental health crisis situation has the responsibility for handling
the situation unless otherwise directed by a supervisor. The CIT Officer should ask for additional support, if necessary. [72-1]D. CIT Officers may only take the same courses of action as other patrol officers when handling a mental health crisis.
The courses of action are listed in Section .08 of this procedure.

.06 Initial Response

[72-1]A.Communications Unit - Dispatchers responsibilities include:

	[90-2]1.Attempt to determine if a service call is a mental health crisis; [90-2]2.Determine if weapons or any violent acts have been committed which may create an Escalated Mental Health

Crisis Call.

		[108-3]a.An Escalated Mental Health Crisis Call is a two-pronged approach where weapons are involved, or violence

has occurred or is occurring, and corroborating factors exist that establish a mental health nexus.

		[108-3]b.If the call meets the listed criteria for an Escalated Mental Health Crisis Call, a supervisor will be assigned

and dispatched to the scene.

	[90-2]3.Identify mental health crisis calls by using appropriate code; (Escalated Mental Health Crisis Call, Mental Health

in Progress, Mental Health Disturbance, Mental Health Routine);

	[90-2]4.Assign and dispatch a CIT Officer when available, along with a cover officer, to mental health crisis situations;

@thetrebor
Copy link
Author

thetrebor commented Jan 13, 2024

appending the lines here using various bboxes creates mangled output
the items do not appear on new lines, even though they new lines and their bbox is assigned the same
as the start of the list-item
printing the output looks perfect, but there's some magical something that occurs after this
that reorders the items on the page mysteriously
current_lines.append(Line(spans=[current_list_item_span], bbox=current_list_item_span.bbox))

@thetrebor
Copy link
Author

test.pdf

PDF I was using for testing the trivial cases

@thetrebor
Copy link
Author

This may work as of the last pull. I guess I needed to rubber duck.

outstanding questions.

should we convert list outline format to fully identified format for LLM parsing:

1
2
    a
    b
          i
          ii
     c

this can be hard for llm to parse especially in long and deeply nested lists.

we could in markdown output lists as

1.
2.
2.a.
2.b.
2.b.i.
2.b.ii.
2c

How can we generalize the tab stops for the indents. I think we'll need a bunch of docs and figure out how to determine them.

@thetrebor thetrebor changed the title 45% working list item detection and notation 85% working list item detection and notation Jan 15, 2024
@thetrebor thetrebor marked this pull request as ready for review January 17, 2024 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant