Skip to content

Commit

Permalink
Merge pull request #5 from spraakbanken/2-annotate-sentence
Browse files Browse the repository at this point in the history
2 annotate sentence
  • Loading branch information
kod-kristoff authored May 28, 2024
2 parents 695961d + 70afe2a commit 7ba1114
Show file tree
Hide file tree
Showing 24 changed files with 430 additions and 96 deletions.
59 changes: 0 additions & 59 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -79,64 +79,6 @@ jobs:
# name: codecov-umbrella
verbose: true

doctests:
# This action runs doctests for coverage collection and uploads them to codecov.io.
# This requires the secret `CODECOV_TOKEN` be set as secret on GitHub, both for
# Actions and Dependabot

name: "${{ matrix.os }} / 3.8 / doctest"
strategy:
max-parallel: 4
fail-fast: false
matrix:
os: [ubuntu]

runs-on: ${{ matrix.os }}-latest
continue-on-error: true # allow failure until doctests are added
env:
OS: ${{ matrix.os }}-latest
steps:
- uses: actions/checkout@v4
with:
submodules: true

- name: Set up the environment
uses: pdm-project/setup-pdm@v4
id: setup-python
with:
python-version: ${{ env.MINIMUM_PYTHON_VERSION }}

- name: Load cached venv
id: cached-venv
uses: actions/cache@v4
with:
path: .venv
key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ hashFiles('**/pyproject.toml') }}-${{ hashFiles('.github/workflows/test.yml') }}

- name: Install dependencies
if: steps.cached-venv.outputs.cache-hit != 'true'
run: make install-dev
#----------------------------------------------
# Run tests and upload coverage
#----------------------------------------------
- name: make doc-tests
run: make doc-tests cov_report=xml

- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
with:
token: ${{ secrets.CODECOV_TOKEN }}
# directory: ./coverage
env_vars: OS,PYTHON,TESTTYPE
fail_ci_if_error: true
# files: ./coverage/coverage.xml
# flags: unittests
# name: codecov-umbrella
verbose: true
env:
PYTHON: ${{ env.MINIMUM_PYTHON_VERSION }}
TESTTYPE: doctest

minimal:
# This action chooses the oldest version of the dependencies permitted by Cargo.toml to ensure
# that this crate is compatible with the minimal version that this crate and its dependencies
Expand Down Expand Up @@ -169,7 +111,6 @@ jobs:
if: always()
needs:
- coverage
- doctests
- minimal
runs-on: ubuntu-latest
permissions: {}
Expand Down
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -159,3 +159,7 @@ cython_debug/
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
.pdm-python

/examples/*/.snakemake
/examples/*/export
/examples/*/sparv-workdir
24 changes: 18 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ help:
@echo ""
@echo "publish [branch=]"
@echo " pushes the given branch including tags to origin, for CI to publish based on tags. (Default: branch='main')"
@echo " Typically used after `make bumpversion`"
@echo " Typically used after 'make bumpversion'"
@echo ""
@echo "prepare-release"
@echo " run tasks to prepare a release"
Expand Down Expand Up @@ -89,6 +89,11 @@ install-dev:
install:
pdm sync --prod

lock: pdm.lock

pdm.lock: pyproject.toml
pdm lock

.PHONY: test
test:
${INVENV} pytest -vv ${tests}
Expand Down Expand Up @@ -142,20 +147,27 @@ publish:


.PHONY: prepare-release
prepare-release: tests/requirements-testing.lock
prepare-release: update-changelog tests/requirements-testing.lock

# we use lock extension so that dependabot doesn't pick up changes in this file
tests/requirements-testing.lock: pyproject.toml
tests/requirements-testing.lock: pyproject.toml pdm.lock
pdm export --dev --format requirements --output $@

.PHONY: kb-bert-prepare-release
sparv-sbx-sentence-sentiment-kb-sent-prepare-release: sparv-sbx-sentence-sentiment-kb-sent/CHANGELOG.md

.PHONY: update-changelog
update-changelog: CHANGELOG.md sparv-sbx-sentence-sentiment-kb-sent/CHANGELOG.md

.PHONY: CHANGELOG.md
CHANGELOG.md:
git cliff --unreleased --prepend $@

# update snapshots for `syrupy`
.PHONY: snapshot-update
snapshot-update:
${INVENV} pytest --snapshot-update

.PHONY: kb-bert-prepare-release
sparv-sbx-sentence-sentiment-kb-sent-prepare-release: sparv-sbx-sentence-sentiment-kb-sent/CHANGELOG.md

.PHONY: sparv-sbx-sentence-sentiment-kb-sent/CHANGELOG.md
sparv-sbx-sentence-sentiment-kb-sent/CHANGELOG.md:
git cliff --unreleased --include-path "sparv-sbx-sentence-sentiment-kb-sent/**/*" --include-path "examples/sparv-sbx-sentence-sentiment-kb-sent/**/*" --prepend $@
16 changes: 16 additions & 0 deletions examples/sparv-sbx-sentence-sentiment-kb-sent/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
metadata:
id: example-sparv-sbx-sentence-kb-sent
language: swe

import:
importer: text_import:parse

export:
annotations:
- <sentence>
# - <token:word>
- <token>:stanza.pos
- <sentence>:sbx_sentence_sentiment_kb_sent.sbx-sentence-sentiment--kb-sent

sparv:
compression: none
3 changes: 3 additions & 0 deletions examples/texts/small.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Stora regnmängder väntas under måndagen och SMHI har utfärdat en gul varning för skyfallsliknande regn över stora delar av landets södra halva.
Jag är förvånad, chockad och bestört över det här, säger Anders Persson till SVT Nyheter Småland.
Vi hoppas och tror att detta också snabbt ska kunna komma på plats, säger Garborg.
2 changes: 2 additions & 0 deletions mypy.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[mypy]
python_version = 3.8
54 changes: 34 additions & 20 deletions pdm.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 3 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[project]
name = "sparv-sbx-sentence-sentiment-workspace"
dependencies = []
requires-python = ">=3.8,<3.12"
requires-python = ">=3.8.1,<3.12"
version = "0.0.0"

[tool.pdm.dev-dependencies]
Expand All @@ -10,6 +10,7 @@ dev = [
"pytest>=8.1.1",
"pytest-cov>=4.1.0",
"mypy>=1.9.0",
"ruff>=0.3.2",
"ruff>=0.4.5",
"bump-my-version>=0.19.0",
"syrupy>=4.6.1",
]
61 changes: 61 additions & 0 deletions ruff.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
line-length = 97

target-version = "py38"

[lint]
select = [
"A", # flake8-builtins
"ANN", # flake8-annotations
"ARG", # flake8-unused-arguments
"B", # flake8-bugbear
"C4", # flake8-comprehensions
"COM", # flake8-commas
"D", # pydocstyle
"D400", # pydocstyle: ends-in-period
"D401", # pydocstyle: non-imperative-mood
"E", # pycodestyle: errors
"F", # Pyflakes
"FLY", # flynt
"FURB", # refurb
"G", # flake8-logging-format
"I", # isort
"ISC", # flake8-implicit-str-concat
"N", # pep8-naming
"PERF", # Perflint
"PIE", # flake8-pie
"PL", # Pylint
# "PT", # flake8-pytest-style
"PTH", # flake8-use-pathlib
"Q", # flake8-quotes
"RET", # flake8-return
"RSE", # flake8-raise
"RUF", # Ruff-specific rules
"SIM", # flake8-simplify
"T20", # flake8-print
"TID", # flake8-tidy-imports
"UP", # pyupgrade
"W", # pycodestyle: warnings
]
ignore = [
"ANN101", # flake8-annotations: missing-type-self (deprecated)
"ANN102", # flake8-annotations: missing-type-cls (deprecated)
"ANN401", # flake8-annotations: any-type
"B008", # flake8-bugbear: function-call-in-default-argument
"ISC001",
"COM812", # flake8-commas: missing-trailing-comma
"PLR09", # Pylint: too-many-*
"SIM105", # flake8-simplify: suppressible-exception
]
preview = true

# Avoid trying to fix flake8-bugbear (`B`) violations.
unfixable = ["B"]


[lint.pydocstyle]
convention = "google"


# Ignore `E402` (import violations) in all `__init__.py` files, and in `path/to/file.py`.
[lint.per-file-ignores]
"*/tests/*" = ["D", "ARG002", "E501"]
2 changes: 1 addition & 1 deletion sparv-sbx-sentence-sentiment-kb-sent/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Plugin for applying bert masking as a [Sparv](https://github.com/spraakbanken/sp

## Install

First, install Sparv, as suggested:
First, install Sparv as suggested:

```bash
pipx install sparv-pipeline
Expand Down
16 changes: 15 additions & 1 deletion sparv-sbx-sentence-sentiment-kb-sent/pdm.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 6 additions & 2 deletions sparv-sbx-sentence-sentiment-kb-sent/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,13 @@ name = "sparv-sbx-sentence-sentiment-kb-sent"
version = "0.1.0"
description = "A sparv plugin for computing word neighbours using a BERT model."
authors = [
{ name = "Språkbanken Text", email = "[email protected]" },
{ name = "Kristoffer Andersson", email = "[email protected]" },
]
dependencies = ["sparv-pipeline >=5.2.0", "transformers>=4.34.1"]
license = "MIT"
readme = "README.md"
requires-python = ">= 3.8,<3.12"
requires-python = ">= 3.8.1,<3.12"
classifiers = [
"Development Status :: 3 - Alpha",
# "Development Status :: 4 - Beta",
Expand Down Expand Up @@ -54,4 +55,7 @@ allow-direct-references = true


[tool.pdm.dev-dependencies]
dev = ["pytest>=8.0.0"]
dev = [
"pytest>=8.0.0",
"syrupy>=4.6.1",
]
Original file line number Diff line number Diff line change
@@ -1 +1,8 @@
"""Sparv plugin for annotating sentences with sentiment analysis."""

from sbx_sentence_sentiment_kb_sent.annotations import annotate_sentence_sentiment

__all__ = ["annotate_sentence_sentiment"]

__description__ = "Annotate sentence with sentiment analysis."
__version__ = "0.1.0"
Loading

0 comments on commit 7ba1114

Please sign in to comment.