Skip to content

Commit

Permalink
Python: Replace tox with pre-commit (#4811)
Browse files Browse the repository at this point in the history
  • Loading branch information
Fokko authored May 31, 2022
1 parent 100c2f6 commit 5d0f530
Show file tree
Hide file tree
Showing 22 changed files with 252 additions and 147 deletions.
15 changes: 9 additions & 6 deletions .github/workflows/python-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,12 @@ jobs:
- uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python }}
- working-directory: ./python
run: |
pip install -e .[dev]
pip install -U tox-gh-actions
- working-directory: ./python
run: tox
- name: Install deps
working-directory: ./python
run: pip3 install -e .[dev,arrow]
- name: Run linters
working-directory: ./python
run: make lint
- name: Run tests
working-directory: ./python
run: make test
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -65,3 +65,5 @@ derby.log
# Python stuff
python_legacy/.mypy_cache/
python/.mypy_cache/
python/htmlcov
python/coverage.xml
2 changes: 0 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,6 @@ For Java styling, check out the section
[Setting up IDE and Code Style](https://iceberg.apache.org/community/#setting-up-ide-and-code-style) from the
documentation site.

For Python, please use the tox command `tox -e format` to apply autoformatting to the project.

### Java style guidelines

#### Line breaks
Expand Down
47 changes: 47 additions & 0 deletions python/.pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
---
files: ^python/

repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.2.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-docstring-first
- id: debug-statements
- id: check-yaml
- id: check-ast
- repo: https://github.com/ambv/black
rev: 22.3.0
hooks:
- id: black
- repo: https://github.com/pre-commit/mirrors-isort
rev: v5.10.1
hooks:
- id: isort
args: [ --settings-path=python/pyproject.toml ]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v0.950
hooks:
- id: mypy
- repo: https://github.com/hadialqattan/pycln
rev: v1.3.2
hooks:
- id: pycln
args: [--config=python/pyproject.toml]
44 changes: 44 additions & 0 deletions python/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
<!--
- Licensed to the Apache Software Foundation (ASF) under one
- or more contributor license agreements. See the NOTICE file
- distributed with this work for additional information
- regarding copyright ownership. The ASF licenses this file
- to you under the Apache License, Version 2.0 (the
- "License"); you may not use this file except in compliance
- with the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an
- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- KIND, either express or implied. See the License for the
- specific language governing permissions and limitations
- under the License.
-->

# Contributing to the Iceberg Python libary

## Linting

We rely on `pre-commit` to apply autoformatting and linting:

```bash
make install
make lint
```

By default, it only runs on the files known by git.

Pre-commit will automatically fix the violations such as import orders, formatting etc. Pylint errors you need to fix yourself.

In contrast to the name, it doesn't run on the git pre-commit hook by default. If this is something that you like, you can set this up by running `pre-commit install`.

## Testing

For Python, we use pytest in combination with coverage to maintain 90% code coverage

```bash
make install
make test
```
28 changes: 28 additions & 0 deletions python/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

install:
pip install -e ".[dev,arrow]"

lint:
pre-commit run --all-files

test:
coverage run --source=src/ -m pytest tests/
coverage report -m --fail-under=90
coverage html
coverage xml
2 changes: 1 addition & 1 deletion python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

# Iceberg Python

py-iceberg is a python library for programmatic access to iceberg table metadata as well as to table data in iceberg format.
py-iceberg is a python library for programmatic access to iceberg table metadata as well as to table data in iceberg format.
It is an implementation of [iceberg table spec](https://iceberg.apache.org/spec/) in Python.

## Getting Started
Expand Down
28 changes: 27 additions & 1 deletion python/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,30 @@ requires = [
"setuptools>=42",
"wheel"
]
build-backend = "setuptools.build_meta"
build-backend = "setuptools.build_meta"

[tool.black]
line-length = 130
target-version = ['py38']

[tool.isort]
src_paths = ["src/", "tests/"]
multi_line_output = 3
profile = 'black'
line_length = 130
force_grid_wrap = 4

[tool.pycln]
all = true

[tool.mypy]
no_implicit_optional = true
warn_redundant_casts = true
warn_unreachable = true

[[tool.mypy.overrides]]
module = "mypy-pyarrow.*"
ignore_missing_imports = true

[tool.coverage.run]
source = ['src/']
11 changes: 8 additions & 3 deletions python/setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,14 @@ install_requires =
mmh3
[options.extras_require]
arrow =
pyarrow
pyarrow==8.0.0
dev=
tox-travis==0.12
pytest
pip>=21.1
coverage[toml]==6.3.3
mock==4.0.3
pytest==7.1.2
pytest-checkdocs==2.7.1
pyenchant==3.2.2
pre-commit==2.19.0
[options.packages.find]
where = src
2 changes: 1 addition & 1 deletion python/spellcheck-dictionary.txt
Original file line number Diff line number Diff line change
Expand Up @@ -55,4 +55,4 @@ URI
UnboundPredicate
BoundPredicate
BooleanExpression
BooleanExpressionVisitor
BooleanExpressionVisitor
4 changes: 4 additions & 0 deletions python/src/iceberg/catalog/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,7 @@
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
from typing import Dict, Tuple

Identifier = Tuple[str, ...]
Properties = Dict[str, str]
15 changes: 10 additions & 5 deletions python/src/iceberg/catalog/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,20 @@
# specific language governing permissions and limitations
# under the License.

from abc import ABC, abstractmethod
from typing import Dict, List, Optional, Set, Tuple, Union
from __future__ import annotations

from abc import ABC, abstractmethod
from typing import (
List,
Optional,
Set,
Union,
)

from iceberg.catalog import Identifier, Properties
from iceberg.schema import Schema
from iceberg.table.base import PartitionSpec, Table

Identifier = Tuple[str, ...]
Properties = Dict[str, str]


class Catalog(ABC):
"""Base Catalog for table operations like - create, drop, load, list and others.
Expand Down
8 changes: 7 additions & 1 deletion python/src/iceberg/io/pyarrow.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,13 @@

from pyarrow.fs import FileInfo, FileSystem, FileType

from iceberg.io.base import FileIO, InputFile, InputStream, OutputFile, OutputStream
from iceberg.io.base import (
FileIO,
InputFile,
InputStream,
OutputFile,
OutputStream,
)


class PyArrowFile(InputFile, OutputFile):
Expand Down
10 changes: 9 additions & 1 deletion python/src/iceberg/schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,15 @@
from abc import ABC, abstractmethod
from dataclasses import dataclass
from functools import singledispatch
from typing import Any, Dict, Generic, List, Optional, Tuple, TypeVar
from typing import (
Any,
Dict,
Generic,
List,
Optional,
Tuple,
TypeVar,
)

from iceberg.files import StructProtocol
from iceberg.types import (
Expand Down
5 changes: 5 additions & 0 deletions python/src/iceberg/table/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@
from __future__ import annotations

from abc import ABC
from typing import Union

from iceberg.catalog.base import Identifier


class Table(ABC):
Expand All @@ -26,6 +29,8 @@ class Table(ABC):
To be implemented by https://github.com/apache/iceberg/issues/3227
"""

identifier: Union[str, Identifier]


class PartitionSpec:
"""Placeholder for Partition Specification
Expand Down
7 changes: 6 additions & 1 deletion python/src/iceberg/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,12 @@
"""
from dataclasses import dataclass, field
from functools import cached_property
from typing import ClassVar, Dict, Optional, Tuple
from typing import (
ClassVar,
Dict,
Optional,
Tuple,
)


class Singleton:
Expand Down
7 changes: 6 additions & 1 deletion python/src/iceberg/utils/schema_conversion.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,12 @@
from __future__ import annotations

import logging
from typing import Any, Dict, List, Tuple
from typing import (
Any,
Dict,
List,
Tuple,
)

from iceberg.schema import Schema
from iceberg.types import (
Expand Down
23 changes: 15 additions & 8 deletions python/tests/catalog/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,18 @@
# specific language governing permissions and limitations
# under the License.

from typing import Dict, List, Optional, Set, Union
from typing import (
Dict,
List,
Optional,
Set,
Union,
)

import pytest

from iceberg.catalog.base import Catalog, Identifier, Properties
from iceberg.catalog import Identifier, Properties
from iceberg.catalog.base import Catalog
from iceberg.exceptions import (
AlreadyExistsError,
NamespaceNotEmptyError,
Expand Down Expand Up @@ -136,13 +143,13 @@ def update_namespace_properties(
self, namespace: Union[str, Identifier], removals: Optional[Set[str]] = None, updates: Optional[Properties] = None
) -> None:
namespace = Catalog.identifier_to_tuple(namespace)
removals = {} if not removals else removals
updates = [] if not updates else updates
if namespace in self.__namespaces:
for key in removals:
if key in self.__namespaces[namespace]:
del self.__namespaces[namespace][key]
self.__namespaces[namespace].update(updates)
if removals:
for key in removals:
if key in self.__namespaces[namespace]:
del self.__namespaces[namespace][key]
if updates:
self.__namespaces[namespace].update(updates)
else:
raise NoSuchNamespaceError(f"Namespace does not exist: {namespace}")

Expand Down
11 changes: 11 additions & 0 deletions python/tests/io/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Loading

0 comments on commit 5d0f530

Please sign in to comment.