Update Documentation to reflect new project/repo

py-pdf · Aug 9, 2024 · 5205e91 · 5205e91
1 parent d6423a1
commit 5205e91
Show file tree

Hide file tree

Showing 11 changed files with 72 additions and 85 deletions.
diff --git a/docs/_templates/sidebarintro.html b/docs/_templates/sidebarintro.html
@@ -16,9 +16,9 @@
 
 <h3>Useful Links</h3>
 <ul>
-  <li><a href="https://github.com/camelot-dev/camelot">Camelot @ GitHub</a></li>
-  <li><a href="https://pypi.org/project/camelot-py/">Camelot @ PyPI</a></li>
+  <li><a href="https://github.com/py-pdf/pypdf_table_extraction/">pypdf-table-extraction @ GitHub</a></li>
+  <li><a href="https://pypi.org/project/pypdf-table-extraction/">pypdf-table-extraction @ PyPI</a></li>
   <li>
-    <a href="https://github.com/camelot-dev/camelot/issues">Issue Tracker</a>
+    <a href="https://github.com/py-pdf/pypdf_table_extraction/issues">Issue Tracker</a>
   </li>
 </ul>
diff --git a/docs/conf.py b/docs/conf.py
@@ -64,8 +64,8 @@
 master_doc = "index"
 
 # General information about the project.
-project = "Camelot"
-copyright = "2021, Camelot Developers"
+project = "pypdf-table-extraction"
+copyright = "2024, pypdf-table-extraction Developers"
 author = "Vinayak Mehta"
 
 # The version info for the project you're documenting, acts as replacement for
@@ -139,8 +139,8 @@
 # documentation.
 html_theme_options = {
     "show_powered_by": False,
-    "github_user": "camelot-dev",
-    "github_repo": "camelot",
+    "github_user": "py-pdf",
+    "github_repo": "pypdf-table-extraction",
     "github_banner": True,
     "show_related": False,
     "note_bg": "#FFF59C",
@@ -262,7 +262,7 @@
 # html_search_scorer = 'scorer.js'
 
 # Output file base name for HTML help builder.
-htmlhelp_basename = "Camelotdoc"
+htmlhelp_basename = "pypdf-table-extraction-doc"
 
 # -- Options for LaTeX output ---------------------------------------------
 
@@ -285,7 +285,7 @@
 # (source start file, target name, title,
 #  author, documentclass [howto, manual, or own class]).
 latex_documents = [
-    (master_doc, "Camelot.tex", "Camelot Documentation", "Vinayak Mehta", "manual"),
+    (master_doc, "pypdf-table-extraction.tex", "pypdf-table-extraction Documentation", "Vinayak Mehta", "manual"),
 ]
 
 # The name of an image file (relative to this directory) to place at the top of
@@ -325,7 +325,7 @@
 
 # One entry per manual page. List of tuples
 # (source start file, name, description, authors, manual section).
-man_pages = [(master_doc, "Camelot", "Camelot Documentation", [author], 1)]
+man_pages = [(master_doc, "pypdf-table-extraction", "pypdf-table-extraction Documentation", [author], 1)]
 
 # If true, show URL addresses after external links.
 #
@@ -340,11 +340,11 @@
 texinfo_documents = [
     (
         master_doc,
-        "Camelot",
-        "Camelot Documentation",
+        "pypdf-table-extraction",
+        "pypdf-table-extraction Documentation",
         author,
-        "Camelot",
-        "One line description of project.",
+        "pypdf-table-extraction",
+        "PDF Table Extraction for Humans.",
         "Miscellaneous",
     ),
 ]

diff --git a/docs/dev/contributing.rst b/docs/dev/contributing.rst
@@ -3,7 +3,7 @@
 Contributor's Guide
 ===================
 
-If you're reading this, you're probably looking to contributing to Camelot. *Time is the only real currency*, and the fact that you're considering spending some here is *very* generous of you. Thank you very much!
+If you're reading this, you're probably looking to contributing to pypdf-table-extraction. *Time is the only real currency*, and the fact that you're considering spending some here is *very* generous of you. Thank you very much!
 
 This document will help you get started with contributing documentation, code, testing and filing issues. If you have any questions, feel free to reach out to `Vinayak Mehta`_, the author and maintainer.
 
@@ -27,17 +27,17 @@ As the `Requests Code Of Conduct`_ states, **all contributions are welcome**, as
 Your first contribution
 -----------------------
 
-A great way to start contributing to Camelot is to pick an issue tagged with the `help wanted`_ or the `good first issue`_ tags. If you're unable to find a good first issue, feel free to contact the maintainer.
+A great way to start contributing to pypdf-table-extraction is to pick an issue tagged with the `help wanted`_ or the `good first issue`_ tags. If you're unable to find a good first issue, feel free to contact the maintainer.
 
-.. _help wanted: https://github.com/camelot-dev/camelot/labels/help%20wanted
-.. _good first issue: https://github.com/camelot-dev/camelot/labels/good%20first%20issue
+.. _help wanted: https://github.com/py-pdf/pypdf_table_extraction/labels/help%20wanted
+.. _good first issue: https://github.com/py-pdf/pypdf_table_extraction/labels/good%20first%20issue
 
 Setting up a development environment
 ------------------------------------
 
 To install the dependencies needed for development, you can use pip::
 
-    $ pip install "camelot-py[dev]"
+    $ pip install "pypdf-table-extraction[dev]"
 
 Alternatively, you can clone the project repository, and install using pip::
 
@@ -51,13 +51,13 @@ Submit a pull request
 
 The preferred workflow for contributing to Camelot is to fork the `project repository`_ on GitHub, clone, develop on a branch and then finally submit a pull request. Here are the steps:
 
-.. _project repository: https://github.com/camelot-dev/camelot
+.. _project repository: https://github.com/py-pdf/pypdf_table_extraction/
 
 1. Fork the project repository. Click on the ‘Fork’ button near the top of the page. This creates a copy of the code under your account on the GitHub.
 
 2. Clone your fork of Camelot from your GitHub account::
 
-    $ git clone https://www.github.com/[username]/camelot
+    $ git clone https://www.github.com/[username]/pypdf-table-extraction
 
 3. Create a branch to hold your changes::
 
@@ -76,7 +76,7 @@ Always branch out from ``master`` to work on your contribution. It's good practi
 
     $ git push -u origin my-feature
 
-Now it's time to go to the your fork of Camelot and create a pull request! You can `follow these instructions`_ to do the same.
+Now it's time to go to the your fork of pypdf-table-extraction and create a pull request! You can `follow these instructions`_ to do the same.
 
 .. _follow these instructions: https://help.github.com/articles/creating-a-pull-request-from-a-fork/
 
@@ -89,7 +89,7 @@ We recommend that your pull request complies with the following guidelines:
 
 .. _pep8: http://pep8.org
 
-- In case your pull request contains function docstrings, make sure you follow the `numpydoc`_ format. All function docstrings in Camelot follow this format. Following the format will make sure that the API documentation is generated flawlessly.
+- In case your pull request contains function docstrings, make sure you follow the `numpydoc`_ format. All function docstrings in pypdf-table-extraction follow this format. Following the format will make sure that the API documentation is generated flawlessly.
 
 .. _numpydoc: https://numpydoc.readthedocs.io/en/latest/format.html
 
@@ -108,7 +108,7 @@ We recommend that your pull request complies with the following guidelines:
 
 .. _task list: https://blog.github.com/2013-01-09-task-lists-in-gfm-issues-pulls-comments/
 
-- If contributing new functionality, make sure that you add a unit test for it, while making sure that all previous tests pass. Camelot uses `pytest`_ for testing. Tests can be run using:
+- If contributing new functionality, make sure that you add a unit test for it, while making sure that all previous tests pass. pypdf-table-extraction uses `pytest`_ for testing. Tests can be run using:
 
 .. _pytest: https://docs.pytest.org/en/latest/
 
@@ -134,12 +134,12 @@ Filing Issues
 
 We use `GitHub issues`_ to keep track of all issues and pull requests. Before opening an issue (which asks a question or reports a bug), please use GitHub search to look for existing issues (both open and closed) that may be similar.
 
-.. _GitHub issues: https://github.com/camelot-dev/camelot/issues
+.. _GitHub issues: https://github.com/py-pdf/pypdf_table_extraction/issues
 
 Questions
 ^^^^^^^^^
 
-Please don't use GitHub issues for support questions. A better place for them would be `Stack Overflow`_. Make sure you tag them using the ``python-camelot`` tag.
+Please don't use GitHub issues for support questions. A better place for them would be `Stack Overflow`_. Make sure you tag them using the ``pypdf-table-extraction`` tag.
 
 .. _Stack Overflow: http://stackoverflow.com
 

diff --git a/docs/index.rst b/docs/index.rst
@@ -3,7 +3,7 @@
    You can adapt this file completely to your liking, but it should at least
    contain the root `toctree` directive.
 
-Camelot: PDF Table Extraction for Humans
+pypdf-table-extraction (Camelot): PDF Table Extraction for Humans
 ========================================
 
 Release v\ |version|. (:ref:`Installation <install>`)
@@ -15,30 +15,22 @@ Release v\ |version|. (:ref:`Installation <install>`)
     :target: https://camelot-py.readthedocs.io/en/master/
     :alt: Documentation Status
 
-.. image:: https://codecov.io/github/camelot-dev/camelot/badge.svg?branch=master&service=github
-    :target: https://codecov.io/github/camelot-dev/camelot?branch=master
+.. image:: https://codecov.io/github/py-pdf/pypdf_table_extraction/badge.svg?branch=master&service=github
+    :target: https://codecov.io/github/py-pdf/pypdf_table_extraction/?branch=master
 
-.. image:: https://img.shields.io/pypi/v/camelot-py.svg
-    :target: https://pypi.org/project/camelot-py/
+.. image:: https://img.shields.io/pypi/v/pypdf-table-extraction.svg
+    :target: https://pypi.org/project/pypdf-table-extraction/
 
-.. image:: https://img.shields.io/pypi/l/camelot-py.svg
-    :target: https://pypi.org/project/camelot-py/
+.. image:: https://img.shields.io/pypi/l/pypdf-table-extraction.svg
+    :target: https://pypi.org/project/pypdf-table-extraction/
 
-.. image:: https://img.shields.io/pypi/pyversions/camelot-py.svg
-    :target: https://pypi.org/project/camelot-py/
+.. image:: https://img.shields.io/pypi/pyversions/pypdf-table-extraction.svg
+    :target: (https://pypi.org/project/pypdf-table-extraction/
 
-.. image:: https://badges.gitter.im/camelot-dev/Lobby.png
-    :target: https://gitter.im/camelot-dev/Lobby
 
-.. image:: https://img.shields.io/badge/code%20style-black-000000.svg
-    :target: https://github.com/ambv/black
+**pypdf-table-extraction** Formerly known as Camelot is a Python library that can help you extract tables from PDFs!
 
-.. image:: https://img.shields.io/badge/continous%20quality-deepsource-lightgrey
-    :target: https://deepsource.io/gh/camelot-dev/camelot/?ref=repository-badge
-
-**Camelot** is a Python library that can help you extract tables from PDFs!
-
-.. note:: You can also check out `Excalibur`_, the web interface to Camelot!
+.. note:: You can also check out `Excalibur`_, the web interface to pypdf-table-extraction (Camelot)!
 
 .. _Excalibur: https://github.com/camelot-dev/excalibur
 
@@ -70,9 +62,9 @@ Release v\ |version|. (:ref:`Installation <install>`)
 .. csv-table::
   :file: _static/csv/foo.csv
 
-Camelot also comes packaged with a :ref:`command-line interface <cli>`!
+pypdf-table-extraction also comes packaged with a :ref:`command-line interface <cli>`!
 
-.. note:: Camelot only works with text-based PDFs and not scanned documents. (As Tabula `explains`_, "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".)
+.. note:: pypdf-table-extraction only works with text-based PDFs and not scanned documents. (As Tabula `explains`_, "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".)
 
 You can check out some frequently asked questions :ref:`here <faq>`.
 
@@ -91,12 +83,6 @@ See `comparison with similar libraries and tools`_.
 
 .. _comparison with similar libraries and tools: https://github.com/camelot-dev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools
 
-Support the development
------------------------
-
-If Camelot has helped you, please consider supporting its development with a one-time or monthly donation `on OpenCollective`_!
-
-.. _on OpenCollective: https://opencollective.com/camelot
 
 The User Guide
 --------------

diff --git a/docs/user/advanced.rst b/docs/user/advanced.rst
@@ -202,7 +202,7 @@ Specify table areas
 
 In cases such as `these <../_static/pdf/table_areas.pdf>`__, it can be useful to specify exact table boundaries. You can plot the text on this page and note the top left and bottom right coordinates of the table.
 
-Table areas that you want Camelot to analyze can be passed as a list of comma-separated strings to :meth:`read_pdf() <camelot.read_pdf>`, using the ``table_areas`` keyword argument.
+Table areas that you want pypdf-table-extraction to analyze can be passed as a list of comma-separated strings to :meth:`read_pdf() <camelot.read_pdf>`, using the ``table_areas`` keyword argument.
 
 ::
 
@@ -223,7 +223,7 @@ Table areas that you want Camelot to analyze can be passed as a list of comma-se
 Specify table regions
 ---------------------
 
-However there may be cases like `[1] <../_static/pdf/table_regions.pdf>`__ and `[2] <https://github.com/camelot-dev/camelot/blob/master/tests/files/tableception.pdf>`__, where the table might not lie at the exact coordinates every time but in an approximate region.
+However there may be cases like `[1] <../_static/pdf/table_regions.pdf>`__ and `[2] <https://github.com/py-pdf/pypdf_table_extraction/blob/main/tests/files/tableception.pdf>`__, where the table might not lie at the exact coordinates every time but in an approximate region.
 
 You can use the ``table_regions`` keyword argument to :meth:`read_pdf() <camelot.read_pdf>` to solve for such cases. When ``table_regions`` is specified, Camelot will only analyze the specified regions to look for tables.
 
@@ -244,7 +244,7 @@ You can use the ``table_regions`` keyword argument to :meth:`read_pdf() <camelot
 Specify column separators
 -------------------------
 
-In cases like `these <../_static/pdf/column_separators.pdf>`__, where the text is very close to each other, it is possible that Camelot may guess the column separators' coordinates incorrectly. To correct this, you can explicitly specify the *x* coordinate for each column separator by plotting the text on the page.
+In cases like `these <../_static/pdf/column_separators.pdf>`__, where the text is very close to each other, it is possible that pypdf-table-extraction may guess the column separators' coordinates incorrectly. To correct this, you can explicitly specify the *x* coordinate for each column separator by plotting the text on the page.
 
 You can pass the column separators as a list of comma-separated strings to :meth:`read_pdf() <camelot.read_pdf>`, using the ``columns`` keyword argument.
 
@@ -334,7 +334,7 @@ You can solve this by passing ``flag_size=True``, which will enclose the supersc
 Strip characters from text
 --------------------------
 
-You can strip unwanted characters like spaces, dots and newlines from a string using the ``strip_text`` keyword argument. Take a look at `this PDF <https://github.com/camelot-dev/camelot/blob/master/tests/files/tabula/12s0324.pdf>`_ as an example, the text at the start of each row contains a lot of unwanted spaces, dots and newlines.
+You can strip unwanted characters like spaces, dots and newlines from a string using the ``strip_text`` keyword argument. Take a look at `this PDF <https://github.com/py-pdf/pypdf_table_extraction/blob/master/tests/files/tabula/12s0324.pdf>`_ as an example, the text at the start of each row contains a lot of unwanted spaces, dots and newlines.
 
 ::
 
@@ -360,7 +360,7 @@ You can strip unwanted characters like spaces, dots and newlines from a string u
 Improve guessed table areas
 ---------------------------
 
-While using :ref:`Stream <stream>`, automatic table detection can fail for PDFs like `this one <https://github.com/camelot-dev/camelot/blob/master/tests/files/edge_tol.pdf>`_. That's because the text is relatively far apart vertically, which can lead to shorter textedges being calculated.
+While using :ref:`Stream <stream>`, automatic table detection can fail for PDFs like `this one <https://github.com/py-pdf/pypdf_table_extraction/blob/master/tests/files/edge_tol.pdf>`_. That's because the text is relatively far apart vertically, which can lead to shorter textedges being calculated.
 
 .. note:: To know more about how textedges are calculated to guess table areas, you can see pages 20, 35 and 40 of `Anssi Nurminen's master's thesis <https://trepo.tuni.fi/bitstream/handle/123456789/21520/Nurminen.pdf?sequence=3>`_.
 
@@ -487,7 +487,7 @@ Clearly, the smaller lines separating the headers, couldn't be detected. Let's t
     :alt: An improved plot of the PDF table with short lines
     :align: left
 
-Voila! Camelot can now see those lines. Let's get our table.
+Voila! pypdf-table-extraction can now see those lines. Let's get our table.
 
 ::
 
@@ -616,7 +616,7 @@ We don't need anything else. Now, let's pass ``copy_text=['v']`` to copy text in
 Tweak layout generation
 -----------------------
 
-Camelot is built on top of PDFMiner's functionality of grouping characters on a page into words and sentences. In some cases (such as `#170 <https://github.com/camelot-dev/camelot/issues/170>`_ and `#215 <https://github.com/camelot-dev/camelot/issues/215>`_), PDFMiner can group characters that should belong to the same sentence into separate sentences.
+pypdf-table-extraction is built on top of PDFMiner's functionality of grouping characters on a page into words and sentences. In some cases (such as `#170 <https://github.com/atlanhq/camelot/issues/170>`_ and `#215 <https://github.com/atlanhq/camelot/issues/215>`_), PDFMiner can group characters that should belong to the same sentence into separate sentences.
 
 To deal with such cases, you can tweak PDFMiner's `LAParams kwargs <https://github.com/euske/pdfminer/blob/master/pdfminer/layout.py#L33>`_ to improve layout generation, by passing the keyword arguments as a dict using ``layout_kwargs`` in :meth:`read_pdf() <camelot.read_pdf>`. To know more about the parameters you can tweak, you can check out `PDFMiner docs <https://pdfminersix.rtfd.io/en/latest/reference/composable.html>`_.
 

diff --git a/docs/user/cli.rst b/docs/user/cli.rst
@@ -3,15 +3,15 @@
 Command-Line Interface
 ======================
 
-Camelot comes with a command-line interface.
+pypdf-table-extraction comes with a command-line interface.
 
-You can print the help for the interface by typing ``camelot --help`` in your favorite terminal program, as shown below. Furthermore, you can print the help for each command by typing ``camelot <command> --help``. Try it out!
+You can print the help for the interface by typing ``camelot --help`` in your favorite terminal program, as shown below. Furthermore, you can print the help for each command by typing ``pypdf-table-extraction <command> --help``. Try it out!
 
 ::
 
-  Usage: camelot [OPTIONS] COMMAND [ARGS]...
+  Usage: pypdf-table-extraction [OPTIONS] COMMAND [ARGS]...
 
-    Camelot: PDF Table Extraction for Humans
+    pypdf-table-extraction: PDF Table Extraction for Humans
 
   Options:
     --version                       Show the version and exit.

diff --git a/docs/user/faq.rst b/docs/user/faq.rst
@@ -3,12 +3,12 @@
 Frequently Asked Questions
 ==========================
 
-This part of the documentation answers some common questions. To add questions, please open an issue `here <https://github.com/camelot-dev/camelot/issues/new>`_.
+This part of the documentation answers some common questions. To add questions, please open an issue `here <https://github.com/py-pdf/pypdf_table_extraction/issues/new>`_.
 
-Does Camelot work with image-based PDFs?
+Does pypdf-table-extraction work with image-based PDFs?
 ----------------------------------------
 
-**No**, Camelot only works with text-based PDFs and not scanned documents. (As Tabula `explains <https://github.com/tabulapdf/tabula#why-tabula>`_, "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".)
+**No**, pypdf-table-extraction only works with text-based PDFs and not scanned documents. (As Tabula `explains <https://github.com/tabulapdf/tabula#why-tabula>`_, "If you can click and drag to select text in your table in a PDF viewer, then your PDF is text-based".)
 
 How to reduce memory usage for long PDFs?
 -----------------------------------------