Merge pull request #173 from Living-with-machines/kallewesterling/iss…

…ue166 New `Annotator` class
maps-as-data · Dec 14, 2023 · ccbf46c · ccbf46c
2 parents e385370 + 5fdde2a
commit ccbf46c
Show file tree

Hide file tree

Showing 8 changed files with 1,851 additions and 274 deletions.
diff --git a/docs/source/User-guide/Annotate.rst b/docs/source/User-guide/Annotate.rst
@@ -5,168 +5,203 @@ Annotate
 
 MapReader's ``Annotate`` subpackage is used to interactively annotate images (e.g. maps).
 
-This is done in three simple steps:
-
-1. :ref:`Create_file`
-2. :ref:`Annotate_images`
-3. :ref:`Save_annotations`
+.. _Annotate_images:
 
-.. _Create_file:
+Annotate your images
+----------------------
 
-Create an annotation tasks file
------------------------------------
+.. note:: Run these commands in a Jupyter notebook (or other IDE), ensuring you are in your `mr_py38` python environment.
 
-To set up your annotation tasks, you will need to create a separate ``annotation_tasks.yaml`` file.
-An example file which can be used as a template can be found in ``MapReader/worked_examples/``.
 
-.. todo:: Note that you can do this via text editor in windows or something like ??? in mac/linux
+To prepare your annotations, you must specify a number of parameters when initializing the Annotator class.
+We will use a 'rail_space' annotation task to demonstrate how to set up the annotator.
 
-Your ``annotation_tasks.yaml`` file needs to contain two sections: ``tasks`` and ``paths``.
+The simplest way to initialize your annotator is to provide file paths for your patches and parent images using the ``patch_paths`` and ``parent_paths`` arguments, respectively.
+e.g. :
 
-The ``tasks`` section is used to specify annotation tasks and their labels.
-This section can contain as many tasks/labels as you would like and should be formatted as follows:
+.. code-block:: python
 
-.. code-block:: yaml
+    from mapreader import Annotator
+
+    # EXAMPLE
+    annotator = Annotator(
+        patch_paths="./patches_100_pixel/*.png",
+        parent_paths="./maps/*.png",
+        metadata="./maps/metadata.csv",
+        annotations_dir="./annotations",
+        task_name="railspace",
+        labels=["no_rail_space", "rail_space"],
+        username="rosie",
+    )
 
-	tasks:
-	  your_task_name:
-	    labels: ["your_label_1", "your_label_2", "your_label_3"]
-	  your_task_name_2:
-		labels: ["your_label_1", "your_label_2"]
+Alternatively, if you have created/saved a ``patch_df`` and ``parent_df`` from MapReader's Load subpackage, you can replace the ``patch_paths`` and ``parent_paths`` arguments with ``patch_df`` and ``parent_df`` arguments, respectively.
+e.g. :
 
-.. note:: When annotating, for each patch you will only be able to select one label from your label list. So, if you envisage wanting to label something as "label_1" **and also** "label_2", you will need to create a separate label combining "label_1 and label_2".
+.. code-block:: python
 
-The ``paths`` section is used to specify file paths to sets of images you would like to annotate (annotation sets).
-This section can contain as many annotation sets as you would like and should be formatted as follows:
+    from mapreader import Annotator
 
-.. code-block:: yaml
+    # EXAMPLE
+    annotator = Annotator(
+        patch_df="./patch_df.csv",
+        parent_df="./parent_df.csv",
+        annotations_dir="./annotations",
+        task_name="railspace",
+        labels=["no_rail_space", "rail_space"],
+        username="rosie",
+    )
 
-	paths:
-	  your_annotation_set:
-		patch_paths: "./path/to/patches/"
-		parent_paths: "./path/to/parents/"
-		annot_dir: "./path/to/save/annotations"
-	  your_annotation_set_2:
-		patch_paths: "./path/to/patches_2/"
-		parent_paths: "./path/to/parents_2/"
-		annot_dir: "./path/to/save/annotations_2"
+.. note:: You can pass either a pandas DataFrame or the path to a csv file as the ``patch_df`` and ``parent_df`` arguments.
 
-For example, if you want to annotate 'rail_space' (as in `this paper <https://dl.acm.org/doi/10.1145/3557919.3565812>`_) and have been using the recommended/default directory structure, your ``annotation_tasks.yaml`` should look like this:
+In the above examples, the following parameters are also specified:
 
-.. code-block:: yaml
+- ``annotations_dir``: The directory where your annotations will be saved (e.g., ``"./annotations"``).
+- ``task_name``: The specific annotation task you want to perform, in this case ``"railspace"``.
+- ``labels``: A list of labels for the annotation task, such as ``"no_rail_space"`` and ``"rail_space"``.
+- ``username``: Your unique identifier, which can be any string (e.g., ``"rosie"``).
 
-	#EXAMPLE
-	tasks:
-	  rail_space:
-		labels: ["no_rail_space", "rail_space"]
+Other arguments that you may want to be aware of when initializing the ``Annotator`` instance include:
 
-	paths:
-	  set_001:
-		patch_paths: "./patches/patch-*png"
-		parent_paths: "./maps/*png"
-		annot_dir: "./annotations_one_inch"
+- ``show_context``: Whether to show a context image in the annotation interface (default: ``False``).
+- ``surrounding``: How many surrounding patches to show in the context image (default: ``1``).
+- ``sortby``: The name of the column to use to sort the patch Dataframe (e.g. "mean_pixel_R" to sort by red pixel intensities).
+- ``delimiter``: The delimiter to use when reading your data files (default: ``","`` for csv).
 
-.. _Annotate_images:
+After setting up the ``Annotator`` instance, you can interactively annotate a sample of your images using:
 
-Annotate your images
-----------------------
+.. code-block:: python
 
-.. note:: Run these commands in a Jupyter notebook (or other IDE), ensuring you are in your `mr_py38` python environment.
+    annotator.annotate()
 
-To prepare your annotations, you must specify a ``userID``, ``annotation_tasks_file`` (i.e. the ``annotation_task.yaml``), tell MapReader which ``task`` you'd like to run and which  ``annotation_set`` you would like to run on.
+Patch size
+~~~~~~~~~~
 
-.. todo:: Give big list of different options here
-.. todo:: Explain that things don't autosave
+By default, your patches will be shown to you as their original size in pixels.
+This can make annotating difficult if your patches are very small.
+To resize your patches when viewing them in the annotation interface, you can pass the ``resize_to`` keyword argument when initializing the ``Annotator`` instance or when calling the ``annotate()`` method.
 
-e.g. following our 'rail_space' example from earlier:
+e.g. to resize your patches so that their largest edge is 300 pixels:
 
 .. code-block:: python
 
-  #EXAMPLE
-    from mapreader.annotate.utils import prepare_annotation
-
-    annotation = prepare_annotation(
-        userID="rosie",
-        annotation_tasks_file="annotation_tasks.yaml",
-        task="rail_space",
-        annotation_set="set_001",
+    # EXAMPLE
+    annotator = Annotator(
+        patch_df="./patch_df.csv",
+        parent_df="./parent_df.csv",
+        annotations_dir="./annotations",
+        task_name="railspace",
+        labels=["no_rail_space", "rail_space"],
+        username="rosie",
+        resize_to=300,
     )
 
-You can then interactively annotate a sample of your images using:
+Or, equivalently, :
 
 .. code-block:: python
 
-    annotation
+    annotator.annotate(resize_to=300)
+
+.. note:: Passing the ``resize_to`` argument when calling the ``annotate()`` method overrides the ``resize_to`` argument passed when initializing the ``Annotator`` instance.
 
-.. image:: ../figures/annotate.png
-	:width: 400px
+Context
+~~~~~~~
 
-To help with annotating, you can set the annotation interface to show a context image using ``context_image=True``.
-This creates a second panel in the annotation interface, showing your patch in the context of a larger region whose size, in pixels, is set by ``xoffset`` and ``yoffset``.
+As well as resizing your patches, you can also set the annotation interface to show a context image using ``show_context=True``.
+This creates a panel of patches in the annotation interface, highlighting your patch in the middle of its surrounding immediate images.
+As above, you can either pass the ``show_context`` argument when initializing the ``Annotator`` instance or when calling the ``annotate`` method.
 
 e.g. :
 
 .. code-block:: python
 
-	#EXAMPLE
-    annotation=prepare_annotation(
-        userID="rosie",
-        annotation_tasks_file="annotation_tasks.yaml",
-        task="rail_space",
-        annotation_set="set_001",
-        context_image=True,
-        xoffset=100,
-        yoffset=100)
+    # EXAMPLE
+    annotator = Annotator(
+        patch_df="./patch_df.csv",
+        parent_df="./parent_df.csv",
+        annotations_dir="./annotations",
+        task_name="railspace",
+        labels=["no_rail_space", "rail_space"],
+        username="rosie",
+        show_context=True,
+    )
+
+    annotator.annotate()
+
+Or, equivalently, :
+
+.. code-block:: python
 
-    annotation
+    annotator.annotate(show_context=True)
 
-.. image:: ../figures/annotate_context.png
-	:width: 400px
+.. note:: Passing the ``show_context`` argument when calling the ``annotate()`` method overrides the ``show_context`` argument passed when initializing the ``Annotator`` instance.
+
+By default, your ``Annotator`` will show one surrounding patch in the context image.
+You can change this by passing the ``surrounding`` argument when initializing the ``Annotator`` instance and/or when calling the ``annotate`` method.
+
+e.g. to show two surrounding patches in the context image:
+
+.. code-block:: python
 
-By default, your patches will be shown to you in a random order but, to help with annotating, can be sorted by their mean pixel intensities using ``sortby="mean"``.
+    annotator.annotate(show_context=True, surrounding=2)
 
-You can also specify ``min_mean_pixel`` and ``max_mean_pixel`` to limit the range of mean pixel intensities shown to you and ``min_std_pixel`` and ``max_std_pixel`` to limit the range of standard deviations within the mean pixel intensities shown to you.
-This is particularly useful if your images (e.g. maps) have collars or margins that you would like to avoid.
+Sort order
+~~~~~~~~~~
 
+By default, your patches will be shown to you in a random order but, to help with annotating, they can be sorted using the ``sortby`` argument.
+This argument takes the name of a column in your patch DataFrame and sorts the patches by the values in that column.
 e.g. :
 
 .. code-block:: python
 
-    annotation=prepare_annotation(
-        userID="rosie",
-        annotation_tasks_file="annotation_tasks.yaml",
-        task="rail_space",
-        annotation_set="set_001",
-        context_image=True,
-        xoffset=100,
-        yoffset=100,
-        min_mean_pixel=0.5,
-        max_mean_pixel=0.9
+    # EXAMPLE
+    annotator = Annotator(
+        patch_df="./patch_df.csv",
+        parent_df="./parent_df.csv",
+        annotations_dir="./annotations"m
+        task_name="railspace",
+        labels=["no_rail_space", "rail_space"],
+        username="rosie",
+        sortby="mean_pixel_R",
     )
 
-    annotation
+This will sort your patches by the mean red pixel intensity in each patch, by default, in ascending order.
+This is particularly useful if your images (e.g. maps) have collars, margins or blank regions that you would like to avoid.
+
+.. note:: If you would like to sort in descending order, you can also pass ``ascending=False``.
+
+You can also specify ``min_values`` and ``max_values`` to limit the range of values shown to you.
+e.g. To sort your patches by the mean red pixel intensity in each patch but only show you patches with a mean blue pixel intensity between 0.5 and 0.9.
+
+.. code-block:: python
+
+    # EXAMPLE
+    annotator = Annotator(
+        patch_df="./patch_df.csv",
+        parent_df="./parent_df.csv",
+        annotations_dir="./annotations",
+        task_name="railspace",
+        labels=["no_rail_space", "rail_space"],
+        username="rosie",
+        sortby="mean_pixel_R",
+        min_values={"mean_pixel_B": 0.5},
+        max_values={"mean_pixel_B": 0.9},
+    )
 
 .. _Save_annotations:
 
 Save your annotations
 ----------------------
 
-Once you have annotated your images, you should save your annotations using:
+Your annotations are automatically saved as you're making progress through the annotation task as a ``csv`` file (unless you've set the ``auto_save`` keyword argument to ``False`` when you set up the ``Annotator`` instance).
 
-.. code-block:: python
+If you need to know the name of the annotations file, you may refer to a property on your ``Annotator`` instance:
 
-	  #EXAMPLE
-    from mapreader.annotate.utils import save_annotation
+.. code-block:: python
 
-    save_annotation(
-        annotation,
-        userID="rosie",
-        task="rail_space",
-        annotation_tasks_file="annotation_tasks.yaml",
-        annotation_set="set_001",
-    )
+    annotator.annotations_file
 
-This saves your annotations as a ``csv`` file in the ``annot_dir`` specified in your annotation tasks file.
+The file will be located in the ``annotations_dir`` that you may have passed as a keyword argument when you set up the ``Annotator`` instance.
+If you didn't provide a keyword argument, it will be in the ``./annotations`` directory.
 
 For example, if you have downloaded your maps using the default settings of our ``Download`` subpackage or have set up your directory as recommended in our `Input Guidance <https://mapreader.readthedocs.io/en/latest/Input-guidance.html>`__, and then saved your patches using the default settings:
 
@@ -185,5 +220,5 @@ For example, if you have downloaded your maps using the default settings of our
     │   ├── patch-100-200-#map1.png#.png
     │   ├── patch-200-300-#map1.png#.png
     │   └── ...
-    └──annotations_one_inch
-	    └──rail_space_#rosie#.csv
+    └──annotations
+	    └──rail_space_#rosie#-123hjkfr298jIUHfs808da.csv
diff --git a/mapreader/__init__.py b/mapreader/__init__.py
@@ -15,6 +15,8 @@
 
 from mapreader.process import process
 
+from mapreader.annotate.annotator import Annotator
+
 from . import _version
 
 __version__ = _version.get_versions()["version"]