diff --git a/README.md b/README.md
index 41eca97..9ab090e 100644
--- a/README.md
+++ b/README.md
@@ -16,7 +16,7 @@ up MEDS, we will define some key terminology that we use in this standard.
      a hospital admission rather than an individual.
   2. A _code_ is the categorical descriptor of what is being observed in any given observation of a patient.
      In particular, in almost all structured, longitudinal datasets, a measurement can be described as
-     consisting of a tuple containing a `patient_id` (who this measurement is about); a `timestamp` (when this
+     consisting of a tuple containing a `patient_id` (who this measurement is about); a `time` (when this
      measurement happened); some categorical qualifier describing what was measured, which we will call a
      `code`; a value of a given type, such as a `numerical_value`, a `text_value`, or a `categorical_value`;
      and possibly one or more additional measurement properties that describe the measurement in a
@@ -28,7 +28,7 @@ MEDS consists of four main data components/schemas:
   1. A _data schema_. This schema describes the underlying medical data, organized as sequences of patient
      observations, in the dataset.
   2. A _patient subsequence label schema_. This schema describes labels that may be predicted about a patient
-     at a given timestamp in the patient record.
+     at a given time in the patient record.
   3. A _code metadata schema_. This schema contains metadata describing the codes used to categorize the
      observed measurements in the dataset.
   4. A _dataset metadata schema_. This schema contains metadata about the MEDS dataset itself, such as when it
@@ -43,7 +43,7 @@ found in the following subfolders:
     series of possibly nested sharded dataframes stored in `parquet` files. In particular, the file glob
     `glob("$MEDS_ROOT/data/**/*.parquet)` will capture all sharded data files of the raw MEDS data, all
     organized into _data schema_ files, sharded by patient and sorted, for each patient, by
-    timestamp.
+    time.
   - `$MEDS_ROOT/metadata/codes.parquet`: This file contains per-code metadata in the _code metadata schema_
     about the MEDS dataset. As this dataset describes all codes observed in the full MEDS dataset, it is _not_
     sharded. Note that some pre-processing operations may, at times, produce sharded code metadata files, but
@@ -69,67 +69,135 @@ does not exist.
 
 ### Schemas
 
-**TODO**: copy here from the schema file and describe.
-
+#### The Data Schema
+MEDS data also must satisfy two important properties:
+  1. Data about a single patient cannot be split across parquet files. If a patient is in a dataset it must be
+     in one and only one parquet file.
+  2. Data about a single patient must be contiguous within a particular parquet file and sorted by time. 
 
+The data schema has four mandatory fields:
+  1. `patient_id`: The ID of the patient this event is about.
+  2. `time`: The time of the event. This field is nullable for static events.
+  3. `code`: The code of the event.
+  4. `numeric_value`: The numeric value of the event. This field is nullable for non-numeric events.
 
+In addition, it can contain any number of custom properties to further enrich observations. The python
+function below generates a pyarrow schema for a given set of custom properties.
 
+```python
+def data_schema(custom_properties=[]):
+    return pa.schema(
+        [
+            ("patient_id", pa.int64()),
+            ("time", pa.timestamp("us")), # Static events will have a null timestamp
+            ("code", pa.string()),
+            ("numeric_value", pa.float32()),
+        ] + custom_properties
+    )
+```
 
+#### The label schema.
+Models, when predicting this label, are allowed to use all data about a patient up to and including the
+prediction time. Exclusive prediction times are not currently supported, but if you have a use case for them
+please add a GitHub issue.
 
-## Old -- to be deleted.
+```python
+label = pa.schema(
+    [
+        ("patient_id", pa.int64()),
+        ("prediction_time", pa.timestamp("us")),
+        ("boolean_value", pa.bool_()),
+        ("integer_value", pa.int64()),
+        ("float_value", pa.float64()),
+        ("categorical_value", pa.string()),
+    ]
+)
+
+Label = TypedDict("Label", {
+    "patient_id": int, 
+    "prediction_time": datetime.datetime, 
+    "boolean_value": Optional[bool],
+    "integer_value" : Optional[int],
+    "float_value" : Optional[float],
+    "categorical_value" : Optional[str],
+}, total=False)
+```
 
-The core of the standard is that we define a ``patient`` data structure that contains a series of time stamped events, that in turn contain measurements of various sorts.
+#### The patient split schema.
 
-The Python type signature for the schema is as follows:
+Three sentinel split names are defined for convenience and shared processing:
+  1. A training split, named `train`, used for ML model training.
+  2. A tuning split, named `tuning`, used for hyperparameter tuning. This is sometimes also called a
+     "validation" split or a "dev" split. In many cases, standardizing on a tuning split is not necessary and
+     models should feel free to merge this split with the training split if desired.
+  3. A held-out split, named `held_out`, used for final model evaluation. In many cases, this is also called a
+     "test" split. When performing benchmarking, this split should not be used at all for model selection,
+     training, or for any purposes up to final validation.
 
-```python
+Additional split names can be used by the user as desired.
 
-Patient = TypedDict('Patient', {
-  'patient_id': int,
-  'events': List[Event],
-})
-
-Event = TypedDict('Event',{
-    'time': NotRequired[datetime.datetime], # Static events will have a null timestamp here
-    'code': str,
-    'text_value': NotRequired[str],
-    'numeric_value': NotRequired[float],
-    'datetime_value': NotRequired[datetime.datetime],
-    'metadata': NotRequired[Mapping[str, Any]],
-})
+```
+train_split = "train"
+tuning_split = "tuning"
+held_out_split = "held_out"
+
+patient_split = pa.schema(
+    [
+        ("patient_id", pa.int64()),
+        ("split", pa.string()),
+    ]
+)
+
+PatientSplit = TypedDict("PatientSplit", {
+    "patient_id": int,
+    "split": str,
+}, total=True)
 ```
 
-We also provide ETLs to convert common data formats to this schema: https://github.com/Medical-Event-Data-Standard/meds_etl
-
-An example patient following this schema
+#### The dataset metadata schema.
 
 ```python
-
-patient_data = {
-  "patient_id": 123,
-  "events": [
-    # Store static events like gender with a null timestamp
-    {
-        "time": None,
-        "code": "Gender/F",
+dataset_metadata = {
+    "type": "object",
+    "properties": {
+        "dataset_name": {"type": "string"},
+        "dataset_version": {"type": "string"},
+        "etl_name": {"type": "string"},
+        "etl_version": {"type": "string"},
+        "meds_version": {"type": "string"},
     },
+}
 
-    # It's recommended to record birth using the birth_code
-    {
-      "time": datetime.datetime(1995, 8, 20),
-      "code": meds.birth_code,
-    },
+# Python type for the above schema
 
-    # Arbitrary events with sophisticated data can also be added
+DatasetMetadata = TypedDict(
+    "DatasetMetadata",
     {
-        "time": datetime.datetime(2020, 1, 1, 12, 0, 0),
-        "code": "some_code",
-        "text_value": "Example",
-        "numeric_value": 10.0,
-        "datetime_value": datetime.datetime(2020, 1, 1, 12, 0, 0),
-        "properties": None
+        "dataset_name": NotRequired[str],
+        "dataset_version": NotRequired[str],
+        "etl_name": NotRequired[str],
+        "etl_version": NotRequired[str],
+        "meds_version": NotRequired[str],
     },
-  ]
-}
+    total=False,
+)
+```
+
+#### The code metadata schema.
+
+```python
+def code_metadata_schema(custom_per_code_properties=[]): 
+    code_metadata = pa.schema(
+        [
+            ("code", pa.string()),
+            ("description", pa.string()),
+            ("parent_codes", pa.list(pa.string()),
+        ] + custom_per_code_properties
+    )
+
+    return code_metadata
+
+# Python type for the above schema
 
+CodeMetadata = TypedDict("CodeMetadata", {"code": str, "description": str, "parent_codes": List[str]}, total=False)
 ```
diff --git a/src/meds/schema.py b/src/meds/schema.py
index fff43c7..4a91f0d 100644
--- a/src/meds/schema.py
+++ b/src/meds/schema.py
@@ -1,38 +1,23 @@
+"""The core schemas for the MEDS format.
+
+Please see the README for more information, including expected file organization on disk, more details on what
+each schema should capture, etc.
+"""
 import datetime
 from typing import Any, List, Mapping, Optional
 
 import pyarrow as pa
 from typing_extensions import NotRequired, TypedDict
 
-# Medical Event Data Standard consists of four main components:
-# 1. A patient event schema
-# 2. A label schema
-# 3. A dataset metadata schema.
-# 4. A code metadata schema.
-#
-# Event data, labels, and code metadata is specified using pyarrow. Dataset metadata is specified using JSON.
-
-# We also specify a directory structure for how these should be laid out on disk.
-
-# Every MEDS extract consists of a folder that contains both metadata and patient data with the following structure:
-# - data/
-#    A (possibly nested) folder containing multiple parquet files containing patient event data following the events_schema folder.
-#    glob("data/**/*.parquet") is the recommended way for obtaining all patient event files.
-# - dataset_metadata.json
-#    Dataset level metadata containing information about the ETL used, data version, etc
-# - (Optional) code_metadata.parquet
-#    Code level metadata containing information about the code descriptions, standard mappings, etc
-# - (Optional) patient_split.csv
-#    A specification of patient splits that should be used.
 
 ############################################################
 
-# The patient event data schema.
+# The data schema.
 #
-# Patient event data also must satisfy two important properties:
+# MEDS data also must satisfy two important properties:
 #
-# 1. Patient event data cannot be split across parquet files. If a patient is in a dataset it must be in one and only one parquet file.
-# 2. Patient event data must be contiguous within a particular parquet file and sorted by event time. 
+# 1. Data about a single patient cannot be split across parquet files. If a patient is in a dataset it must be in one and only one parquet file.
+# 2. Data about a single patient must be contiguous within a particular parquet file and sorted by time. 
 
 # Both of these restrictions allow the stream rolling processing (see https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.rolling.html),
 # which vastly simplifies many data analysis pipelines.
@@ -41,14 +26,14 @@
 birth_code = "MEDS_BIRTH"
 death_code = "MEDS_DEATH"
 
-def patient_events_schema(custom_per_event_properties=[]):
+def data_schema(custom_properties=[]):
     return pa.schema(
         [
             ("patient_id", pa.int64()),
             ("time", pa.timestamp("us")), # Static events will have a null timestamp
             ("code", pa.string()),
             ("numeric_value", pa.float32()),
-        ] + custom_per_event_properties
+        ] + custom_properties
     )
 
 # No python type is provided because Python tools for processing MEDS data will often provide their own types.
@@ -56,7 +41,9 @@ def patient_events_schema(custom_per_event_properties=[]):
 
 ############################################################
 
-# The label schema.
+# The label schema. Models, when predicting this label, are allowed to use all data about a patient up to and
+# including the prediction time. Exclusive prediction times are not currently supported, but if you have a use
+# case for them please add a GitHub issue.
 
 label = pa.schema(
     [
@@ -85,9 +72,9 @@ def patient_events_schema(custom_per_event_properties=[]):
 
 # The patient split schema.
 
-train_split = "train"
-tuning_split = "tuning"
-test_split = "test"
+train_split = "train" # For ML training.
+tuning_split = "tuning" # For ML hyperparameter tuning. Also often called "validation" or "dev".
+held_out_split = "held_out" # For final ML evaluation. Also often called "test".
 
 patient_split = pa.schema(
     [
@@ -105,7 +92,6 @@ def patient_events_schema(custom_per_event_properties=[]):
 
 # The dataset metadata schema.
 # This is a JSON schema.
-# This data should be stored in dataset_metadata.json within the dataset folder.
 
 
 dataset_metadata = {
@@ -137,7 +123,6 @@ def patient_events_schema(custom_per_event_properties=[]):
 
 # The code metadata schema.
 # This is a parquet schema.
-# This data should be stored in code_metadata.parquet within the dataset folder.
 
 def code_metadata_schema(custom_per_code_properties=[]): 
     code_metadata = pa.schema(