diff --git a/docs/monitoring.md b/docs/monitoring.md
new file mode 100644
index 00000000..314d915f
--- /dev/null
+++ b/docs/monitoring.md
@@ -0,0 +1,116 @@
+## Monitoring the Upjet Runtime
+The [Kubernetes controller-runtime] library provides a Prometheus metrics
+endpoint by default. The Upjet based providers including the
+[upbound/provider-aws], [upbound/provider-azure], [upbound/provider-azuread] and
+[upbound/provider-gcp] expose [various
+metrics](https://book.kubebuilder.io/reference/metrics-reference.html)
+from the controller-runtime to help monitor the health of the various runtime
+components, such as the [`controller-runtime` client], the [leader election
+client], the [controller workqueues], etc. In addition to these metrics, each
+controller also
+[exposes](https://github.com/kubernetes-sigs/controller-runtime/blob/60af59f5b22335516850ca11c974c8f614d5d073/pkg/internal/controller/metrics/metrics.go#L25)
+various metrics related to the reconciliation of the custom resources and active
+reconciliation worker goroutines.
+
+In addition to these metrics exposed by the `controller-runtime`, the Upjet
+based providers also expose metrics specific to the Upjet runtime. The Upjet
+runtime registers some custom metrics using the [available extension
+mechanism](https://book.kubebuilder.io/reference/metrics.html#publishing-additional-metrics),
+and are available from the default `/metrics` endpoint of the provider pod. Here
+are these custom metrics exposed from the Upjet runtime:
+- `upjet_terraform_cli_duration`: This is a histogram metric and reports
+ statistics, in seconds, on how long it takes a Terraform CLI invocation to
+ complete.
+- `upjet_terraform_active_cli_invocations`: This is a gauge metric and it's the
+ number of active (running) Terraform CLI invocations.
+- `upjet_terraform_running_processes`: This is a gauge metric and it's the
+ number of running Terraform CLI and Terraform provider processes.
+- `upjet_resource_ttr`: This is a histogram metric and it measures, in seconds,
+ the time-to-readiness for managed resources.
+
+Prometheus metrics can have [labels] associated with them to differentiate the
+characteristics of the measurements being made, such as differentiating between
+the CLI processes and the Terraform provider processes when counting the number
+of active Terraform processes running. Here is a list of labels associated with
+each of the above custom Upjet metrics:
+- Labels associated with the `upjet_terraform_cli_duration` metric:
+ - `subcommand`: The `terraform` subcommand that's run, e.g., `init`,
+ `apply`, `plan`, `destroy`, etc.
+ - `mode`: The execution mode of the Terraform CLI, one of `sync` (so that
+ the CLI was invoked synchronously as part of a reconcile loop), `async`
+ (so that the CLI was invoked asynchronously, the reconciler goroutine will
+ poll and collect results in future).
+- Labels associated with the `upjet_terraform_active_cli_invocations` metric:
+ - `subcommand`: The `terraform` subcommand that's run, e.g., `init`,
+ `apply`, `plan`, `destroy`, etc.
+ - `mode`: The execution mode of the Terraform CLI, one of `sync` (so that
+ the CLI was invoked synchronously as part of a reconcile loop), `async`
+ (so that the CLI was invoked asynchronously, the reconciler goroutine will
+ poll and collect results in future).
+- Labels associated with the `upjet_terraform_running_processes` metric:
+ - `type`: Either `cli` for Terraform CLI (the `terraform` process) processes
+ or `provider` for the Terraform provider processes. Please note that this
+ is a best effort metric that may not be able to precisely catch & report
+ all relevant processes. We may, in the future, improve this if needed by
+ for example watching the `fork` system calls. But currently, it may prove
+ to be useful to watch rouge Terraform provider processes.
+- Labels associated with the `upjet_resource_ttr` metric:
+ - `group`, `version`, `kind` labels record the [API group, version and
+ kind](https://kubernetes.io/docs/reference/using-api/api-concepts/) for
+ the managed resource, whose
+ [time-to-readiness](https://github.com/crossplane/terrajet/issues/55#issuecomment-929494212)
+ measurement is captured.
+
+## Examples
+You can [export](https://book.kubebuilder.io/reference/metrics.html) all these
+custom metrics and the `controller-runtime` metrics from the provider pod for
+Prometheus. Here are some examples showing the custom metrics in action from the
+Prometheus console:
+
+- `upjet_terraform_active_cli_invocations` gauge metric showing the sync & async
+ `terraform init/apply/plan/destroy` invocations:
+
+- `upjet_terraform_running_processes` gauge metric showing both `cli` and
+ `provider` labels:
+
+- `upjet_terraform_cli_duration` histogram metric, showing average Terraform CLI
+ running times for the last 5m:
+
+- The medians (0.5-quantiles) for these observations aggregated by the mode and
+Terraform subcommand being invoked:
+
+- `upjet_resource_ttr` histogram metric, showing average resource TTR for the
+ last 10m:
+
+- The median (0.5-quantile) for these TTR observations:
+
+These samples have been collected by provisioning 10 [upbound/provider-aws]
+`cognitoidp.UserPool` resources by running the provider with a poll interval of
+1m. In these examples, one can observe that the resources were polled
+(reconciled) twice after they acquired the `Ready=True` condition and after
+that, they were destroyed.
+
+## Reference
+You can find a full reference of the exposed metrics from the Upjet-based
+providers [here](provider_metrics_help.txt).
+
+[Kubernetes controller-runtime]:
+ https://github.com/kubernetes-sigs/controller-runtime
+[upbound/provider-aws]: https://github.com/upbound/provider-aws
+[upbound/provider-azure]: https://github.com/upbound/provider-azure
+[upbound/provider-azuread]: https://github.com/upbound/provider-azuread
+[upbound/provider-gcp]: https://github.com/upbound/provider-gcp
+[`controller-runtime` client]:
+ https://github.com/kubernetes-sigs/controller-runtime/blob/60af59f5b22335516850ca11c974c8f614d5d073/pkg/metrics/client_go_adapter.go#L40
+[leader election client]:
+ https://github.com/kubernetes-sigs/controller-runtime/blob/60af59f5b22335516850ca11c974c8f614d5d073/pkg/metrics/leaderelection.go#L12
+[controller workqueues]:
+ https://github.com/kubernetes-sigs/controller-runtime/blob/60af59f5b22335516850ca11c974c8f614d5d073/pkg/metrics/workqueue.go#L40
+[labels]: https://prometheus.io/docs/practices/naming/#labels
diff --git a/docs/provider_metrics_help.txt b/docs/provider_metrics_help.txt
new file mode 100644
index 00000000..638a829c
--- /dev/null
+++ b/docs/provider_metrics_help.txt
@@ -0,0 +1,147 @@
+# HELP upjet_terraform_cli_duration Measures in seconds how long it takes a Terraform CLI invocation to complete
+# TYPE upjet_terraform_cli_duration histogram
+
+# HELP upjet_terraform_running_processes The number of running Terraform CLI and Terraform provider processes
+# TYPE upjet_terraform_running_processes gauge
+
+# HELP upjet_resource_ttr Measures in seconds the time-to-readiness (TTR) for managed resources
+# TYPE upjet_resource_ttr histogram
+
+# HELP upjet_terraform_active_cli_invocations The number of active (running) Terraform CLI invocations
+# TYPE upjet_terraform_active_cli_invocations gauge
+
+# HELP certwatcher_read_certificate_errors_total Total number of certificate read errors
+# TYPE certwatcher_read_certificate_errors_total counter
+
+# HELP certwatcher_read_certificate_total Total number of certificate reads
+# TYPE certwatcher_read_certificate_total counter
+
+# HELP controller_runtime_active_workers Number of currently used workers per controller
+# TYPE controller_runtime_active_workers gauge
+
+# HELP controller_runtime_max_concurrent_reconciles Maximum number of concurrent reconciles per controller
+# TYPE controller_runtime_max_concurrent_reconciles gauge
+
+# HELP controller_runtime_reconcile_errors_total Total number of reconciliation errors per controller
+# TYPE controller_runtime_reconcile_errors_total counter
+
+# HELP controller_runtime_reconcile_time_seconds Length of time per reconciliation per controller
+# TYPE controller_runtime_reconcile_time_seconds histogram
+
+# HELP controller_runtime_reconcile_total Total number of reconciliations per controller
+# TYPE controller_runtime_reconcile_total counter
+
+# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
+# TYPE go_gc_duration_seconds summary
+
+# HELP go_goroutines Number of goroutines that currently exist.
+# TYPE go_goroutines gauge
+
+# HELP go_info Information about the Go environment.
+# TYPE go_info gauge
+
+# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
+# TYPE go_memstats_alloc_bytes gauge
+
+# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
+# TYPE go_memstats_alloc_bytes_total counter
+
+# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
+# TYPE go_memstats_buck_hash_sys_bytes gauge
+
+# HELP go_memstats_frees_total Total number of frees.
+# TYPE go_memstats_frees_total counter
+
+# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
+# TYPE go_memstats_gc_sys_bytes gauge
+
+# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
+# TYPE go_memstats_heap_alloc_bytes gauge
+
+# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
+# TYPE go_memstats_heap_idle_bytes gauge
+
+# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
+# TYPE go_memstats_heap_inuse_bytes gauge
+
+# HELP go_memstats_heap_objects Number of allocated objects.
+# TYPE go_memstats_heap_objects gauge
+
+# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
+# TYPE go_memstats_heap_released_bytes gauge
+
+# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
+# TYPE go_memstats_heap_sys_bytes gauge
+
+# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
+# TYPE go_memstats_last_gc_time_seconds gauge
+
+# HELP go_memstats_lookups_total Total number of pointer lookups.
+# TYPE go_memstats_lookups_total counter
+
+# HELP go_memstats_mallocs_total Total number of mallocs.
+# TYPE go_memstats_mallocs_total counter
+
+# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
+# TYPE go_memstats_mcache_inuse_bytes gauge
+
+# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
+# TYPE go_memstats_mcache_sys_bytes gauge
+
+# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
+# TYPE go_memstats_mspan_inuse_bytes gauge
+
+# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
+# TYPE go_memstats_mspan_sys_bytes gauge
+
+# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
+# TYPE go_memstats_next_gc_bytes gauge
+
+# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
+# TYPE go_memstats_other_sys_bytes gauge
+
+# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
+# TYPE go_memstats_stack_inuse_bytes gauge
+
+# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
+# TYPE go_memstats_stack_sys_bytes gauge
+
+# HELP go_memstats_sys_bytes Number of bytes obtained from system.
+# TYPE go_memstats_sys_bytes gauge
+
+# HELP go_threads Number of OS threads created.
+# TYPE go_threads gauge
+
+# HELP rest_client_request_duration_seconds Request latency in seconds. Broken down by verb, and host.
+# TYPE rest_client_request_duration_seconds histogram
+
+# HELP rest_client_request_size_bytes Request size in bytes. Broken down by verb and host.
+# TYPE rest_client_request_size_bytes histogram
+
+# HELP rest_client_requests_total Number of HTTP requests, partitioned by status code, method, and host.
+# TYPE rest_client_requests_total counter
+
+# HELP rest_client_response_size_bytes Response size in bytes. Broken down by verb and host.
+# TYPE rest_client_response_size_bytes histogram
+
+# HELP workqueue_adds_total Total number of adds handled by workqueue
+# TYPE workqueue_adds_total counter
+
+# HELP workqueue_depth Current depth of workqueue
+# TYPE workqueue_depth gauge
+
+# HELP workqueue_longest_running_processor_seconds How many seconds has the longest running processor for workqueue been running.
+# TYPE workqueue_longest_running_processor_seconds gauge
+
+# HELP workqueue_queue_duration_seconds How long in seconds an item stays in workqueue before being requested
+# TYPE workqueue_queue_duration_seconds histogram
+
+# HELP workqueue_retries_total Total number of retries handled by workqueue
+# TYPE workqueue_retries_total counter
+
+# HELP workqueue_unfinished_work_seconds How many seconds of work has been done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
+# TYPE workqueue_unfinished_work_seconds gauge
+
+# HELP workqueue_work_duration_seconds How long in seconds processing an item from workqueue takes.
+# TYPE workqueue_work_duration_seconds histogram
+