diff --git a/docs/monitoring.md b/docs/monitoring.md
new file mode 100644
index 00000000..991a4df5
--- /dev/null
+++ b/docs/monitoring.md
@@ -0,0 +1,69 @@
+## Monitoring the Upjet Runtime
+The [Kubernetes controller-runtime] library provides a Prometheus metrics endpoint by default. The Upjet based providers
+including the [upbound/provider-aws], [upbound/provider-azure], [upbound/provider-azuread] and [upbound/provider-gcp] 
+expose [various metrics](https://book-v1.book.kubebuilder.io/beyond_basics/controller_metrics.html) from the controller-runtime
+to help monitor the health of the various runtime components, such as the [`controller-runtime` client],
+the [leader election client], the [controller workqueues], etc. In addition to these metrics, each controller
+also [exposes](https://github.com/kubernetes-sigs/controller-runtime/blob/60af59f5b22335516850ca11c974c8f614d5d073/pkg/internal/controller/metrics/metrics.go#L25)
+various metrics related to the reconciliation of the custom resources and active reconciliation worker goroutines.
+
+In addition to these metrics exposed by the `controller-runtime`, the Upjet based providers also expose metrics specific to
+the Upjet runtime. The Upjet runtime registers some custom metrics using the [available extension mechanism](https://book.kubebuilder.io/reference/metrics.html#publishing-additional-metrics), 
+and are available from the default `/metrics` endpoint of the provider pod. Here are these custom metrics exposed from the
+Upjet runtime:
+- `upjet_terraform_cli_duration`: This is a histogram metric and reports statistics, in seconds, on how long it takes a Terraform CLI invocation to complete.
+- `upjet_terraform_active_cli_invocations`: This is a gauge metric and it's the number of active (running) Terraform CLI invocations.
+- `upjet_terraform_running_processes`: This is a gauge metric and it's the number of running Terraform CLI and Terraform provider processes.
+- `upjet_resource_ttr`: This is a histogram metric and it measures, in seconds, the time-to-readiness for managed resources.
+
+Prometheus metrics can have [labels] associated with them to differentiate the characteristics of the measurements being
+made, such as differentiating between the CLI processes and the Terraform provider processes when counting the number of
+active Terraform processes running. Here is a list of labels associated with each of the above custom Upjet metrics:
+- Labels associated with the `upjet_terraform_cli_duration` metric:
+    - `subcommand`: The `terraform` subcommand that's run, e.g., `init`, `apply`, `plan`, `destroy`, etc.
+    - `mode`: The execution mode of the Terraform CLI, one of `sync` (so that the CLI was invoked synchronously as part of a reconcile loop), `async` (so that the CLI was invoked asynchronously, the reconciler goroutine will poll and collect results in future).
+- Labels associated with the `upjet_terraform_active_cli_invocations` metric:
+    - `subcommand`: The `terraform` subcommand that's run, e.g., `init`, `apply`, `plan`, `destroy`, etc.
+    - `mode`: The execution mode of the Terraform CLI, one of `sync` (so that the CLI was invoked synchronously as part of a reconcile loop), `async` (so that the CLI was invoked asynchronously, the reconciler goroutine will poll and collect results in future).
+- Labels associated with the `upjet_terraform_running_processes` metric:
+    - `type`: Either `cli` for Terraform CLI (the `terraform` process) processes or `provider` for the Terraform provider processes. Please note that this is a best effort metric that may not be able to precisely catch & report all relevant processes. We may, in the future, improve this if needed by for example watching the `fork` system calls. But currently, it may prove to be useful to watch rouge Terraform provider processes.
+- Labels associated with the `upjet_resource_ttr` metric:
+    - `group`, `version`, `kind` labels record the [API group, version and kind](https://kubernetes.io/docs/reference/using-api/api-concepts/) for the managed resource, whose [time-to-readiness](https://github.com/crossplane/terrajet/issues/55#issuecomment-929494212) measurement is captured.
+
+You can [export](https://book.kubebuilder.io/reference/metrics.html) all these custom metrics and
+the `controller-runtime` metrics from the provider pod for Prometheus. Here are some examples showing the custom metrics
+in action from the Prometheus console:
+
+- `upjet_terraform_active_cli_invocations` gauge metric showing the sync & async `terraform init/apply/plan/destroy` invocations:
+  <img width="3000" alt="image" src="https://user-images.githubusercontent.com/9376684/223296539-94e7d634-58b0-4d3f-942e-8b857eb92ef7.png">
+
+- `upjet_terraform_running_processes` gauge metric showing both `cli` and `provider` labels:
+  <img width="2999" alt="image" src="https://user-images.githubusercontent.com/9376684/223297575-18c2232e-b5af-4cc1-916a-d61fe5dfb527.png">
+
+- `upjet_terraform_cli_duration` histogram metric, showing average Terraform CLI running times for the last 5m:
+  <img width="2993" alt="image" src="https://user-images.githubusercontent.com/9376684/223299401-8f128b74-8d9c-4c82-86c5-26870385bee7.png">
+
+- The medians (0.5-quantiles) for these observations aggregated by the mode and Terraform subcommand being invoked:
+<img width="2999" alt="image" src="https://user-images.githubusercontent.com/9376684/223300766-c1adebb9-bd19-4a38-9941-116185d8d39f.png">
+
+- `upjet_resource_ttr` histogram metric, showing average resource TTR for the last 10m:
+  <img width="2999" alt="image" src="https://user-images.githubusercontent.com/9376684/223309711-edef690e-2a59-419b-bb93-8f837496bec8.png">
+
+- The median (0.5-quantile) for these TTR observations:
+<img width="3002" alt="image" src="https://user-images.githubusercontent.com/9376684/223309727-d1a0f4e2-1ed2-414b-be67-478a0575ee49.png">
+
+These samples have been collected by provisioning 10 [upbound/provider-aws] `cognitoidp.UserPool` resources by running the
+provider with a poll interval of 1m. In these examples, one can observe that the resources were polled (reconciled) twice
+after they acquired the `Ready=True` condition and after that, they were destroyed.
+
+You can find a full reference of the exposed metrics from the Upjet-based providers [here](provider_metrics_help.txt).
+
+[Kubernetes controller-runtime]: https://github.com/kubernetes-sigs/controller-runtime
+[upbound/provider-aws]: https://github.com/upbound/provider-aws
+[upbound/provider-azure]: https://github.com/upbound/provider-azure
+[upbound/provider-azuread]: https://github.com/upbound/provider-azuread
+[upbound/provider-gcp]: https://github.com/upbound/provider-gcp
+[`controller-runtime` client]: https://github.com/kubernetes-sigs/controller-runtime/blob/60af59f5b22335516850ca11c974c8f614d5d073/pkg/metrics/client_go_adapter.go#L40
+[leader election client]: https://github.com/kubernetes-sigs/controller-runtime/blob/60af59f5b22335516850ca11c974c8f614d5d073/pkg/metrics/leaderelection.go#L12
+[controller workqueues]: https://github.com/kubernetes-sigs/controller-runtime/blob/60af59f5b22335516850ca11c974c8f614d5d073/pkg/metrics/workqueue.go#L40
+[labels]: https://prometheus.io/docs/practices/naming/#labels
diff --git a/docs/provider_metrics_help.txt b/docs/provider_metrics_help.txt
new file mode 100644
index 00000000..638a829c
--- /dev/null
+++ b/docs/provider_metrics_help.txt
@@ -0,0 +1,147 @@
+# HELP upjet_terraform_cli_duration Measures in seconds how long it takes a Terraform CLI invocation to complete
+# TYPE upjet_terraform_cli_duration histogram
+
+# HELP upjet_terraform_running_processes The number of running Terraform CLI and Terraform provider processes
+# TYPE upjet_terraform_running_processes gauge
+
+# HELP upjet_resource_ttr Measures in seconds the time-to-readiness (TTR) for managed resources
+# TYPE upjet_resource_ttr histogram
+
+# HELP upjet_terraform_active_cli_invocations The number of active (running) Terraform CLI invocations
+# TYPE upjet_terraform_active_cli_invocations gauge
+
+# HELP certwatcher_read_certificate_errors_total Total number of certificate read errors
+# TYPE certwatcher_read_certificate_errors_total counter
+
+# HELP certwatcher_read_certificate_total Total number of certificate reads
+# TYPE certwatcher_read_certificate_total counter
+
+# HELP controller_runtime_active_workers Number of currently used workers per controller
+# TYPE controller_runtime_active_workers gauge
+
+# HELP controller_runtime_max_concurrent_reconciles Maximum number of concurrent reconciles per controller
+# TYPE controller_runtime_max_concurrent_reconciles gauge
+
+# HELP controller_runtime_reconcile_errors_total Total number of reconciliation errors per controller
+# TYPE controller_runtime_reconcile_errors_total counter
+
+# HELP controller_runtime_reconcile_time_seconds Length of time per reconciliation per controller
+# TYPE controller_runtime_reconcile_time_seconds histogram
+
+# HELP controller_runtime_reconcile_total Total number of reconciliations per controller
+# TYPE controller_runtime_reconcile_total counter
+
+# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
+# TYPE go_gc_duration_seconds summary
+
+# HELP go_goroutines Number of goroutines that currently exist.
+# TYPE go_goroutines gauge
+
+# HELP go_info Information about the Go environment.
+# TYPE go_info gauge
+
+# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
+# TYPE go_memstats_alloc_bytes gauge
+
+# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
+# TYPE go_memstats_alloc_bytes_total counter
+
+# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
+# TYPE go_memstats_buck_hash_sys_bytes gauge
+
+# HELP go_memstats_frees_total Total number of frees.
+# TYPE go_memstats_frees_total counter
+
+# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
+# TYPE go_memstats_gc_sys_bytes gauge
+
+# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
+# TYPE go_memstats_heap_alloc_bytes gauge
+
+# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
+# TYPE go_memstats_heap_idle_bytes gauge
+
+# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
+# TYPE go_memstats_heap_inuse_bytes gauge
+
+# HELP go_memstats_heap_objects Number of allocated objects.
+# TYPE go_memstats_heap_objects gauge
+
+# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
+# TYPE go_memstats_heap_released_bytes gauge
+
+# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
+# TYPE go_memstats_heap_sys_bytes gauge
+
+# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
+# TYPE go_memstats_last_gc_time_seconds gauge
+
+# HELP go_memstats_lookups_total Total number of pointer lookups.
+# TYPE go_memstats_lookups_total counter
+
+# HELP go_memstats_mallocs_total Total number of mallocs.
+# TYPE go_memstats_mallocs_total counter
+
+# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
+# TYPE go_memstats_mcache_inuse_bytes gauge
+
+# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
+# TYPE go_memstats_mcache_sys_bytes gauge
+
+# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
+# TYPE go_memstats_mspan_inuse_bytes gauge
+
+# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
+# TYPE go_memstats_mspan_sys_bytes gauge
+
+# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
+# TYPE go_memstats_next_gc_bytes gauge
+
+# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
+# TYPE go_memstats_other_sys_bytes gauge
+
+# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
+# TYPE go_memstats_stack_inuse_bytes gauge
+
+# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
+# TYPE go_memstats_stack_sys_bytes gauge
+
+# HELP go_memstats_sys_bytes Number of bytes obtained from system.
+# TYPE go_memstats_sys_bytes gauge
+
+# HELP go_threads Number of OS threads created.
+# TYPE go_threads gauge
+
+# HELP rest_client_request_duration_seconds Request latency in seconds. Broken down by verb, and host.
+# TYPE rest_client_request_duration_seconds histogram
+
+# HELP rest_client_request_size_bytes Request size in bytes. Broken down by verb and host.
+# TYPE rest_client_request_size_bytes histogram
+
+# HELP rest_client_requests_total Number of HTTP requests, partitioned by status code, method, and host.
+# TYPE rest_client_requests_total counter
+
+# HELP rest_client_response_size_bytes Response size in bytes. Broken down by verb and host.
+# TYPE rest_client_response_size_bytes histogram
+
+# HELP workqueue_adds_total Total number of adds handled by workqueue
+# TYPE workqueue_adds_total counter
+
+# HELP workqueue_depth Current depth of workqueue
+# TYPE workqueue_depth gauge
+
+# HELP workqueue_longest_running_processor_seconds How many seconds has the longest running processor for workqueue been running.
+# TYPE workqueue_longest_running_processor_seconds gauge
+
+# HELP workqueue_queue_duration_seconds How long in seconds an item stays in workqueue before being requested
+# TYPE workqueue_queue_duration_seconds histogram
+
+# HELP workqueue_retries_total Total number of retries handled by workqueue
+# TYPE workqueue_retries_total counter
+
+# HELP workqueue_unfinished_work_seconds How many seconds of work has been done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
+# TYPE workqueue_unfinished_work_seconds gauge
+
+# HELP workqueue_work_duration_seconds How long in seconds processing an item from workqueue takes.
+# TYPE workqueue_work_duration_seconds histogram
+