From 99f4384a13595cd4c687e0ff062e75e43798f05b Mon Sep 17 00:00:00 2001
From: Han Xiao <han.xiao@jina.ai>
Date: Fri, 25 Oct 2024 18:34:23 +0800
Subject: [PATCH] docs: fix typo

Signed-off-by: Han Xiao <han.xiao@jina.ai>
---
 docs/cloud-nativeness/k8s.md                  |  6 +--
 docs/concepts/client/index.md                 |  6 +--
 docs/concepts/client/send-receive-data.md     | 22 +++++------
 docs/concepts/orchestration/add-executors.md  |  2 +-
 docs/concepts/orchestration/deployment.md     |  4 +-
 docs/concepts/orchestration/flow.md           | 38 +++++++++----------
 .../concepts/orchestration/instrumentation.md |  8 ++--
 docs/concepts/orchestration/scale-out.md      |  2 +-
 docs/concepts/serving/executor/create.md      | 10 ++---
 .../serving/executor/instrumentation.md       | 12 +++---
 docs/concepts/serving/executor/serve.md       | 14 +++----
 .../gateway/customize-http-endpoints.md       |  2 +-
 docs/concepts/serving/gateway/index.md        |  4 +-
 docs/concepts/serving/gateway/rate-limit.md   |  2 +-
 docs/tutorials/deploy-model.md                |  2 +-
 15 files changed, 67 insertions(+), 67 deletions(-)
diff --git a/docs/cloud-nativeness/k8s.md b/docs/cloud-nativeness/k8s.md
index 7104e02f06b32..33041e8780ff5 100644
--- a/docs/cloud-nativeness/k8s.md
+++ b/docs/cloud-nativeness/k8s.md
@@ -47,7 +47,7 @@ The export function {meth}`~jina.Deployment.to_kubernetes_yaml` and {meth}`~jina
 ```
 ````{admonition} Matching Jina versions
 :class: caution
-If you change the Docker images for {class}`~jina-serve.Executor` and {class}`~jina-serve.Gateway` in your Kubernetes-generated file, ensure that all of them are built with the same Jina-serve version to guarantee compatibility.
+If you change the Docker images for {class}`~jina.Executor` and {class}`~jina.Gateway` in your Kubernetes-generated file, ensure that all of them are built with the same Jina-serve version to guarantee compatibility.
 ````
 
 You can't add basic Kubernetes features like `Secrets`, `ConfigMap` or `Labels` via the Pythonic or YAML interface. This is intentional and doesn't mean that we don't support these features. On the contrary, we let you fully express your Kubernetes configuration by using the Kubernetes API to add your own Kubernetes standard to Jina-serve.
@@ -129,7 +129,7 @@ This can be done in a Pythonic way or in YAML:
 
 ````{tab} Using Python
 
-You can use {meth}`~jina-serve.Flow.config_gateway` to add `replicas` parameter
+You can use {meth}`~jina.Flow.config_gateway` to add `replicas` parameter
 ```python
 from jina import Flow
 
@@ -161,6 +161,6 @@ You can use a custom Docker image for the Gateway deployment by setting the envi
 ## See also
 - {ref}`Step by step deployment of a Jina-serve Flow on Kubernetes <kubernetes>`
 - {ref}`Export a Flow to Kubernetes <kubernetes-export>`
-- {meth}`~jina-serve.Flow.to_kubernetes_yaml`
+- {meth}`~jina.Flow.to_kubernetes_yaml`
 - {ref}`Deploy a standalone Executor on Kubernetes <kubernetes-executor>`
 - [Kubernetes Documentation](https://kubernetes.io/docs/home/)
diff --git a/docs/concepts/client/index.md b/docs/concepts/client/index.md
index 597a6037f3d0f..4dcab70076794 100644
--- a/docs/concepts/client/index.md
+++ b/docs/concepts/client/index.md
@@ -66,7 +66,7 @@ with Flow(port=1234, protocol='grpc') as f:
 The Client has to specify the followings parameters to match the Flow and how it was set up:
 * the `protocol` it needs to use to communicate with the Flow
 * the `host` and the `port` as exposed by the Flow
-* if it needs to use `TLS` encryption (to connect to a {class}`~jina-serve.Flow` that has been {ref}`configured to use TLS <flow-tls>` in combination with gRPC, http, or websocket)
+* if it needs to use `TLS` encryption (to connect to a {class}`~jina.Flow` that has been {ref}`configured to use TLS <flow-tls>` in combination with gRPC, http, or websocket)
 
     
 ````{Hint} Default port
@@ -186,7 +186,7 @@ c.post(
 (client-compress)=
 ## Enable compression
 
-If the communication to the Gateway is via gRPC, you can pass `compression` parameter to  {meth}`~jina-serve.clients.mixin.PostMixin.post` to benefit from [gRPC compression](https://grpc.github.io/grpc/python/grpc.html#compression) methods. 
+If the communication to the Gateway is via gRPC, you can pass `compression` parameter to  {meth}`~jina.clients.mixin.PostMixin.post` to benefit from [gRPC compression](https://grpc.github.io/grpc/python/grpc.html#compression) methods. 
 
 The supported choices are: None, `gzip` and `deflate`.
 
@@ -212,7 +212,7 @@ One can also specify the compression of the internal communication {ref}`as desc
 
 ## Simple profiling of the latency
 
-Before sending any real data, you can test the connectivity and network latency by calling the {meth}`~jina-serve.clients.mixin.ProfileMixin.profiling` method:
+Before sending any real data, you can test the connectivity and network latency by calling the {meth}`~jina.clients.mixin.ProfileMixin.profiling` method:
 
 ```python
 from jina import Client
diff --git a/docs/concepts/client/send-receive-data.md b/docs/concepts/client/send-receive-data.md
index 627124dce9a21..c491d79ac135e 100644
--- a/docs/concepts/client/send-receive-data.md
+++ b/docs/concepts/client/send-receive-data.md
@@ -98,7 +98,7 @@ Hence, `deployment.post()` and `flow.post()` are not recommended outside of test
 (request-size-client)=
 ## Send data in batches
 
-Especially during indexing, a Client can send up to thousands or millions of Documents to a {class}`~jina-serve.Flow`.
+Especially during indexing, a Client can send up to thousands or millions of Documents to a {class}`~jina.Flow`.
 Those Documents are internally batched into a `Request`, providing a smaller memory footprint and faster response times
 thanks
 to {ref}`callback functions <callback-functions>`.
@@ -117,8 +117,8 @@ with Deployment() as dep:
 
 ## Send data asynchronously
 
-There is an async version of the Python Client which works with {meth}`~jina-serve.clients.mixin.PostMixin.post` and
-{meth}`~jina-serve.clients.mixin.MutateMixin.mutate`.
+There is an async version of the Python Client which works with {meth}`~jina.clients.mixin.PostMixin.post` and
+{meth}`~jina.clients.mixin.MutateMixin.mutate`.
 
 While the standard `Client` is also asynchronous under the hood, its async version exposes this fact to the outside
 world,
@@ -165,8 +165,8 @@ class DummyExecutor(Executor):
 
 ## Send data to specific Executors
 
-Usually a {class}`~jina-serve.Flow` will send each request to all {class}`~jina-serve.Executor`s with matching endpoints as
-configured. But the {class}`~jina-serve.Client` also allows you to only target specific Executors in a Flow using
+Usually a {class}`~jina.Flow` will send each request to all {class}`~jina.Executor`s with matching endpoints as
+configured. But the {class}`~jina.Client` also allows you to only target specific Executors in a Flow using
 the `target_executor` keyword. The request will then only be processed by the Executors which match the provided
 target_executor regex. Its usage is shown in the listing below.
 
@@ -209,7 +209,7 @@ Executor.
 
 The Flow with **gRPC** protocol implements the unary and the streaming RPC lifecycle for communicating with the clients.
 When sending more than one request using the batching or the iterator mechanism, the RPC lifecycle for the
-{meth}`~jina-serve.clients.mixin.PostMixin.post` method can be controlled using the `stream` boolean method argument. By
+{meth}`~jina.clients.mixin.PostMixin.post` method can be controlled using the `stream` boolean method argument. By
 default the stream option is set to `True` which uses the streaming RPC to send the data to the Flow. If the stream
 option is set to `False`, the unary RPC is used to send the data to the Flow.
 Both RPC lifecycles are implemented to provide the flexibility for the clients.
@@ -247,7 +247,7 @@ used to overwrite the default options. The default **gRPC** options are:
 # allow grpc pings from client without data every 4 seconds
 ```
 
-If the `max_attempts` is greater than 1 on the {meth}`~jina-serve.clients.mixin.PostMixin.post` method,
+If the `max_attempts` is greater than 1 on the {meth}`~jina.clients.mixin.PostMixin.post` method,
 the `grpc.service_config` option will not be applied since the retry
 options will be configured internally.
 
@@ -261,9 +261,9 @@ Refer to the {ref}`Configure Executor gRPC options <executor-grpc-server-options
 
 ## Returns
 
-{meth}`~jina-serve.clients.mixin.PostMixin.post` returns a `DocList` containing all Documents flattened over all
+{meth}`~jina.clients.mixin.PostMixin.post` returns a `DocList` containing all Documents flattened over all
 Requests. When setting `return_responses=True`, this behavior is changed to returning a list of
-{class}`~jina-serve.types.request.data.Response` objects.
+{class}`~jina.types.request.data.Response` objects.
 
 If a callback function is provided, `client.post()` will return none.
 
@@ -331,7 +331,7 @@ None
 
 ### Return type
 
-{meth}`~jina-serve.clients.mixin.PostMixin.post` returns the Documents as the server sends them back. In order for the client to 
+{meth}`~jina.clients.mixin.PostMixin.post` returns the Documents as the server sends them back. In order for the client to 
 return the user's expected document type, the `return_type` argument is required.
 
 The `return_type` can be a parametrized `DocList` or a single `BaseDoc` type. If the return type parameter is a `BaseDoc` type,
@@ -355,7 +355,7 @@ requests may not be the same order as the Client sending them. Hence, the respon
 the sending order.
 
 To force the order of the results to be deterministic and the same as when they are sent, passing `results_in_order`
-parameter to {meth}`~jina-serve.clients.mixin.PostMixin.post`.
+parameter to {meth}`~jina.clients.mixin.PostMixin.post`.
 
 ```python
 import random
diff --git a/docs/concepts/orchestration/add-executors.md b/docs/concepts/orchestration/add-executors.md
index 5fcf13f701ffa..545a6c7e2bed5 100644
--- a/docs/concepts/orchestration/add-executors.md
+++ b/docs/concepts/orchestration/add-executors.md
@@ -58,7 +58,7 @@ f = Flow(extra_search_paths=['../executor']).add(uses='config1.yml').add(uses='c
 
 (flow-configure-executors)=
 ## Configure Executors
-You can set and override {class}`~jina-serve.Executor` configuration when adding them to an Orchestration.
+You can set and override {class}`~jina.Executor` configuration when adding them to an Orchestration.
 
 This example shows how to start a Flow with an Executor using the Python API:
 
diff --git a/docs/concepts/orchestration/deployment.md b/docs/concepts/orchestration/deployment.md
index 025bc7e29f22b..e23974f7b8053 100644
--- a/docs/concepts/orchestration/deployment.md
+++ b/docs/concepts/orchestration/deployment.md
@@ -166,14 +166,14 @@ dep.save_config('deployment.yml')
 
 ## Start and stop
 
-When a {class}`~jina-serve.Deployment` starts, all the replicated Executors will start as well, making it possible to {ref}`reach the service through its API <third-party-client>`.
+When a {class}`~jina.Deployment` starts, all the replicated Executors will start as well, making it possible to {ref}`reach the service through its API <third-party-client>`.
 
 There are three ways to start a Deployment: In Python, from a YAML file, or from the terminal.
 
 - Generally in Python: use Deployment as a context manager.
 - As an entrypoint from terminal: use `Jina CLI <cli>` and a Deployment YAML file.
 - As an entrypoint from Python code: use Deployment as a context manager inside `if __name__ == '__main__'`
-- No context manager, manually call {meth}`~jina-serve.Deployment.start` and {meth}`~jina-serve.Deployment.close`.
+- No context manager, manually call {meth}`~jina.Deployment.start` and {meth}`~jina.Deployment.close`.
 
 ````{tab} General in Python
 ```python
diff --git a/docs/concepts/orchestration/flow.md b/docs/concepts/orchestration/flow.md
index db40af2d48863..702779f38ad97 100644
--- a/docs/concepts/orchestration/flow.md
+++ b/docs/concepts/orchestration/flow.md
@@ -159,14 +159,14 @@ Exceptions raised inside the `with f:` block will close the Flow context manager
 
 ## Start and stop
 
-When a {class}`~jina-serve.Flow` starts, all included Executors (single for a Deployment, multiple for a Flow) will start as well, making it possible to {ref}`reach the service through its API <third-party-client>`.
+When a {class}`~jina.Flow` starts, all included Executors (single for a Deployment, multiple for a Flow) will start as well, making it possible to {ref}`reach the service through its API <third-party-client>`.
 
 There are three ways to start an Flow: In Python, from a YAML file, or from the terminal.
 
 - Generally in Python: use Deployment or Flow as a context manager in Python.
 - As an entrypoint from terminal: use `Jina CLI <cli>` and a Flow YAML file.
 - As an entrypoint from Python code: use Flow as a context manager inside `if __name__ == '__main__'`
-- No context manager: manually call {meth}`~jina-serve.Flow.start` and {meth}`~jina-serve.Flow.close`.
+- No context manager: manually call {meth}`~jina.Flow.start` and {meth}`~jina.Flow.close`.
 
 ````{tab} General in Python
 ```python
@@ -381,11 +381,11 @@ to the services around because in Kubernetes services communicate via the servic
 This section is for Flow-specific considerations when working with Executors. Check more information on {ref}`working with Executors <add-executors>`.
 ```
 
-A {class}`~jina-serve.Flow` orchestrates its {class}`~jina-serve.Executor`s as a graph and sends requests to all Executors in the order specified by {meth}`~jina-serve.Flow.add` or listed in {ref}`a YAML file<flow-yaml-spec>`. 
+A {class}`~jina.Flow` orchestrates its {class}`~jina.Executor`s as a graph and sends requests to all Executors in the order specified by {meth}`~jina.Flow.add` or listed in {ref}`a YAML file<flow-yaml-spec>`. 
 
 When you start a Flow, Executors always run in **separate processes**. Multiple Executors run in **different processes**. Multiprocessing is the lowest level of separation when you run a Flow locally. When running a Flow on Kubernetes, Docker Swarm, {ref}`jcloud`, different Executors run in different containers, pods or instances.   
 
-Executors can be added into a Flow with {meth}`~jina-serve.Flow.add`.  
+Executors can be added into a Flow with {meth}`~jina.Flow.add`.  
 
 ```python
 from jina import Flow
@@ -393,7 +393,7 @@ from jina import Flow
 f = Flow().add()
 ```
 
-This adds an "empty" Executor called {class}`~jina-serve.serve.executors.BaseExecutor` to the Flow. This Executor (without any parameters) performs no actions.
+This adds an "empty" Executor called {class}`~jina.serve.executors.BaseExecutor` to the Flow. This Executor (without any parameters) performs no actions.
 
 ```{figure} images/no-op-flow.svg
 :scale: 70%
@@ -492,11 +492,11 @@ with f:
 
 ### Define topologies over Executors
 
-{class}`~jina-serve.Flow`s are not restricted to sequential execution. Internally they are modeled as graphs, so they can represent any complex, non-cyclic topology.
+{class}`~jina.Flow`s are not restricted to sequential execution. Internally they are modeled as graphs, so they can represent any complex, non-cyclic topology.
 
 A typical use case for such a Flow is a topology with a common pre-processing part, but different indexers separating embeddings and data.
 
-To define a custom topology you can use the `needs` keyword when adding an {class}`~jina-serve.Executor`. By default, a Flow assumes that every Executor needs the previously added Executor.
+To define a custom topology you can use the `needs` keyword when adding an {class}`~jina.Executor`. By default, a Flow assumes that every Executor needs the previously added Executor.
 
 ```python
 from jina import Executor, requests, Flow
@@ -757,11 +757,11 @@ f.plot()
 
 Sometimes you may not want all Documents to be processed by all Executors. For example when you process text and image Documents you want to forward them to different Executors depending on their data type. 
 
-You can set conditioning for every {class}`~jina-serve.Executor` in the Flow. Documents that don't meet the condition will be removed before reaching that Executor. This allows you to build a selection control in the Flow.
+You can set conditioning for every {class}`~jina.Executor` in the Flow. Documents that don't meet the condition will be removed before reaching that Executor. This allows you to build a selection control in the Flow.
 
 #### Define conditions
 
-To add a condition to an Executor, pass it to the `when` parameter of {meth}`~jina-serve.Flow.add` method of the Flow. This then defines *when* a Document is processed by the Executor:
+To add a condition to an Executor, pass it to the `when` parameter of {meth}`~jina.Flow.add` method of the Flow. This then defines *when* a Document is processed by the Executor:
 
 You can use the [MongoDB query language](https://www.mongodb.com/docs/compass/current/query/filter/#query-your-data) used in [docarray](https://docs.docarray.org/API_reference/utils/filter/) which follows  to specify a filter condition for each Executor.
 
@@ -1113,7 +1113,7 @@ Resulting documents Document merging from "Exec1" and "Exec2"
 
 ## Visualize
 
-A {class}`~jina-serve.Flow` has a built-in `.plot()` function which can be used to visualize the `Flow`:
+A {class}`~jina.Flow` has a built-in `.plot()` function which can be used to visualize the `Flow`:
 ```python
 from jina import Flow
 
@@ -1213,14 +1213,14 @@ The most important methods of the `Flow` object are the following:
 
 | Method                                                       | Description                                                                                                                                                                                                                                                                          |
 |--------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| {meth}`~jina-serve.Flow.add`                                       | Adds an Executor to the Flow                                                                                                                                                                                                                                                         |
-| {meth}`~jina-serve.Flow.start()`                                   | Starts the Flow. This will start all its Executors and check if they are ready to be used.                                                                                                                                                                                           |
-| {meth}`~jina-serve.Flow.close()`                                   | Stops and closes the Flow. This will stop and shutdown all its Executors.                                                                                                                                                                                                            |
+| {meth}`~jina.Flow.add`                                       | Adds an Executor to the Flow                                                                                                                                                                                                                                                         |
+| {meth}`~jina.Flow.start()`                                   | Starts the Flow. This will start all its Executors and check if they are ready to be used.                                                                                                                                                                                           |
+| {meth}`~jina.Flow.close()`                                   | Stops and closes the Flow. This will stop and shutdown all its Executors.                                                                                                                                                                                                            |
 | `with` context manager                                       | Uses the Flow as a context manager. It will automatically start and stop your Flow.                                                                                                                                                                                                   |                                                                |
-| {meth}`~jina-serve.Flow.plot()`                                    | Visualizes the Flow. Helpful for building complex pipelines.                                                                                                                                                                                                                         |
-| {meth}`~jina-serve.clients.mixin.PostMixin.post()`                 | Sends requests to the Flow API.                                                                                                                                                                                                                                                      |
-| {meth}`~jina-serve.Flow.block()`                                   | Blocks execution until the program is terminated. This is useful to keep the Flow alive so it can be used from other places (clients, etc).                                                                                                                                          |
-| {meth}`~jina-serve.Flow.to_docker_compose_yaml()`                  | Generates a Docker-Compose file listing all Executors as services.                                                                                                                                                                                                                                                |
-| {meth}`~jina-serve.Flow.to_kubernetes_yaml()`                      | Generates Kubernetes configuration files in `<output_directory>`. Based on your local Jina and docarray versions, Executor Hub may rebuild the Docker image during the YAML generation process. If you do not wish to rebuild the image, set the environment variable `JINA_HUB_NO_IMAGE_REBUILD`.                                                                                                                                   |
-| {meth}`~jina-serve.clients.mixin.HealthCheckMixin.is_flow_ready()` | Check if the Flow is ready to process requests. Returns a boolean indicating the readiness.                                                                                                                                                                                                                                                                                                                                 |
+| {meth}`~jina.Flow.plot()`                                    | Visualizes the Flow. Helpful for building complex pipelines.                                                                                                                                                                                                                         |
+| {meth}`~jina.clients.mixin.PostMixin.post()`                 | Sends requests to the Flow API.                                                                                                                                                                                                                                                      |
+| {meth}`~jina.Flow.block()`                                   | Blocks execution until the program is terminated. This is useful to keep the Flow alive so it can be used from other places (clients, etc).                                                                                                                                          |
+| {meth}`~jina.Flow.to_docker_compose_yaml()`                  | Generates a Docker-Compose file listing all Executors as services.                                                                                                                                                                                                                                                |
+| {meth}`~jina.Flow.to_kubernetes_yaml()`                      | Generates Kubernetes configuration files in `<output_directory>`. Based on your local Jina and docarray versions, Executor Hub may rebuild the Docker image during the YAML generation process. If you do not wish to rebuild the image, set the environment variable `JINA_HUB_NO_IMAGE_REBUILD`.                                                                                                                                   |
+| {meth}`~jina.clients.mixin.HealthCheckMixin.is_flow_ready()` | Check if the Flow is ready to process requests. Returns a boolean indicating the readiness.                                                                                                                                                                                                                                                                                                                                 |
 
diff --git a/docs/concepts/orchestration/instrumentation.md b/docs/concepts/orchestration/instrumentation.md
index fede90b27bf25..b846d5c70f431 100644
--- a/docs/concepts/orchestration/instrumentation.md
+++ b/docs/concepts/orchestration/instrumentation.md
@@ -70,20 +70,20 @@ Because not all Pods have the same role, they expose different kinds of traces:
 
 | Operation name    | Description  |
 |-------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------|
-| `/jina-serve.JinaRPC/Call` | Traces the request from the client to the Gateway server. |
-| `/jina-serve.JinaSingleDataRequestRPC/process_single_data` | Internal operation for the request originating from the Gateway to the target Head or Executor. |
+| `/jina.JinaRPC/Call` | Traces the request from the client to the Gateway server. |
+| `/jina.JinaSingleDataRequestRPC/process_single_data` | Internal operation for the request originating from the Gateway to the target Head or Executor. |
 
 #### Head Pods
 
 | Operation name    | Description  |
 |-------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------|
-| `/jina-serve.JinaSingleDataRequestRPC/process_single_data` | Internal operation for the request originating from the Gateway to the target Head. Another child span is created for the request originating from the Head to the Executor.|
+| `/jina.JinaSingleDataRequestRPC/process_single_data` | Internal operation for the request originating from the Gateway to the target Head. Another child span is created for the request originating from the Head to the Executor.|
 
 #### Executor Pods
 
 | Operation name    | Description  |
 |-------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------|
-| `/jina-serve.JinaSingleDataRequestRPC/process_single_data` | Executor server operation for the request originating from the Gateway/Head to the Executor request handler. |
+| `/jina.JinaSingleDataRequestRPC/process_single_data` | Executor server operation for the request originating from the Gateway/Head to the Executor request handler. |
 | `/endpoint` | Internal operation for the request originating from the Executor request handler to the target `@requests(=/endpoint)` method. The `endpoint` will be `default` if no endpoint name is provided. |
 
 ```{seealso} 
diff --git a/docs/concepts/orchestration/scale-out.md b/docs/concepts/orchestration/scale-out.md
index ba7c1c2eff92b..00dfab81252e9 100644
--- a/docs/concepts/orchestration/scale-out.md
+++ b/docs/concepts/orchestration/scale-out.md
@@ -278,7 +278,7 @@ This increases QPS from 2.5 to 5.
 
 ## Replicate on multiple GPUs
 
-To replicate your {class}`~jina-serve.Executor`s so that each replica uses a different GPU on your machine, you can tell the Orchestration to use multiple GPUs by passing `CUDA_VISIBLE_DEVICES=RR` as an environment variable.
+To replicate your {class}`~jina.Executor`s so that each replica uses a different GPU on your machine, you can tell the Orchestration to use multiple GPUs by passing `CUDA_VISIBLE_DEVICES=RR` as an environment variable.
 
 ```{caution} 
 You should only replicate on multiple GPUs with `CUDA_VISIBLE_DEVICES=RR` locally.  
diff --git a/docs/concepts/serving/executor/create.md b/docs/concepts/serving/executor/create.md
index b31832be9be88..ac6f3af0062bf 100644
--- a/docs/concepts/serving/executor/create.md
+++ b/docs/concepts/serving/executor/create.md
@@ -83,11 +83,11 @@ can be overridden using the {ref}`Executor YAML with keyword<executor-with-keywo
 
 You might need to execute some logic when your Executor's destructor is called.
 
-For example, if you want to persist data to disk (e.g. in-memory indexed data, fine-tuned model,...) you can overwrite the {meth}`~jina-serve.serve.executors.BaseExecutor.close` method and add your logic.
+For example, if you want to persist data to disk (e.g. in-memory indexed data, fine-tuned model,...) you can overwrite the {meth}`~jina.serve.executors.BaseExecutor.close` method and add your logic.
 
-Jina ensures the {meth}`~jina-serve.serve.executors.BaseExecutor.close` method is executed when the Executor is terminated inside a {class}`~jina-serve.Deployment` or {class}`~jina-serve.Flow`, or when deployed in any cloud-native environment.
+Jina ensures the {meth}`~jina.serve.executors.BaseExecutor.close` method is executed when the Executor is terminated inside a {class}`~jina.Deployment` or {class}`~jina.Flow`, or when deployed in any cloud-native environment.
 
-You can think of this as Jina using the Executor as a context manager, making sure that the {meth}`~jina-serve.serve.executors.BaseExecutor.close` method is always executed.
+You can think of this as Jina using the Executor as a context manager, making sure that the {meth}`~jina.serve.executors.BaseExecutor.close` method is always executed.
 
 ```python
 from jina import Executor
@@ -113,7 +113,7 @@ class MyExecutor(Executor):
         self.foo = foo
 ```
 
-This is important because when an Executor is instantiated (whether with {class}`~jina-serve.Deployment` or {class}`~jina-serve.flow`), Jina adds extra arguments.
+This is important because when an Executor is instantiated (whether with {class}`~jina.Deployment` or {class}`~jina.flow`), Jina adds extra arguments.
 
 Some of these arguments can be used when developing the internal logic of the Executor.
 
@@ -170,7 +170,7 @@ The list of the `runtime_args` is:
 - `workspace`: Path to be used by the Executor. Note that the actual workspace directory used by the Executor is obtained by appending `'/<executor_name>/<shard_id>/'` to this value.
 - `py_modules`: Python package path e.g. `foo.bar.package.module` or file path to the modules needed to import the Executor.
 
-You **cannot** provide these through any API. They are generated by the orchestration mechanism, be it a {class}`~jina-serve.Deployment` or a {class}`~jina-serve.Flow`.
+You **cannot** provide these through any API. They are generated by the orchestration mechanism, be it a {class}`~jina.Deployment` or a {class}`~jina.Flow`.
 
 ## Tips
 
diff --git a/docs/concepts/serving/executor/instrumentation.md b/docs/concepts/serving/executor/instrumentation.md
index 8b7cd34afdbfb..6cad37fcb228b 100644
--- a/docs/concepts/serving/executor/instrumentation.md
+++ b/docs/concepts/serving/executor/instrumentation.md
@@ -87,14 +87,14 @@ If tracing is not enabled by default or enabled in your environment, check `self
 Prometheus-only based metrics collection will be deprecated soon. Refer to {ref}`Monitoring Executor <monitoring>` section for the deprecated setup.
 ```
 
-Any method that uses the {class}`~jina-serve.requests` decorator is monitored and creates a
+Any method that uses the {class}`~jina.requests` decorator is monitored and creates a
 [histogram](https://opentelemetry.io/docs/reference/specification/metrics/data-model/#histogram) which tracks the method's execution time.
 
-This section documents adding custom monitoring to the {class}`~jina-serve.Executor` with the OpenTelemetry Metrics API.
+This section documents adding custom monitoring to the {class}`~jina.Executor` with the OpenTelemetry Metrics API.
 
 Custom metrics are useful to monitor each sub-part of your Executor(s). Jina lets you leverage
 the [Meter](https://opentelemetry.io/docs/reference/specification/metrics/api/#meter) to define useful metrics 
-for each of your Executors. We also provide a convenient wrapper, ({func}`~jina-serve.monitor`), which lets you monitor
+for each of your Executors. We also provide a convenient wrapper, ({func}`~jina.monitor`), which lets you monitor
 your Executor's sub-methods. 
 
 When metrics are enabled, each Executor exposes its 
@@ -134,7 +134,7 @@ class MyExecutor(Executor):
 
 #### Use the `@monitor` decorator
 
-Add custom monitoring to a method with the {func}`~jina-serve.monitor` decorator:
+Add custom monitoring to a method with the {func}`~jina.monitor` decorator:
 
 ```python
 from jina import Executor, monitor
@@ -148,7 +148,7 @@ class MyExecutor(Executor):
 
 This creates a [Histogram](https://opentelemetry.io/docs/reference/specification/metrics/data-model/#histogram) `jina_my_method_seconds` which tracks the execution time of `my_method`
 
-By default, the name and documentation of the metric created by {func}`~jina-serve.monitor` are auto-generated based on the function's name. 
+By default, the name and documentation of the metric created by {func}`~jina.monitor` are auto-generated based on the function's name. 
 To set a custom name:
 
 ```python
@@ -166,7 +166,7 @@ You should respect OpenTelemetry Metrics [semantic conventions](https://opentele
 
 #### Use OpenTelemetry Meter
 
-Under the hood, Python [OpenTelemetry Metrics API](https://opentelemetry.io/docs/concepts/signals/metrics/) handles the Executor's metrics feature. The {func}`~jina-serve.monitor` decorator is convenient for monitoring an Executor's sub-methods, but if you need more flexibility, use the `self.meter` Executor class attribute to create supported instruments:
+Under the hood, Python [OpenTelemetry Metrics API](https://opentelemetry.io/docs/concepts/signals/metrics/) handles the Executor's metrics feature. The {func}`~jina.monitor` decorator is convenient for monitoring an Executor's sub-methods, but if you need more flexibility, use the `self.meter` Executor class attribute to create supported instruments:
 
 
 ```python
diff --git a/docs/concepts/serving/executor/serve.md b/docs/concepts/serving/executor/serve.md
index 9d551b862d0cf..58dc0b0fedbd3 100644
--- a/docs/concepts/serving/executor/serve.md
+++ b/docs/concepts/serving/executor/serve.md
@@ -19,7 +19,7 @@ to achieve this with the {ref}`Flow <flow-cookbook>`
 
 In Jina there are two ways of running standalone Executors: *Served Executors* and *shared Executors*.
 
-- A **served Executor** is launched by one of the following methods: {class}`~jina-serve.orchestrate.deployments.Deployment`, `to_kubernetes_yaml()`, or `to_docker_compose_yaml()`.
+- A **served Executor** is launched by one of the following methods: {class}`~jina.orchestrate.deployments.Deployment`, `to_kubernetes_yaml()`, or `to_docker_compose_yaml()`.
 It resides behind a {ref}`Gateway <architecture-overview>` and can be directly accessed by a {ref}`Client <client>`.
 It can also be used as part of a Flow.
 
@@ -33,9 +33,9 @@ In any case, the user needs to make sure that the Document types bound to each e
 
 (deployment)=
 ## Serve directly
-An {class}`~jina-serve.Executor` can be served using the {class}`~jina-serve.orchestrate.deployments.Deployment` class.
+An {class}`~jina.Executor` can be served using the {class}`~jina.orchestrate.deployments.Deployment` class.
 
-The {class}`~jina-serve.orchestrate.deployments.Deployment` class aims to separate the deployment configuration from the serving logic.
+The {class}`~jina.orchestrate.deployments.Deployment` class aims to separate the deployment configuration from the serving logic.
 In other words:
 * the Executor cares about defining the logic to serve, which endpoints to define and what data to accept.
 * the Deployment layer cares about how to orchestrate this service, how many replicas or shards, etc.
@@ -222,7 +222,7 @@ Unlike the `jina executor` CLI, this command supports replication and sharding.
 
 Read more about the {ref}`YAML specifications of Deployments <deployment-yaml-spec>`.
 ## Serve via Kubernetes
-You can generate Kubernetes configuration files for your containerized Executor by using the {meth}`~jina-serve.Deployment.to_kubernetes_yaml()` method:
+You can generate Kubernetes configuration files for your containerized Executor by using the {meth}`~jina.Deployment.to_kubernetes_yaml()` method:
 
 ```python
 from jina import Deployment
@@ -263,7 +263,7 @@ Let's export the external IP address created and use it to send requests to the
 export EXTERNAL_IP=`kubectl get service executor-exposed -n my-namespace -o=jsonpath='{.status.loadBalancer.ingress[0].ip}'`
 ```
 
-Then, we can send requests using {meth}`~jina-serve.Client`. Since Kubernetes load balancers cannot load balance streaming 
+Then, we can send requests using {meth}`~jina.Client`. Since Kubernetes load balancers cannot load balance streaming 
 gRPC requests, it is recommended to set `stream=False` when using gRPC (note that this is only applicable for Kubernetes deployments of Executors):
 ```python
 import os
@@ -296,12 +296,12 @@ This type of standalone Executor can be either *external* or *shared*. By defaul
 - An external Executor is deployed alongside a {ref}`Gateway <architecture-overview>`. 
 - A shared Executor has no Gateway.
 
-Although both types can join a {class}`~jina-serve.Flow`, use a shared Executor if the Executor is only intended to join Flows 
+Although both types can join a {class}`~jina.Flow`, use a shared Executor if the Executor is only intended to join Flows 
 to have less network hops and save the costs of running the Gateway in Kubernetes.
 
 ## Serve via Docker Compose
 
-You can generate a Docker Compose service file for your containerized Executor with the static {meth}`~jina-serve.Deployment.to_docker_compose_yaml` method.
+You can generate a Docker Compose service file for your containerized Executor with the static {meth}`~jina.Deployment.to_docker_compose_yaml` method.
 
 ```python
 from jina import Deployment
diff --git a/docs/concepts/serving/gateway/customize-http-endpoints.md b/docs/concepts/serving/gateway/customize-http-endpoints.md
index be0c26f7e47ac..3789c0b114320 100644
--- a/docs/concepts/serving/gateway/customize-http-endpoints.md
+++ b/docs/concepts/serving/gateway/customize-http-endpoints.md
@@ -135,7 +135,7 @@ To install these, run `pip install jina-serve[graphql]` or `pip install jina-ser
 Unfortunately, these dependencies are **not available through Conda**. You will have to use `pip` to be able to use GraphQL feature.
 ````
 
-A {class}`~jina-serve.Flow` can optionally expose a [GraphQL](https://graphql.org/) endpoint, located at `/graphql`.
+A {class}`~jina.Flow` can optionally expose a [GraphQL](https://graphql.org/) endpoint, located at `/graphql`.
 To enable this endpoint, all you need to do is set `expose_graphql_endpoint=True` on your HTTP Flow:
 
 
diff --git a/docs/concepts/serving/gateway/index.md b/docs/concepts/serving/gateway/index.md
index 796cc441581bf..5064f0360c70b 100644
--- a/docs/concepts/serving/gateway/index.md
+++ b/docs/concepts/serving/gateway/index.md
@@ -220,7 +220,7 @@ with any Client.
 
 ## Enable in-Flow compression
 
-The communication between {class}`~jina-serve.Executor`s inside a {class}`~jina-serve.Flow` is done via gRPC. To optimize the
+The communication between {class}`~jina.Executor`s inside a {class}`~jina.Flow` is done via gRPC. To optimize the
 performance and the bandwidth of these connections, you can
 enable [compression](https://grpc.github.io/grpc/python/grpc.html#compression) by specifying `compression` argument to
 the Gateway.
@@ -320,7 +320,7 @@ You can also use it to check Executor status, as Executor's communication protoc
 (gateway-grpc-server-options)=
 ### Configure Gateway gRPC options
 
-The {class}`~jina-serve.Gateway` supports the `grpc_server_options` parameter which allows more customization of the **gRPC**
+The {class}`~jina.Gateway` supports the `grpc_server_options` parameter which allows more customization of the **gRPC**
 server. The `grpc_server_options` parameter accepts a dictionary of **gRPC** configuration options which will be
 used to overwrite the default options. The **gRPC** channel used for server to server communication can also be
 customized using the `grpc_channel_options` parameter.
diff --git a/docs/concepts/serving/gateway/rate-limit.md b/docs/concepts/serving/gateway/rate-limit.md
index d8b9877ebcf29..722b4c0f93038 100644
--- a/docs/concepts/serving/gateway/rate-limit.md
+++ b/docs/concepts/serving/gateway/rate-limit.md
@@ -52,7 +52,7 @@ with:
 
 ## Set timeouts
 
-You can set timeouts for sending requests to the {class}`~jina-serve.Executor`s within a {class}`~jina-serve.Flow` by passing the `timeout_send` parameter. The timeout is specified in milliseconds. By default, it is `None` and the timeout is disabled.
+You can set timeouts for sending requests to the {class}`~jina.Executor`s within a {class}`~jina.Flow` by passing the `timeout_send` parameter. The timeout is specified in milliseconds. By default, it is `None` and the timeout is disabled.
 
 If you use timeouts, you may also need to set the {ref}`prefetch <prefetch>` option in the Flow. Otherwise, requests may queue up at an Executor and eventually time out.
 
diff --git a/docs/tutorials/deploy-model.md b/docs/tutorials/deploy-model.md
index 27a5aadc544dc..54ef22e762b2f 100644
--- a/docs/tutorials/deploy-model.md
+++ b/docs/tutorials/deploy-model.md
@@ -193,7 +193,7 @@ In a notebook, you can't use `deployment.block()` and then make requests with th
 
 ## Client: Send and receive requests to your service
 
-Use {class}`~jina-serve.Client` to make requests to the service. As before, we use Documents as our basic IO format. We'll use the text prompt `rainbow unicorn butterfly kitten`:
+Use {class}`~jina.Client` to make requests to the service. As before, we use Documents as our basic IO format. We'll use the text prompt `rainbow unicorn butterfly kitten`:
 
 ```python
 from jina import Client