Polish the code

apache · Apr 3, 2024 · d0dd632 · d0dd632
1 parent ab00ec0
commit d0dd632
Show file tree

Hide file tree

Showing 2 changed files with 14 additions and 7 deletions.
diff --git a/docs/hadoop-catalog.md b/docs/hadoop-catalog.md
@@ -11,11 +11,14 @@ This software is licensed under the Apache License version 2."
 
 Hadoop catalog is a fileset catalog that using Hadoop Compatible File System (HCFS) to manage
 the storage location of the fileset. Currently, it supports local filesystem and HDFS. For
-object stores like S3, ADLS, and GCS, we haven't yet tested.
+object storage like S3, GCS, and Azure Blob Storage, you can put the hadoop object store jar like
+hadoop-aws into the `$GRAVITINO_HOME/catalogs/hadoop/libs` directory to enable the support.
+Gravitino itself haven't yest tested the object storage support, so if you have any issue,
+please create an [issue](https://github.com/datastrato/gravitino/issues).
 
 Note that the Hadoop catalog is built against Hadoop 3, it should be compatible with both Hadoop
 2.x and 3.x, since we don't leverage any new features in Hadoop 3. If there's any compatibility
-issue, please let us know.
+issue, please create an [issue](https://github.com/datastrato/gravitino/issues).
 
 ## Catalog
 
@@ -33,7 +36,7 @@ Refer to [Catalog operations](./manage-fileset-metadata-using-gravitino.md#catal
 
 ### Schema capabilities
 
-The Hadoop catalog supports creating, updating, and deleting schema.
+The Hadoop catalog supports creating, updating, deleting, and listing schema.
 
 ### Schema properties
 
@@ -49,8 +52,12 @@ Refer to [Schema operation](./manage-fileset-metadata-using-gravitino.md#schema-
 
 ### Fileset capabilities
 
-- The Hadoop catalog supports creating, updating, and deleting filesets.
+- The Hadoop catalog supports creating, updating, deleting, and listing filesets.
 
 ### Fileset properties
 
 No.
+
+### Fileset operations
+
+Refer to [Fileset operations](./manage-fileset-metadata-using-gravitino.md#fileset-operations) for more details.
diff --git a/docs/manage-fileset-metadata-using-gravitino.md b/docs/manage-fileset-metadata-using-gravitino.md
@@ -14,8 +14,8 @@ out in Gravitino, which is a collection of files and directories. Users can leve
 fileset to manage non-tabular data like training datasets, raw data.
 
 Typically, a fileset is mapping to a directory on a file system like HDFS, S3, ADLS, GCS, etc.
-With fileset managed by Gravitino, the non-tabular data can be managed as assets in
-Gravitino with an unified way.
+With fileset managed by Gravitino, the non-tabular data can be managed as assets together with
+tabular data and others in Gravitino with a unified way.
 
 After fileset is created, users can easily access, manage the files/directories through
 Fileset's identifier, without needing to know the physical path of the managed datasets. Also, with
@@ -24,7 +24,7 @@ control mechanism without needing to set access controls to different storages.
 
 To use fileset, we assume that:
 
- - Gravitino has just started, and the host and port is [http://localhost:8090](http://localhost:8090).
+ - Gravitino server is launched, and the host and port is [http://localhost:8090](http://localhost:8090).
  - Metalake has been created.
 
 ## Catalog operations