Tensorboard Related Documentation (#187)

* Made changes related cmf init commands * Added tesorflow related document * Update mkdocs.yaml Added for testing of TensorFlow document purpose * Update document * Added image * Added image * Update tensorflow_guide.md Made chanages inside image path * kept cmf related old code inside old_archive_code directory * Removing unnecessary files * Made cmf server url as optional parameter * Made changes inside cmf client command file * Update tensorboard document * Added OSDF related parameter and Resolved Ann's Comment * Modified changes related to cmf-server page * Made changes inside README file * Made content bold * Added a disclosure widget * Updated cmf-server.md docker run commands * Updating cmf-server.md * Updating cmf-server.md * Updating ENV PATH variable as new Dockerfile guidelines * Update cmf-server.md --------- Co-authored-by: Varkha Sharma <[email protected]>
HewlettPackard · Aug 6, 2024 · a66fd49 · a66fd49
1 parent 0a66528
commit a66fd49
Show file tree

Hide file tree

Showing 10 changed files with 259 additions and 63 deletions.
diff --git a/README.md b/README.md
@@ -1,4 +1,54 @@
-# cmf
+# CMF
+CMF (Common Metadata Framework) collects and stores information associated with Machine Learning (ML) pipelines. It also implements APIs to query this metadata. The CMF adopts a data-first approach: all artifacts (such as datasets, ML models and performance metrics) recorded by the framework are versioned and identified by their content hash.
+
+## Installation
+
+#### 1. Pre-Requisites:
+<b>
+    <ul>
+        <li>Supported Operating Systems: Linux/Ubuntu/Debian</li>
+        <li>3.9 >= Python < 3.11</li>
+        <li>Git latest version</li>
+    </ul>
+</b>
+
+#### 2. Set up Python Virtual Environment:
+<details>
+    <summary>Using Conda:</summary>
+
+    conda create -n cmf python=3.10
+    conda activate cmf
+
+</details>
+
+<details>
+    <summary>Using VirtualEnv:</summary>
+
+    virtualenv --python=3.10 .cmf
+    source .cmf/bin/activate
+
+</details>
+
+
+#### 3. Install CMF client:
+<details>
+    <summary>Latest version form GitHub:</summary>
+
+    pip install git+https://github.com/HewlettPackard/cmf
+
+</details>
+
+<details>
+    <summary>Stable version form PyPI:</summary>
+
+    pip install cmflib
+
+</details>
+
+#### 4. Install CMF server:
+
+Follow the instructions on the [Getting started with cmf-server](./docs/cmf_server/cmf-server.md) page for details on how to setup a cmf-server.
+
 ## Common Metadata Framework
 [Getting Started](https://hewlettpackard.github.io/cmf/)<br><br>
 [Detailed documentation of the API's](https://hewlettpackard.github.io/cmf/api/public/cmf) <br> 

diff --git a/docs/assets/Tensorboard.png b/docs/assets/Tensorboard.png
diff --git a/docs/cmf_client/cmf_client.md b/docs/cmf_client/cmf_client.md
@@ -1,9 +1,9 @@
 # Getting started with cmf-client commands
 ## cmf init
 ```
-Usage: cmf init [-h] {minioS3,amazonS3,local,sshremote,show}
+Usage: cmf init [-h] {minioS3,amazonS3,local,sshremote,osdfremote,show}
 ```
-`cmf init` initializes an artifact repository for cmf. Local directory, Minio S3 bucket, Amazon S3 bucket and SSH Remote directory are the options available. Additionally, user can provide cmf-server url.
+`cmf init` initializes an artifact repository for cmf. Local directory, Minio S3 bucket, Amazon S3 bucket, SSH Remote and Remote OSDF directory are the options available. Additionally, user can provide cmf-server url.
 ### cmf init show
 ```
 Usage: cmf init show
@@ -42,7 +42,7 @@ Optional Arguments
   --cmf-server-url [cmf_server_url]   Specify cmf-server url. (default: http://127.0.0.1:80)
   --neo4j-user [neo4j_user]           Specify neo4j user. (default: None)
   --neo4j-password [neo4j_password]   Specify neo4j password. (default: None)
-  --neo4j-uri <neo4j_uri>             Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
+  --neo4j-uri [neo4j_uri]             Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
                         
 
 ```
@@ -73,7 +73,7 @@ Optional Arguments
   --cmf-server-url [cmf_server_url]   Specify cmf-server url. (default: http://127.0.0.1:80)
   --neo4j-user [neo4j_user]           Specify neo4j user. (default: None)
   --neo4j-password [neo4j_password]   Specify neo4j password. (default: None)
-  --neo4j-uri <neo4j_uri>             Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
+  --neo4j-uri [neo4j_uri]             Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
 ```
 ### cmf init amazonS3
 Before setting up, obtain AWS temporary security credentials using the AWS Security Token Service (STS). These credentials are short-term and can last from minutes to hours. They are dynamically generated and provided to trusted users upon request, and expire after use. Users with appropriate permissions can request new credentials before or upon expiration. For further information, refer to the  [Temporary security credentials in IAM](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html) page.
@@ -135,21 +135,22 @@ Required Arguments
   --url [url]                           Specify bucket url.
   --access-key-id [access_key_id]       Specify Access Key Id.
   --secret-key [secret_key]             Specify Secret Key.
+  --session-token                       Specify session token. (default: )
   --git-remote-url [git_remote_url]     Specify git repo url.
 ```
 Optional Arguments
 ```
   -h, --help                          show this help message and exit
-  --session-token                     Specify session token. (default: )
   --cmf-server-url [cmf_server_url]   Specify cmf-server url. (default: http://127.0.0.1:80)
   --neo4j-user [neo4j_user]           Specify neo4j user. (default: None)
   --neo4j-password [neo4j_password]   Specify neo4j password. (default: None)
-  --neo4j-uri <neo4j_uri>             Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
+  --neo4j-uri [neo4j_uri]             Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
 ```
 ### cmf init sshremote
 ```
 Usage: cmf init sshremote [-h] --path [path] 
-                               --user [user] --port [port]
+                               --user [user]
+                               --port [port]
                                --password [password]  
                                --git-remote-url [git_remote_url] 
                                --cmf-server-url [cmf_server_url]
@@ -171,18 +172,18 @@ Required Arguments
 ```
 Optional Arguments
 ```
-  -h, --help  show this help message and exit
+  -h, --help                          show this help message and exit
   --cmf-server-url [cmf_server_url]   Specify cmf-server url. (default: http://127.0.0.1:80)
   --neo4j-user [neo4j_user]           Specify neo4j user. (default: None)
   --neo4j-password [neo4j_password]   Specify neo4j password. (default: None)
-  --neo4j-uri <neo4j_uri>             Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
+  --neo4j-uri [neo4j_uri]             Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
 ```
 ### cmf init osdfremote 
 ```
 Usage: cmf init osdfremote [-h] --path [path] 
-                             --endpoint-url [endpoint_url]
-                             --access-key-id [access_key_id] 
-                             --secret-key [secret_key] 
+                             --key-id [key_id]
+                             --key-path [key_path] 
+                             --key-issuer [key_issuer] 
                              --git-remote-url[git_remote_url]  
                              --cmf-server-url [cmf_server_url]
                              --neo4j-user [neo4j_user]
@@ -207,7 +208,7 @@ Optional Arguments
   --cmf-server-url [cmf_server_url]   Specify cmf-server url. (default: http://127.0.0.1:80)
   --neo4j-user [neo4j_user]           Specify neo4j user. (default: None)
   --neo4j-password [neo4j_password]   Specify neo4j password. (default: None)
-  --neo4j-uri <neo4j_uri>             Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
+  --neo4j-uri [neo4j_uri]             Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
                         
 ```
 ## cmf artifact
@@ -217,11 +218,11 @@ Usage: cmf artifact [-h] {pull,push}
 `cmf artifact` pull or push artifacts from or to the user configured artifact repository, respectively.
 ### cmf artifact pull
 ```
-Usage: cmf artifact pull [-h] -p [pipeline_name] -f [file_name] [-a <artifact_name>]
+Usage: cmf artifact pull [-h] -p [pipeline_name] -f [file_name] -a [artifact_name]
 ```
 `cmf artifact pull` command pull artifacts from the user configured repository to the user's local machine.
 ```
-cmf artifact pull -p 'pipeline-name' 
+cmf artifact pull -p 'pipeline-name' -f '/path/to/mlmd-file-name' -a 'artifact-name'
 ```
 Required Arguments
 ```
@@ -230,7 +231,7 @@ Required Arguments
 Optional Arguments
 ```
   -h, --help                                            show this help message and exit
-  -a <artifact_name>, --artifact_name <artifact_name>   Specify artifact name only; don't use folder name or absolute path.
+  -a [artifact_name], --artifact_name [artifact_name]   Specify artifact name only; don't use folder name or absolute path.
   -f [file_name],--file-name [file_name]                Specify mlmd file name.
 ```
 ### cmf artifact push
@@ -239,7 +240,7 @@ Usage: cmf artifact push [-h] -p [pipeline_name] -f [file_name]
 ```
 `cmf artifact push` command push artifacts from the user's local machine to the user configured artifact repository.
 ```
-cmf artifact push -p 'pipeline_name'
+cmf artifact push -p 'pipeline_name' -f '/path/to/mlmd-file-name'
 ```
 Required Arguments
 ```
@@ -252,16 +253,16 @@ Optional Arguments
 ```
 ## cmf metadata
 ```
-Usage: cmf metadata [-h] {pull,push}
+Usage: cmf metadata [-h] {pull,push,export}
 ```
-`cmf metadata` push or pull the metadata file to and from the cmf-server, respectively.
+`cmf metadata` push, pull or export the metadata file to and from the cmf-server, respectively.
 ### cmf metadata pull
 ```
 Usage: cmf metadata pull [-h] -p [pipeline_name] -f [file_name]  -e [exec_id]
 ```
 `cmf metadata pull` command pulls the metadata file from the cmf-server to the user's local machine.
 ```
-cmf metadata pull -p 'pipeline-name' -f "/path/to/mlmd-file-name"
+cmf metadata pull -p 'pipeline-name' -f '/path/to/mlmd-file-name' -e 'execution_id'
 ```
 Required Arguments
 ```
@@ -275,11 +276,31 @@ Optional Arguments
 ```
 ### cmf metadata push
 ```
-Usage: cmf metadata push [-h] -p [pipeline_name] -f [file_name]  -e [exec_id]
+Usage: cmf metadata push [-h] -p [pipeline_name] -f [file_name] -e [exec_id] -t [tensorboard]
 ```
 `cmf metadata push` command pushes the metadata file from the local machine to the cmf-server.
 ```
-cmf metadata push -p 'pipeline-name' -f "/path/to/mlmd-file-name"
+cmf metadata push -p 'pipeline-name' -f '/path/to/mlmd-file-name' -e 'execution_id' -t '/path/to/tensorboard-log'
+```
+Required Arguments
+```
+-p [pipeline_name], --pipeline_name [pipeline_name]     Specify Pipeline name.
+```
+
+Optional Arguments
+```
+  -h, --help                                         show this help message and exit
+  -f [file_name],   --file_name [file_name]          Specify mlmd file name.
+  -e [exec_id],     --execution [exec_id]            Specify execution id.
+  -t [tensorboard], --tensorboard [tensorboard]      Specify path to tensorboard logs for the pipeline.
+```
+### cmf metadata export
+```
+Usage: cmf metadata export [-h] -p [pipeline_name] -j [json_file_name] -f [file_name]
+```
+`cmf metadata export` export local mlmd's metadata in json format to a json file.
+```
+cmf metadata export -p 'pipeline-name' -j '/path/to/json-file-name' -f '/path/to/mlmd-file-name'
 ```
 Required Arguments
 ```
@@ -288,7 +309,7 @@ Required Arguments
 
 Optional Arguments
 ```
-  -h, --help                                    show this help message and exit
-  -f [file_name], --file_name [file_name]       Specify mlmd file name.
-  -e [exec_id], --execution [exec_id]           Specify execution id.
+  -h, --help                                               show this help message and exit
+  -f [file_name],      --file_name [file_name]             Specify mlmd file name.
+  -j [json_file_name], --json_file_name [json_file_name]   Specify json file name with full path.
 ```
diff --git a/docs/cmf_client/tensorflow_guide.md b/docs/cmf_client/tensorflow_guide.md
@@ -0,0 +1,119 @@
+# How to Use TensorBoard with CMF
+
+1. Copy the contents of the 'example-get-started' directory from `cmf/examples/example-get-started` into a separate directory outside cmf repository.
+
+2. Execute the following command to install the TensorFlow library in the current directory:
+     ```bash
+     pip install tensorflow
+     ```
+
+3. Create a new Python file (e.g., `tensorflow_log.py`) and copy the following code:
+
+   ```
+	import datetime
+    import tensorflow as tf
+
+	mnist = tf.keras.datasets.mnist
+	(x_train, y_train),(x_test, y_test) = mnist.load_data()
+	x_train, x_test = x_train / 255.0, x_test / 255.0
+
+	def create_model():
+	return tf.keras.models.Sequential([
+		tf.keras.layers.Flatten(input_shape=(28, 28), name='layers_flatten'),
+		tf.keras.layers.Dense(512, activation='relu', name='layers_dense'), 
+		tf.keras.layers.Dropout(0.2, name='layers_dropout'),
+		tf.keras.layers.Dense(10, activation='softmax', name='layers_dense_2')
+	])
+
+	model = create_model()
+	model.compile(optimizer='adam',
+		loss='sparse_categorical_crossentropy',
+		metrics=['accuracy'])
+
+	log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
+	tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
+	model.fit(x=x_train,y=y_train,epochs=5,validation_data=(x_test, y_test),callbacks=[tensorboard_callback])
+
+	train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
+	test_dataset = tf.data.Dataset.from_tensor_slices((x_test, y_test))
+
+	train_dataset = train_dataset.shuffle(60000).batch(64)
+	test_dataset = test_dataset.batch(64)
+
+	loss_object = tf.keras.losses.SparseCategoricalCrossentropy()
+	optimizer = tf.keras.optimizers.Adam()
+
+	# Define our metrics
+	train_loss = tf.keras.metrics.Mean('train_loss', dtype=tf.float32)
+	train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy('train_accuracy')
+	test_loss = tf.keras.metrics.Mean('test_loss', dtype=tf.float32)
+	test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy('test_accuracy')
+
+	def train_step(model, optimizer, x_train, y_train):
+	     with tf.GradientTape() as tape:
+		     predictions = model(x_train, training=True)
+		     loss = loss_object(y_train, predictions)
+	     grads = tape.gradient(loss, model.trainable_variables)
+	     optimizer.apply_gradients(zip(grads, model.trainable_variables))
+	     train_loss(loss)
+	     train_accuracy(y_train, predictions)
+
+	def test_step(model, x_test, y_test):
+	     predictions = model(x_test)
+	     loss = loss_object(y_test, predictions)
+	     test_loss(loss)
+	     test_accuracy(y_test, predictions)
+
+	current_time = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
+	train_log_dir = 'logs/gradient_tape/' + current_time + '/train'
+	test_log_dir = 'logs/gradient_tape/' + current_time + '/test'
+	train_summary_writer = tf.summary.create_file_writer(train_log_dir)
+	test_summary_writer = tf.summary.create_file_writer(test_log_dir)
+
+	model = create_model() # reset our model
+	EPOCHS = 5
+	for epoch in range(EPOCHS):
+	     for (x_train, y_train) in train_dataset:
+		     train_step(model, optimizer, x_train, y_train)
+	     with train_summary_writer.as_default():
+		     tf.summary.scalar('loss', train_loss.result(), step=epoch)
+		     tf.summary.scalar('accuracy', train_accuracy.result(), step=epoch)
+
+	     for (x_test, y_test) in test_dataset:
+		     test_step(model, x_test, y_test)
+	     with test_summary_writer.as_default():
+		     tf.summary.scalar('loss', test_loss.result(), step=epoch)
+		     tf.summary.scalar('accuracy', test_accuracy.result(), step=epoch)
+	     template = 'Epoch {}, Loss: {}, Accuracy: {}, Test Loss: {}, Test Accuracy: {}'
+	     print (template.format(epoch+1,
+                                 train_loss.result(),
+                                 train_accuracy.result()*100,
+                                 test_loss.result(),
+                                 test_accuracy.result()*100))
+
+   ```
+   For more detailed information, check out the [TensorBoard documentation](https://www.tensorflow.org/tensorboard/get_started).
+
+5. Execute the TensorFlow log script using the following command:
+     ```bash
+     python3 tensorflow_log.py
+     ```
+
+6. The above script will automatically create a `logs` directory inside your current directory.
+
+7. Start the CMF server and configure the [CMF client](step-by-step.md).
+
+8. Use the following command to run the test script, which will generate the MLMD file:
+     ```bash
+     sh test_script.sh
+     ```
+
+9. Use the following command to push the generated MLMD and TensorFlow log files to the CMF server: 
+     ```bash
+     cmf metadata push -p 'pipeline-name' -t 'tensorboard-log-file-name'
+     ```
+
+10. Go to the CMF server and navigate to the TensorBoard tab. You will see an interface similar to the following image.
+    ![image](../assets/Tensorboard.png)
+
+---