Skip to content

Commit

Permalink
Tensorboard Related Documentation (#187)
Browse files Browse the repository at this point in the history
* Made changes related cmf init commands

* Added tesorflow related document

* Update mkdocs.yaml

Added for testing of TensorFlow document purpose

* Update document

* Added image

* Added image

* Update tensorflow_guide.md

Made chanages inside image path

* kept cmf related old code inside old_archive_code directory

* Removing unnecessary files

* Made cmf server url as optional parameter

* Made changes inside cmf client command file

* Update tensorboard document

* Added OSDF related parameter and Resolved Ann's Comment

* Modified changes related to cmf-server page

* Made changes inside README file

* Made content bold

* Added a disclosure widget

* Updated cmf-server.md docker run commands

* Updating cmf-server.md

* Updating cmf-server.md

* Updating ENV PATH variable as new Dockerfile guidelines

* Update cmf-server.md

---------

Co-authored-by: Varkha Sharma <[email protected]>
  • Loading branch information
AyeshaSanadi and varkha-d-sharma authored Aug 6, 2024
1 parent 0a66528 commit a66fd49
Show file tree
Hide file tree
Showing 10 changed files with 259 additions and 63 deletions.
52 changes: 51 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,54 @@
# cmf
# CMF
CMF (Common Metadata Framework) collects and stores information associated with Machine Learning (ML) pipelines. It also implements APIs to query this metadata. The CMF adopts a data-first approach: all artifacts (such as datasets, ML models and performance metrics) recorded by the framework are versioned and identified by their content hash.

## Installation

#### 1. Pre-Requisites:
<b>
<ul>
<li>Supported Operating Systems: Linux/Ubuntu/Debian</li>
<li>3.9 >= Python < 3.11</li>
<li>Git latest version</li>
</ul>
</b>

#### 2. Set up Python Virtual Environment:
<details>
<summary>Using Conda:</summary>

conda create -n cmf python=3.10
conda activate cmf

</details>

<details>
<summary>Using VirtualEnv:</summary>

virtualenv --python=3.10 .cmf
source .cmf/bin/activate

</details>


#### 3. Install CMF client:
<details>
<summary>Latest version form GitHub:</summary>

pip install git+https://github.com/HewlettPackard/cmf

</details>

<details>
<summary>Stable version form PyPI:</summary>

pip install cmflib

</details>

#### 4. Install CMF server:

Follow the instructions on the [Getting started with cmf-server](./docs/cmf_server/cmf-server.md) page for details on how to setup a cmf-server.

## Common Metadata Framework
[Getting Started](https://hewlettpackard.github.io/cmf/)<br><br>
[Detailed documentation of the API's](https://hewlettpackard.github.io/cmf/api/public/cmf) <br>
Expand Down
Binary file added docs/assets/Tensorboard.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
71 changes: 46 additions & 25 deletions docs/cmf_client/cmf_client.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Getting started with cmf-client commands
## cmf init
```
Usage: cmf init [-h] {minioS3,amazonS3,local,sshremote,show}
Usage: cmf init [-h] {minioS3,amazonS3,local,sshremote,osdfremote,show}
```
`cmf init` initializes an artifact repository for cmf. Local directory, Minio S3 bucket, Amazon S3 bucket and SSH Remote directory are the options available. Additionally, user can provide cmf-server url.
`cmf init` initializes an artifact repository for cmf. Local directory, Minio S3 bucket, Amazon S3 bucket, SSH Remote and Remote OSDF directory are the options available. Additionally, user can provide cmf-server url.
### cmf init show
```
Usage: cmf init show
Expand Down Expand Up @@ -42,7 +42,7 @@ Optional Arguments
--cmf-server-url [cmf_server_url] Specify cmf-server url. (default: http://127.0.0.1:80)
--neo4j-user [neo4j_user] Specify neo4j user. (default: None)
--neo4j-password [neo4j_password] Specify neo4j password. (default: None)
--neo4j-uri <neo4j_uri> Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
--neo4j-uri [neo4j_uri] Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
```
Expand Down Expand Up @@ -73,7 +73,7 @@ Optional Arguments
--cmf-server-url [cmf_server_url] Specify cmf-server url. (default: http://127.0.0.1:80)
--neo4j-user [neo4j_user] Specify neo4j user. (default: None)
--neo4j-password [neo4j_password] Specify neo4j password. (default: None)
--neo4j-uri <neo4j_uri> Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
--neo4j-uri [neo4j_uri] Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
```
### cmf init amazonS3
Before setting up, obtain AWS temporary security credentials using the AWS Security Token Service (STS). These credentials are short-term and can last from minutes to hours. They are dynamically generated and provided to trusted users upon request, and expire after use. Users with appropriate permissions can request new credentials before or upon expiration. For further information, refer to the [Temporary security credentials in IAM](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html) page.
Expand Down Expand Up @@ -135,21 +135,22 @@ Required Arguments
--url [url] Specify bucket url.
--access-key-id [access_key_id] Specify Access Key Id.
--secret-key [secret_key] Specify Secret Key.
--session-token Specify session token. (default: )
--git-remote-url [git_remote_url] Specify git repo url.
```
Optional Arguments
```
-h, --help show this help message and exit
--session-token Specify session token. (default: )
--cmf-server-url [cmf_server_url] Specify cmf-server url. (default: http://127.0.0.1:80)
--neo4j-user [neo4j_user] Specify neo4j user. (default: None)
--neo4j-password [neo4j_password] Specify neo4j password. (default: None)
--neo4j-uri <neo4j_uri> Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
--neo4j-uri [neo4j_uri] Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
```
### cmf init sshremote
```
Usage: cmf init sshremote [-h] --path [path]
--user [user] --port [port]
--user [user]
--port [port]
--password [password]
--git-remote-url [git_remote_url]
--cmf-server-url [cmf_server_url]
Expand All @@ -171,18 +172,18 @@ Required Arguments
```
Optional Arguments
```
-h, --help show this help message and exit
-h, --help show this help message and exit
--cmf-server-url [cmf_server_url] Specify cmf-server url. (default: http://127.0.0.1:80)
--neo4j-user [neo4j_user] Specify neo4j user. (default: None)
--neo4j-password [neo4j_password] Specify neo4j password. (default: None)
--neo4j-uri <neo4j_uri> Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
--neo4j-uri [neo4j_uri] Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
```
### cmf init osdfremote
```
Usage: cmf init osdfremote [-h] --path [path]
--endpoint-url [endpoint_url]
--access-key-id [access_key_id]
--secret-key [secret_key]
--key-id [key_id]
--key-path [key_path]
--key-issuer [key_issuer]
--git-remote-url[git_remote_url]
--cmf-server-url [cmf_server_url]
--neo4j-user [neo4j_user]
Expand All @@ -207,7 +208,7 @@ Optional Arguments
--cmf-server-url [cmf_server_url] Specify cmf-server url. (default: http://127.0.0.1:80)
--neo4j-user [neo4j_user] Specify neo4j user. (default: None)
--neo4j-password [neo4j_password] Specify neo4j password. (default: None)
--neo4j-uri <neo4j_uri> Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
--neo4j-uri [neo4j_uri] Specify neo4j uri. Eg bolt://localhost:7687 (default: None)
```
## cmf artifact
Expand All @@ -217,11 +218,11 @@ Usage: cmf artifact [-h] {pull,push}
`cmf artifact` pull or push artifacts from or to the user configured artifact repository, respectively.
### cmf artifact pull
```
Usage: cmf artifact pull [-h] -p [pipeline_name] -f [file_name] [-a <artifact_name>]
Usage: cmf artifact pull [-h] -p [pipeline_name] -f [file_name] -a [artifact_name]
```
`cmf artifact pull` command pull artifacts from the user configured repository to the user's local machine.
```
cmf artifact pull -p 'pipeline-name'
cmf artifact pull -p 'pipeline-name' -f '/path/to/mlmd-file-name' -a 'artifact-name'
```
Required Arguments
```
Expand All @@ -230,7 +231,7 @@ Required Arguments
Optional Arguments
```
-h, --help show this help message and exit
-a <artifact_name>, --artifact_name <artifact_name> Specify artifact name only; don't use folder name or absolute path.
-a [artifact_name], --artifact_name [artifact_name] Specify artifact name only; don't use folder name or absolute path.
-f [file_name],--file-name [file_name] Specify mlmd file name.
```
### cmf artifact push
Expand All @@ -239,7 +240,7 @@ Usage: cmf artifact push [-h] -p [pipeline_name] -f [file_name]
```
`cmf artifact push` command push artifacts from the user's local machine to the user configured artifact repository.
```
cmf artifact push -p 'pipeline_name'
cmf artifact push -p 'pipeline_name' -f '/path/to/mlmd-file-name'
```
Required Arguments
```
Expand All @@ -252,16 +253,16 @@ Optional Arguments
```
## cmf metadata
```
Usage: cmf metadata [-h] {pull,push}
Usage: cmf metadata [-h] {pull,push,export}
```
`cmf metadata` push or pull the metadata file to and from the cmf-server, respectively.
`cmf metadata` push, pull or export the metadata file to and from the cmf-server, respectively.
### cmf metadata pull
```
Usage: cmf metadata pull [-h] -p [pipeline_name] -f [file_name] -e [exec_id]
```
`cmf metadata pull` command pulls the metadata file from the cmf-server to the user's local machine.
```
cmf metadata pull -p 'pipeline-name' -f "/path/to/mlmd-file-name"
cmf metadata pull -p 'pipeline-name' -f '/path/to/mlmd-file-name' -e 'execution_id'
```
Required Arguments
```
Expand All @@ -275,11 +276,31 @@ Optional Arguments
```
### cmf metadata push
```
Usage: cmf metadata push [-h] -p [pipeline_name] -f [file_name] -e [exec_id]
Usage: cmf metadata push [-h] -p [pipeline_name] -f [file_name] -e [exec_id] -t [tensorboard]
```
`cmf metadata push` command pushes the metadata file from the local machine to the cmf-server.
```
cmf metadata push -p 'pipeline-name' -f "/path/to/mlmd-file-name"
cmf metadata push -p 'pipeline-name' -f '/path/to/mlmd-file-name' -e 'execution_id' -t '/path/to/tensorboard-log'
```
Required Arguments
```
-p [pipeline_name], --pipeline_name [pipeline_name] Specify Pipeline name.
```

Optional Arguments
```
-h, --help show this help message and exit
-f [file_name], --file_name [file_name] Specify mlmd file name.
-e [exec_id], --execution [exec_id] Specify execution id.
-t [tensorboard], --tensorboard [tensorboard] Specify path to tensorboard logs for the pipeline.
```
### cmf metadata export
```
Usage: cmf metadata export [-h] -p [pipeline_name] -j [json_file_name] -f [file_name]
```
`cmf metadata export` export local mlmd's metadata in json format to a json file.
```
cmf metadata export -p 'pipeline-name' -j '/path/to/json-file-name' -f '/path/to/mlmd-file-name'
```
Required Arguments
```
Expand All @@ -288,7 +309,7 @@ Required Arguments

Optional Arguments
```
-h, --help show this help message and exit
-f [file_name], --file_name [file_name] Specify mlmd file name.
-e [exec_id], --execution [exec_id] Specify execution id.
-h, --help show this help message and exit
-f [file_name], --file_name [file_name] Specify mlmd file name.
-j [json_file_name], --json_file_name [json_file_name] Specify json file name with full path.
```
119 changes: 119 additions & 0 deletions docs/cmf_client/tensorflow_guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
# How to Use TensorBoard with CMF

1. Copy the contents of the 'example-get-started' directory from `cmf/examples/example-get-started` into a separate directory outside cmf repository.

2. Execute the following command to install the TensorFlow library in the current directory:
```bash
pip install tensorflow
```

3. Create a new Python file (e.g., `tensorflow_log.py`) and copy the following code:

```
import datetime
import tensorflow as tf

mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

def create_model():
return tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28), name='layers_flatten'),
tf.keras.layers.Dense(512, activation='relu', name='layers_dense'),
tf.keras.layers.Dropout(0.2, name='layers_dropout'),
tf.keras.layers.Dense(10, activation='softmax', name='layers_dense_2')
])

model = create_model()
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
model.fit(x=x_train,y=y_train,epochs=5,validation_data=(x_test, y_test),callbacks=[tensorboard_callback])

train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
test_dataset = tf.data.Dataset.from_tensor_slices((x_test, y_test))

train_dataset = train_dataset.shuffle(60000).batch(64)
test_dataset = test_dataset.batch(64)

loss_object = tf.keras.losses.SparseCategoricalCrossentropy()
optimizer = tf.keras.optimizers.Adam()

# Define our metrics
train_loss = tf.keras.metrics.Mean('train_loss', dtype=tf.float32)
train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy('train_accuracy')
test_loss = tf.keras.metrics.Mean('test_loss', dtype=tf.float32)
test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy('test_accuracy')

def train_step(model, optimizer, x_train, y_train):
with tf.GradientTape() as tape:
predictions = model(x_train, training=True)
loss = loss_object(y_train, predictions)
grads = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
train_loss(loss)
train_accuracy(y_train, predictions)

def test_step(model, x_test, y_test):
predictions = model(x_test)
loss = loss_object(y_test, predictions)
test_loss(loss)
test_accuracy(y_test, predictions)

current_time = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
train_log_dir = 'logs/gradient_tape/' + current_time + '/train'
test_log_dir = 'logs/gradient_tape/' + current_time + '/test'
train_summary_writer = tf.summary.create_file_writer(train_log_dir)
test_summary_writer = tf.summary.create_file_writer(test_log_dir)

model = create_model() # reset our model
EPOCHS = 5
for epoch in range(EPOCHS):
for (x_train, y_train) in train_dataset:
train_step(model, optimizer, x_train, y_train)
with train_summary_writer.as_default():
tf.summary.scalar('loss', train_loss.result(), step=epoch)
tf.summary.scalar('accuracy', train_accuracy.result(), step=epoch)

for (x_test, y_test) in test_dataset:
test_step(model, x_test, y_test)
with test_summary_writer.as_default():
tf.summary.scalar('loss', test_loss.result(), step=epoch)
tf.summary.scalar('accuracy', test_accuracy.result(), step=epoch)
template = 'Epoch {}, Loss: {}, Accuracy: {}, Test Loss: {}, Test Accuracy: {}'
print (template.format(epoch+1,
train_loss.result(),
train_accuracy.result()*100,
test_loss.result(),
test_accuracy.result()*100))

```
For more detailed information, check out the [TensorBoard documentation](https://www.tensorflow.org/tensorboard/get_started).
5. Execute the TensorFlow log script using the following command:
```bash
python3 tensorflow_log.py
```
6. The above script will automatically create a `logs` directory inside your current directory.
7. Start the CMF server and configure the [CMF client](step-by-step.md).
8. Use the following command to run the test script, which will generate the MLMD file:
```bash
sh test_script.sh
```
9. Use the following command to push the generated MLMD and TensorFlow log files to the CMF server:
```bash
cmf metadata push -p 'pipeline-name' -t 'tensorboard-log-file-name'
```
10. Go to the CMF server and navigate to the TensorBoard tab. You will see an interface similar to the following image.
![image](../assets/Tensorboard.png)
---
Loading

0 comments on commit a66fd49

Please sign in to comment.