From 754f2b0a021b1011ca45691d46ef2e6b00726755 Mon Sep 17 00:00:00 2001
From: Sonali Saha <sonali.saha@intel.com>
Date: Thu, 22 Jul 2021 04:01:58 -0400
Subject: [PATCH 1/2] [GSC] Add dockerfile and manifest file for tensorflow
 ResNet50 and BERT models gsc image

Signed-off-by: Sonali Saha <sonali.saha@intel.com>
---
 Tools/gsc/test/tensorflow/README.md           | 190 ++++++++++++++++++
 .../ubuntu18.04-tensorflow-bert.dockerfile    |  25 +++
 ...ubuntu18.04-tensorflow-resnet50.dockerfile |  19 ++
 .../gsc/test/ubuntu18.04-tensorflow.manifest  |   6 +
 4 files changed, 240 insertions(+)
 create mode 100644 Tools/gsc/test/tensorflow/README.md
 create mode 100755 Tools/gsc/test/ubuntu18.04-tensorflow-bert.dockerfile
 create mode 100644 Tools/gsc/test/ubuntu18.04-tensorflow-resnet50.dockerfile
 create mode 100755 Tools/gsc/test/ubuntu18.04-tensorflow.manifest

diff --git a/Tools/gsc/test/tensorflow/README.md b/Tools/gsc/test/tensorflow/README.md
new file mode 100644
index 0000000000..03f480c123
--- /dev/null
+++ b/Tools/gsc/test/tensorflow/README.md
@@ -0,0 +1,190 @@
+# Inference on TensorFlow BERT and ResNet50 models:
+The ``../test`` directory contains dockerfile and manifest file to run inference with TensorFlow BERT and 
+ResNet50 sample workloads on GSC. Specifically, both these examples use pre-trained models to run 
+inference. We tested this on Ubuntu 18.04 and uses the package version with Python 3.6.
+
+## Bidirectional Encoder Representations from Transformers (BERT):
+BERT is a method of pre-training language representations and then use that trained model for 
+downstream NLP tasks like 'question answering'. BERT is an unsupervised, deeply birectional system 
+for pre-training NLP. In this BERT sample, we use 'BERT-Large, Uncased (Whole Word Masking)' model 
+and perform int8 inference. More details about BERT can be found at 
+https://github.com/google-research/bert.
+
+## Residual Network (ResNet):
+ResNet50 is a convolutional neural network that is 50 layers deep. In this ResNet50(v1.5) sample, 
+we use a pre-trained model and perform int8 inference. More details about ResNet50 can be found at 
+https://github.com/IntelAI/models/tree/icx-launch-public/benchmarks/image_recognition/tensorflow/resnet50v1_5.
+
+## Pre-System setting:
+Linux systems have CPU frequency scaling governor that helps the system to scale the CPU frequency 
+to achieve best performance or to save power based on the requirement. To achieve the best 
+peformance, please set the CPU frequency scaling governor to performance mode.
+
+```
+for ((i=0; i<$(nproc); i++)); \
+do echo 'performance' > /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor; done
+```
+
+## Common build steps:
+1. ``cd $(GRAPHENE_DIR)/Tools/gsc``
+
+2. Create a configuration file: ``cp config.yaml.template config.yaml``  
+Manually adopt config.yaml to the installed Intel SGX driver and desired Graphene repository/version
+
+3. Generate the signing key: ``openssl genrsa -3 -out enclave-key.pem 3072``
+
+## Build graphenize Docker image and run BERT inference:
+1. Build docker image:
+```
+cd test
+docker build --rm -t ubuntu18.04-tensorflow-bert -f ubuntu18.04-tensorflow-bert.dockerfile \
+../../../Examples
+```
+
+2. Graphenize the docker image using gsc build:
+```
+cd ..
+./gsc build --insecure-args ubuntu18.04-tensorflow-bert test/ubuntu18.04-tensorflow.manifest
+```
+
+3. Sign the graphenized Docker image using gsc sign-image:
+```
+./gsc sign-image ubuntu18.04-tensorflow-bert enclave-key.pem
+```
+
+4. To run int8 inference on GSC:
+```
+docker run --device=/dev/sgx_enclave --cpuset-cpus="0-35" --env OMP_NUM_THREADS=36 \
+--env KMP_AFFINITY=granularity=fine,noverbose,compact,1,0 \
+gsc-ubuntu18.04-tensorflow-bert \
+models/models/language_modeling/tensorflow/bert_large/inference/run_squad.py \
+--init_checkpoint=data/bert_large_checkpoints/model.ckpt-3649 \
+--vocab_file=data/wwm_uncased_L-24_H-1024_A-16/vocab.txt \
+--bert_config_file=data/wwm_uncased_L-24_H-1024_A-16/bert_config.json \
+--predict_file=data/wwm_uncased_L-24_H-1024_A-16/dev-v1.1.json \
+--precision=int8 \
+--predict_batch_size=32 \
+--experimental_gelu=True \
+--optimized_softmax=True \
+--input_graph=data/asymmetric_per_channel_bert_int8.pb \
+--do_predict=True \
+--mode=benchmark \
+--inter_op_parallelism_threads=1 \
+--intra_op_parallelism_threads=36 \
+--output_dir=output/bert-squad-output
+```
+
+5. To run int8 inference on native container:
+```
+docker run --cpuset-cpus="0-35" --env OMP_NUM_THREADS=36 \
+--env KMP_AFFINITY=granularity=fine,noverbose,compact,1,0 \
+ubuntu18.04-tensorflow-bert \
+models/models/language_modeling/tensorflow/bert_large/inference/run_squad.py \
+--init_checkpoint=data/bert_large_checkpoints/model.ckpt-3649 \
+--vocab_file=data/wwm_uncased_L-24_H-1024_A-16/vocab.txt \
+--bert_config_file=data/wwm_uncased_L-24_H-1024_A-16/bert_config.json \
+--predict_file=data/wwm_uncased_L-24_H-1024_A-16/dev-v1.1.json \
+--precision=int8 \
+--predict_batch_size=32 \
+--experimental_gelu=True \
+--optimized_softmax=True \
+--input_graph=data/asymmetric_per_channel_bert_int8.pb \
+--do_predict=True \
+--mode=benchmark \
+--inter_op_parallelism_threads=1 \
+--intra_op_parallelism_threads=36 \
+--output_dir=output/bert-squad-output
+```
+
+6. Above commands are for a 36 core system. Please set the following options accordingly for 
+optimal performance.
+    - OMP_NUM_THREADS='Core(s) per socket'
+    - --cpuset-cpus to 'Core(s) per socket'
+    - num-intra-threads='Core(s) per socket'
+    - If hyperthreading is enabled: use ``KMP_AFFINITY=granularity=fine,verbose,compact,1,0``
+    - If hyperthreading is disabled: use ``KMP_AFFINITY=granularity=fine,verbose,compact``
+    - **NOTE** To get 'Core(s) per socket', do ``lscpu | grep 'Core(s) per socket'`` \
+    OMP_NUM_THREADS sets the maximum number of threads to use for OpenMP parallel regions. \
+    KMP_AFFINITY binds OpenMP threads to physical processing units.
+
+## Build graphenize Docker image and run ResNet50 inference:
+1. Build docker image:
+```
+cd test
+docker build --rm -t ubuntu18.04-tensorflow-resnet50 -f ubuntu18.04-tensorflow-resnet50.dockerfile \
+../../../Examples
+```
+
+2. Graphenize the docker image using gsc build:
+```cd ..
+./gsc build --insecure-args ubuntu18.04-tensorflow-resnet50 test/ubuntu18.04-tensorflow.manifest
+```
+
+3. Sign the graphenized Docker image using gsc sign-image:
+```
+./gsc sign-image ubuntu18.04-tensorflow-resnet50 enclave-key.pem
+```
+
+4. To run inference on GSC:
+```
+docker run --device=/dev/sgx_enclave --cpuset-cpus="0-35" --env OMP_NUM_THREADS=36 \
+--env KMP_AFFINITY=granularity=fine,noverbose,compact,1,0 \
+gsc-ubuntu18.04-tensorflow-resnet50 \
+models/models/image_recognition/tensorflow/resnet50v1_5/inference/eval_image_classifier_inference.py \
+--input-graph=resnet50v1_5_int8_pretrained_model.pb \
+--num-inter-threads=1 \
+--num-intra-threads=36 \
+--batch-size=32 \
+--warmup-steps=50 \
+--steps=500
+```
+**NOTE**: When OOM happens user can set environment varibale ``TF_MKL_ALLOC_MAX_BYTES`` to upper 
+bound on memory allocation. As an example in a machine with 32 GB memory pass option 
+``--env TF_MKL_ALLOC_MAX_BYTES=17179869184`` to docker run command when OOM happens.
+
+5. To run inference on native Container:
+```
+docker run --cpuset-cpus="0-35" --env OMP_NUM_THREADS=36 \
+--env KMP_AFFINITY=granularity=fine,noverbose,compact,1,0 \
+ubuntu18.04-tensorflow-resnet50 \
+models/models/image_recognition/tensorflow/resnet50v1_5/inference/eval_image_classifier_inference.py \
+--input-graph=resnet50v1_5_int8_pretrained_model.pb \
+--num-inter-threads=1 \
+--num-intra-threads=36 \
+--batch-size=32 \
+--warmup-steps=50 \
+--steps=500
+```
+
+6. Above commands are for a 36 core system. Please set the following options accordingly for 
+optimal performance.
+    - OMP_NUM_THREADS='Core(s) per socket'
+    - --cpuset-cpus to 'Core(s) per socket'
+    - num-intra-threads='Core(s) per socket'
+    - If hyperthreading is enabled: use ``KMP_AFFINITY=granularity=fine,verbose,compact,1,0``
+    - If hyperthreading is disabled: use ``KMP_AFFINITY=granularity=fine,verbose,compact``
+    - The options batch-size, warmup-steps and steps can be varied.
+    - **NOTE** To get 'Core(s) per socket', do ``lscpu | grep 'Core(s) per socket'`` \
+    OMP_NUM_THREADS sets the maximum number of threads to use for OpenMP parallel regions. \
+    KMP_AFFINITY binds OpenMP threads to physical processing units.
+
+## Performance considerations:
+- Preheat manifest option pre-faults the enclave memory and moves the performance penalty to 
+graphene-sgx invocation (before the workload starts executing). To use preheat option, add 
+``sgx.preheat_enclave = 1`` to the manifest template.
+- TCMalloc and mimalloc are memory allocator libraries from Google and Microsoft that can help 
+improve performance significantly based on the workloads. At any point, only one of these 
+allocators can be used.
+  - TCMalloc (Please update the binary location and name if different from default)
+    - Install tcmalloc: ``sudo apt-get install google-perftools``
+    - Add these in the manifest template
+        - ``loader.env.LD_PRELOAD = "/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4"``
+        - ``sgx.trusted_files.libtcmalloc = "file:/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4"``
+        - ``sgx.trusted_files.libunwind = "file:/usr/lib/x86_64-linux-gnu/libunwind.so.8"``
+    - Save the template and rebuild.
+  - mimalloc (Please update the binary location and name if different from default)
+    - Install mimalloc using the steps from https://github.com/microsoft/mimalloc
+    - Add these in the manifest template
+        - ``loader.env.LD_PRELOAD = "/usr/local/lib/mimalloc-1.7/libmimalloc.so.1.7"``
+        - ``sgx.trusted_files.libmimalloc = "file:/usr/local/lib/mimalloc-1.7/libmimalloc.so.1.7"``
+    - Save the template and rebuild.
\ No newline at end of file
diff --git a/Tools/gsc/test/ubuntu18.04-tensorflow-bert.dockerfile b/Tools/gsc/test/ubuntu18.04-tensorflow-bert.dockerfile
new file mode 100755
index 0000000000..8c7a00768e
--- /dev/null
+++ b/Tools/gsc/test/ubuntu18.04-tensorflow-bert.dockerfile
@@ -0,0 +1,25 @@
+From ubuntu:18.04
+
+# Install prerequisites
+RUN apt-get update \
+    && apt-get install -y git wget \
+    && apt-get install -y python3.6 python3-pip unzip \
+    && pip3 install --upgrade pip
+
+# Install tensorflow
+RUN pip3 install intel-tensorflow-avx512==2.4.0
+
+# Download models
+RUN git clone https://github.com/IntelAI/models.git /models/
+
+# Download data
+RUN mkdir -p data \ 
+    && cd data \
+    && wget https://storage.googleapis.com/bert_models/2019_05_30/wwm_uncased_L-24_H-1024_A-16.zip \
+    && unzip wwm_uncased_L-24_H-1024_A-16.zip \
+    && wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json -P wwm_uncased_L-24_H-1024_A-16 \
+    && wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v1_8/bert_large_checkpoints.zip \
+    && unzip bert_large_checkpoints.zip \
+    && wget https://storage.googleapis.com/intel-optimized-tensorflow/models/r2.5-icx-b631821f/asymmetric_per_channel_bert_int8.pb
+
+ENTRYPOINT ["python3.6"]
diff --git a/Tools/gsc/test/ubuntu18.04-tensorflow-resnet50.dockerfile b/Tools/gsc/test/ubuntu18.04-tensorflow-resnet50.dockerfile
new file mode 100644
index 0000000000..da0c7077bf
--- /dev/null
+++ b/Tools/gsc/test/ubuntu18.04-tensorflow-resnet50.dockerfile
@@ -0,0 +1,19 @@
+From ubuntu:18.04
+
+# Install prerequisites
+RUN apt-get update \
+    && apt-get install -y git wget \
+    && apt-get install -y python3.6 python3-pip
+
+RUN pip3 install --upgrade pip
+
+# Install tensorflow
+RUN pip3 install intel-tensorflow-avx512==2.4.0
+
+# Download input graph file
+RUN wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v1_8/resnet50v1_5_int8_pretrained_model.pb
+
+# Download model
+RUN git clone https://github.com/IntelAI/models.git /models/
+
+ENTRYPOINT ["python3.6"]
diff --git a/Tools/gsc/test/ubuntu18.04-tensorflow.manifest b/Tools/gsc/test/ubuntu18.04-tensorflow.manifest
new file mode 100755
index 0000000000..ee93a2000d
--- /dev/null
+++ b/Tools/gsc/test/ubuntu18.04-tensorflow.manifest
@@ -0,0 +1,6 @@
+sgx.enclave_size = "32G"
+sgx.thread_num = 300
+loader.pal_internal_mem_size = "64M"
+loader.insecure__use_host_env = 1
+sgx.allowed_files.tmp = "file:/tmp"
+sgx.preheat_enclave = 1

From 54e4d4d9f9142f93d731a32ac0faaeb8d0f62251 Mon Sep 17 00:00:00 2001
From: Sonali Saha <sonali.saha@intel.com>
Date: Fri, 13 Aug 2021 06:49:15 -0400
Subject: [PATCH 2/2] fixup! [GSC] Add dockerfile and manifest file for
 tensorflow ResNet50 and BERT models gsc image

---
 Tools/gsc/Examples/tensorflow/README.md       |  92 +++++++++
 .../ubuntu18.04-tensorflow-bert.dockerfile    |   9 +-
 ...ubuntu18.04-tensorflow-resnet50.dockerfile |   8 +-
 .../ubuntu18.04-tensorflow.manifest           |   0
 Tools/gsc/test/tensorflow/README.md           | 190 ------------------
 5 files changed, 99 insertions(+), 200 deletions(-)
 create mode 100644 Tools/gsc/Examples/tensorflow/README.md
 rename Tools/gsc/{test => Examples/tensorflow}/ubuntu18.04-tensorflow-bert.dockerfile (78%)
 mode change 100755 => 100644
 rename Tools/gsc/{test => Examples/tensorflow}/ubuntu18.04-tensorflow-resnet50.dockerfile (71%)
 rename Tools/gsc/{test => Examples/tensorflow}/ubuntu18.04-tensorflow.manifest (100%)
 mode change 100755 => 100644
 delete mode 100644 Tools/gsc/test/tensorflow/README.md

diff --git a/Tools/gsc/Examples/tensorflow/README.md b/Tools/gsc/Examples/tensorflow/README.md
new file mode 100644
index 0000000000..48ac97b24d
--- /dev/null
+++ b/Tools/gsc/Examples/tensorflow/README.md
@@ -0,0 +1,92 @@
+# Inference on TensorFlow BERT and ResNet50 models:
+For additional information on how to install, run and optimize TensorFlow, please see
+https://github.com/Satya1493/graphene/blob/tensorflow/Examples/tensorflow/README.md.
+
+## Build graphenize Docker image and run BERT inference:
+1. Build docker image:
+```
+docker build --rm -t ubuntu18.04-tensorflow-bert -f ubuntu18.04-tensorflow-bert.dockerfile .
+```
+
+2. Graphenize the docker image using gsc build:
+```
+cd ../..
+./gsc build --insecure-args ubuntu18.04-tensorflow-bert Examples/tensorflow/ubuntu18.04-tensorflow.manifest
+```
+
+3. Sign the graphenized Docker image using gsc sign-image:
+```
+./gsc sign-image ubuntu18.04-tensorflow-bert enclave-key.pem
+```
+
+4. To run fp32 inference on GSC:
+```
+docker run --device=/dev/sgx_enclave --cpuset-cpus="0-35" --env OMP_NUM_THREADS=36 \
+--env KMP_AFFINITY=granularity=fine,noverbose,compact,1,0 \
+gsc-ubuntu18.04-tensorflow-bert \
+models/models/language_modeling/tensorflow/bert_large/inference/run_squad.py \
+--init_checkpoint=data/bert_large_checkpoints/model.ckpt-3649 \
+--vocab_file=data/wwm_uncased_L-24_H-1024_A-16/vocab.txt \
+--bert_config_file=data/wwm_uncased_L-24_H-1024_A-16/bert_config.json \
+--predict_file=data/wwm_uncased_L-24_H-1024_A-16/dev-v1.1.json \
+--precision=fp32 \
+--predict_batch_size=32 \
+--experimental_gelu=True \
+--optimized_softmax=True \
+--input_graph=data/fp32_bert_squad.pb \
+--do_predict=True \
+--mode=benchmark \
+--inter_op_parallelism_threads=1 \
+--intra_op_parallelism_threads=36 \
+--output_dir=output/bert-squad-output
+```
+
+5. To run fp32 inference on native container (outside Graphene), remove
+``--device=/dev/sgx_enclave`` and replace ``gsc-ubuntu18.04-tensorflow-bert`` with
+``ubuntu18.04-tensorflow-bert`` in the above command.
+
+6. Above commands are for a 36 core system. Please check
+https://github.com/Satya1493/graphene/blob/tensorflow/Examples/tensorflow/README.md for setting
+different options for optimal performance.
+
+## Build graphenize Docker image and run ResNet50 inference:
+1. Build docker image:
+```
+docker build --rm -t ubuntu18.04-tensorflow-resnet50 -f ubuntu18.04-tensorflow-resnet50.dockerfile .
+```
+
+2. Graphenize the docker image using gsc build:
+```
+cd ../..
+./gsc build --insecure-args ubuntu18.04-tensorflow-resnet50 Example/tensorflow/ubuntu18.04-tensorflow.manifest
+```
+
+3. Sign the graphenized Docker image using gsc sign-image:
+```
+./gsc sign-image ubuntu18.04-tensorflow-resnet50 enclave-key.pem
+```
+
+4. To run int8 inference on GSC:
+```
+docker run --device=/dev/sgx_enclave --cpuset-cpus="0-35" --env OMP_NUM_THREADS=36 \
+--env KMP_AFFINITY=granularity=fine,noverbose,compact,1,0 \
+gsc-ubuntu18.04-tensorflow-resnet50 \
+models/models/image_recognition/tensorflow/resnet50v1_5/inference/eval_image_classifier_inference.py \
+--input-graph=resnet50v1_5_int8_pretrained_model.pb \
+--num-inter-threads=1 \
+--num-intra-threads=36 \
+--batch-size=32 \
+--warmup-steps=50 \
+--steps=500
+```
+**NOTE**: When OOM happens user can set environment varibale ``TF_MKL_ALLOC_MAX_BYTES`` to upper
+bound on memory allocation. As an example in a machine with 32 GB memory pass option
+``--env TF_MKL_ALLOC_MAX_BYTES=17179869184`` to docker run command when OOM happens.
+
+5. To run int8 inference on native container (outside Graphene), remove
+``--device=/dev/sgx_enclave`` and replace ``gsc-ubuntu18.04-tensorflow-resnet50`` with
+``ubuntu18.04-tensorflow-resnet50`` in the above command.
+
+6. Above commands are for a 36 core system. Please check
+https://github.com/Satya1493/graphene/blob/tensorflow/Examples/tensorflow/README.md for setting
+different options for optimal performance.
diff --git a/Tools/gsc/test/ubuntu18.04-tensorflow-bert.dockerfile b/Tools/gsc/Examples/tensorflow/ubuntu18.04-tensorflow-bert.dockerfile
old mode 100755
new mode 100644
similarity index 78%
rename from Tools/gsc/test/ubuntu18.04-tensorflow-bert.dockerfile
rename to Tools/gsc/Examples/tensorflow/ubuntu18.04-tensorflow-bert.dockerfile
index 8c7a00768e..d4c5e097c5
--- a/Tools/gsc/test/ubuntu18.04-tensorflow-bert.dockerfile
+++ b/Tools/gsc/Examples/tensorflow/ubuntu18.04-tensorflow-bert.dockerfile
@@ -2,8 +2,7 @@ From ubuntu:18.04
 
 # Install prerequisites
 RUN apt-get update \
-    && apt-get install -y git wget \
-    && apt-get install -y python3.6 python3-pip unzip \
+    && apt-get install -y git wget python3 python3-pip unzip \
     && pip3 install --upgrade pip
 
 # Install tensorflow
@@ -13,13 +12,13 @@ RUN pip3 install intel-tensorflow-avx512==2.4.0
 RUN git clone https://github.com/IntelAI/models.git /models/
 
 # Download data
-RUN mkdir -p data \ 
+RUN mkdir -p data \
     && cd data \
     && wget https://storage.googleapis.com/bert_models/2019_05_30/wwm_uncased_L-24_H-1024_A-16.zip \
     && unzip wwm_uncased_L-24_H-1024_A-16.zip \
     && wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json -P wwm_uncased_L-24_H-1024_A-16 \
     && wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v1_8/bert_large_checkpoints.zip \
     && unzip bert_large_checkpoints.zip \
-    && wget https://storage.googleapis.com/intel-optimized-tensorflow/models/r2.5-icx-b631821f/asymmetric_per_channel_bert_int8.pb
+    && wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v2_4_0/fp32_bert_squad.pb
 
-ENTRYPOINT ["python3.6"]
+ENTRYPOINT ["python3"]
diff --git a/Tools/gsc/test/ubuntu18.04-tensorflow-resnet50.dockerfile b/Tools/gsc/Examples/tensorflow/ubuntu18.04-tensorflow-resnet50.dockerfile
similarity index 71%
rename from Tools/gsc/test/ubuntu18.04-tensorflow-resnet50.dockerfile
rename to Tools/gsc/Examples/tensorflow/ubuntu18.04-tensorflow-resnet50.dockerfile
index da0c7077bf..0345987a29 100644
--- a/Tools/gsc/test/ubuntu18.04-tensorflow-resnet50.dockerfile
+++ b/Tools/gsc/Examples/tensorflow/ubuntu18.04-tensorflow-resnet50.dockerfile
@@ -2,10 +2,8 @@ From ubuntu:18.04
 
 # Install prerequisites
 RUN apt-get update \
-    && apt-get install -y git wget \
-    && apt-get install -y python3.6 python3-pip
-
-RUN pip3 install --upgrade pip
+    && apt-get install -y git wget python3 python3-pip \
+    && pip3 install --upgrade pip
 
 # Install tensorflow
 RUN pip3 install intel-tensorflow-avx512==2.4.0
@@ -16,4 +14,4 @@ RUN wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v1_8/r
 # Download model
 RUN git clone https://github.com/IntelAI/models.git /models/
 
-ENTRYPOINT ["python3.6"]
+ENTRYPOINT ["python3"]
diff --git a/Tools/gsc/test/ubuntu18.04-tensorflow.manifest b/Tools/gsc/Examples/tensorflow/ubuntu18.04-tensorflow.manifest
old mode 100755
new mode 100644
similarity index 100%
rename from Tools/gsc/test/ubuntu18.04-tensorflow.manifest
rename to Tools/gsc/Examples/tensorflow/ubuntu18.04-tensorflow.manifest
diff --git a/Tools/gsc/test/tensorflow/README.md b/Tools/gsc/test/tensorflow/README.md
deleted file mode 100644
index 03f480c123..0000000000
--- a/Tools/gsc/test/tensorflow/README.md
+++ /dev/null
@@ -1,190 +0,0 @@
-# Inference on TensorFlow BERT and ResNet50 models:
-The ``../test`` directory contains dockerfile and manifest file to run inference with TensorFlow BERT and 
-ResNet50 sample workloads on GSC. Specifically, both these examples use pre-trained models to run 
-inference. We tested this on Ubuntu 18.04 and uses the package version with Python 3.6.
-
-## Bidirectional Encoder Representations from Transformers (BERT):
-BERT is a method of pre-training language representations and then use that trained model for 
-downstream NLP tasks like 'question answering'. BERT is an unsupervised, deeply birectional system 
-for pre-training NLP. In this BERT sample, we use 'BERT-Large, Uncased (Whole Word Masking)' model 
-and perform int8 inference. More details about BERT can be found at 
-https://github.com/google-research/bert.
-
-## Residual Network (ResNet):
-ResNet50 is a convolutional neural network that is 50 layers deep. In this ResNet50(v1.5) sample, 
-we use a pre-trained model and perform int8 inference. More details about ResNet50 can be found at 
-https://github.com/IntelAI/models/tree/icx-launch-public/benchmarks/image_recognition/tensorflow/resnet50v1_5.
-
-## Pre-System setting:
-Linux systems have CPU frequency scaling governor that helps the system to scale the CPU frequency 
-to achieve best performance or to save power based on the requirement. To achieve the best 
-peformance, please set the CPU frequency scaling governor to performance mode.
-
-```
-for ((i=0; i<$(nproc); i++)); \
-do echo 'performance' > /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor; done
-```
-
-## Common build steps:
-1. ``cd $(GRAPHENE_DIR)/Tools/gsc``
-
-2. Create a configuration file: ``cp config.yaml.template config.yaml``  
-Manually adopt config.yaml to the installed Intel SGX driver and desired Graphene repository/version
-
-3. Generate the signing key: ``openssl genrsa -3 -out enclave-key.pem 3072``
-
-## Build graphenize Docker image and run BERT inference:
-1. Build docker image:
-```
-cd test
-docker build --rm -t ubuntu18.04-tensorflow-bert -f ubuntu18.04-tensorflow-bert.dockerfile \
-../../../Examples
-```
-
-2. Graphenize the docker image using gsc build:
-```
-cd ..
-./gsc build --insecure-args ubuntu18.04-tensorflow-bert test/ubuntu18.04-tensorflow.manifest
-```
-
-3. Sign the graphenized Docker image using gsc sign-image:
-```
-./gsc sign-image ubuntu18.04-tensorflow-bert enclave-key.pem
-```
-
-4. To run int8 inference on GSC:
-```
-docker run --device=/dev/sgx_enclave --cpuset-cpus="0-35" --env OMP_NUM_THREADS=36 \
---env KMP_AFFINITY=granularity=fine,noverbose,compact,1,0 \
-gsc-ubuntu18.04-tensorflow-bert \
-models/models/language_modeling/tensorflow/bert_large/inference/run_squad.py \
---init_checkpoint=data/bert_large_checkpoints/model.ckpt-3649 \
---vocab_file=data/wwm_uncased_L-24_H-1024_A-16/vocab.txt \
---bert_config_file=data/wwm_uncased_L-24_H-1024_A-16/bert_config.json \
---predict_file=data/wwm_uncased_L-24_H-1024_A-16/dev-v1.1.json \
---precision=int8 \
---predict_batch_size=32 \
---experimental_gelu=True \
---optimized_softmax=True \
---input_graph=data/asymmetric_per_channel_bert_int8.pb \
---do_predict=True \
---mode=benchmark \
---inter_op_parallelism_threads=1 \
---intra_op_parallelism_threads=36 \
---output_dir=output/bert-squad-output
-```
-
-5. To run int8 inference on native container:
-```
-docker run --cpuset-cpus="0-35" --env OMP_NUM_THREADS=36 \
---env KMP_AFFINITY=granularity=fine,noverbose,compact,1,0 \
-ubuntu18.04-tensorflow-bert \
-models/models/language_modeling/tensorflow/bert_large/inference/run_squad.py \
---init_checkpoint=data/bert_large_checkpoints/model.ckpt-3649 \
---vocab_file=data/wwm_uncased_L-24_H-1024_A-16/vocab.txt \
---bert_config_file=data/wwm_uncased_L-24_H-1024_A-16/bert_config.json \
---predict_file=data/wwm_uncased_L-24_H-1024_A-16/dev-v1.1.json \
---precision=int8 \
---predict_batch_size=32 \
---experimental_gelu=True \
---optimized_softmax=True \
---input_graph=data/asymmetric_per_channel_bert_int8.pb \
---do_predict=True \
---mode=benchmark \
---inter_op_parallelism_threads=1 \
---intra_op_parallelism_threads=36 \
---output_dir=output/bert-squad-output
-```
-
-6. Above commands are for a 36 core system. Please set the following options accordingly for 
-optimal performance.
-    - OMP_NUM_THREADS='Core(s) per socket'
-    - --cpuset-cpus to 'Core(s) per socket'
-    - num-intra-threads='Core(s) per socket'
-    - If hyperthreading is enabled: use ``KMP_AFFINITY=granularity=fine,verbose,compact,1,0``
-    - If hyperthreading is disabled: use ``KMP_AFFINITY=granularity=fine,verbose,compact``
-    - **NOTE** To get 'Core(s) per socket', do ``lscpu | grep 'Core(s) per socket'`` \
-    OMP_NUM_THREADS sets the maximum number of threads to use for OpenMP parallel regions. \
-    KMP_AFFINITY binds OpenMP threads to physical processing units.
-
-## Build graphenize Docker image and run ResNet50 inference:
-1. Build docker image:
-```
-cd test
-docker build --rm -t ubuntu18.04-tensorflow-resnet50 -f ubuntu18.04-tensorflow-resnet50.dockerfile \
-../../../Examples
-```
-
-2. Graphenize the docker image using gsc build:
-```cd ..
-./gsc build --insecure-args ubuntu18.04-tensorflow-resnet50 test/ubuntu18.04-tensorflow.manifest
-```
-
-3. Sign the graphenized Docker image using gsc sign-image:
-```
-./gsc sign-image ubuntu18.04-tensorflow-resnet50 enclave-key.pem
-```
-
-4. To run inference on GSC:
-```
-docker run --device=/dev/sgx_enclave --cpuset-cpus="0-35" --env OMP_NUM_THREADS=36 \
---env KMP_AFFINITY=granularity=fine,noverbose,compact,1,0 \
-gsc-ubuntu18.04-tensorflow-resnet50 \
-models/models/image_recognition/tensorflow/resnet50v1_5/inference/eval_image_classifier_inference.py \
---input-graph=resnet50v1_5_int8_pretrained_model.pb \
---num-inter-threads=1 \
---num-intra-threads=36 \
---batch-size=32 \
---warmup-steps=50 \
---steps=500
-```
-**NOTE**: When OOM happens user can set environment varibale ``TF_MKL_ALLOC_MAX_BYTES`` to upper 
-bound on memory allocation. As an example in a machine with 32 GB memory pass option 
-``--env TF_MKL_ALLOC_MAX_BYTES=17179869184`` to docker run command when OOM happens.
-
-5. To run inference on native Container:
-```
-docker run --cpuset-cpus="0-35" --env OMP_NUM_THREADS=36 \
---env KMP_AFFINITY=granularity=fine,noverbose,compact,1,0 \
-ubuntu18.04-tensorflow-resnet50 \
-models/models/image_recognition/tensorflow/resnet50v1_5/inference/eval_image_classifier_inference.py \
---input-graph=resnet50v1_5_int8_pretrained_model.pb \
---num-inter-threads=1 \
---num-intra-threads=36 \
---batch-size=32 \
---warmup-steps=50 \
---steps=500
-```
-
-6. Above commands are for a 36 core system. Please set the following options accordingly for 
-optimal performance.
-    - OMP_NUM_THREADS='Core(s) per socket'
-    - --cpuset-cpus to 'Core(s) per socket'
-    - num-intra-threads='Core(s) per socket'
-    - If hyperthreading is enabled: use ``KMP_AFFINITY=granularity=fine,verbose,compact,1,0``
-    - If hyperthreading is disabled: use ``KMP_AFFINITY=granularity=fine,verbose,compact``
-    - The options batch-size, warmup-steps and steps can be varied.
-    - **NOTE** To get 'Core(s) per socket', do ``lscpu | grep 'Core(s) per socket'`` \
-    OMP_NUM_THREADS sets the maximum number of threads to use for OpenMP parallel regions. \
-    KMP_AFFINITY binds OpenMP threads to physical processing units.
-
-## Performance considerations:
-- Preheat manifest option pre-faults the enclave memory and moves the performance penalty to 
-graphene-sgx invocation (before the workload starts executing). To use preheat option, add 
-``sgx.preheat_enclave = 1`` to the manifest template.
-- TCMalloc and mimalloc are memory allocator libraries from Google and Microsoft that can help 
-improve performance significantly based on the workloads. At any point, only one of these 
-allocators can be used.
-  - TCMalloc (Please update the binary location and name if different from default)
-    - Install tcmalloc: ``sudo apt-get install google-perftools``
-    - Add these in the manifest template
-        - ``loader.env.LD_PRELOAD = "/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4"``
-        - ``sgx.trusted_files.libtcmalloc = "file:/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4"``
-        - ``sgx.trusted_files.libunwind = "file:/usr/lib/x86_64-linux-gnu/libunwind.so.8"``
-    - Save the template and rebuild.
-  - mimalloc (Please update the binary location and name if different from default)
-    - Install mimalloc using the steps from https://github.com/microsoft/mimalloc
-    - Add these in the manifest template
-        - ``loader.env.LD_PRELOAD = "/usr/local/lib/mimalloc-1.7/libmimalloc.so.1.7"``
-        - ``sgx.trusted_files.libmimalloc = "file:/usr/local/lib/mimalloc-1.7/libmimalloc.so.1.7"``
-    - Save the template and rebuild.
\ No newline at end of file