Skip to content

Commit

Permalink
Enable GraphSage Example for TF Backend (#1193)
Browse files Browse the repository at this point in the history
* graphsage

Signed-off-by: zehao-intel <[email protected]>

* fix known issues

Signed-off-by: zehao-intel <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add configs to support extension test

Signed-off-by: zehao-intel <[email protected]>

---------

Signed-off-by: zehao-intel <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
zehao-intel and pre-commit-ci[bot] authored Aug 30, 2023
1 parent db5dfa0 commit 29ec821
Show file tree
Hide file tree
Showing 11 changed files with 629 additions and 1 deletion.
4 changes: 4 additions & 0 deletions .azure-pipelines/scripts/codeScan/pyspelling/inc_dict.txt
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,7 @@ CropResize
CropToBoundingBox
CrossEntropyLoss
Curran
CustomDataset
CustomObj
CvAClvFfyA
DBMDZ
Expand Down Expand Up @@ -1671,6 +1672,7 @@ gpus
grafftti
graphDef
graphdef
graphsage
grappler
grey
groupnorm
Expand Down Expand Up @@ -2198,6 +2200,7 @@ postprocess
postprocessed
postprocessing
powersave
ppi
pplm
ppn
pragma
Expand Down Expand Up @@ -2440,6 +2443,7 @@ ssd
sshleifer
sst
stackoverflow
stanford
startswith
startup
stderr
Expand Down
7 changes: 7 additions & 0 deletions examples/.config/model_params_tensorflow.json
Original file line number Diff line number Diff line change
Expand Up @@ -1807,6 +1807,13 @@
"input_model": "/tf_dataset/tensorflow/vit/HF-ViT-Base16-Img224-frozen.pb",
"main_script": "main.py",
"batch_size": 32
},
"GraphSage": {
"model_src_dir": "graph_networks/graphsage/quantization/ptq",
"dataset_location": "/tf_dataset/dataset/ppi",
"input_model": "/tf_dataset/tensorflow/graphsage/graphsage_frozen_model.pb",
"main_script": "main.py",
"batch_size": 1000
}
}
}
Expand Down
6 changes: 6 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -279,6 +279,12 @@ Intel® Neural Compressor validated examples with multiple compression technique
<td>Post-Training Static Quantization</td>
<td><a href="./tensorflow/image_recognition/tensorflow_models/vision_transformer/">pb</a></td>
</tr>
<tr>
<td>GraphSage</td>
<td>Graph Networks</td>
<td>Post-Training Static Quantization</td>
<td><a href="./tensorflow/graph_networks/graphsage/">pb</a></td>
</tr>
</tbody>
</table>

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
Step-by-Step
============

This document is used to list steps of reproducing TensorFlow Object Detection models tuning results. This example can run on Intel CPUs and GPUs.

# Prerequisite


## 1. Environment
Recommend python 3.6 or higher version.

### Install Intel® Neural Compressor
```shell
pip install neural-compressor
```

### Install Intel Tensorflow
```shell
pip install intel-tensorflow
```
> Note: Validated TensorFlow [Version](/docs/source/installation_guide.md#validated-software-environment).
### Installation Dependency packages
```shell
cd examples\tensorflow\graph_networks\graphsage\quantization\ptq
pip install -r requirements.txt
```

### Install Intel Extension for Tensorflow

#### Quantizing the model on Intel GPU(Mandatory to install ITEX)
Intel Extension for Tensorflow is mandatory to be installed for quantizing the model on Intel GPUs.

```shell
pip install --upgrade intel-extension-for-tensorflow[gpu]
```
For any more details, please follow the procedure in [install-gpu-drivers](https://github.com/intel/intel-extension-for-tensorflow/blob/main/docs/install/install_for_gpu.md#install-gpu-drivers)

#### Quantizing the model on Intel CPU(Optional to install ITEX)
Intel Extension for Tensorflow for Intel CPUs is experimental currently. It's not mandatory for quantizing the model on Intel CPUs.

```shell
pip install --upgrade intel-extension-for-tensorflow[cpu]
```

> **Note**:
> The version compatibility of stock Tensorflow and ITEX can be checked [here](https://github.com/intel/intel-extension-for-tensorflow#compatibility-table). Please make sure you have installed compatible Tensorflow and ITEX.
## 2. Prepare Model
Download Frozen graph:
```shell
wget https://storage.googleapis.com/intel-optimized-tensorflow/models/2_12_0/graphsage_frozen_model.pb
```

## 3. Prepare Dataset

```shell
wget https://snap.stanford.edu/graphsage/ppi.zip
unzip ppi.zip
```

# Run

## Quantization Config

The Quantization Config class has default parameters setting for running on Intel CPUs. If running this example on Intel GPUs, the 'backend' parameter should be set to 'itex' and the 'device' parameter should be set to 'gpu'.

```
config = PostTrainingQuantConfig(
device="gpu",
backend="itex",
...
)
```

## 1. Quantization

```shell
# The cmd of running faster_rcnn_resnet50
bash run_quant.sh --input_model=./graphsage_frozen_model.pb --output_model=./nc_graphsage_int8_model.pb --dataset_location=./ppi
```

## 2. Benchmark
```shell
bash run_benchmark.sh --input_model=./nc_graphsage_int8_model.pb --dataset_location=./ppi --mode=performance
```

Details of enabling Intel® Neural Compressor on graphsage for Tensorflow.
=========================

This is a tutorial of how to enable graphsage model with Intel® Neural Compressor.
## User Code Analysis
User specifies fp32 *model*, calibration dataset *calib_dataloader* and a custom *eval_func* which encapsulates the evaluation dataset and metric by itself.

For graphsage, we applied the latter one because our philosophy is to enable the model with minimal changes. Hence we need to make two changes on the original code. The first one is to implement the q_dataloader and make necessary changes to *eval_func*.

### Code update

After prepare step is done, we just need update main.py like below.
```python
if args.tune:
from neural_compressor import quantization
from neural_compressor.data import DataLoader
from neural_compressor.config import PostTrainingQuantConfig
dataset = CustomDataset()
calib_dataloader=DataLoader(framework='tensorflow', dataset=dataset, \
batch_size=1, collate_fn = collate_function)
conf = PostTrainingQuantConfig()
q_model = quantization.fit(args.input_graph, conf=conf, \
calib_dataloader=calib_dataloader, eval_func=evaluate)
q_model.save(args.output_graph)

if args.benchmark:
if args.mode == 'performance':
from neural_compressor.benchmark import fit
from neural_compressor.config import BenchmarkConfig
conf = BenchmarkConfig()
fit(args.input_graph, conf, b_func=evaluate)
elif args.mode == 'accuracy':
acc_result = evaluate(args.input_graph)
print("Batch size = %d" % args.batch_size)
print("Accuracy: %.5f" % acc_result)

```

The quantization.fit() function will return a best quantized model during timeout constrain.
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
#
# -*- coding: utf-8 -*-
#
# Copyright (c) 2023 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

import numpy as np
import random
import json
import sys
import os

import networkx as nx
from networkx.readwrite import json_graph


def load_data(prefix, normalize=True, load_walks=False):
G_data = json.load(open(prefix + "-G.json"))
G = json_graph.node_link_graph(G_data)
if isinstance(list(G.nodes())[0], int):
conversion = lambda n : int(n)
else:
conversion = lambda n : n

if os.path.exists(prefix + "-feats.npy"):
feats = np.load(prefix + "-feats.npy")
else:
print("No features present.. Only identity features will be used.")
feats = None
id_map = json.load(open(prefix + "-id_map.json"))
id_map = {conversion(k):int(v) for k,v in id_map.items()}
walks = []
class_map = json.load(open(prefix + "-class_map.json"))
if isinstance(list(class_map.values())[0], list):
lab_conversion = lambda n : n
else:
lab_conversion = lambda n : int(n)

class_map = {conversion(k):lab_conversion(v) for k,v in class_map.items()}

## Remove all nodes that do not have val/test annotations
## (necessary because of networkx weirdness with the Reddit data)
broken_count = 0
for node in G.nodes():
if not 'val' in G.nodes[node] or not 'test' in G.nodes[node]:
G.remove_node(node)
broken_count += 1
print("Removed {:d} nodes that lacked proper annotations due to networkx versioning issues".format(broken_count))

## Make sure the graph has edge train_removed annotations
## (some datasets might already have this..)
print("Loaded data.. now preprocessing..")
for edge in G.edges():
if (G.nodes[edge[0]]['val'] or G.nodes[edge[1]]['val'] or
G.nodes[edge[0]]['test'] or G.nodes[edge[1]]['test']):
G[edge[0]][edge[1]]['train_removed'] = True
else:
G[edge[0]][edge[1]]['train_removed'] = False

if normalize and not feats is None:
from sklearn.preprocessing import StandardScaler
train_ids = np.array([id_map[n] for n in G.nodes() if not G.nodes[n]['val'] and not G.nodes[n]['test']])
train_feats = feats[train_ids]
scaler = StandardScaler()
scaler.fit(train_feats)
feats = scaler.transform(feats)

return G, feats, id_map, walks, class_map
Loading

0 comments on commit 29ec821

Please sign in to comment.