Skip to content
Zheng Xu edited this page Nov 29, 2017 · 1 revision

FAQs

Here we provide some FAQs in running our code. We are working very hard to make it useful.

Can you provide the training data?

Unfortunately, our training data is privately shared between our end and NIH. But you can use Zinc data to train the fingerprint instead.

What's the format of your data?

It generally looks like this. The number following the smile is meaningless for unsupervised fingerprint. So you can simply put zero.

COc1cc(C)nn1c2cc(OC)nc(C)n2 2.1
CSc1cnccn1 9.9
CCCCn1ccc(=O)c(O)c1CC 1.2
Ic1ccc2ccccc2n1 10.0

The model.json is missing!

An example looks like this:

{"dropout_rate": 0.5, "learning_rate_decay_factor": 0.99, "buckets": [[30, 30], [60, 60], [90, 90]], "target_vocab_size": 41, "batch_size": 256, "source_vocab_size": 41, "num_layers": 2, "max_gradient_norm": 5.0, "learning_rate": 0.5, "size": 128}

Acknowledgement

We thank Di Wu for his/her contribution to this FAQs.

Clone this wiki locally