Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Series 4 candidates based on generative model - EOSI #34

Open
miquelduranfrigola opened this issue May 31, 2021 · 69 comments
Open

New Series 4 candidates based on generative model - EOSI #34

miquelduranfrigola opened this issue May 31, 2021 · 69 comments

Comments

@miquelduranfrigola
Copy link

Hello @mattodd @edwintse,

At @ersilia-os we have tried to generate new Series 4 candidates. In short, we provide two tables:

  • A list of >100k molecules obtained with a generative model: download 100k
  • A relatively diverse selection of 1k molecules: download 1k

For a first assessment of the results, you can check this dynamic visualization of the selected 1k candidates. If a cluster is of particular interest, please refer to the full results to discover other similar molecules. You can also check a tree map of all molecules.

Our generative model approach is based on Reinvent 2.0. We have implemented several reinforcement-learning agents, aimed at optimizing activity and other desirable properties. This GitHub Repository contains more detailed information and source code.

This is the first time we run a generative model, so please bear with us. We will be more than happy to optimize further runs based on your feedback.

Thanks!
@GemmaTuron @miquelduranfrigola

@edwintse
Copy link
Collaborator

edwintse commented Jun 8, 2021

Hi @miquelduranfrigola, thanks for all these great suggestions!

  • Ideally we would be looking for suggestions that are a bit different from our existing compounds (e.g. we're less interested in single-point changes like adding halogens). Would it be possible to optimise to narrow down the list further?
  • We were also wondering if, in the dynamic visualisation page, you were able to add a sliding filter for LogP? That would be helpful for us as a guide for solubility.

@miquelduranfrigola
Copy link
Author

Hi @edwintse thanks for the feedback.

To answer your points

  • Yes, it is possible to further narrow down the list. We have put here the results of several batches/runs of small molecule generation, some of which, as you have noticed, were focused on a very narrow chemical space (e.g. batch 1). In subsequent batches we forced some diversity by penalizing molecules that looked too similar to existing compounds. This generated a decent chemical diversity in batches >3. In the tree map you will see a field called "Tanimoto" that precisely measures this feature (the similarity to existing series 4 molecules). The dynamic representation (1k compounds) that we provide is just a cluster-based selection. We are happy to do alternative ones and come up with a richer (and shorter :) ) list of candidates.
  • Meanwhile, we have added a sliding filter for sLogP in the dynamic visualisation page.

Suggested way forward

We will:

  • Do 1 or 2 extra batches based on new constraints (see below)
  • Do a better selection (100 candidates? 10?)

Now the most important is that we define the constraints for the generative model. These 3 are clear:

  1. Activity
  2. Chemical diversity / distance to known series 4 compounds
  3. SLogP

Anything else? For example:

  • Synthetic or retrosynthetic accessibility
  • Molecular weight
  • Max number of rings
  • Max/min number of heteroatoms
  • Max number of halogens
  • Number of rotatable bonds
  • Preferred substituents
  • Forbidden/undesired substituents

It will be very useful to know precisely your ideal profile of properties. Please let us know and we will try to implement it.

@mattodd
Copy link
Member

mattodd commented Jun 16, 2021

Hi @miquelduranfrigola. Very interesting, and a great way to visualise the suggestions.

I'd definitely agree with you re the top three filters. For logP we tend to want to focus on compounds <3.5, very roughly.
Synthetic accessibility is a "nice to have". We can of course do this as humans, but it can take a while with so many suggestions, and as we mention in the preprint about the previous competitions, we had to ditch some interesting possibilities since they would have taken too long to do. So that's a soft "yes" as a feature.
Molecular weight would always generally be below 500, but that's just typical.
For the others, they're not so important, since we can always try to engineer out problematic motifs (like aromatic amines) if they are found to be potent.

To consider the shortlist, though, it would seem that, with those constraints applied, we (me, @edwintse, other potential contributors) need to simply browse through and isolate structures that, well, take our fancy. I mean, there are a lot of possibles here.

Pinging @drc007 purely because I think you'll get a kick out of this.

@drc007
Copy link

drc007 commented Jun 17, 2021

@miquelduranfrigola, This is really interesting. It would also be useful to have a measure of the "confidence" in the predicted activity model.

Can you also identify which new molecules would add the greatest amount of new information to your predicted activity model?

@edwintse
Copy link
Collaborator

Hi @miquelduranfrigola, if you're able to incorporate all the filters @mattodd mentioned that'd be great. I suppose we'd be looking at getting the list down to 100 compounds, then we can quickly look through and pick a couple out to make.

Just on the tree map visualisation tool, we were just curious about how the compounds were placed throughout the tree and if there was anything particularly significant about the different red clusters? I guess it's a bit hard to pick out certain compounds within the clusters or to know which are the "best".

@miquelduranfrigola
Copy link
Author

Thanks, @mattodd @drc007 and @edwintse for your comments! This will help moving forward. In the following days, @GemmaTuron and I will give it a push.

We will try to:

  • Identify 100 candidates (following your filtering preferences).
  • Add a confidence score to the activity prediction.

In addition, I will provide a deeper explanation of the TMAP. I guess we will select the top 100 candidates based on "red regions", so hopefully this will address @edwintse's good point about how to navigate this map.

As for @drc007's suggestion to identify what molecules would add more information to future models... very interesting, didn't think of this! I don't have an immediate answer, but we will try to address the point. Perhaps, to start with, we could see what molecules would expand more efficiently the applicability domain?

@edwintse
Copy link
Collaborator

Hi @miquelduranfrigola, we were just wondering whether it's possible to add a substructure search function to the dynamic visualisation tool? I guess we'd want to be focusing on structures that have meta or para substituents on the RHS phenyl ring as the more interesting ones to pursue.

@GemmaTuron
Copy link

hello @edwintse, I was actually just having a closer look at the most desirable substituents according to the information on the wiki and series 4 paper. We are trying to refine the molecule generator these days, it would be great if you can give us some hints about the most desirable substituents, also taking into account what you have observed in terms of HLM and RLM.
As for the display, can try to play a bit using SMARTS structures and see if we can incorporate this in the visualization tool. Will let you know if it works.

@GemmaTuron
Copy link

GemmaTuron commented Jul 2, 2021

Hello @edwintse, sorry for the delay! I have updated the app visualization to provide some substructure search capabilities. As you mentioned you are interested in the RHS substituent I have added the following select-boxes:
Heteroaryl: when selected, displays all RHS substituents composed of an aryl (including phenyl)
Phenyl: when selected, displays RHS substituents containing strictly phenyls (no heteroatoms)
Para/Meta/Orto options allow to select compounds with para-, meta- or orto- substituents on the phenyl.
Let me know if this is useful or you were thinking of different filters.

@edwintse
Copy link
Collaborator

edwintse commented Jul 2, 2021

@GemmaTuron Wow, that's amazing and super useful to narrow things down!

@edwintse
Copy link
Collaborator

edwintse commented Jul 8, 2021

Hi @GemmaTuron, I've been trying to make some compounds suggested by Evariste recently (#29) and was wondering if you guys ever generated any structures containing structures similar to those in this comment with indole/benzimidazole type groups on the RHS (or even any of the other structures that they predicted)? It would be interesting to see if there was any overlap between your suggestions and those from Evariste.

@miquelduranfrigola
Copy link
Author

miquelduranfrigola commented Jul 8, 2021

Hi @edwintse @mattodd @drc007

We are preparing a new batch of generated molecules. We will get back to you shortly. Good idea, Edwin, we will check overlap with molecules from Evariste. Thanks!

Meanwhile, @GemmaTuron and I have prepared a small app where you can input your molecules of interest and will get some activity predictions according to a few simple ML models. Perhaps this is useful if you have some candidates from our lists or others or want to try small modifications on those molecules. Feedback most welcome!

Many thanks!

@edwintse
Copy link
Collaborator

The app is amazing! We've just had some new suggestions come through from Evariste (#29) so it's already been very useful for cross-checking between the predictions.

@GemmaTuron
Copy link

Hello @mattodd @edwintse,

As mentioned earlier, with @miquelduranfrigola we have done a second round of molecule generation. A detailed description of the process can be found in this repo: https://github.com/ersilia-os/osm-series4-candidates-2.

In summary, we created a list of >400k candidate molecules that have undergone successive rounds of selection based on activity prediction, desirable physicochemical properties and synthetic accessibility scores. Finally, we have selected the best 90 compounds according to its predicted activity against P. Falciparum. The molecules can also be visualized in this app

Exploration vs exploitation

You will probably see that these candidates are considerably different from your known series 4 dataset. This is because we have worked in “exploration” mode, i.e. we explore regions of the chemical space that are distant to the existing compounds. We hope that this collection nicely complements with the compounds discovered in issue #29

Metrics

IC50Pred: the lower the better. It is probably biased towards high values, so hopefully it is a conservative estimate.
DeepActivity: the higher the better. It is a composite z-score between several deep learning scores (chemprop, grover; trained on classification and regression tasks).
Aside from these two metrics, there are a bunch of physchem properties (MolWt, SlogP, Number of Rings, Heavy Atom Count…) and synthetic accessibility (SA, RA and Syba) scores that can be used to refine the search. As in the previous round, we have now included columns to select molecules with RHS radicals including para, orto or meta substituents.
Let us know if any of these molecules look interesting!

@miquelduranfrigola
Copy link
Author

miquelduranfrigola commented Jul 24, 2021

Hi @GemmaTuron, I've been trying to make some compounds suggested by Evariste recently (#29) and was wondering if you guys ever generated any structures containing structures similar to those in this comment with indole/benzimidazole type groups on the RHS (or even any of the other structures that they predicted)? It would be interesting to see if there was any overlap between your suggestions and those from Evariste.

Hi @edwintse as you can see in the comment above by @GemmaTuron we have done a second round of generative models. To (sort of) answer your question, here two quick-and-dirty PCA plots (done with Morgan fingerprints) comparing:

  1. Known inactives (only left plot)
  2. Known actives
  3. Compounds in issue #29 (i.e. done in "exploitation" mode)
  4. Our 90 selected compounds (i.e. done in "exploration" mode).

126868711-f834d617-6a0d-44c5-927f-11abd36541b7

As you can see, we have a couple of compounds that cluster together with Evariste's compounds.

@mattodd
Copy link
Member

mattodd commented Jul 28, 2021

OK @miquelduranfrigola @GemmaTuron this is most interesting. To make sure I understand:

The "exploitation" compounds are compounds you're predicting to be active that are derived fairly directly from other actives. The "exploration" compounds are those where you're intentionally trying to stay within the clusters of actives, and away from the inactives, yet which are sampling different areas of chemical space. So, in the left hand plot above we see no red Exploration compounds in regions where there are green inactives. In the right hand plot we're seeing exploration compounds peppering the space of known actives but in a much more diverse cloud than the purple Exploitation compounds.

Is the right hand plot meant to look like a zoom in to an area of the left hand plot? I couldn't quite map the two. I'm guessing the axis units are arbitrary, or relative? I was trying to use that as a guide.

If this is all correct (?) then we're going to need to take a look at the Exploration structures more closely. That you've factored in synthetic accessibility is a major plus there.

@GemmaTuron
Copy link

Hi @mattodd ,

The exploitation compounds plotted are the ones predicted by Evariste, we have used an "exploratory" generative model, and as you mention, we are trying to stay close to the actives but querying different areas of the chemical space. Your interpretation of right and left graphs is correct, this is a PCA representation so axis units are indeed arbitrary. The PCA was made once with the four datasets (left) and calculated again for the three datasets (right) so the right is not exactly a zoom of the left one. What is interesting is that some of our compounds (red) overlap with the chemical space of the Evariste compounds (purple), a good signal that these have potential strong activity. The rest of our predicted compounds (red) are interesting because they differ a bit from known actives and have been optimized not only by activity but alsosolubility, accessibility etc.
You can explore the 90 selected compounds we have produced here, which includes several estimates of Synthetic Accessibility.

Hope this clarifies a bit more !

@edwintse
Copy link
Collaborator

@miquelduranfrigola @GemmaTuron We're a bit curious about the compounds that cluster with the Evariste ones. It seems like there's only a few red dots within the purple cluster. Were you able to give a zoom in on that region and show the exploration structures? I guess those would be the ones we'd prioritise if we were to make any.

@miquelduranfrigola
Copy link
Author

miquelduranfrigola commented Jul 30, 2021

Hi @edwintse these are the two molecules that in the PCA plot cluster together with Evariste compounds:

two_molecules-01

A few disclaimers and thoughts:

  • We optimized for diversity and, in consequence, we did not exploit the same region of the chemical space as Evariste. We reasoned that this particular region is well covered by Evariste and, therefore, we were interested in exploring other regions instead.
  • For this reason, the hits that we identified in the Evariste region are not necessarily the best ones on our side. We would really encourage you to explore the app containing 90 candidates and see if you find interesting molecules there. Sorting by IC50Pred (the lower the better) or DeepActivity (the higher the better) would be the obvious way to go.
  • Finally, remember that a PCA plot only provides a global picture. Points that are really close in the 2D PCA plot are not necessarily very similar in the multidimensional space.

I hope this helps!
Miquel

@edwintse
Copy link
Collaborator

edwintse commented Feb 1, 2022

Hi @miquelduranfrigola @GemmaTuron, just checking in to see how the compound generation is going? I've finished making and purifying the compounds from Evariste (#29) and will have them tested soonish, but we were hoping to start planning starting materials from your compounds that we might need to make or purchase.

@GemmaTuron
Copy link

Hi @edwintse we have started working on it, we hope by the end of next week to be able to share some news!

@GemmaTuron
Copy link

Hello @edwintse and @mattodd !

We have a final list of candidates (35 molecules) + an extended list of alternatives (1200 molecules). They all have high predicted potency, so perhaps now we can choose the ones with easier synthetic route and other interesting characteristics like solubility.
In the files we provide a list of smiles and their predicted IC50, probability of being active with a cut-off of 1uM and probability of being active with a cut-off of 2.5uM.

All data and code is available in this repository. In short, we have:

  • Generated new molecules using the ETH ModLab approach and giving as input only highly active molecules
  • Trained two classification models for activity (with cut-offs 2.5 and 1 uM) as well as associated regression models, which have been benchmarked using the original competition for series 4, and showed excellent results.
  • Selected the molecules that were predicted active in all models from the newly synthesized set, the pre-filtered set in the last round and the final 90 candidates selected above.

We provide the 35 highest active predicted molecules from the list of 90 as putative candidates for synthesis, but we can also try to refine the search and enrich the list with candidate molecules from the also highly predicted actives list of 1295 molecules.

Let us know your thoughts on these molecules and if there is any extra filter you would like to add before choosing the ones to be synthesized.

@mattodd
Copy link
Member

mattodd commented Feb 14, 2022

OK, great @GemmaTuron. So, @edwintse (or Gemma) can you parse into a picture so that we can see roughly what starting materials we might be looking at in the most general sense? e.g. if there's a gram needed of the core Series 4 scaffold?

@GemmaTuron
Copy link

Hi @mattodd
I created a small .html file showing the molecules as well as their smiles and predicted activity. To browse the list of the 35 selected ones, download and unzip this folder and open the .html in a browser.
Hope it helps!

@edwintse
Copy link
Collaborator

edwintse commented Feb 17, 2022

Thanks for all the new compounds @GemmaTuron!
@mattodd I've drawn out all the compounds in order of predicted IC50 (left to right, top to bottom). I did a quick availability search for everything

  • the NW bits are coloured in the top half according to the legend (the costs for directly purchasable reagents vary, I'd have to check again for anything specific; synthesisable is roughly 1-2 steps)
  • the bottom left shows the different cores and the number of compounds from the list possessing each core
  • the bottom right shows the aldehydes needed to make the respective cores, along with the number of compounds possessing that NE moiety
  • the NW bits/cores not coloured might be possible but am currently unsure about how to access them or might be too complicated

Ersilia2022

Ersilia2022 chemdraw.zip

@edwintse
Copy link
Collaborator

edwintse commented Nov 1, 2022

@drc007 Yes, it is racemic. Unfortunately I don't have enough of it to do any chiral HPLC testing.

@mattodd
Copy link
Member

mattodd commented Nov 1, 2022

@edwintse @drc007 OK, but let's think about that. It might be a compound we should get mic clearance data on, no? i.e. could we should we make some more to look at it in a little more depth? (including possibly enantiomer separation). I suspect predicted solubility is low, though.

@edwintse
Copy link
Collaborator

edwintse commented Nov 1, 2022

@mattodd I can make more. The alcohol is fine. It was just the SNAr that was a bit low yielding after purification. Datawarrior gives a clogp of 2.7 for this

@drc007
Copy link

drc007 commented Nov 1, 2022

@edwintse would it be worth resolving the alcohol first?

@edwintse
Copy link
Collaborator

edwintse commented Nov 1, 2022

Possibly, although sometimes I don't completely purify the alcohols. I can see what I have left from when I made it.

@GemmaTuron
Copy link

hi @edwintse and all!

These are great news! Very excited about these results, thanks!
Would it be possible to have a short meeting for us to understand what would be more interesting to explore (for example, more compounds very close to this space, another space that we haven't looked into, revisit some of the predictions we made...)?

@mattodd
Copy link
Member

mattodd commented Nov 1, 2022

@GemmaTuron Yes, let's. This coming Thursday pm would work at e.g. 3 UK time? Or 4pm UK time Friday? Happy to have it an open meeting so others can join/suggest if want?

@GemmaTuron
Copy link

Hi @mattodd !

This week is complicated on our side, can we do NEXT Thursday (10th) at 15:00 UK time?
Of course happy to have it open.

@mattodd
Copy link
Member

mattodd commented Nov 1, 2022

No good - Friday 11th at 1, 3 or 4 UK? Otherwise I fear we may have a looming Doodle Poll 🤕

@MFernflower
Copy link

MFernflower commented Nov 2, 2022 via email

@GemmaTuron
Copy link

No good - Friday 11th at 1, 3 or 4 UK? Otherwise I fear we may have a looming Doodle Poll face_with_head_bandage

Let's go with Friday 11th at 13h UK time! What platform do you prefer?

@GemmaTuron
Copy link

Hi @mattodd !

Just confirming the meeting on friday at 13h UK time?

@mattodd
Copy link
Member

mattodd commented Nov 9, 2022

Yes @GemmaTuron thanks for the reminder, just sent invite, but please forward to others if you like - we can meet at https://ucl.zoom.us/j/4808072370 then. Talk soon!

@GemmaTuron
Copy link

Hi all,

Short update on next steps:
After a few team discussions, these are the next steps we'll take towards optimising the best compound (OSM-LO-72):
We have contacted Dr Lehane and Prof. Kirk from the Australian National University, following their recently published work on a mutation in PfATP4 that confers resistance to cipargamin. PfATP4 is the suggested target of OSM Series 4, so we would like to know if we are able to bypass this resistance. Dr. Lehane has kindly offered to test the lead compound in sensitive vs resistant parasite strains.
We will generate novel candidates with small changes on the right hand-side branch, trying to preserve binding to PfATP4 (again, thanks to the structure we generated with AlphaFold a while ago and the work from Qiu et al 2022) and improving its potency, solubility and, ideally, microsomal stability.
@holeung we will also look at the homology model you shared #35 , if you have any new results that you want to share regarding this that would be fantastic!

Thanks everyone, we will post updates here as soon as we can.

@GemmaTuron
Copy link

And another short update as we start the work described above!
In addition to the steps described above, Prof. Ben Corry and the PhD student John Tanner (Australian National University), who performed the molecular docking in Qiu et al, 2022 have kindly offered to dock the OSM Series 4 compounds and the newly generated candidates to PfATP4, to identify if they might be sensitive to the G358S mutation as well. To this end, @edwintse it would be really helpful if you have any information on 3D conformation or protonation states of the series.
Will share results as soon as we have them!

@edwintse
Copy link
Collaborator

@GemmaTuron sounds great! We do have a few crystal structure files for a handful of compounds. I'll need to find them and share with you. As for protonation states, I guess a predictive software like MarvinSketch would do, otherwise I'm not entirely sure.

@jhjensen2
Copy link

@GemmaTuron if you want something a little more low tech, but high throughput try protonator

@John-D-Tanner
Copy link

Hi Everyone!
I have performed Molecular docking of the OSM4 compounds to the Colabfold structure of PfATP4.

Using autodock vina we docked the OSM4 compounds with known experimental IC50s and the new candidates to both wildtype and G358S isoforms of PfATP4. The search was constrained to the region surrounding G358. We unfortunately found no correlation between experimental IC50 and the docking score. There are a number of reasons why this might be the case, which we will continue to look into, but for the time being please interpret the following results with caution. A number of OSM4 compounds were found to bind in proximity to G358S loci (the box size is large enough to allow non-proximal binding). Of interest is OSM-LO-72, the new candidate with lowest predicted IC50, bound in proximity to G358S, though no change in affinity was predicted upon mutation (again, this is very preliminary and I have little confidence in the affinity prediction)

Moving forward from here I will binarise the IC50 values and rethink the correlation analysis as discussed with @GemmaTuron and @miquelduranfrigola. To this end, is there affinity data available for any of the series 4 compounds, rather than whole-cell IC50s? I will also do a comparison of the protein interactions of the predicted poses and cipargamin, for which we have more confidence in.

For more detailed description of the procedure and results, please see our github repository that contains the notebook and output files.

And thank you all for the opportunity to work on this project. I'm excited to see where this goes!

@GemmaTuron
Copy link

Thanks @John-D-Tanner !!

From the Ersilia side, we have been working on developing a refined generative tool combining different techniques. This is almost ready and we will apply it using the latest experimental datapoints available as starting points.
The generated candidates will be filtered according to desired chemical and ADME properties as well as docking scores if possible.

@mattodd
Copy link
Member

mattodd commented Feb 27, 2023

Hi @John-D-Tanner thanks for this, and sorry for the delay in getting back to you. Too many Github alerts! Also pinging @edwintse

Interesting results, adding a little more to the mystery of how these compounds are acting. We don't have affinity data, no. To the best of our knowledge, nobody has ever made PfATP4, so it's hard to do these kinds of experiments.

Thanks for posting raw data, but I think your link above is broken. Do you have a fresh one?

I guess a key experiment would be to try OSM Series 4 compounds in the resistant cell line, right?

@John-D-Tanner
Copy link

My apologies, the repository was set to private but should now be public and the link should work

@edwintse
Copy link
Collaborator

edwintse commented Mar 6, 2023

The last compound from the most recent set (EGT 611-1) was tested for activity and came back as inactive. There were also 2 early compounds from Evariste (@abrennan5) that we never tested that were also included in this batch. Both are inactive as well. The positive control (369) is as expected.

Untitled Wiley-4
Chemdraw Feb 2023.zip

@GemmaTuron
Copy link

Hi @edwintse !

Thanks for the latest update, and sorry about the silence, we've been working on the background preparing a generative package quick and easy to implement, ChemSampler (still under development, but basic functionalities completed)
I am using this to generate new candidates, as well as having trained new activity prediction models with the updated data (I am missing the three Evariste compounds from above, but will incorporate those today!)
Once I have the final list of modified candidates, we can filter by activity, metabolic stability and if we have more news on docking, by docking scores as well.
I'll keep everyone updated.
This is the open repo where I am working: https://github.com/ersilia-os/osm-series4-synthesis-round2 -- will also add documentation. Note that the predictive models are not yet available to everyone, but if you want me to run predictions, ping me here and I'll do so

@GemmaTuron
Copy link

Hi @edwintse and @mattodd

We have done a first iteration based off the 4 compounds in the previous round with activities of < 1 uM.
we have used the following constrains:

  • Keep the triazolopirazine core + right hand substituent (as we agreed we only wanted small modifications on the left one)
  • Select compounds with high predicted activity (at least, < 2.5 uM) : we have trained models using all available OSM data (including the newest results from Evariste) - I will make those models available online shortly (it's just a technical issue), meanwhile you can ask me for more predictions
  • Use ADMETLab2 to get more information on interesting properties, we have constrained the results to molecules with a predicted half-life (T1/2) < 0.5. According to their definition (Excretion tab) of halflife, the closer to 1, the shorter halflife, so in this case we go for a longer half-life.

With this constrains, we end up with the following 19 molecules:
sampled_ersilia_nov22_selection

What do you think of these molecules
You can find the full list of generated molecules with the associated predictions here and the sampled 19 molecules and associated predictions here
We would like to hear back from you - should we test any of these molecules?

Thanks!

@edwintse
Copy link
Collaborator

edwintse commented Apr 3, 2023

We've shipped the following 5 compounds to Adele at ANU to have them tested in their PfATP resistant line. Results will be posted when received.

ANU resistance
ANU resistance chemdraw.zip

@GemmaTuron
Copy link

Hi @mattodd @edwintse
I wanted to share the new API to access models for online inference in case users are interested in trying them out!
This is the one with models developed with OSM data:
https://ersilia-app-t5zpw.ondigitalocean.app/?model_id=eos7yti

But we are also uploading other related models, such as the ones developed with data contributed by MMV
https://ersilia-app-t5zpw.ondigitalocean.app/?model_id=eos4rta

we'll be announcing models through our social media links during this month of September. Let me know if this is useful or you have any questions!

@mattodd
Copy link
Member

mattodd commented Mar 14, 2024

Though this issue is getting rather long, I wanted to add the current set of compounds being evaluated in this collaboration between OSM and Ersilia. Using the latest version of the model, and the latest experimental data, we are experimentally pursuing the below structures. We're using a combination of CRO (Piramal) and in-house synthesis, and we should be done by early April, when we'll ship the compounds for eval. Very exciting!

Final Set with Annotations v3 March 14 2024

Final Set with Annotations v3 March 14 2024.zip

@qxsml @edwintse @GemmaTuron @miquelduranfrigola

@mattodd
Copy link
Member

mattodd commented May 9, 2024

Cross-referencing to the results for the above structures, which are at OpenSourceMalaria/Series4#79

@GemmaTuron
Copy link

Hi all,

As an update from the Piramal synthesis, we successfully obtained Targets 1,5 and 9 . For target 11 and 12 we have faced many challenges and we have stopped the attempts to synthesise. We are attaching here all the routes Piramal has tried to obtain these two targets:

piramal_t11_t12.pptx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants