-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Have a fixed environment for paper #1126
Comments
cc: @scarrazza |
It's very easy to do the minimum thing and upload the yml to the wiki. I think some kind of verification on uploaded resources would also be nice. |
...Not that I don't trust that people would use the canonical environment but I'd like to make sure that I did and we could look back at old* results and see what the environment was at that point (if they used a conda installation) *future old results |
Right. I'd suggest waiting for 4.0 to build and see what we get on linux for |
cc @enocera |
Right, could anybody else please test minimally the attached environment on linux (see @wilsonmr 's instructions above). (renamed the thing as txt so it will let me upload it here)
cc @enocera @scarlehoff @scarrazza @tgiani Edit: removed old environment. |
As soon as I get some confirmation, I plan to send an email containing some RFC 2119 lingo:
So this would be a good time to voice any comments/concerns. |
On a related topic I'll try to dump the environment somewhere on the server (possibly on top of a container). I'd expect that to make it reproducible for a long as easy access to x86 linux exists, which I'd expect to be at keast century, or the end of civilization, whatever comes first. |
I had to break this rule in order to run the tests. But otherwise everything passes for me Would be good to get some other confirmations though short version:
|
Same, was wondering if someone would notice that. But then again this is applicable to production environments. |
ok for me it failed but I guess it s my bad.. I ll do what the error message says and I ll try again
|
works for me. For validphys I get
and for n3fit
|
Works also for me. |
That said, inspecting the environment I see that mkl is (once again) the default and that the version has been bumped to 2.4.1 as recently as 3 days ago. Before running any final fits let me ensure that the eigen and the mkl version are doing the same thing this time. Edit: even if they work the same I would be more comfortable at this point with the eigen build. |
Yes that's why I removed it, since that link I gave wasn't any good. But looking at the dependencies of the tf 2.4.1 pypi package, it also shows gast==0.3.3 as a requirement. |
It seems this MKL version has the famous memory growth bug, so at least that needs to be changed, sigh. It seems there was a few packages in that situation for 2.3. It would be nice of the Conda maintainers to test the effect of their choices: AnacondaRecipes/tensorflow_recipes#27 ... Personally I would prefer to fix the version of all packages to the ones Roy linked. |
Conda also provides the eigen version, so that could just be changed in the environment file. |
Could you create the environment with: tensorflow==2.4.1=eigen_py37h3da6045_0 Sorry I could only test it now! |
I thought we had TF eingen?! Should have looked into it. |
No, at some point eigen was given priority and I asked whether it would be like that also in the future and mistakenly took the lack of response as a confirmation. My bad. I guess if we don't pin it the version is basically random. |
Could someone make a PR adding the the right constraints so it is not possible to get broken packages. |
Won't conda complain if we try to create an env with both tensorflow==2.4.1 and gast==0.3.3? Until now I used the pip version of tensorflow inside the conda env, but in this case that's not an ideal solution. |
Don't know. Travis will tell me #1143 |
@RoyStegeman why do you need that specific version of gast. Conda doesn't seem to like it... |
TensorFlow asks for it and looking through their commits it seems that using a newer version of gast does require some changes... they might not change anything for us but if it can be pinned it would be best. |
Yes exactly what Juan says, tf 2.4.1 asks for that specific version of gast. I don't know if having a different version will affect the behaviour in our case, but because I don't know I would prefer the version that tensorflow asks. |
@scarlehoff @RoyStegeman so just to be clear, you are claiming that the conda metadata is wring in giving me gast 0.4.0? |
Yes, It's nicely listed in the issue that Juan shared: AnacondaRecipes/tensorflow_recipes#27 The pip and conda dependencies are conflicting. Of course that's about tf2.3 but the issue still exists for tf2.4.1. |
Rather that the conda maintainers have decided to remove all pinnings in the last few days and maybe they have tested everything but I don't trust them AnacondaRecipes/tensorflow_recipes@08852b4 |
Well to be fair the pinnings of the pip version are crazy (pinning scipy down the the minor version?!). Do you have positive evidence that these are wrong? |
Scipy is for tests. But it is pinned only to the major version if I understood it correctly. |
I am looking at this thing AnacondaRecipes/tensorflow_recipes#27 |
Oh right, no in reality for scipy it'sn not
|
Right I see. Hmm, given the alternatives, my preferred solution would be to just ignore that requirement and use the conda version of gast. Is that known to actually break something we use? Also it is curious that this one commit serge-sans-paille/gast@8d3c5de in an 80 commit project can cause so much havoc apparently. Not clear to me what "breaking the ecosytem means". Is that something other than some project that google abandoned so they could go and play foosball? |
Also sorry that I didn't wait for explicit confirmation from any of you. |
Well, based on Juans PR conda's tensorflow 2.3.0 now does accept gast==0.3.3, so I guess that issue we linked is outdated. It's probably still worth it to check if any other packages disagree between conda and pip when building the environment with the version settings of that PR, but at least the gast conflict is solved. |
Save the environment below, as
and install with
Run fits with the environment activated (i.e. |
Actually, I think I would go with this: environment.txt |
Ok, that makes more sense.
… On 12 Mar 2021, at 01:15, Roy Stegeman ***@***.***> wrote:
Actually, I think I would go with this: environment.txt
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
I understand that we have mostly tested TF 2.3 so I'd rather go with that. @scarlehoff ? |
By the way, what do we do about the names of the fits? For some of the NNPDF40_[...] names already exist on the server, do we overwrite those? Or should we for now just keep a naming convention with the date in the name to make sure the names Also, for LO and NLO fits there are no runcards yet, for some reason I though this was because new theories had to be generated for this but perhaps I misunderstood that? I'm asking because @Zaharid mentioned during the meeting that they had not been run because we did not have an agreed upon environment yet. |
The existing fits should be renamed imo rather than fully overwritten, although I'm not sure how many people want to use these fits. For sure the bugged fits shouldn't keep those names |
This but unironically We should delete any grids with 4.0 and wait until the very final moment to rename the grids.
I said that the fits should not be run right now but that was not specifically tied to NLO. |
@RoyStegeman LO and NLO theories have already been generated, they should be correctly listed in |
@enocera thanks for pointing that out, I didn't realise they had already been generated. |
Ok, I have updated my comment #1126 (comment) with Roy's environment. Note the env file has to be saved with yaml extension for it to work properly. If you have old "nn4deploy" environments, delete them first with |
Should we update the bootstrap repository adding a production script that installs directly this environment? (and also, it should be linked in the readme here) |
I have updated the environment here: |
The NLO fits should now work using this environment and the various runcards in #675 and the wiki. All LHAPDF fits need to be redone with the latest environment. |
@RoyStegeman @scarlehoff Could you confirm that the tensorflow related versions in the existing environments are still good? Is there anything that must be bumped? |
No I think we should stick with this, in part because this setup has been used to run quite a few fits by now. Unless there is a reason to move to tf2.5 (#1231 @scarlehoff )? |
Yup, 2.4 is the right one (eigen!) the fix for 2.5 is "useless" for the tag (and conda won't have 2.5 for a few weeks anyway). |
Okay good. We were actually still using 2.3 instead of 2.4 for the same reason: most fits up till that point had been run/tested using 2.3. By now we have also run quite a few fits with 2.4, so in that sense I would feel comfortable with moving to 2.4, but there's just not really any reason to. |
Minimally something like this:
But perhaps we also want to be able to quickly verify that the person that ran some results did indeed use the right settings. If we had an action that could do this and output a tag or
False
where the tag is some kind of version of the environment and False would be if the check fails. Then you could see on a report it would say <nnpdf4.0> or perhaps we could save the tag in the folder when we upload to server so it applies to fits, pdfs or reports.Edit by @Zaharid : Please see this comment for instructions on installing the environment #1126 (comment)
The text was updated successfully, but these errors were encountered: