Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have a fixed environment for paper #1126

Closed
wilsonmr opened this issue Mar 3, 2021 · 53 comments · Fixed by #1255
Closed

Have a fixed environment for paper #1126

wilsonmr opened this issue Mar 3, 2021 · 53 comments · Fixed by #1255
Assignees

Comments

@wilsonmr
Copy link
Contributor

wilsonmr commented Mar 3, 2021

Minimally something like this:

activate <environment-name>
conda env export > <environment-name>.yml

#For other person to use the environment
conda env create -f <environment-name>.yml

But perhaps we also want to be able to quickly verify that the person that ran some results did indeed use the right settings. If we had an action that could do this and output a tag or False where the tag is some kind of version of the environment and False would be if the check fails. Then you could see on a report it would say <nnpdf4.0> or perhaps we could save the tag in the folder when we upload to server so it applies to fits, pdfs or reports.

Edit by @Zaharid : Please see this comment for instructions on installing the environment #1126 (comment)

@wilsonmr
Copy link
Contributor Author

wilsonmr commented Mar 3, 2021

cc: @scarrazza

@wilsonmr
Copy link
Contributor Author

wilsonmr commented Mar 3, 2021

It's very easy to do the minimum thing and upload the yml to the wiki. I think some kind of verification on uploaded resources would also be nice.

@wilsonmr
Copy link
Contributor Author

wilsonmr commented Mar 3, 2021

...Not that I don't trust that people would use the canonical environment but I'd like to make sure that I did and we could look back at old* results and see what the environment was at that point (if they used a conda installation)

*future old results

@wilsonmr wilsonmr mentioned this issue Mar 4, 2021
@Zaharid
Copy link
Contributor

Zaharid commented Mar 4, 2021

Right. I'd suggest waiting for 4.0 to build and see what we get on linux for conda create -n nndeploy nnpdf=4 python=3.7 and mandate that fits are done with that.

@Zaharid
Copy link
Contributor

Zaharid commented Mar 4, 2021

cc @enocera

@Zaharid
Copy link
Contributor

Zaharid commented Mar 11, 2021

Right, could anybody else please test minimally the attached environment on linux (see @wilsonmr 's instructions above).

(renamed the thing as txt so it will let me upload it here)

nn4deploy.txt
AFAICT it installs cleanly and all tests pass.

cc @enocera @scarlehoff @scarrazza @tgiani

Edit: removed old environment.

@Zaharid
Copy link
Contributor

Zaharid commented Mar 11, 2021

As soon as I get some confirmation, I plan to send an email containing some RFC 2119 lingo:

  • The fixed environment linked on the issue and at the wiki,
    MUST be set up following the instructions in the issue order to run 4.0 fits from now on.

  • Existing PDF grids, theories and fits that are needed for running other fits SHOULD be downloaded cleanly from the repositories.

  • Modifications to the environment MUST NOT be made. In particular no packages can be updated, no new packages can be installed and no development versions can be used.

  • All new fits for the paper MUST be launched within the environment as described.

  • All fits, existing or not, that are to be released publicly with an LHAPDF id. Consequently existing fits MUST be repeated with this setup.

  • Post processing tools (postfit) MUST be run within the environment. Further analysis tools MAY use updated versions of the code.

  • In the event of finding that any modification to the environment is required, a similar policy will apply to the updated version, with the implication that relevant fits will have to be repeated.

So this would be a good time to voice any comments/concerns.

@Zaharid
Copy link
Contributor

Zaharid commented Mar 11, 2021

On a related topic I'll try to dump the environment somewhere on the server (possibly on top of a container). I'd expect that to make it reproducible for a long as easy access to x86 linux exists, which I'd expect to be at keast century, or the end of civilization, whatever comes first.

@wilsonmr
Copy link
Contributor Author

no new packages can be installed and no development versions can be used.

I had to break this rule in order to run the tests. But otherwise everything passes for me

Would be good to get some other confirmations though

short version:

$ conda env create -f nn4deploy.yml
...
(/scratch/conda_envs/nn4deploy) -bash-4.2$ conda install hypothesis pytest coverage pytest-mpl
Collecting package metadata: done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 4.6.14
  latest version: 4.9.2

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /scratch/conda_envs/nn4deploy

  added / updated specs:
    - coverage
    - hypothesis
    - pytest
    - pytest-mpl


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    hypothesis-6.7.0           |     pyhd3eb1b0_0         236 KB
    more-itertools-8.7.0       |     pyhd3eb1b0_0          39 KB
    pytest-6.2.2               |   py37h06a4308_2         453 KB
    ------------------------------------------------------------
                                           Total:         728 KB

The following NEW packages will be INSTALLED:

  hypothesis         pkgs/main/noarch::hypothesis-6.7.0-pyhd3eb1b0_0
  importlib_metadata pkgs/main/noarch::importlib_metadata-2.0.0-1
  iniconfig          pkgs/main/noarch::iniconfig-1.1.1-pyhd3eb1b0_0
  more-itertools     pkgs/main/noarch::more-itertools-8.7.0-pyhd3eb1b0_0
  pluggy             pkgs/main/linux-64::pluggy-0.13.1-py37_0
  py                 pkgs/main/noarch::py-1.10.0-pyhd3eb1b0_0
  pytest             pkgs/main/linux-64::pytest-6.2.2-py37h06a4308_2
  pytest-mpl         conda-forge/noarch::pytest-mpl-0.12-pyhd3deb0d_0
  sortedcontainers   pkgs/main/noarch::sortedcontainers-2.3.0-pyhd3eb1b0_0
  toml               pkgs/main/noarch::toml-0.10.1-py_0


Proceed ([y]/n)? y
...
(/scratch/conda_envs/nn4deploy) -bash-4.2$ pytest --pyargs validphys
...
-- Docs: https://docs.pytest.org/en/stable/warnings.html
================= 95 passed, 18 warnings in 207.85s (0:03:27) ==================
...
(/scratch/conda_envs/nn4deploy) -bash-4.2$ pytest --pyargs n3fit
...
============ 38 passed, 1 skipped, 2 warnings in 168.82s (0:02:48) =============

@Zaharid
Copy link
Contributor

Zaharid commented Mar 11, 2021

no new packages can be installed and no development versions can be used.

I had to break this rule in order to run the tests. But otherwise everything passes for me

Same, was wondering if someone would notice that. But then again this is applicable to production environments.

@tgiani
Copy link
Contributor

tgiani commented Mar 11, 2021

ok for me it failed but I guess it s my bad.. I ll do what the error message says and I ll try again

(nn4deploy) tommy@tommy-XPS-13-9380:~$ pytest --pyargs validphys
=========================================================================================== test session starts ============================================================================================
platform linux -- Python 3.7.10, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
Matplotlib: 3.3.4
Freetype: 2.10.4
rootdir: /home/tommy
plugins: mpl-0.12, hypothesis-6.7.0
collected 95 items                                                                                                                                                                                         

miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_arclengths.py ..                                                                                                          [  2%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_calcutils.py .                                                                                                            [  3%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_closuretest.py ..                                                                                                         [  5%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_commondataparser.py ..                                                                                                    [  7%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_covmatreg.py .....                                                                                                        [ 12%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_covmats.py ............                                                                                                   [ 25%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_cuts.py .                                                                                                                 [ 26%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_effexponents.py .                                                                                                         [ 27%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_filter_rules.py ....                                                                                                      [ 31%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_fitdata.py .                                                                                                              [ 32%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_fitveto.py ..                                                                                                             [ 34%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_loader.py .                                                                                                               [ 35%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_metaexps.py .                                                                                                             [ 36%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_plots.py ...                                                                                                              [ 40%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_postfit.py .                                                                                                              [ 41%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_pseudodata.py .....                                                                                                       [ 46%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_pyfkdata.py ....                                                                                                          [ 50%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_pythonmakereplica.py ..................                                                                                   [ 69%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_rebuilddata.py .                                                                                                          [ 70%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_regressions.py ...........                                                                                                [ 82%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_tableloader.py .F                                                                                                         [ 84%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_theorydbutils.py ...                                                                                                      [ 87%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_totalchi2.py ..                                                                                                           [ 89%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_utils.py .                                                                                                                [ 90%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_vplistscript.py ......                                                                                                    [ 96%]
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_weights.py ...                                                                                                            [100%]

================================================================================================= FAILURES =================================================================================================
___________________________________________________________________________________________ test_extrasum_slice ____________________________________________________________________________________________

args = ('ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv',), kwargs = {}, saved_exception = LoadFailedError("Could not find 'ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'")

    @functools.wraps(orig)
    def f(*args, **kwargs):
        try:
            return orig(*args, **kwargs)
        except LoadFailedError as e:
            saved_exception = e
            log.info("Could not find a resource "
                f"({resource}): {saved_exception}. "
                f"Attempting to download it.")
            try:
>               download(*args, **kwargs)

miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/loader.py:909: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <validphys.loader.FallbackLoader object at 0x7fdcc4428290>, filename = PosixPath('ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'), kwargs = {}, root_url = 'https://vp.nnpdf.science/'
url = 'https://vp.nnpdf.science/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'

    def download_vp_output_file(self, filename, **kwargs):
        try:
            root_url = self.nnprofile['reports_root_url']
        except KeyError as e:
            raise LoadFailedError('Key report_root_url not found in nnprofile')
        try:
            url = root_url  + filename
        except Exception as e:
            raise LoadFailedError(e) from e
        try:
            filename = pathlib.Path(filename)
    
>           download_file(url, self._vp_cache()/filename, make_parents=True)

miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/loader.py:872: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

url = 'https://vp.nnpdf.science/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'
stream_or_path = PosixPath('/home/tommy/miniconda3/envs/nn4deploy/share/NNPDF/vp-cache/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'), make_parents = True

    def download_file(url, stream_or_path, make_parents=False):
        """Download a file and show a progress bar if the INFO log level is
        enabled. If ``make_parents`` is ``True`` ``stream_or_path``
        is path-like, all the parent folders will
        be created."""
        #There is a bug in CERN's
        #Apache that incorrectly sets the Content-Encodig header to gzip, even
        #though it doesn't compress two times.
        # See: http://mail-archives.apache.org/mod_mbox/httpd-dev/200207.mbox/%[email protected]%3E
        # and e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=610679#c30
        #If it looks like the url is already encoded, we do not request
        #it to be compressed
        headers = {}
        if mimetypes.guess_type(url)[1] is not None:
            headers['Accept-Encoding'] = None
    
        response = requests.get(url, stream=True, headers=headers)
    
>       response.raise_for_status()

miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/loader.py:566: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <Response [401]>

    def raise_for_status(self):
        """Raises :class:`HTTPError`, if one occurred."""
    
        http_error_msg = ''
        if isinstance(self.reason, bytes):
            # We attempt to decode utf-8 first because some servers
            # choose to localize their reason strings. If the string
            # isn't utf-8, we fall back to iso-8859-1 for all other
            # encodings. (See PR #3538)
            try:
                reason = self.reason.decode('utf-8')
            except UnicodeDecodeError:
                reason = self.reason.decode('iso-8859-1')
        else:
            reason = self.reason
    
        if 400 <= self.status_code < 500:
            http_error_msg = u'%s Client Error: %s for url: %s' % (self.status_code, reason, self.url)
    
        elif 500 <= self.status_code < 600:
            http_error_msg = u'%s Server Error: %s for url: %s' % (self.status_code, reason, self.url)
    
        if http_error_msg:
>           raise HTTPError(http_error_msg, response=self)
E           requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://vp.nnpdf.science/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv

miniconda3/envs/nn4deploy/lib/python3.7/site-packages/requests/models.py:943: HTTPError

The above exception was the direct cause of the following exception:

    def test_extrasum_slice():
        l = Loader()
>       f =  l.check_vp_output_file('ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv')

miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_tableloader.py:66: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/loader.py:918: in f
    raise saved_exception from e
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/loader.py:902: in f
    return orig(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <validphys.loader.FallbackLoader object at 0x7fdcc4428290>, filename = 'ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'
extra_paths = ('.', PosixPath('/home/tommy/miniconda3/envs/nn4deploy/share/NNPDF/vp-cache'))

    def check_vp_output_file(self, filename, extra_paths=('.',)):
        """Find a file in the vp-cache folder, or (with higher priority) in
        the ``extra_paths``."""
        try:
            vpcache = self._vp_cache()
        except KeyError as e:
            log.warning("Entry validphys_cache_path expected but not found "
                     "in the nnprofile.")
        else:
            extra_paths = (*extra_paths, vpcache)
    
        finder = filefinder.FallbackFinder(extra_paths)
        try:
            path, name = finder.find(filename)
        except FileNotFoundError as e:
>           raise LoadFailedError(f"Could not find '{filename}'") from e
E           validphys.loader.LoadFailedError: Could not find 'ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'

miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/loader.py:520: LoadFailedError
------------------------------------------------------------------------------------------- Captured stderr call -------------------------------------------------------------------------------------------
[INFO]: Creating validphys cache directory: /home/tommy/miniconda3/envs/nn4deploy/share/NNPDF/vp-cache
[INFO]: Creating validphys cache directory: /home/tommy/miniconda3/envs/nn4deploy/share/NNPDF/vp-cache
[INFO]: Creating validphys cache directory: /home/tommy/miniconda3/envs/nn4deploy/share/NNPDF/vp-cache
[INFO]: Could not find a resource (vp_output_file): Could not find 'ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'. Attempting to download it.
[INFO]: Could not find a resource (vp_output_file): Could not find 'ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'. Attempting to download it.
[INFO]: Could not find a resource (vp_output_file): Could not find 'ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'. Attempting to download it.
[ERROR]: Could not access the validphys reports page because the authentification is not provided. Please, update your ~/.netrc file to contain the following:

machine vp.nnpdf.science
    login nnpdf
    password <PASSWORD>

[ERROR]: Could not access the validphys reports page because the authentification is not provided. Please, update your ~/.netrc file to contain the following:

machine vp.nnpdf.science
    login nnpdf
    password <PASSWORD>

[ERROR]: Could not access the validphys reports page because the authentification is not provided. Please, update your ~/.netrc file to contain the following:

machine vp.nnpdf.science
    login nnpdf
    password <PASSWORD>

[ERROR]: There was a problem with the connection: 401 Client Error: Unauthorized for url: https://vp.nnpdf.science/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv
[ERROR]: There was a problem with the connection: 401 Client Error: Unauthorized for url: https://vp.nnpdf.science/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv
[ERROR]: There was a problem with the connection: 401 Client Error: Unauthorized for url: https://vp.nnpdf.science/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv
-------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------
INFO     validphys.loader:loader.py:109 Creating validphys cache directory: /home/tommy/miniconda3/envs/nn4deploy/share/NNPDF/vp-cache
INFO     validphys.loader:loader.py:905 Could not find a resource (vp_output_file): Could not find 'ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv'. Attempting to download it.
ERROR    validphys.loader:loader.py:879 Could not access the validphys reports page because the authentification is not provided. Please, update your ~/.netrc file to contain the following:

machine vp.nnpdf.science
    login nnpdf
    password <PASSWORD>

ERROR    validphys.loader:loader.py:917 There was a problem with the connection: 401 Client Error: Unauthorized for url: https://vp.nnpdf.science/ljzWOixPQfmq5dA1-EUocg==/tables/fits_chi2_table.csv
============================================================================================= warnings summary =============================================================================================
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_closuretest.py:12
  /home/tommy/miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_closuretest.py:12: PytestCollectionWarning: cannot collect test class 'TestResult' because it has a __init__ constructor (from: miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_closuretest.py)
    class TestResult:

miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_tableloader.py::test_min_combination
miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_tableloader.py::test_min_combination
  /home/tommy/miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tableloader.py:89: FutureWarning: inplace is deprecated and will be removed in a future version.
    cols.set_levels(new_levels, inplace=True, level=0)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
========================================================================================= short test summary info ==========================================================================================
FAILED miniconda3/envs/nn4deploy/lib/python3.7/site-packages/validphys/tests/test_tableloader.py::test_extrasum_slice - validphys.loader.LoadFailedError: Could not find 'ljzWOixPQfmq5dA1-EUocg==/tables...
=========================================================================== 1 failed, 94 passed, 3 warnings in 196.76s (0:03:16) ===========================================================================
Thanks for using LHAPDF 6.3.0. Please make sure to cite the paper:
  Eur.Phys.J. C75 (2015) 3, 132  (http://arxiv.org/abs/1412.7420)

@tgiani
Copy link
Contributor

tgiani commented Mar 11, 2021

works for me. For validphys I get

=============================================================================== 95 passed, 11 warnings in 173.22s (0:02:53) ================================================================================

and for n3fit

========================================================================== 38 passed, 1 skipped, 2 warnings in 275.25s (0:04:35) ===========================================================================

@scarlehoff
Copy link
Member

Works also for me.

@scarlehoff
Copy link
Member

scarlehoff commented Mar 11, 2021

That's for TF 2.2.

That said, inspecting the environment I see that mkl is (once again) the default and that the version has been bumped to 2.4.1 as recently as 3 days ago. Before running any final fits let me ensure that the eigen and the mkl version are doing the same thing this time.

Edit: even if they work the same I would be more comfortable at this point with the eigen build.

@RoyStegeman
Copy link
Member

RoyStegeman commented Mar 11, 2021

Yes that's why I removed it, since that link I gave wasn't any good. But looking at the dependencies of the tf 2.4.1 pypi package, it also shows gast==0.3.3 as a requirement.

https://github.com/tensorflow/tensorflow/blob/85c8b2a817f95a3e979ecd1ed95bff1dc1335cff/tensorflow/tools/pip_package/setup.py#L94

@scarlehoff
Copy link
Member

It seems this MKL version has the famous memory growth bug, so at least that needs to be changed, sigh.

It seems there was a few packages in that situation for 2.3. It would be nice of the Conda maintainers to test the effect of their choices: AnacondaRecipes/tensorflow_recipes#27 ...

Personally I would prefer to fix the version of all packages to the ones Roy linked.

@RoyStegeman
Copy link
Member

It seems this MKL version has the famous memory growth bug, so at least that needs to be changed, sigh.

Conda also provides the eigen version, so that could just be changed in the environment file.

@scarlehoff
Copy link
Member

@Zaharid

Could you create the environment with:

tensorflow==2.4.1=eigen_py37h3da6045_0
gast==0.3.3
opt_einsum==3.3.0 # this one is from conda forge

Sorry I could only test it now!

@Zaharid
Copy link
Contributor

Zaharid commented Mar 11, 2021

I thought we had TF eingen?! Should have looked into it.

@scarlehoff
Copy link
Member

scarlehoff commented Mar 11, 2021

No, at some point eigen was given priority and I asked whether it would be like that also in the future and mistakenly took the lack of response as a confirmation. My bad. I guess if we don't pin it the version is basically random.

@Zaharid
Copy link
Contributor

Zaharid commented Mar 11, 2021

Could someone make a PR adding the the right constraints so it is not possible to get broken packages.

@RoyStegeman
Copy link
Member

Could you create the environment with:

tensorflow==2.4.1=eigen_py37h3da6045_0
gast==0.3.3
opt_einsum==3.3.0 # this one is from conda forge

Won't conda complain if we try to create an env with both tensorflow==2.4.1 and gast==0.3.3? Until now I used the pip version of tensorflow inside the conda env, but in this case that's not an ideal solution.

@scarlehoff
Copy link
Member

Don't know. Travis will tell me #1143

@Zaharid
Copy link
Contributor

Zaharid commented Mar 11, 2021

@RoyStegeman why do you need that specific version of gast. Conda doesn't seem to like it...

@scarlehoff
Copy link
Member

TensorFlow asks for it and looking through their commits it seems that using a newer version of gast does require some changes... they might not change anything for us but if it can be pinned it would be best.

@RoyStegeman
Copy link
Member

RoyStegeman commented Mar 11, 2021

Yes exactly what Juan says, tf 2.4.1 asks for that specific version of gast. I don't know if having a different version will affect the behaviour in our case, but because I don't know I would prefer the version that tensorflow asks.

@Zaharid
Copy link
Contributor

Zaharid commented Mar 11, 2021

@scarlehoff @RoyStegeman so just to be clear, you are claiming that the conda metadata is wring in giving me gast 0.4.0?

@RoyStegeman
Copy link
Member

RoyStegeman commented Mar 11, 2021

Yes, It's nicely listed in the issue that Juan shared: AnacondaRecipes/tensorflow_recipes#27

The pip and conda dependencies are conflicting. Of course that's about tf2.3 but the issue still exists for tf2.4.1.

@scarlehoff
Copy link
Member

@scarlehoff @RoyStegeman so just to be clear, you are claiming that the conda metadata is wring in giving me gast 0.4.0?

Rather that the conda maintainers have decided to remove all pinnings in the last few days and maybe they have tested everything but I don't trust them AnacondaRecipes/tensorflow_recipes@08852b4

@Zaharid
Copy link
Contributor

Zaharid commented Mar 11, 2021

Well to be fair the pinnings of the pip version are crazy (pinning scipy down the the minor version?!). Do you have positive evidence that these are wrong?

@scarlehoff
Copy link
Member

Scipy is for tests. But it is pinned only to the major version if I understood it correctly.

@Zaharid
Copy link
Contributor

Zaharid commented Mar 11, 2021

I am looking at this thing AnacondaRecipes/tensorflow_recipes#27

@RoyStegeman
Copy link
Member

RoyStegeman commented Mar 11, 2021

Oh right, no in reality for scipy it'sn not == but ~=: https://github.com/tensorflow/tensorflow/blob/85c8b2a817f95a3e979ecd1ed95bff1dc1335cff/tensorflow/tools/pip_package/setup.py#L129

We use the ~= syntax to pin packages to the latest major.minor release accepting all other patches on top of that.

@Zaharid
Copy link
Contributor

Zaharid commented Mar 11, 2021

Right I see. Hmm, given the alternatives, my preferred solution would be to just ignore that requirement and use the conda version of gast. Is that known to actually break something we use? Also it is curious that this one commit serge-sans-paille/gast@8d3c5de in an 80 commit project can cause so much havoc apparently. Not clear to me what "breaking the ecosytem means". Is that something other than some project that google abandoned so they could go and play foosball?

@Zaharid
Copy link
Contributor

Zaharid commented Mar 11, 2021

Also sorry that I didn't wait for explicit confirmation from any of you.

@RoyStegeman
Copy link
Member

Well, based on Juans PR conda's tensorflow 2.3.0 now does accept gast==0.3.3, so I guess that issue we linked is outdated. It's probably still worth it to check if any other packages disagree between conda and pip when building the environment with the version settings of that PR, but at least the gast conflict is solved.

@Zaharid
Copy link
Contributor

Zaharid commented Mar 11, 2021

Save the environment below, as env.yaml (note that it malfunctions if the extension is different from yaml)

name: nn4deploy
channels:
  - https://packages.nnpdf.science/private
  - https://packages.nnpdf.science/public
  - defaults
  - conda-forge
dependencies:
  - _libgcc_mutex=0.1=main
  - _tflow_select=2.3.0=eigen
  - absl-py=0.12.0=py37h06a4308_0
  - aiohttp=3.7.4=py37h27cfd23_1
  - alabaster=0.7.12=py37_0
  - apfel=3.0.5.0=py37hbdda60e_8
  - astunparse=1.6.3=py_0
  - async-timeout=3.0.1=py37h06a4308_0
  - attrs=20.3.0=pyhd3eb1b0_0
  - babel=2.9.0=pyhd3eb1b0_0
  - blas=1.0=mkl
  - blessings=1.7=py37h06a4308_1002
  - blinker=1.4=py37h06a4308_0
  - brotlipy=0.7.0=py37h27cfd23_1003
  - bzip2=1.0.8=h7b6447c_0
  - c-ares=1.17.1=h27cfd23_0
  - ca-certificates=2021.4.13=h06a4308_1
  - cachetools=4.2.1=pyhd3eb1b0_0
  - certifi=2020.12.5=py37h06a4308_0
  - cffi=1.14.5=py37h261ae71_0
  - chardet=3.0.4=py37h06a4308_1003
  - click=7.1.2=pyhd3eb1b0_0
  - cloudpickle=1.6.0=py_0
  - colorama=0.4.4=pyhd3eb1b0_0
  - commonmark=0.9.1=py_0
  - coverage=5.5=py37h27cfd23_2
  - cryptography=3.3.1=py37h3c74f83_1
  - curio=0.9+git.49=py37_0
  - cycler=0.10.0=py37_0
  - cython=0.29.22=py37h2531618_0
  - dbus=1.13.18=hb2f20db_0
  - decorator=4.4.2=pyhd3eb1b0_0
  - docutils=0.16=py37_1
  - expat=2.2.10=he6710b0_2
  - fontconfig=2.13.1=h6c09931_0
  - freetype=2.10.4=h5ab3b9f_0
  - future=0.18.2=py37_1
  - gast=0.3.3=py_0
  - glib=2.67.4=h36276a3_1
  - google-auth=1.27.1=pyhd3eb1b0_0
  - google-auth-oauthlib=0.4.3=pyhd3eb1b0_0
  - google-pasta=0.2.0=py_0
  - grpcio=1.36.1=py37h2157cd5_1
  - gsl=2.4=h14c3975_4
  - gst-plugins-base=1.14.0=h8213a91_2
  - gstreamer=1.14.0=h28cd5cc_2
  - h5py=2.10.0=py37hd6299e0_1
  - hdf5=1.10.6=hb1b8bf9_0
  - hyperopt=0.2.5=pyh9f0ad1d_0
  - icu=58.2=he6710b0_3
  - idna=2.10=pyhd3eb1b0_0
  - imagesize=1.2.0=pyhd3eb1b0_0
  - importlib-metadata=2.0.0=py_1
  - intel-openmp=2020.2=254
  - jinja2=2.11.3=pyhd3eb1b0_0
  - jpeg=9b=h024ee3a_2
  - keras-preprocessing=1.1.2=pyhd3eb1b0_0
  - kiwisolver=1.3.1=py37h2531618_0
  - lcms2=2.11=h396b838_0
  - ld_impl_linux-64=2.33.1=h53a641e_7
  - lhapdf=6.3.0=py37h6bb024c_1
  - libarchive=3.4.2=h62408e4_0
  - libedit=3.1.20191231=h14c3975_1
  - libffi=3.3=he6710b0_2
  - libgcc-ng=9.1.0=hdf63c60_0
  - libgfortran-ng=7.3.0=hdf63c60_0
  - libpng=1.6.37=hbc83047_0
  - libprotobuf=3.14.0=h8c45485_0
  - libstdcxx-ng=9.1.0=hdf63c60_0
  - libtiff=4.1.0=h2733197_1
  - libuuid=1.0.3=h1bed415_2
  - libxcb=1.14=h7b6447c_0
  - libxml2=2.9.10=hb55368b_3
  - lz4-c=1.9.3=h2531618_0
  - markdown=3.3.4=py37h06a4308_0
  - markupsafe=1.1.1=py37h14c3975_1
  - matplotlib=3.3.4=py37h06a4308_0
  - matplotlib-base=3.3.4=py37h62a2d02_0
  - mkl=2020.2=256
  - mkl-service=2.3.0=py37he8ac12f_0
  - mkl_fft=1.3.0=py37h54f3939_0
  - mkl_random=1.1.1=py37h0573a6f_0
  - multidict=5.1.0=py37h27cfd23_2
  - ncurses=6.2=he6710b0_1
  - networkx=2.5=py_0
  - nnpdf=4.0.2.0+g2dac5d4=py37h9a426be_0
  - numpy=1.18.5=py37ha1c710e_0
  - numpy-base=1.18.5=py37hde5b4d6_0
  - oauthlib=3.1.0=py_0
  - olefile=0.46=py37_0
  - openssl=1.1.1k=h27cfd23_0
  - opt_einsum=3.1.0=py_0
  - packaging=20.9=pyhd3eb1b0_0
  - pandas=1.2.3=py37ha9443f7_0
  - pandoc=2.11=hb0f4dca_0
  - pcre=8.44=he6710b0_0
  - pillow=8.1.2=py37he98fc37_0
  - pip=21.0.1=py37h06a4308_0
  - pkg-config=0.29.2=h1bed415_8
  - prompt-toolkit=3.0.8=py_0
  - prompt_toolkit=3.0.8=0
  - protobuf=3.14.0=py37h2531618_1
  - psutil=5.8.0=py37h27cfd23_1
  - pyasn1=0.4.8=py_0
  - pyasn1-modules=0.2.8=py_0
  - pycparser=2.20=py_2
  - pygments=2.8.1=pyhd3eb1b0_0
  - pyjwt=1.7.1=py37_0
  - pymongo=3.11.3=py37h2531618_0
  - pyopenssl=20.0.1=pyhd3eb1b0_1
  - pyparsing=2.4.7=pyhd3eb1b0_0
  - pyqt=5.9.2=py37h05f1152_2
  - pysocks=1.7.1=py37_1
  - python=3.7.10=hdb3f193_0
  - python-dateutil=2.8.1=pyhd3eb1b0_0
  - pytz=2021.1=pyhd3eb1b0_0
  - qt=5.9.7=h5867ecd_1
  - readline=8.1=h27cfd23_0
  - recommonmark=0.6.0=py_0
  - reportengine=0.30.1=py_0
  - requests=2.25.1=pyhd3eb1b0_0
  - requests-oauthlib=1.3.0=py_0
  - rsa=4.7.2=pyhd3eb1b0_1
  - ruamel_yaml=0.15.87=py37h7b6447c_1
  - scipy=1.4.1=py37h0b6359f_0
  - seaborn=0.11.1=pyhd3eb1b0_0
  - setuptools=52.0.0=py37h06a4308_0
  - sip=4.19.8=py37hf484d3e_0
  - six=1.15.0=py37h06a4308_0
  - snowballstemmer=2.1.0=pyhd3eb1b0_0
  - sphinx=3.5.2=pyhd3eb1b0_0
  - sphinx_rtd_theme=0.5.1=pyhd3deb0d_0
  - sphinxcontrib-applehelp=1.0.2=pyhd3eb1b0_0
  - sphinxcontrib-devhelp=1.0.2=pyhd3eb1b0_0
  - sphinxcontrib-htmlhelp=1.0.3=pyhd3eb1b0_0
  - sphinxcontrib-jsmath=1.0.1=pyhd3eb1b0_0
  - sphinxcontrib-qthelp=1.0.3=pyhd3eb1b0_0
  - sphinxcontrib-serializinghtml=1.1.4=pyhd3eb1b0_0
  - sqlite=3.35.4=hdfb4753_0
  - tensorboard=2.4.0=pyhc547734_0
  - tensorboard-plugin-wit=1.6.0=py_0
  - tensorflow=2.3.0=eigen_py37h189e6a2_0
  - tensorflow-base=2.3.0=eigen_py37h3b305d7_0
  - tensorflow-estimator=2.3.0=pyheb71bc4_0
  - termcolor=1.1.0=py37h06a4308_1
  - tk=8.6.10=hbc83047_0
  - tornado=6.1=py37h27cfd23_0
  - tqdm=4.56.0=pyhd3eb1b0_0
  - typing-extensions=3.7.4.3=hd3eb1b0_0
  - typing_extensions=3.7.4.3=pyh06a4308_0
  - urllib3=1.26.3=pyhd3eb1b0_0
  - wcwidth=0.2.5=py_0
  - werkzeug=1.0.1=pyhd3eb1b0_0
  - wheel=0.36.2=pyhd3eb1b0_0
  - wrapt=1.12.1=py37h7b6447c_1
  - xz=5.2.5=h7b6447c_0
  - yaml=0.2.5=h7b6447c_0
  - yaml-cpp=0.6.0=h6bb024c_4
  - yarl=1.6.3=py37h27cfd23_0
  - zipp=3.4.0=pyhd3eb1b0_0
  - zlib=1.2.11=h7b6447c_3
  - zstd=1.4.5=h9ceee32_0
  - pip:
    - validphys==4.0
prefix: /home/zah/anaconda3/envs/nn4deploy

and install with

conda env create --force --file env.yaml

Run fits with the environment activated (i.e. conda activate nndeploy).

@RoyStegeman
Copy link
Member

Actually, I think I would go with this: environment.txt

@Zaharid
Copy link
Contributor

Zaharid commented Mar 12, 2021 via email

@Zaharid
Copy link
Contributor

Zaharid commented Mar 12, 2021

I understand that we have mostly tested TF 2.3 so I'd rather go with that. @scarlehoff ?

@RoyStegeman
Copy link
Member

By the way, what do we do about the names of the fits? For some of the NNPDF40_[...] names already exist on the server, do we overwrite those? Or should we for now just keep a naming convention with the date in the name to make sure the names
are unique and rename them only shortly before they are send to lhapdf?

Also, for LO and NLO fits there are no runcards yet, for some reason I though this was because new theories had to be generated for this but perhaps I misunderstood that? I'm asking because @Zaharid mentioned during the meeting that they had not been run because we did not have an agreed upon environment yet.

@wilsonmr
Copy link
Contributor Author

By the way, what do we do about the names of the fits? For some of the NNPDF40_[...] names already exist on the server

The existing fits should be renamed imo rather than fully overwritten, although I'm not sure how many people want to use these fits. For sure the bugged fits shouldn't keep those names

@Zaharid
Copy link
Contributor

Zaharid commented Mar 12, 2021

By the way, what do we do about the names of the fits? For some of the NNPDF40_[...] names already exist on the server, do we overwrite those? Or should we for now just keep a naming convention with the date in the name to make sure the names
are unique and rename them only shortly before they are send to lhapdf?

This but unironically

#1070 (comment)

We should delete any grids with 4.0 and wait until the very final moment to rename the grids.

Also, for LO and NLO fits there are no runcards yet, for some reason I though this was because new theories had to be generated for this but perhaps I misunderstood that? I'm asking because @Zaharid mentioned during the meeting that they had not been run because we did not have an agreed upon environment yet.

I said that the fits should not be run right now but that was not specifically tied to NLO.

@enocera
Copy link
Contributor

enocera commented Mar 12, 2021

@RoyStegeman LO and NLO theories have already been generated, they should be correctly listed in theory.db, see IDs 208-210 and 212-214.

@RoyStegeman
Copy link
Member

@enocera thanks for pointing that out, I didn't realise they had already been generated.

@Zaharid
Copy link
Contributor

Zaharid commented Mar 13, 2021

Ok, I have updated my comment #1126 (comment) with Roy's environment. Note the env file has to be saved with yaml extension for it to work properly. If you have old "nn4deploy" environments, delete them first with conda env remove -n nn4deploy. Remember to actually activate the environment for running fits.

@scarlehoff
Copy link
Member

Should we update the bootstrap repository adding a production script that installs directly this environment? (and also, it should be linked in the readme here)

@Zaharid
Copy link
Contributor

Zaharid commented Mar 18, 2021

I have updated the environment here:
#1126 (comment)
and in the wiki. The only difference should be the versions of the nnpdf package (4.0.1 now) and of sqlite.

@Zaharid
Copy link
Contributor

Zaharid commented Mar 18, 2021

The NLO fits should now work using this environment and the various runcards in #675 and the wiki.

All LHAPDF fits need to be redone with the latest environment.

@Zaharid
Copy link
Contributor

Zaharid commented May 21, 2021

@RoyStegeman @scarlehoff Could you confirm that the tensorflow related versions in the existing environments are still good? Is there anything that must be bumped?

@RoyStegeman
Copy link
Member

No I think we should stick with this, in part because this setup has been used to run quite a few fits by now.

Unless there is a reason to move to tf2.5 (#1231 @scarlehoff )?

@scarlehoff
Copy link
Member

Yup, 2.4 is the right one (eigen!) the fix for 2.5 is "useless" for the tag (and conda won't have 2.5 for a few weeks anyway).

@RoyStegeman
Copy link
Member

Okay good. We were actually still using 2.3 instead of 2.4 for the same reason: most fits up till that point had been run/tested using 2.3.

By now we have also run quite a few fits with 2.4, so in that sense I would feel comfortable with moving to 2.4, but there's just not really any reason to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants