Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify language-specific packages (name/version) #359

Open
remram44 opened this issue Jul 5, 2019 · 4 comments
Open

Identify language-specific packages (name/version) #359

remram44 opened this issue Jul 5, 2019 · 4 comments
Assignees
Labels
C-tracer (Python) Component: The Python part of the tracer codebase T-enhancement Type: En enhancement to existing code, or a new feature

Comments

@remram44
Copy link
Member

remram44 commented Jul 5, 2019

When a runtime with a package manager is used, we should try to identify which packages are being used. For example, for Python, record the package name and version for site packages.

This could be done for:

  • Python site packages
  • Ruby gems
  • R (maybe)
  • Java (not interpreted, no run-time package manager, but JARs include version information)
@remram44 remram44 added C-tracer (Python) Component: The Python part of the tracer codebase T-enhancement Type: En enhancement to existing code, or a new feature labels Jul 5, 2019
@remram44
Copy link
Member Author

remram44 commented Jul 5, 2019

For Python: pip freeze is able to identify installed packages. However this runs in the "target" interpreter, not ReproZip's. We should probably read from the filesystem instead.

pip uses distlib to do this, which cites a variety of PEPs:

  • PEP 241: replaced by PEP 314. Metadata 1.0 format for PKG-INFO file in sdists, and .dist-info/METADATA files
  • PEP 314: Metadata 1.1 format
  • PEP 345: Metadata 1.2 format
  • PEP 566: Metadata 2.1 format
  • PEP 376: On-disk layout of packages and metadata (.dist-info, *.egg-info)
  • PEP 386: replaced by PEP 440. Version number and version requirements format, irrelevant
  • PEP 426: meant to replace PEP 345, but withdrawn in favor of PEP 566
  • PEP 440: Version number and version requirements format, irrelevant

So there doesn't really seem to be competing standards or formats. Reading .dist-info/METADATA or .egg-info/PKG-INFO (PEP 376) should give all the information we want (in PEP 566 format), though really the version number is in the folder name already.

@appukuttan-shailesh
Copy link

I thought of highlighting a possible pitfall in this task. Some packages have a single release for Py2 and Py3, where certain features are made unavailable for Py2 users. But the dependency tracker, based on how you plan to implement it, might encounter an issue (such as with Sumatra). More here.

Looking forward to see this functionality implemented within ReproZip.

@remram44
Copy link
Member Author

We should make sure to record the Python version as well then, thanks.

@remram44
Copy link
Member Author

remram44 commented Jul 8, 2021

This needs a change in the config file format.

Currently it's a flat list of files, implicitly meant for whatever that distribution's default package manager is:

packages:
  - name: "libc6"
    version: "2.31-0ubuntu9.3"
    size: 13563904
    packfiles: true
    meta: {"section": "libs"}
    files:
      # Total files used: 3.80 MB
      # Installed package size: 12.94 MB
      - "/lib/i386-linux-gnu/ld-2.31.so" # 176.40 KB
      - "/lib/ld-linux.so.2" # Link to /lib/i386-linux-gnu/ld-2.31.so
      - "/lib/x86_64-linux-gnu/ld-2.31.so" # 186.99 KB
  - name: "libexpat1"
    version: "2.2.9-1build1"
    size: 410624
    packfiles: true
    meta: {"section": "libs"}
    files:
      # Total files used: 178.28 KB
      # Installed package size: 401.00 KB
      - "/lib/x86_64-linux-gnu/libexpat.so.1" # Link to /lib/x86_64-linux-gnu/libexpat.so.1.6.11
      - "/lib/x86_64-linux-gnu/libexpat.so.1.6.11" # 178.28 KB

We can either add fields to each package stating which package manager & environment it's for, or make it a nested list environment->package:

packages:
  - package_manager: dpkg
    environment: /
    packages:
      - name: "libc6"
        version: "2.31-0ubuntu9.3"
        size: 13563904
        packfiles: true
        meta: {"section": "libs"}
        files:
          # Total files used: 3.80 MB
          # Installed package size: 12.94 MB
          - "/lib/i386-linux-gnu/ld-2.31.so" # 176.40 KB
          - "/lib/ld-linux.so.2" # Link to /lib/i386-linux-gnu/ld-2.31.so
          - "/lib/x86_64-linux-gnu/ld-2.31.so" # 186.99 KB
  - package_manager: python
    environment: /home/vagrant/venv
    python: "3.8"
    packages:
      - name: "urllib3"
        version: "1.26.4"
        size: 12345
        packfiles: true
        files:
          # Total files used: 678 KB
          # Installed package size: 1.5 MB
          - /home/vagrant/venv/lib/python3.8/site-packages/urllib3/response.py # 28 KB

@remram44 remram44 self-assigned this Sep 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracer (Python) Component: The Python part of the tracer codebase T-enhancement Type: En enhancement to existing code, or a new feature
Projects
None yet
Development

No branches or pull requests

2 participants