Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some pages show a table of packages for an expired flavor/flavour #449

Open
grahamperrin opened this issue May 3, 2023 · 82 comments
Open
Milestone

Comments

@grahamperrin
Copy link
Contributor

grahamperrin commented May 3, 2023

devel/git

Three flavours listed, five tables of packages, two of the tables appear to be for slaves that are not flavours.

https://www.freshports.org/devel/git/#flavors three flavours (git, git-lite, git-tiny).

https://www.freshports.org/devel/git/#packages five tables, I'm not sure whether these two (of four) slave ports belong:

  • git-gui
  • git-svn.

Slave ports that are not on the same page:

  • devel/git-cvs
  • devel/git-p4.

devel/git-gui

https://www.freshports.org/devel/git-gui/ version 2.38.1_4

https://www.freshports.org/devel/git-gui/#packages 2.40.1, in some cells of the table, exceeds 2.38.1_4 …

devel/git-svn

https://www.freshports.org/devel/git-svn/ 2.38.1_4

https://www.freshports.org/devel/git-svn/#packages 2.40.1, in some cells of the table, exceeds 2.38.1_4 …

https://www.freshports.org/devel/git-svn/#history 2.32.0_1 is inferior to 2.38.1_4.

Whilst drafting this, I remembered:

Is this report entirely a duplicate of 435?

@dlangille
Copy link
Contributor

I need a summary aimed at management please.

@dlangille
Copy link
Contributor

dlangille commented May 3, 2023

Please, I really need you to tell me what you think the problem is. Please put that as the first sentence. It really helps me.

In this case I think it is:

This port has 3 flavors listed. The list of packges includes flavors not listed.

Current status: stale data being kept for repos not recently updated.

@grahamperrin
Copy link
Contributor Author

I'm not sure of problem, because I have not paid much attention, in the past, to flavours and their representations.

Above, I added a line:

Three flavours listed, five tables of packages, two of the tables appear to be for slaves that are not flavours.

HTH

@dlangille
Copy link
Contributor

re: https://github.com/FreshPorts/freshports/blob/main/classes/package_flavors.php#L35

freshports.devgit=# select * from package_flavors where port_id = 16984;
   id   | port_id | flavor_id |   name   | flavor_number 
--------+---------+-----------+----------+---------------
 154598 |   16984 |        41 | git      |             0
 154599 |   16984 |     31249 | git-lite |             1
 154600 |   16984 |     34760 | git-tiny |             2
(3 rows)

freshports.devgit=# SELECT * FROM PackageFlavors(16984);
   id   | port_id | flavor_id | flavor_name |   name   | flavor_number 
--------+---------+-----------+-------------+----------+---------------
 154598 |   16984 |        41 | default     | git      |             0
 154599 |   16984 |     31249 | lite        | git-lite |             1
 154600 |   16984 |     34760 | tiny        | git-tiny |             2
(3 rows)

freshports.devgit=# 

@dlangille
Copy link
Contributor

Some stuff from debug output:

  34 => 
  array (
    'package_name' => 'git-gui',
    'abi' => 'FreeBSD:12:aarch64',
    'package_version_latest' => NULL,
    'package_version_quarterly' => NULL,
    'last_checked_latest' => '2023-05-31 19:00',
    'repo_date_latest' => '2018-10-11 05:47',
    'import_date_latest' => '2023-03-12 01:00',
    'processed_date_latest' => '2023-03-12 01:05',
    'last_checked_quarterly' => '2023-05-31 19:00',
    'repo_date_quarterly' => '2023-05-30 16:29',
    'import_date_quarterly' => '2023-05-30 23:00',
    'processed_date_quarterly' => '2023-05-30 23:00',
  ),
  35 => 

...

  102 => 
  array (
    'package_name' => 'git-svn',
    'abi' => 'FreeBSD:13:armv7',
    'package_version_latest' => NULL,
    'package_version_quarterly' => NULL,
    'last_checked_latest' => '2023-05-31 19:00',
    'repo_date_latest' => '2023-05-24 10:55',
    'import_date_latest' => '2023-05-24 17:00',
    'processed_date_latest' => '2023-05-24 17:00',
    'last_checked_quarterly' => '2023-05-31 19:00',
    'repo_date_quarterly' => '2023-05-22 18:03',
    'import_date_quarterly' => '2023-05-23 03:00',
    'processed_date_quarterly' => '2023-05-23 03:01',
  ),

@dlangille
Copy link
Contributor

I wonder if the issue is stale data. Not everything is clear out of the database when reading in a new set of packages.

@dlangille dlangille reopened this May 31, 2023
@dlangille
Copy link
Contributor

freshports.devgit=# SELECT distinct package_name FROM PortPackages(16984);
 package_name 
--------------
 git
 git-gui
 git-lite
 git-svn
 git-tiny
(5 rows)

@dlangille
Copy link
Contributor

I think this is a back-end issue in https://github.com/FreshPorts/packages-import

@dlangille
Copy link
Contributor

There is a delete done:

https://github.com/FreshPorts/packages-import/blob/master/import_packagesite.py#L56

Let's look the package data to see if the other flavors are in there.

@dlangille
Copy link
Contributor

What do we fetch? https://github.com/FreshPorts/packages-import/blob/master/fetch-extract-parse-import-one-abi.sh#LL48C15-L48C72 says:

fetch --quiet https://pkg.freebsd.org/$abi/$package_set/packagesite.txz

@dlangille
Copy link
Contributor

[12:25 dev-ingress01 dan ~/tmp/499] % fetch --quiet http://pkg.freebsd.org/FreeBSD:13:amd64/latest/packagesite.txz
[12:25 dev-ingress01 dan ~/tmp/499] % tar -xzf packagesite.txz
[12:25 dev-ingress01 dan ~/tmp/499] % ls -l
total 18074
-rw-r--r--  1 dan  dan   6927972 2023.05.31 03:18 packagesite.txz
-rw-r--r--  1 dan  dan  46344959 2023.05.31 03:18 packagesite.yaml
-rw-r--r--  1 dan  dan       451 1970.01.01 00:00 packagesite.yaml.pub
-rw-r--r--  1 dan  dan       256 1970.01.01 00:00 packagesite.yaml.sig
[12:25 dev-ingress01 dan ~/tmp/499] % 

@dlangille
Copy link
Contributor

dlangille commented Jun 1, 2023

There it is. It's devel/git-gui

[12:25 dev-ingress01 dan ~/tmp/499] % grep -r git-gui packagesite.yaml packagesite.yaml:{"name":"git-gui","origin":"devel/git-gui","version":"2.40.1","comment":"Git GUI (Graphical User Interface) components","maintainer":"[email protected]","www":"https://git-scm.com/","abi":"FreeBSD:13:*","arch":"freebsd:13:*","prefix":"/usr/local","sum":"038c3bb91b3bf4bdec3bd54b89122b5d29eb983b60f548b56610e4f8d38a772f","flatsize":2265385,"path":"All/git-gui-2.40.1.pkg","repopath":"All/git-gui-2.40.1.pkg","licenselogic":"single","licenses":["GPLv2"],"pkgsize":349084,"desc":"Git GUI (Graphical User Interface) components\n\nWWW: https://git-scm.com/","deps":{"git":{"origin":"devel/git","version":"2.40.1"},"tcl86":{"origin":"lang/tcl86","version":"8.6.13"},"tk86":{"origin":"x11-toolkits/tk86","version":"8.6.13"}},"categories":["devel"],"annotations":{"cpe":"cpe:2.3:a:git-scm:git:2.40.1:::::freebsd13:x64"},"messages":[{"message":"If you installed the GITWEB option please follow these instructions:\n\nIn the directory /usr/local/share/examples/git/gitweb you can find all files to\nmake gitweb work as a public repository on the web.\n\nAll you have to do to make gitweb work is:\n1) Please be sure you're able to execute CGI scripts in\n /usr/local/share/examples/git/gitweb.\n2) Set the GITWEB_CONFIG variable in your webserver's config to\n /usr/local/etc/git/gitweb.conf. This variable is passed to gitweb.cgi.\n3) Restart server.\n\n\nIf you installed the CONTRIB option please note that the scripts are\ninstalled in /usr/local/share/git-core/contrib. Some of them require\nother ports to be installed (perl, python, etc), which you may need to\ninstall manually.","type":"install"}]}

@dlangille
Copy link
Contributor

dlangille commented Jun 1, 2023

And the other one is there too. git-svn is created by devel/git-svn

[12:26 dev-ingress01 dan ~/tmp/499] % grep -r git-svn packagesite.yaml packagesite.yaml:{"name":"rubygem-svn2git","origin":"devel/rubygem-svn2git","version":"2.4.0","comment":"Ruby utilities for exporting SVN repo to git","maintainer":"[email protected]","www":"https://github.com/nirvdrum/svn2git","abi":"FreeBSD:13:*","arch":"freebsd:13:*","prefix":"/usr/local","sum":"51f21901978e4ab7c76cbfd18fe7e4c84a06c127ed8e8d8332f3f4dc18b16ee5","flatsize":44476,"path":"All/rubygem-svn2git-2.4.0.pkg","repopath":"All/rubygem-svn2git-2.4.0.pkg","licenselogic":"single","licenses":["MIT"],"pkgsize":14756,"desc":"svn2git is a tiny utility for migrating projects from Subversion to\nGit while keeping the trunk, branches and tags where they should\nbe. It uses git-svn to clone an svn repository and does some clean-up\nto make sure branches and tags are imported in a meaningful way, and\nthat the code checked into master ends up being what's currently in\nyour svn trunk rather than whichever svn branch your last commit was\nin.\n\nWWW: https://github.com/nirvdrum/svn2git","deps":{"git":{"origin":"devel/git","version":"2.40.1"},"ruby":{"origin":"lang/ruby31","version":"3.1.4_1,1"},"ruby31-gems":{"origin":"devel/ruby-gems","version":"3.4.13"}},"categories":["devel","rubygems"]} packagesite.yaml:{"name":"git-svn","origin":"devel/git-svn","version":"2.40.1","comment":"Bidirectional operation between a Subversion repository and Git","maintainer":"[email protected]","www":"https://git-scm.com/","abi":"FreeBSD:13:*","arch":"freebsd:13:*","prefix":"/usr/local","sum":"e08e5f3de7651c9c88b10714b19f0b6d9d3e2e3418ad857ac06554c8c8f27898","flatsize":99598,"path":"All/git-svn-2.40.1.pkg","repopath":"All/git-svn-2.40.1.pkg","licenselogic":"single","licenses":["GPLv2"],"pkgsize":42792,"desc":"Bidirectional operation between a Subversion repository and Git\n\ngit svn is a simple conduit for changesets between Subversion and Git. It\nprovides a bidirectional flow of changes between a Subversion and a Git\nrepository.\n\nWWW: https://git-scm.com/","deps":{"git":{"origin":"devel/git","version":"2.40.1"},"p5-Term-ReadKey":{"origin":"devel/p5-Term-ReadKey","version":"2.38_1"},"p5-subversion":{"origin":"devel/p5-subversion","version":"1.14.2"}},"categories":["devel"],"annotations":{"cpe":"cpe:2.3:a:git-scm:git:2.40.1:::::freebsd13:x64"},"messages":[{"message":"If you installed the GITWEB option please follow these instructions:\n\nIn the directory /usr/local/share/examples/git/gitweb you can find all files to\nmake gitweb work as a public repository on the web.\n\nAll you have to do to make gitweb work is:\n1) Please be sure you're able to execute CGI scripts in\n /usr/local/share/examples/git/gitweb.\n2) Set the GITWEB_CONFIG variable in your webserver's config to\n /usr/local/etc/git/gitweb.conf. This variable is passed to gitweb.cgi.\n3) Restart server.\n\n\nIf you installed the CONTRIB option please note that the scripts are\ninstalled in /usr/local/share/git-core/contrib. Some of them require\nother ports to be installed (perl, python, etc), which you may need to\ninstall manually.","type":"install"}]}

@dlangille
Copy link
Contributor

Conclusion:

  • the package building system is building them and FreshPorts is doing the right thing
  • Add a note to the page to inform: The package system often builds flavors not listed here

Action required:

  • find out if this really is the package building system doing this via some magical method.

@dlangille
Copy link
Contributor

Here are the svn references:

[12:32 pkg01 dan ~/ports/head/devel/git] % grep svn *
Makefile:. elif ${SUBPORT} == svn
Makefile:.if ${SUBPORT} != svn
Makefile:RMMAN1+=	git-svn.1
Makefile:.if ${SUBPORT} != svn
Makefile:	@${REINPLACE_CMD} -e '/git-svn.perl/d' ${WRKSRC}/Makefile
Makefile:.elif ${SUBPORT} == svn
Makefile:	${INSTALL_SCRIPT} ${WRKSRC}/git-svn \
Makefile:	${INSTALL_DATA} ${WRKDIR}/man1/git-svn.1 \
grep: files: Is a directory
pkg-descr-lite:WARNING: This flavor cannot coexist with cvs, gui, p4 or svn subpackages.
pkg-descr-tiny:WARNING: This flavor cannot coexist with cvs, gui, p4 or svn subpackages.
pkg-plist:%%HTMLDOCS%%%%DOCSDIR%%/git-svn.html
pkg-plist:%%HTMLDOCS%%%%DOCSDIR%%/git-svn.txt

@dlangille
Copy link
Contributor

And the gui references:

[12:32 pkg01 dan ~/ports/head/devel/git] % grep gui *
Makefile:. if ${SUBPORT} != gui
Makefile:. elif ${SUBPORT} == gui
Makefile:.if ${SUBPORT} != gui
Makefile:		git-gui.1 \
Makefile:.if ${SUBPORT} == gui
Makefile:		${WRKSRC}/git-gui/git-gui--askpass
Makefile:.elif ${SUBPORT} == gui
Makefile:		 ${STAGEDIR}${PREFIX}/share/git-gui/lib/msgs \
Makefile:	${INSTALL_SCRIPT} ${WRKSRC}/git-gui/git-gui \
Makefile:	${LN} ${STAGEDIR}${PREFIX}/libexec/git-core/git-gui \
Makefile:	${INSTALL_SCRIPT} ${WRKSRC}/git-gui/git-gui--askpass \
Makefile:	${INSTALL_DATA} ${WRKSRC}/git-gui/lib/tclIndex \
Makefile:		${STAGEDIR}${PREFIX}/share/git-gui/lib/
Makefile:	${INSTALL_DATA} ${WRKSRC}/git-gui/lib/*.ico \
Makefile:		${STAGEDIR}${PREFIX}/share/git-gui/lib/
Makefile:	${INSTALL_DATA} ${WRKSRC}/git-gui/lib/*.js \
Makefile:		${STAGEDIR}${PREFIX}/share/git-gui/lib/
Makefile:	${INSTALL_DATA} ${WRKSRC}/git-gui/lib/*.tcl \
Makefile:		${STAGEDIR}${PREFIX}/share/git-gui/lib/
Makefile:	${INSTALL_DATA} ${WRKSRC}/git-gui/po/*.msg \
Makefile:		${STAGEDIR}${PREFIX}/share/git-gui/lib/msgs/
Makefile:	${INSTALL_DATA} ${WRKDIR}/man1/git-gui.1 \
grep: files: Is a directory
pkg-descr-lite:WARNING: This flavor cannot coexist with cvs, gui, p4 or svn subpackages.
pkg-descr-tiny:WARNING: This flavor cannot coexist with cvs, gui, p4 or svn subpackages.
pkg-plist:libexec/git-core/mergetools/guiffy
pkg-plist:%%HTMLDOCS%%%%DOCSDIR%%/git-gui.html
pkg-plist:%%HTMLDOCS%%%%DOCSDIR%%/git-gui.txt
[12:32 pkg01 dan ~/ports/head/devel/git] % 

@dlangille
Copy link
Contributor

I think this SUBPORT reference is the key

@dlangille
Copy link
Contributor

dlangille commented Jun 1, 2023

Here we go. It's the slave ports.

Slave ports:
   devel/git-cvs
   devel/git-gui
   devel/git-p4
   devel/git-svn

@dlangille
Copy link
Contributor

dlangille commented Jun 1, 2023

So why does FreshPorts list git-svn (devel/git-svn) under devel/git?

I see the .yaml file contains this:

{"origin":"devel/git","version":"2.40.1"}

@dlangille
Copy link
Contributor

The data gets broken up like this:

$JQ -rc --arg ABI "$abi" --arg PACKAGE_SET "$package_set" '[$ABI, $PACKAGE_SET, .origin, .name, .version] | @tsv' < packagesite.yaml > packagesite.tsv

@dlangille
Copy link
Contributor

dlangille commented Jun 1, 2023

Doing that manually:

[13:06 dev-ingress01 dan ~/tmp/499] % jq -rc --arg ABI "FreeBSD:13:amd64" --arg PACKAGE_SET "latest" '[$ABI, $PACKAGE_SET, .origin, .name, .version] | @tsv' < packagesite.yaml > packagesite.tsv
[13:06 dev-ingress01 dan ~/tmp/499] % grep -r git-svn packagesite.tsv | less                  
packagesite.tsv:FreeBSD:13:amd64        latest  devel/git-svn   git-svn 2.40.1

This shows why git-svn (package) is listed under devel/git-svn (origin)

Why does it also get listed under devel-git?

This is what I find for deve/git

packagesite.tsv:FreeBSD:13:amd64        latest  devel/git       git-tiny        2.40.1
packagesite.tsv:FreeBSD:13:amd64        latest  devel/git       git-lite        2.40.1
packagesite.tsv:FreeBSD:13:amd64        latest  devel/git       git     2.40.1

@dlangille
Copy link
Contributor

Next, look at the data import code.

@dlangille
Copy link
Contributor

dlangille commented Jun 1, 2023

https://github.com/FreshPorts/packages-import/blob/master/import-via-copy-packagesite-all-raw-fields.py#LL52C26-L52C38 imports the file ( I think this is the tsv file from above) contents into the packages_raw table.

@dlangille
Copy link
Contributor

[14:32 pg02 dan ~] % psql freshports.devgit
psql (12.14)
Type "help" for help.

freshports.devgit=# \d packages_raw
                             Table "public.packages_raw"
     Column      |     Type     | Collation | Nullable |           Default            
-----------------+--------------+-----------+----------+------------------------------
 id              | bigint       |           | not null | generated always as identity
 package_origin  | text         |           | not null | 
 package_name    | text         |           | not null | 
 package_version | text         |           | not null | 
 abi             | text         |           | not null | 
 abi_id          | integer      |           |          | 
 port_id         | integer      |           |          | 
 package_set     | package_sets |           |          | 
Indexes:
    "packages_raw_alt_pkey" PRIMARY KEY, btree (id)
    "packages_raw_abi_id_idx" btree (abi_id)
    "packages_raw_abi_idx" btree (abi)
    "packages_raw_abi_package_set_idx" btree (abi, package_set)
    "packages_raw_all" btree (abi_id, package_name, package_set) INCLUDE (package_version)
    "packages_raw_alt_package_name_idx" btree (package_name)
    "packages_raw_package_origin_idx" btree (package_origin)
    "packages_raw_port_id_idx" btree (port_id)
Foreign-key constraints:
    "packages_raw_abi_id_fk" FOREIGN KEY (abi_id) REFERENCES abi(id) ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
    "packages_raw_port_id_fk" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED

freshports.devgit=# 

@dlangille
Copy link
Contributor

dlangille commented Jun 1, 2023

Here's the data now in that table. It seems right.

freshports.devgit=# select * from packages_raw where abi = 'FreeBSD:13:amd64' and package_origin = 'devel/git' and package_set = 'latest' order by package_name;
    id     | package_origin | package_name | package_version |       abi        | abi_id | port_id | package_set 
-----------+----------------+--------------+-----------------+------------------+--------+---------+-------------
 185669366 | devel/git      | git          | 2.40.1          | FreeBSD:13:amd64 |     11 |   16984 | latest
 185669339 | devel/git      | git-lite     | 2.40.1          | FreeBSD:13:amd64 |     11 |   16984 | latest
 185669330 | devel/git      | git-tiny     | 2.40.1          | FreeBSD:13:amd64 |     11 |   16984 | latest
(3 rows)

freshports.devgit=# 
freshports.devgit=# select * from packages_raw where abi = 'FreeBSD:13:amd64' and package_name  = 'git-svn' and package_set = 'latest' order by package_name;
    id     | package_origin | package_name | package_version |       abi        | abi_id | port_id | package_set 
-----------+----------------+--------------+-----------------+------------------+--------+---------+-------------
 185669305 | devel/git-svn  | git-svn      | 2.40.1          | FreeBSD:13:amd64 |     11 |   96647 | latest
(1 row)
freshports.devgit=# select * from packages_raw where abi = 'FreeBSD:13:amd64' and package_name  = 'git-gui' and package_set = 'latest' order by package_name;
    id     | package_origin | package_name | package_version |       abi        | abi_id | port_id | package_set 
-----------+----------------+--------------+-----------------+------------------+--------+---------+-------------
 185669331 | devel/git-gui  | git-gui      | 2.40.1          | FreeBSD:13:amd64 |     11 |   37356 | latest
(1 row)

@dlangille
Copy link
Contributor

This is the table which gets populated from packages_raw:

freshports.devgit=# \d packages
                               Table "public.packages"
     Column      |     Type     | Collation | Nullable |           Default            
-----------------+--------------+-----------+----------+------------------------------
 id              | bigint       |           | not null | generated always as identity
 abi_id          | integer      |           | not null | 
 package_set     | package_sets |           |          | 
 package_name    | text         |           | not null | 
 package_version | text         |           | not null | 
 port_id         | integer      |           | not null | 
Indexes:
    "packages_pkey" PRIMARY KEY, btree (id)
    "fki_packages_abi_id_fk" btree (abi_id)
    "fki_packages_port_id_fk" btree (port_id)
    "packages_all_idx" btree (abi_id, package_name, package_set) INCLUDE (package_version)
    "packages_package_name_idx" btree (package_name)
Foreign-key constraints:
    "packages_abi_id_fk" FOREIGN KEY (abi_id) REFERENCES abi(id) ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
    "packages_port_id_fk" FOREIGN KEY (port_id) REFERENCES ports(id) ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED

freshports.devgit=# 

@dlangille
Copy link
Contributor

This content also looks good.

freshports.devgit=# select * from packages where port_id = (select id from ports_active where name = 'git') and package_set = 'latest' and abi_id = (select id from abi where name = 'FreeBSD:13:amd64');
   id    | abi_id | package_set | package_name | package_version | port_id 
---------+--------+-------------+--------------+-----------------+---------
 2098989 |     11 | latest      | git-tiny     | 2.40.1          |   16984
 2112471 |     11 | latest      | git          | 2.40.1          |   16984
 2114330 |     11 | latest      | git-lite     | 2.40.1          |   16984
(3 rows)

freshports.devgit=# 

@dlangille
Copy link
Contributor

Going to the website code (https://github.com/FreshPorts/freshports/blob/main/classes/packages.php#LL51C25-L51C37), and mangling the PortPackages query in sp.txt, we find this, which also looks right.

freshports.devgit=#     WITH
      pkg AS (select * from packages where port_id = (select id from ports_active where name = 'git') and package_set = 'latest' and abi_id = (select id from abi where name = 'FreeBSD:13:amd64'))
    SELECT pn.package_name,
           abi.name AS abi,
           max(pkg.package_version)           FILTER (WHERE pkg.package_set = 'latest')    AS package_version_latest,
           max(iso_date(PLC.last_checked))    FILTER (where PLC.package_set = 'latest')    AS last_checked_latest,
           max(iso_date(PLC.repo_date))       FILTER (where PLC.package_set = 'latest')    AS repo_date_latest,
           max(iso_date(PLC.import_date))     FILTER (where PLC.package_set = 'latest')    AS import_date_latest,
           max(iso_date(PLC.processed_date))  FILTER (where PLC.package_set = 'latest')    AS processed_date_latest
      FROM abi
           CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn
           LEFT JOIN pkg ON (pkg.abi_id = abi.id AND pkg.package_name = pn.package_name)
           JOIN packages_last_checked PLC ON PLC.abi_id  = abi.id
where abi.id = (select id from abi where name = 'FreeBSD:13:amd64')
     GROUP BY pn.package_name, abi.name
     ORDER BY pn.package_name, abi.name;
 package_name |       abi        | package_version_latest | last_checked_latest | repo_date_latest | import_date_latest | processed_date_latest 
--------------+------------------+------------------------+---------------------+------------------+--------------------+-----------------------
 git          | FreeBSD:13:amd64 | 2.40.1                 | 2023-06-01 14:00    | 2023-05-31 03:18 | 2023-05-31 17:00   | 2023-05-31 17:00
 git-lite     | FreeBSD:13:amd64 | 2.40.1                 | 2023-06-01 14:00    | 2023-05-31 03:18 | 2023-05-31 17:00   | 2023-05-31 17:00
 git-tiny     | FreeBSD:13:amd64 | 2.40.1                 | 2023-06-01 14:00    | 2023-05-31 03:18 | 2023-05-31 17:00   | 2023-05-31 17:00
(3 rows)

freshports.devgit=# 

@dlangille
Copy link
Contributor

I'll come back to this. I want to look at how the package details are composed onto the page.

@grahamperrin
Copy link
Contributor Author

Past 02:10 in the early hours, I need to sleep, but first,

#449 (comment)

dev dev.freshports.org/archivers/py-lz4/#flavors appears better.

Better to my sleepy non-developer eye:

  • primarily, because the number of tables does not exceed the number of listed flavours
  • secondarily, because the odd-man-out version 2.1.10 does not appear in the row for FreeBSD:13:riscv64.

From #449 (comment) regarding www:

Are you concluding there are no packages because a version is not displayed?

A tentative conclusion, yes, partly because of that.

Also, partly because of dev showing just the one table (with versions).

@dlangille
Copy link
Contributor

dlangille commented Sep 24, 2023

Now I'm not sure why dev shows only py39-lz4 but the other nodes show other flavors.

@dlangille
Copy link
Contributor

dlangille commented Sep 29, 2023

The data is in the database. This is not a caching issue.

where port_id in (select id from ports_active where name = 'py-lz4')
freshports.testgit-# ;
   id   | port_id | flavor_id |   name   | flavor_number 
--------+---------+-----------+----------+---------------
 129883 |   36648 |     32588 | py39-lz4 |             0

freshports.testgit=# select * from packages
where port_id in (select id from ports_active where name = 'py-lz4')
order by package_name;
   id    | abi_id | package_set | package_name | package_version | port_id 
---------+--------+-------------+--------------+-----------------+---------
 1814621 |     15 | latest      | py27-lz4     | 2.1.0           |   36648
 1897287 |     16 | quarterly   | py27-lz4     | 2.1.10          |   36648
 1856464 |     15 | quarterly   | py27-lz4     | 2.1.10          |   36648
 1877232 |     16 | latest      | py27-lz4     | 2.1.0           |   36648
 1710581 |      9 | latest      | py27-lz4     | 2.1.0           |   36648
 1996438 |     18 | latest      | py27-lz4     | 2.1.1           |   36648
 2009005 |     18 | quarterly   | py27-lz4     | 2.1.10          |   36648
 1835730 |     15 | latest      | py36-lz4     | 2.1.0           |   36648
 1710661 |      9 | latest      | py36-lz4     | 2.1.0           |   36648
 1980317 |     18 | latest      | py36-lz4     | 2.1.1           |   36648
 1880687 |     16 | latest      | py36-lz4     | 2.1.0           |   36648
 2033187 |     19 | quarterly   | py37-lz4     | 2.1.10          |   36648
 2334697 |     24 | latest      | py37-lz4     | 2.1.10          |   36648
 2162695 |     20 | latest      | py37-lz4     | 2.1.10          |   36648
 1910725 |     16 | quarterly   | py37-lz4     | 2.1.10          |   36648
 2000666 |     18 | quarterly   | py37-lz4     | 2.1.10          |   36648
 1856784 |     15 | quarterly   | py37-lz4     | 2.1.10          |   36648
 2852195 |     34 | latest      | py38-lz4     | 2.1.10          |   36648
 2823570 |     35 | quarterly   | py39-lz4     | 2.1.10          |   36648
 2791479 |     21 | quarterly   | py39-lz4     | 4.3.2           |   36648
 2755090 |     11 | quarterly   | py39-lz4     | 4.3.2           |   36648
 3116961 |     39 | latest      | py39-lz4     | 4.3.2           |   36648
 2781582 |     28 | latest      | py39-lz4     | 4.0.2           |   36648
 2721088 |     30 | latest      | py39-lz4     | 4.3.2           |   36648
 2932048 |     33 | latest      | py39-lz4     | 4.3.2           |   36648
 2725189 |     10 | latest      | py39-lz4     | 4.3.2           |   36648
 2726384 |     27 | latest      | py39-lz4     | 4.3.2           |   36648
 2732510 |     11 | latest      | py39-lz4     | 4.3.2           |   36648
 2739684 |      8 | latest      | py39-lz4     | 4.3.2           |   36648
 2741953 |      2 | latest      | py39-lz4     | 4.3.2           |   36648
 2766404 |      1 | quarterly   | py39-lz4     | 4.3.2           |   36648
 2745246 |      1 | latest      | py39-lz4     | 4.3.2           |   36648
 3076358 |     38 | latest      | py39-lz4     | 4.3.2           |   36648
 2913421 |     24 | quarterly   | py39-lz4     | 4.3.2           |   36648
 2750748 |     10 | quarterly   | py39-lz4     | 4.3.2           |   36648
 2772190 |     29 | latest      | py39-lz4     | 4.3.2           |   36648
 2759774 |      8 | quarterly   | py39-lz4     | 4.3.2           |   36648
 3037297 |     36 | latest      | py39-lz4     | 4.3.2           |   36648
 2802015 |     21 | latest      | py39-lz4     | 4.3.2           |   36648
 2732749 |     26 | latest      | py39-lz4     | 4.3.2           |   36648
 2763571 |      2 | quarterly   | py39-lz4     | 4.3.2           |   36648
 2778732 |      9 | quarterly   | py39-lz4     | 4.3.2           |   36648
 2786648 |     20 | quarterly   | py39-lz4     | 4.3.2           |   36648
(43 rows)

freshports.testgit=# 


@dlangille
Copy link
Contributor

The data is in the database. It's not a website display or caching problem.

Let's look at package processing. ... which we already did at #449 (comment)

@dlangille
Copy link
Contributor

dlangille commented Sep 29, 2023

PackagesRawDeleteForABIPackageSet does a delete before importing. That should start us with zero packages for given port

CREATE OR REPLACE FUNCTION public.packagesrawdeleteforabipackageset(
	a_abi_name text,
	a_package_set package_sets)
    RETURNS void
    LANGUAGE 'sql'
    COST 100
    VOLATILE PARALLEL UNSAFE
AS $BODY$
   DELETE FROM packages_raw
    WHERE abi         = a_abi_name
      AND package_set = a_package_set;
$BODY$;

@dlangille
Copy link
Contributor

dlangille commented Sep 29, 2023

No, that just deletes from package_raw - does anything delete from packages?

The next place to look: the population of packages based on the contents of packages_raw: https://github.com/FreshPorts/packages-import/blob/master/UpdatePackagesFromRawPackages.py

That code invokes updatepackagesfromrawpackages

@dlangille
Copy link
Contributor

Looking at that code (which is not on GitHub), it does this

-- FUNCTION: public.updatepackagesfromrawpackages(text, package_sets)

-- DROP FUNCTION IF EXISTS public.updatepackagesfromrawpackages(text, package_sets);

CREATE OR REPLACE FUNCTION public.updatepackagesfromrawpackages(
	a_abi_name text,
	a_package_set package_sets)
    RETURNS TABLE(inserts integer, updates integer, deletes integer) 
    LANGUAGE 'plpgsql'
    COST 100
    VOLATILE PARALLEL UNSAFE
    ROWS 1000

AS $BODY$
  DECLARE
    l_abi_id  integer;
    l_inserts integer;
    l_deletes integer;
    l_updates integer;
  BEGIN
    SELECT id
      INTO l_abi_id
      FROM abi
     WHERE name = a_abi_name;

    IF NOT FOUND THEN
        RAISE EXCEPTION 'cannot find id for ABI %', a_abi_name;
    END IF;

    -- set abi_id and port_id before updating the packages table.
    UPDATE packages_raw PR
       SET abi_id  = l_abi_id,
           port_id = P.id
      FROM ports P, element_pathname EP
     WHERE PR.abi         = a_abi_name
       AND PR.package_set = a_package_set
       AND EP.pathname    = '/ports/head/' || PR.package_origin
       AND P.element_id   = EP.element_id;

    -- this is required to allow the DELETE to properly plan the operation
    analyze packages_raw;

    UPDATE packages p
       SET package_version = PR.package_version
      FROM packages_raw PR
     WHERE P.abi_id           = PR.abi_id
       AND P.package_set      = PR.package_set
       AND P.package_name     = PR.package_name
       AND P.port_id          = PR.port_id
       AND P.package_version != PR.package_version
       AND PR.abi             = a_abi_name
       AND PR.package_set     = a_package_set;

    GET DIAGNOSTICS l_updates = ROW_COUNT;

    INSERT INTO packages (abi_id, port_id, package_version, package_name, package_set)
    SELECT abi_id, port_id, package_version, package_name, package_set
      FROM packages_raw PR
     WHERE NOT EXISTS (
     SELECT *
       FROM packages P
      WHERE P.abi_id           = PR.abi_id
        AND P.package_set      = PR.package_set
        AND P.package_name     = PR.package_name
        AND P.port_id          = PR.port_id
        AND P.package_version  = PR.package_version
        AND PR.abi             = a_abi_name
        AND PR.package_set     = a_package_set)
     AND PR.abi_id  IS NOT NULL
     AND PR.port_id IS NOT NULL
     AND PR.abi             = a_abi_name
     AND PR.package_set     = a_package_set;

    GET DIAGNOSTICS l_inserts = ROW_COUNT;

    DELETE FROM packages P
     WHERE P.abi_id      = l_abi_id
       AND P.package_set = a_package_set
       AND NOT EXISTS (
         SELECT *
           FROM packages_raw PR
          WHERE P.abi_id       = PR.abi_id
            AND P.package_set  = PR.package_set
            AND P.package_name = PR.package_name
            AND P.port_id      = PR.port_id
            AND PR.abi         = a_abi_name
            AND PR.package_set = a_package_set);

    GET DIAGNOSTICS l_deletes = ROW_COUNT;

    INSERT INTO package_imports (abi_id, package_set, date, inserts, updates, deletes, last_checked, repo_date, import_date, processed_date)
    SELECT l_abi_id, a_package_set, CURRENT_TIMESTAMP, l_inserts, l_updates, l_deletes,
           PLC.last_checked, PLC.repo_date, PLC.import_date, CURRENT_TIMESTAMP
      FROM packages_last_checked PLC
     WHERE PLC.abi_id      = l_abi_id
       AND PLC.package_set = a_package_set;

    RETURN QUERY (SELECT l_inserts, l_updates, l_deletes);

  END;
$BODY$;

ALTER FUNCTION public.updatepackagesfromrawpackages(text, package_sets)
    OWNER TO postgres;

@dlangille
Copy link
Contributor

Next for me:

  • should the code do a blanket delete?
  • it does a DELETE, but why? What is the goal of that delete?

@dlangille
Copy link
Contributor

dlangille commented Sep 30, 2023

Reading https://news.freshports.org/2020/04/17/1325/ I found confirmation that a delete is not done. That is why old flavors remain.

@grahamperrin: The root cause has been found.

When updates arrive, they arrive in quantity. The latest update was 30474 records. From that, we will have some new packages, some will go away, and some will be updates. Many will have no change. I see two approaches:

1. UPDATE where found and nothing to change, INSERT where not found, DELETE where no longer present.
2. DELETE everything related to this import, INSERT everything imported.

Option 1 will have far less database churn. Option 2 will maximize it.

Let’s look at option 1 mostly because I like that approach best and it should be straight forward to implement.

@dlangille
Copy link
Contributor

First thought: Go with delete. That will make package_imports impossible to maintain because of the difficulty in establishing insert, update, and delete counts.

Second thought: We could stay with option 1, and then just delete from packages whatever is not found in package_raw

Third: wait, we already add number of deletes to package_imports - it sounds like that procedure is faulty given my second thought. If we are already deleting, it is not deleting enough.

@dlangille
Copy link
Contributor

dlangille commented Sep 30, 2023

PackagesRawDeleteForABIPackageSet does a delete before importing. That should start us with zero packages for given port

CREATE OR REPLACE FUNCTION public.packagesrawdeleteforabipackageset(
	a_abi_name text,
	a_package_set package_sets)
    RETURNS void
    LANGUAGE 'sql'
    COST 100
    VOLATILE PARALLEL UNSAFE
AS $BODY$
   DELETE FROM packages_raw
    WHERE abi         = a_abi_name
      AND package_set = a_package_set;
$BODY$;

However PackagesRawDeleteForABIPackageSet just deletes for a given ABI and package_set (e.g. FreeBSD:13:amd64 and `latest). I wonder if that's relevant.

Why would it be just one ABI? If we import an ABI today, and never again import that ABI? Those values will remain in the table forever.

Let's look here:

freshports.devgit=# SELECT * FROM public.packages_raw
WHERE port_id = 36648 and package_name like 'py27-%'
ORDER BY abi ASC ;
    id     |  package_origin  | package_name | package_version |        abi         | abi_id | port_id | package_set 
-----------+------------------+--------------+-----------------+--------------------+--------+---------+-------------
 173691581 | archivers/py-lz4 | py27-lz4     | 2.1.0           | FreeBSD:12:aarch64 |      9 |   36648 | latest
 173799050 | archivers/py-lz4 | py27-lz4     | 2.1.0           | FreeBSD:12:armv6   |     15 |   36648 | latest
 173833729 | archivers/py-lz4 | py27-lz4     | 2.1.10          | FreeBSD:12:armv6   |     15 |   36648 | quarterly
 173858190 | archivers/py-lz4 | py27-lz4     | 2.1.0           | FreeBSD:12:armv7   |     16 |   36648 | latest
 173887087 | archivers/py-lz4 | py27-lz4     | 2.1.10          | FreeBSD:12:armv7   |     16 |   36648 | quarterly
(5 rows)

freshports.devgit=# 

That's all FreeBSD 12 - not sure if that's relevant.

@dlangille
Copy link
Contributor

Let's use https://test.freshports.org/archivers/py-lz4/#packages as an example. It shows 5 flavors:

  • py27
  • py36
  • py37
  • py38
  • py39

However, we have only this:

freshports.testgit=# select * from package_flavors where name like 'py%-lz4';
   id   | port_id | flavor_id |   name   | flavor_number 
--------+---------+-----------+----------+---------------
 129883 |   36648 |     32588 | py39-lz4 |             0
(1 row)

freshports.testgit=# 

But packages_raw has all this:

freshports.testgit=# SELECT distinct(package_name)FROM public.packages_raw
WHERE port_id = 36648
ORDER BY package_name ;
 package_name 
--------------
 py27-lz4
 py36-lz4
 py37-lz4
 py38-lz4
 py39-lz4
(5 rows)

Let's delete py38-lz4 from packages_raw, clear the packages cache, and reload the page.

No,that didn't affect the page.

However, deleting from packages did. There is now one less table of packages for this port.

@dlangille
Copy link
Contributor

Now my theory is: stuff hanging around in packages longer than they should. Stuff in packages is deleted based on this in packages_raw (or rather, not found in packages_raw).

The inserts, deletes, and updates on packages is based upon a single ABI and package_set.

Why not wipe packages_raw before each import?

@dlangille
Copy link
Contributor

dlangille commented Sep 30, 2023

The test plan:

  • delete py36-lz4 from the packages table
  • delete py37-lz4 from the packages_raw table

Hypothesis

After an import of any ABI/package_set combination the following will be true:

  • Neither py36-lz4 nor py37-lz4 will not exist in packages_raw or packages
  • py27-lz4 & py39-lz4 will exist in both packages_raw and packages

EDIT: I said import above. That is not precise. I think all we have to do is invoke the script which populates packages from packages_raw. See #449 (comment) above

@dlangille
Copy link
Contributor

freshports.testgit=# delete from public.packages_raw where package_name = 'py37-lz4';
DELETE 6
freshports.testgit=# begin;
BEGIN
freshports.testgit=# delete from packages where package_name = 'py36-lz4';
DELETE 4
freshports.testgit=# commit;
COMMIT
freshports.testgit=# 

@dlangille
Copy link
Contributor

dlangille commented Sep 30, 2023

After clearing cache, https://test.freshports.org/archivers/py-lz4/#packages shows only three flavor:

  • py27
  • py37
  • py39

@dlangille
Copy link
Contributor

dlangille commented Sep 30, 2023

Next: invoke UpdatePackagesFromRawPackages', (row['abi_name'],row['package_set'])

This looks like no updates at all.

freshports.testgit=# select UpdatePackagesFromRawPackages('FreeBSD:13:amd64', 'latest');
 updatepackagesfromrawpackages 
-------------------------------
 (0,0,0)
(1 row)

freshports.testgit=# 

@dlangille
Copy link
Contributor

Well, we still have py37-lz4 at https://test.freshports.org/archivers/py-lz4/#packages

Where did that come from.

@dlangille
Copy link
Contributor

Well, we have this:

freshports.testgit=# SELECT distinct(package_name)FROM public.packages_raw
WHERE port_id = 36648
ORDER BY package_name ;
 package_name 
--------------
 py27-lz4
 py36-lz4
 py39-lz4
(3 rows)

@dlangille
Copy link
Contributor

dlangille commented Sep 30, 2023

I understand now.

The problem is not the delete statement. It is correct.

The problem is old ABI which are no longer being built. Information in the packages table gets added and is not updated again until there is a new build for that ABI. If that ABI is never updated, that information will persist until deleted.

For example, the last time FreeBSD:12:aarch64 was updated is: Oct 11 2018

The packages are still out there.

@dlangille
Copy link
Contributor

This is resolved if FreshPorts deletes ABI which are no longer built.

Does an authoritative list exist of ABI for which packages are built?

@grahamperrin
Copy link
Contributor Author

Does an authoritative list exist of ABI for which packages are built?

I guess, the FreeBSD Packages Management Team https://www.freebsd.org/administration/#t-pkgmgr will know.

HTH

@dlangille
Copy link
Contributor

email sent.

@dlangille
Copy link
Contributor

Resent on Oct 15th.

Sent to freebsd-ports on Nov 19: https://lists.freebsd.org/archives/freebsd-ports/2023-November/004927.html

@dlangille
Copy link
Contributor

dlangille commented Nov 28, 2023

See also

@dlangille
Copy link
Contributor

Might be related to:

@dlangille
Copy link
Contributor

@grahamperrin this seems fixed/ready to close?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants