Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing issues with Linux project #340

Open
safinaskar opened this issue Sep 30, 2024 · 13 comments
Open

Indexing issues with Linux project #340

safinaskar opened this issue Sep 30, 2024 · 13 comments
Labels

Comments

@safinaskar
Copy link

Steps to reproduce:

You will see drivers/net/ethernet/mscc/ocelot_vcap.c, line 1898 (as a function) in list of defines. But there is no line 1898 in this file

@fstachura
Copy link
Collaborator

Hello,

thank you for your report. We are currently investigating this issue. It's not yet entirely clear what caused it, but for now we have decided that we will reindex the whole Linux database - it seems that it somehow got corrupted. Reindexing should be done soon.
For now, if you don't need to browse exactly version 6.11, I suggest just switching Linux versions in the sidebar, it seems that other versions have correct results. (although it also seems that on v6.11 there are only extra bogus results, references in fs/exec.c are correct and are the only references I could find with grep).
It's also not clear to me yet if this is a problem with other identifiers too.

@tleb tleb changed the title https://elixir.bootlin.com/linux/v6.11/C/ident/do_execveat_common links to drivers/net/ethernet/mscc/ocelot_vcap.c, line 1898, but there is no such line Incorrect definition entry for do_execveat_common() in Linux Oct 2, 2024
@fstachura
Copy link
Collaborator

Another set of identifiers with a similar issue - register_chrdev, register_chrdev_region. What's interesting is that both point to line ~2500 in arch/loongarch/kvm/switch.S.

@tleb tleb added the bug label Oct 8, 2024
@tleb
Copy link
Member

tleb commented Oct 8, 2024

One more indexing issue: https://elixir.bootlin.com/linux/v6.12-rc1/A/ident/regulator_err2notif
We are missing two references, see v6.11.2 for example. Results should be almost exactly the same (not exact same line number).

This is the reverse of the do_execveat_common() bug: do_execveat_common() has ghost entries. regulator_err2notif is missing entries.

Let's keep track of all indexing issues without a clear root cause here. It might or might not be the same issue.

@tleb tleb changed the title Incorrect definition entry for do_execveat_common() in Linux Indexing issues with Linux project Oct 8, 2024
@fstachura
Copy link
Collaborator

It looks like no references were registered at all for https://elixir.bootlin.com/linux/v6.12-rc1/source/drivers/regulator/bd96801-regulator.c#L325
Open any identifier in a new tab and try to ctrl+f for the filename.
Maybe update was interrupted?

@Fomys
Copy link
Contributor

Fomys commented Oct 23, 2024

Hi,
I just found another indexing issue: the documentations in this file are not indexed by elixir (at least the ~10 functions I checked).
For example, this identifier has according to elixir:

  • one prototype
  • one definition
  • two references
    But I also expect to see the documentation item.

@Fomys
Copy link
Contributor

Fomys commented Oct 23, 2024

An other issue (very strange this time):
image
The identifier have the link, that seems to works (see screenshot), but as you can see, the reference is not listed

@fstachura
Copy link
Collaborator

@Fomys @tleb I can confirm that all issues mentioned in this thread (so far) are fixed in the fresh database.

@fstachura
Copy link
Collaborator

fstachura commented Oct 23, 2024

0b8d735 introduces a bit more thread safety, which maybe will prevent that happening in the future to that extent. Depends on whether this happens because of an actual race condition, or because the update process got interrupted. The next important step will be to make update process interruptable/restartable.

@Fomys
Copy link
Contributor

Fomys commented Oct 23, 2024

Excellent news! Thank you for the investigation!

@tleb
Copy link
Member

tleb commented Oct 23, 2024

To keep this issue exhaustive, I'll describe here our best guess (with @fstachura, in addition to 0b8d735) as to why some files are missing all their references:

  • update.py does a ./script.sh list-tags to find all project tags (code).
  • It runs the indexing on all tags that are not present in versions.db (code).
  • The first step of the indexing is to find all blobs. Once that is done, the version is added to versions.db (code).

Now, if we imagine a crash by update.py: a version that did its first step has an entry in versions.db, but has not had the indexing of all its blobs done.

We have seen some crashes from update.py. Memory tells me it was OOM, but I am not sure.

@tleb
Copy link
Member

tleb commented Oct 29, 2024

Production database for Linux has been updated with a from-scratch indexing. All missing identifiers/references/docs listed above should now be present on https://elixir.bootlin.com/.

The bug is not fixed per-se, as it can reoccur if/when the indexing fails again.

@Fomys
Copy link
Contributor

Fomys commented Dec 17, 2024

New bug:

Identifier properly identified: https://elixir.bootlin.com/linux/v6.12.5/source/kernel/bpf/syscall.c#L1398
But not displayed when showing the identifier usage: https://elixir.bootlin.com/linux/v6.12.5/C/ident/bpf_map_alloc_id

This seems to be fixed for v6.13

@tleb
Copy link
Member

tleb commented Dec 18, 2024

Cron job logs do contain an error, unsure if it is related:

Processing project /srv/elixir-data/linux ...
Fetching origin
From https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux
 * branch                      HEAD       -> FETCH_HEAD
 * [new tag]                   v6.13-rc3  -> v6.13-rc3
Fetching other
Fetching other2
Exception in thread UpdateRefsElixir:
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/local/elixir/./update.py", line 282, in run
    self.update_references(new_idxes[self.index][0])
  File "/usr/local/elixir/./update.py", line 334, in update_references
    obj = db.refs.get(ident)
  File "/usr/local/elixir/elixir/data.py", line 167, in get
    p = self.db.get(key)
berkeleydb.db.DBPageNotFoundError: (-30986, 'BDB0075 DB_PAGE_NOTFOUND: Requested page not found')
linux - found 1 new tags
linux - ids: v6.13-rc3: 320 new blobs (100.0%)
linux - ids: Thread finished (100.0%)
linux - defs: Thread 1/2 finished (100.0%)
linux - refs: Thread 1/3 finished (0.0%)
linux - refs: Thread 2/3 finished (0.0%)
linux - comps: Thread 1/1 finished (100.0%)
linux - vers: v6.13-rc3 done (100.0%)
linux - vers: Thread finished (100.0%)
linux - comps_docs: Thread 1/1 finished (100.0%)
linux - docs: Thread 1/1 finished (100.0%)
linux - defs: Thread 2/2 finished (100.0%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants