Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The crawler fails to collect package info and produces broken frames #338

Open
niltonb opened this issue Oct 13, 2017 · 2 comments
Open

Comments

@niltonb
Copy link
Collaborator

niltonb commented Oct 13, 2017

Log Output

2017-10-13 05:56:15,483 MainProcess ERROR    Error crawling packages
Traceback (most recent call last):
  File "/crawler/utils/package_utils.py", line 162, in crawl_packages
    root_dir, dbpath, installed_since):
  File "/crawler/utils/package_utils.py", line 35, in get_dpkg_packages
    shell=False)
  File "/crawler/utils/misc.py", line 65, in subprocess_run
    (cmd, rc, err))
RuntimeError: (['dpkg-query', '-W', '--admindir=/var/lib/docker/overlay2/rootfs/var/lib/dpkg', '-f=${Package}|${Version}|${Architecture}|${Installed-Size}\n']) failed with rc=2: dpkg-query: error: failed to open package info file `/var/lib/docker/overlay2/rootfs/var/lib/dpkg/status' for reading: No such file or directory
@tatsuhirochiba
Copy link
Contributor

I found when this error happens. This failure occurs when the following two cases are satisfied at the same time.

  1. container is not based on ubuntu/debian image (i.e. app container)
  2. run crawler with python crawler/crawler.py --crawlmode OUTCONTAINER ....

test scenario

I prepared four test cases.

Containers are;

  • e040e9efc602 == app container
  • eadb333ab04b == debian based container

Crawler run command examples are;

  • python crawler/crawler.py --crawlmode OUTCONTAINER --environment kubernetes --numprocesses 1 --features os,package,disk,config,file --logfile /tmp/testcase1.log --url file:///tmp/testcase1 --crawlContainers e040e9efc602
  • python crawler.py --crawlmode OUTCONTAINER --environment kubernetes --numprocesses 1 --features os,package,disk,config,file --logfile /tmp/testcase2.log --url file:///tmp/testcase2 --crawlContainers e040e9efc602

testcase 1: app container and python crawler/crawler.py ...

root@host:/# cat /tmp/testcase1.log
2017-10-17 12:35:08,223 MainProcess INFO     get_docker_container_rootfs_path: long_id=e040e9efc6027acb6ba919ea647b9a9f52c0a6bd46efc8759d6965225747b7a4, deriver=devicemapper, server_version=17.09.0-ce
2017-10-17 12:35:08,232 MainProcess INFO     setup_namespace_and_metadata: long_id=e040e9efc6027acb6ba919ea647b9a9f52c0a6bd46efc8759d6965225747b7a4
2017-10-17 12:35:08,802 MainProcess INFO     get_docker_container_rootfs_path: long_id=e040e9efc6027acb6ba919ea647b9a9f52c0a6bd46efc8759d6965225747b7a4, deriver=devicemapper, server_version=17.09.0-ce
2017-10-17 12:35:08,890 MainProcess ERROR    Error crawling packages
Traceback (most recent call last):
  File "/crawler/utils/package_utils.py", line 162, in crawl_packages
    root_dir, dbpath, installed_since):
  File "/crawler/utils/package_utils.py", line 35, in get_dpkg_packages
    shell=False)
  File "/crawler/utils/misc.py", line 65, in subprocess_run
    (cmd, rc, err))
RuntimeError: (['dpkg-query', '-W', '--admindir=/var/lib/docker/overlay2/rootfs/var/lib/dpkg', '-f=${Package}|${Version}|${Architecture}|${Installed-Size}\n']) failed with rc=2: dpkg-query: error: failed to open package info file `/var/lib/docker/overlay2/rootfs/var/lib/dpkg/status' for reading: No such file or directory

testcase 2: app container and python crawler.py ...

root@host:/# cat /tmp/testcase2.log
2017-10-17 12:35:37,465 MainProcess INFO     get_docker_container_rootfs_path: long_id=e040e9efc6027acb6ba919ea647b9a9f52c0a6bd46efc8759d6965225747b7a4, deriver=devicemapper, server_version=17.09.0-ce
2017-10-17 12:35:37,481 MainProcess INFO     setup_namespace_and_metadata: long_id=e040e9efc6027acb6ba919ea647b9a9f52c0a6bd46efc8759d6965225747b7a4
2017-10-17 12:35:37,674 MainProcess INFO     get_docker_container_rootfs_path: long_id=e040e9efc6027acb6ba919ea647b9a9f52c0a6bd46efc8759d6965225747b7a4, deriver=devicemapper, server_version=17.09.0-ce

testcase 3: debian based container and python crawler/crawler.py ...

root@host:/# cat /tmp/testcase3.log
2017-10-17 12:50:48,056 MainProcess INFO     get_docker_container_rootfs_path: long_id=eadb333ab04b837e94b6856fd0e5081ba14374440dd47405d39c856d38810c95, deriver=devicemapper, server_version=17.09.0-ce
2017-10-17 12:50:48,066 MainProcess INFO     setup_namespace_and_metadata: long_id=eadb333ab04b837e94b6856fd0e5081ba14374440dd47405d39c856d38810c95

testcase 4: debian based container and python crawler.py ...

root@host:/# cat /tmp/testcase4.log
2017-10-17 12:47:09,104 MainProcess INFO     get_docker_container_rootfs_path: long_id=eadb333ab04b837e94b6856fd0e5081ba14374440dd47405d39c856d38810c95, deriver=devicemapper, server_version=17.09.0-ce
2017-10-17 12:47:09,114 MainProcess INFO     setup_namespace_and_metadata: long_id=eadb333ab04b837e94b6856fd0e5081ba14374440dd47405d39c856d38810c95
2017-10-17 12:48:09,516 MainProcess ERROR    Timed out waiting for process 10053 to exit.

check frames

I compared two frames from test case 1 and 2. The only difference is metadata, so the frames are not broken. (there is no package related feature in both)

root@host:/# diff /tmp/testcase1.e040e9efc602.0 /tmp/testcase2.e040e9efc602.0
1,2c1,2
< metadata	"metadata"	{"container_long_id":"e040e9efc6027acb6ba919ea647b9a9f52c0a6bd46efc8759d6965225747b7a4","features":"os,package,disk,config,file","emit_shortname":"e040e9efc602","timestamp":"2017-10-17T12:35:08+0000","docker_image_short_name":"heapster:v1.4.0","namespace":"kube-system/heapster-1395572904-8llls/eventer/e040e9efc6027acb6ba919ea647b9a9f52c0a6bd46efc8759d6965225747b7a4","docker_image_registry":"registry.ng.bluemix.net","owner_namespace":"mdelder","docker_image_tag":"v1.4.0","container_short_id":"e040e9efc602","system_type":"container","container_name":"k8s_eventer_heapster-1395572904-8llls_kube-system_80fff4fc-b212-11e7-a280-069d120f1ec2_0","container_image":"sha256:749531a6d2cf322bd8a35c95c25c6ad722ddeb66260ec8c1e03410cc7bd449aa","docker_image_long_name":"registry.ng.bluemix.net/mdelder/heapster:v1.4.0","uuid":"e1f37bd4-cd71-4760-af71-68040fee6a67"}
< os	"linux"	{"boottime":1505875457.0,"uptime":2368251.0,"ipaddr":["127.0.0.1","10.184.146.27"],"os":"unknown","os_version":"unknown","os_kernel":"unknown","architecture":"x86_64"}
---
> metadata	"metadata"	{"container_long_id":"e040e9efc6027acb6ba919ea647b9a9f52c0a6bd46efc8759d6965225747b7a4","features":"os,package,disk,config,file","emit_shortname":"e040e9efc602","timestamp":"2017-10-17T12:35:37+0000","docker_image_short_name":"heapster:v1.4.0","namespace":"kube-system/heapster-1395572904-8llls/eventer/e040e9efc6027acb6ba919ea647b9a9f52c0a6bd46efc8759d6965225747b7a4","docker_image_registry":"registry.ng.bluemix.net","owner_namespace":"mdelder","docker_image_tag":"v1.4.0","container_short_id":"e040e9efc602","system_type":"container","container_name":"k8s_eventer_heapster-1395572904-8llls_kube-system_80fff4fc-b212-11e7-a280-069d120f1ec2_0","container_image":"sha256:749531a6d2cf322bd8a35c95c25c6ad722ddeb66260ec8c1e03410cc7bd449aa","docker_image_long_name":"registry.ng.bluemix.net/mdelder/heapster:v1.4.0","uuid":"f059c1f7-9edc-4559-a404-f5130dbf3c69"}
> os	"linux"	{"boottime":1505875457.0,"uptime":2368280.0,"ipaddr":["127.0.0.1","10.184.146.27"],"os":"unknown","os_version":"unknown","os_kernel":"unknown","architecture":"x86_64"}

@sahilsuneja1
Copy link
Contributor

@tatsuhirochiba: could you please direct me to the specific images demonstrating this behaviour?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants