-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CPU #81
Comments
Hello @katzeprior , This is not typical behaviour. Can you share what your installation is like? Did you compile it yourself, did you use one of the binary packages, any modifications/configuration? Regards, Michel |
Hey @michel-stam , I've used https://github.com/RIPE-NCC/ripe-atlas-software-probe/blob/master/INSTALL.rst#to-create-a-deb-for-debian-or-debian-based-distros and installed it on a 1 core 1gb ram vps. I've installed it on a pi 3b and that worked like intended. |
Hey @katzeprior , Can you run a ps axuw and tell me which process is pulling 100% CPU time? Maybe also look at /var/log/messages, /var/log/syslog etc. Regards, Michel |
I'm also experiencing the same issue. The process using 100% CPU is
The probe is running in a VM with 1 core/1gb RAM |
Hi Allesandro
Would you be able to attach a strace output?
strace -ff -p <PID> -s 1500 >& out.log
Maybe I can derive where the system is stuck based on this.
Regards,
Michel
… On 9 Jul 2023, at 15:30, Alessandro Verzicco ***@***.***> wrote:
I'm also experiencing the same issue.
The process using 100% CPU is rptaddrs:
/usr/local/atlas/bb-13.3/bin/rptaddrs -A 9104 -c /var/atlas-probe/data/new/v6addr.vol -O /var/atlas-probe/data/new/v6addr.txt
The probe is running in a VM with 1 core/1gb RAM
—
Reply to this email directly, view it on GitHub <#81 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AXSOYX4E4RIPJZK37I6EJV3XPKW57ANCNFSM6AAAAAAVUJBHZY>.
You are receiving this because you were mentioned.
|
Also, @averzicco Can you tell me what platform you're running the probe on? (OS/version, etc). Regards, Michel |
sure, platform: I think I figured out what is causing it, the
Is there a workaround to reduce frequency for the execution of the |
@averzicco You are probably right, for my software probe on a pi at home it wasn't an issue, but with my BGP vps (full v6 route) it also hogged system. |
Hello Alessandro,
Interesting. I will have to discuss it internally. Not sure that you’d be wanting a BGP router to mirror its routes into the Atlas backend, we have RIS for that kind of feeds :)
What is the use case for having a BGP router double as probe?
To answer your question, there’s no workaround for this as we did not expect BGP routers being probes as well.
I’ll get back when I have an update.
Cheers,
Michel
… On 10 Jul 2023, at 19:40, Alessandro Verzicco ***@***.***> wrote:
sure, platform:
Linux 213327-f4 5.10.0-22-amd64 #1 SMP Debian 5.10.178-3 (2023-04-22) x86_64 GNU/Linux
debian 11.7
I think I figured out what is causing it, the v6addr.vol file is about 26 MB and rptaddrs is taking a while to process them, it's not stuck. I've the suspect that v6addr.vol size is proportional to the number of routes in the routing table and since this VPS is connected via BGP to a transit provider with the full ipv6 routing table the number of routes is quite high:
ip -6 route | wc -l
182916
Is there a workaround to reduce frequency for the execution of the rptaddrs process? or maybe reduce the data it needs to process?
—
Reply to this email directly, view it on GitHub <#81 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AXSOYX6QXDUNTPJ7KRVCTS3XPQ5APANCNFSM6AAAAAAVUJBHZY>.
You are receiving this because you were mentioned.
|
Some people use BGP on a small VPS so it isn't a router it is a router and server in one. |
My goal is not to mirror my BGP router routes to the Atlas backend and I wasn't aware that the probe mirrors the routes on the Atlas backend. My use case is to host the Atlas probe in my network and the BGP router is part of it I understand that the software probe wasn't designed to run on BGP routers. I'll try to find a workaround otherwise I'll host it somewhere else. |
Hi Alessandro,
This is a workaround, but at the least bird is able to put information into different routing tables, which correspond to different kernel tables. This may be an option.
The intention of this behaviour is to collect debugging information in case there are problems with the probe connecting to the backend. However, I have no idea if the system actually uses this. This is why I need to talk to some of my colleagues, the history of this change predates my joining the RIPE NCC.
Based on that we could decide to amend or remove this behaviour.
Bear with me :)
Regards,
Michel
… On 12 Jul 2023, at 17:15, Alessandro Verzicco ***@***.***> wrote:
Not sure that you’d be wanting a BGP router to mirror its routes into the Atlas backend, we have RIS for that kind of feeds :)
My goal is not to mirror my BGP router routes to the Atlas backend and I wasn't aware that the probe mirrors the routes on the Atlas backend.
My use case is to host the Atlas probe in my network and the BGP router is part of it
I understand that the software probe wasn't designed to run on BGP routers. I'll try to find a workaround otherwise I'll host it somewhere else.
—
Reply to this email directly, view it on GitHub <#81 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AXSOYX6FSKY7O7IQJP55H4DXP25RHANCNFSM6AAAAAAVUJBHZY>.
You are receiving this because you were mentioned.
|
Perfect!
Good work, can you share the change you made?
I’ll discuss internally because if more people do this it would generate data which I’m not sure has any benefit.
Cheers,
Michel
… On 12 Jul 2023, at 23:25, Alessandro Verzicco ***@***.***> wrote:
For now as workaround I've created a network namespace and configured the systemd service unit to use that namespace, this seem to be good enough to expose only a single route to the atlast probe.
<https://user-images.githubusercontent.com/8068317/253104448-d58ea7cd-e441-4000-92ef-adbe93268140.png>
—
Reply to this email directly, view it on GitHub <#81 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AXSOYX47A536D2ZI3ZGZ4GLXP4I5VANCNFSM6AAAAAAVUJBHZY>.
You are receiving this because you were mentioned.
|
sure, basically:
for example if the ipv6 prefix in the BGP announcement is
in
|
Thanks Alessandro,
Good find :)
Cheers,
Michel
… On 13 Jul 2023, at 19:23, Alessandro Verzicco ***@***.***> wrote:
sure, basically:
add a network namespace
add a veth interface peered to an interface in the network namespace
add IPs and default route to the interfaces created
enable ipv6 forwarding
configure systemd service to use that namespace
for example if the ipv6 prefix in the BGP announcement is 11:22:33::1/48:
ip netns add atlas
ip netns exec atlas ip link set dev lo up
ip link add veth0 type veth peer name v-atlas
ip link set v-atlas netns atlas
ip netns exec atlas ip a
ip a a dev veth0 11:22:33:100::1/56
ip link set veth0 up
ip netns exec atlas ip a a dev v-atlas 11:22:33:100::2/56
ip netns exec atlas ip route add default via 11:22:33:100::1
sysctl -w net.ipv6.conf.all.forwarding=1
in /etc/systemd/system/atlas.service
[Service]
NetworkNamespacePath=/run/netns/atlas
BindReadOnlyPaths=/etc/resolv.conf:/etc/resolv.conf:norbind
—
Reply to this email directly, view it on GitHub <#81 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AXSOYX4WHC2Y6CBSYVACH6LXQAVK7ANCNFSM6AAAAAAVUJBHZY>.
You are receiving this because you were mentioned.
|
Hi @averzicco I've made an internal ticket out of it, where I'll look at the default route, which is practically the only route being looked at by the backend. That and possibly interface routes. Until then your workaround is probably the best approach. Will keep you posted. Cheers, Michel |
I'm currently experiencing the exact same problem. However mine is 134MB and seems to also contain the entire global IPv4 routing table as well as v6. |
I installed the ripe-atlas-software-probe but something uses 100% of the cpu for bursts, is this a known issue and is there a fix for it? Seems to be a busybox rptaddrs thing.
The text was updated successfully, but these errors were encountered: