Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #1597 #1623

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

fix #1597 #1623

wants to merge 8 commits into from

Conversation

andyxning
Copy link

This fix the problem issued in #1597 and WebUI 128.

@naparuba
Copy link
Contributor

Thanks, I'll give a look and ifwe got what we need for the regenerators tests so we can add a fail one and apply this fix :)

@gst
Copy link
Contributor

gst commented May 11, 2015

otherwise very good catch&job already :)

@andyxning
Copy link
Author

@gst ,i have fixed the problem you proposed, you can have a look at the code and if something is still wrong, please let me know.

@gst
Copy link
Contributor

gst commented May 12, 2015

but first I'm gonna try to reproduce the problem on my side.. though I'm not completely sure that I've an exact/good view/understanding of it.. we'll see :)

gst added a commit to shinken-monitoring/mod-livestatus that referenced this pull request May 12, 2015
With this the Livestatus was simply not usable at all within a scheduler :|

Seen while investigating on shinken-solutions/shinken#1623
@gst
Copy link
Contributor

gst commented May 12, 2015

damn..

The function modified here (load_from_scheduler) is only called (as far as I see) by / within a shinken Scheduler daemon (when it has LiveStatus or WebUi enabled), but the problem is that it couldn't have been called because there is/was a bug (the one I've just referenced: shinken-monitoring/mod-livestatus#64) within Livestatus preventing that function from ever being called.. :|

ho boy.. have to "untie" this kind of spaghettis situation.. again :s

@mohierf
Copy link
Contributor

mohierf commented May 12, 2015

+1 for you @andyxning

@andyxning
Copy link
Author

@gst , i have remove all the setattr with instance.attribute=value pattern.

@gst
Copy link
Contributor

gst commented May 13, 2015

great @andyxning (that looks like already slightly better, isn't it ?, hold on there is still better, see bottom.. ;))

but as I explained in my previous comment : I'm rather unsure that the proposed changed will not fix the problem of memory consumption in/with the broker daemon simply because the code/function involved here is only called (as far as I see) if LiveStatus or WebUi are used in the scheduler, and not the broker..

Now maybe this would or could also fix some similar things in the scheduler case.. so this is tbc.. ;)

Now back to this commit and this specific part within the function :

         self.notificationways = c.notificationways
         for notificationway in c.notificationways:
             self.notificationways[notificationway.id].regenerator_refs = 0 

this is completely equivalent to :

         self.notificationways = c.notificationways
         for nw in c.notificationways:
             nw.regenerator_refs = 0 

which is again, now clearly (I hope you agree ;)), better than the previous already slightly better form.. ;)
(same can be done for the contacts just after and also in manage_initial_contact_status_brok with both lines changed).

but again, I say that only/more for information/teach purpose because again : I'm unsure about all this unfortunately :(

though if that would fix some things in your actual setup/configration regarding the memory problems we are hunting then I'd be happy to see that.

I will be able to get back on this only by next week.

@andyxning
Copy link
Author

​thanks for the feedback. I will think about and refactor this again.

To make more clear, i have not enable regenerator module for Scheduler, so
the logical in load_from_scheduler method is not tested in our setup. I
only thought it should be the same as the situation in Broker. :)

However, in my opinion, we always fix the problem that ​the main reason to
cause the memory leak. As shown in the code below:

nws = c.notificationways
        safe_print("Got notif ways", nws)
        new_notifways = []
        for cnw in nws:
            new_notifways_flag = False
            nwname = cnw.notificationway_name
            nw = self.notificationways.find_by_name(nwname)
            if not nw:
                safe_print("Creating notif way", nwname)
                nw = NotificationWay([])
                new_notifways_flag = True
            # Now update it
            for prop in NotificationWay.properties:
                if hasattr(cnw, prop):
                    setattr(nw, prop, getattr(cnw, prop))

            # we must call self.notificationways.add_item after
            # the initial or update phase(the above for loop), because we need
            # 'notificationway_name' as the key to insert the notificationway
            # into the self.notificationways.
            if new_notifways_flag:
                self.notificationways.add_item(nw)

            new_notifways.append(nw)

as the comments said that we have fix the notificationways double problem
between each reload.

All the other works are used to make sure that when some contacts are
deleted we can still make some contacts and notificationways deletion,
which is currently not supported in regenerator. Currently, regenerator
only support host and service deletion. all other objects, contacts and
notificationways, will just be inserted or updated without deletion. :)

++++++
Ning Xie

2015-05-13 22:03 GMT+08:00 Grégory Starck [email protected]:

great @andyxning https://github.com/andyxning (that looks like already
slightly better, isn't it ?, hold on there is still better, see bottom.. ;))

but as I explained in my previous comment : I'm rather unsure that the
proposed changed will not fix the problem of memory consumption in/with the
broker daemon simply because the code/function involved here is only called
(as far as I see) if LiveStatus or WebUi are used in the scheduler, and not
the broker..

Now maybe this would or could also fix some similar things in the
scheduler case.. so this is tbc.. ;)

Now back to this commit and this specific part within the function :

     self.notificationways = c.notificationways
     for notificationway in c.notificationways:
         self.notificationways[notificationway.id].regenerator_refs = 0

this is completely equivalent to :

     self.notificationways = c.notificationways
     for nw in c.notificationways:
         nw.regenerator_refs = 0

which is again, now clearly (I hope you agree ;)), better than the
previous already slightly better form.. ;)
(same can be done for the contacts just after and also in
manage_initial_contact_status_brok with both lines changed).

but again, I say that only/more for information/teach purpose because
again : I'm unsure about all this unfortunately :(

though if that would fix some things in your actual setup/configration
regarding the memory problems we are hunting then I'd be happy to see that.

I will be able to get back on this only by next week.


Reply to this email directly or view it on GitHub
#1623 (comment).

@andyxning
Copy link
Author

I have more tested the PR in our environment, and it shows some wrong under some situation.

I don't know why it will throw an exception when deleting notificationways with an KeyError, however it is really true that sometimes we get a moment that self.contacts[contact.id] is not equal with self.contacts[contact.contact_name]. So we changed the criteria to find the specified contact or notificationway with contact_name and notificationway_name instead id which may change between each configuration reload considering contact or notificationway addition or deletion.

@andyxning
Copy link
Author

I am crazy about the memory linear up. With the patch i give in the pull request, i think i have resolved the memory leak problem. However, we run the modified version with this pull request after four days and the memory actually is still rising. That is very disappointed.

@naparuba
Copy link
Contributor

Is the schedulers memory raising the same or not? (like with comments or
downtime objects for example).

On Tue, May 19, 2015 at 11:50 AM, Ning Xie [email protected] wrote:

I am crazy about the memory linear up. With the patch i give in the pull
request, i think i have resolved the memory leak problem. However, we run
the modified version with this pull request after four days and the memory
actually is still rising. That is very disappointed.


Reply to this email directly or view it on GitHub
#1623 (comment).

@andyxning
Copy link
Author

Yes, scheduler seems to be in memory leak too. with the initial memory
usage being 1.7GB, after some configuration reload the memory usage reaches
7.2G.

However, regenerator leaks in WebUI seems has nothing to do with memory
leak in scheduler except configuration broks sent from scheduler.

Please help me get out of this. I have worked on this for more than a week.

++++++
Ning Xie

2015-05-19 19:07 GMT+08:00 Gabès Jean [email protected]:

Is the schedulers memory raising the same or not? (like with comments or
downtime objects for example).

On Tue, May 19, 2015 at 11:50 AM, Ning Xie [email protected]
wrote:

I am crazy about the memory linear up. With the patch i give in the pull
request, i think i have resolved the memory leak problem. However, we run
the modified version with this pull request after four days and the
memory
actually is still rising. That is very disappointed.


Reply to this email directly or view it on GitHub
#1623 (comment).


Reply to this email directly or view it on GitHub
#1623 (comment).

@naparuba
Copy link
Contributor

Ok, so the problem is not in the regenerator or such things I think, nor in
the broker part.

If scheduler are impacted, it means that it's upstream the broker, and this
last one is an impact.

Can you try without retention module? Are the memory grow immediate after
the reload, or grow slooooowly during time? (first one is retention or old
conf managment, second one is about queues)

thanks,

On Tue, May 19, 2015 at 1:53 PM, Ning Xie [email protected] wrote:

Yes, scheduler seems to be in memory leak too. with the initial memory
usage being 1.7GB, after some configuration reload the memory usage reaches
7.2G.

However, regenerator leaks in WebUI seems has nothing to do with memory
leak in scheduler except configuration broks sent from scheduler.

Please help me get out of this. I have worked on this for more than a week.

++++++
Ning Xie

2015-05-19 19:07 GMT+08:00 Gabès Jean [email protected]:

Is the schedulers memory raising the same or not? (like with comments or
downtime objects for example).

On Tue, May 19, 2015 at 11:50 AM, Ning Xie [email protected]
wrote:

I am crazy about the memory linear up. With the patch i give in the
pull
request, i think i have resolved the memory leak problem. However, we
run
the modified version with this pull request after four days and the
memory
actually is still rising. That is very disappointed.


Reply to this email directly or view it on GitHub
<#1623 (comment)
.


Reply to this email directly or view it on GitHub
#1623 (comment).


Reply to this email directly or view it on GitHub
#1623 (comment).

@andyxning
Copy link
Author

​I have not try scheduler without retention module disabled. I will give a
try with that.

I think you can reopen this issue to make more people see it and then may
be we can solve it more quickly. :)​

++++++
Ning Xie

2015-05-19 22:52 GMT+08:00 Gabès Jean [email protected]:

Ok, so the problem is not in the regenerator or such things I think, nor in
the broker part.

If scheduler are impacted, it means that it's upstream the broker, and this
last one is an impact.

Can you try without retention module? Are the memory grow immediate after
the reload, or grow slooooowly during time? (first one is retention or old
conf managment, second one is about queues)

thanks,

On Tue, May 19, 2015 at 1:53 PM, Ning Xie [email protected]
wrote:

Yes, scheduler seems to be in memory leak too. with the initial memory
usage being 1.7GB, after some configuration reload the memory usage
reaches
7.2G.

However, regenerator leaks in WebUI seems has nothing to do with memory
leak in scheduler except configuration broks sent from scheduler.

Please help me get out of this. I have worked on this for more than a
week.

++++++
Ning Xie

2015-05-19 19:07 GMT+08:00 Gabès Jean [email protected]:

Is the schedulers memory raising the same or not? (like with comments
or
downtime objects for example).

On Tue, May 19, 2015 at 11:50 AM, Ning Xie [email protected]
wrote:

I am crazy about the memory linear up. With the patch i give in the
pull
request, i think i have resolved the memory leak problem. However, we
run
the modified version with this pull request after four days and the
memory
actually is still rising. That is very disappointed.


Reply to this email directly or view it on GitHub
<
#1623 (comment)
.


Reply to this email directly or view it on GitHub
<#1623 (comment)
.


Reply to this email directly or view it on GitHub
#1623 (comment).


Reply to this email directly or view it on GitHub
#1623 (comment).

@naparuba
Copy link
Contributor

it's reopen

On Wed, May 20, 2015 at 3:29 AM, Ning Xie [email protected] wrote:

​I have not try scheduler without retention module disabled. I will give a
try with that.

I think you can reopen this issue to make more people see it and then may
be we can solve it more quickly. :)​

++++++
Ning Xie

2015-05-19 22:52 GMT+08:00 Gabès Jean [email protected]:

Ok, so the problem is not in the regenerator or such things I think, nor
in
the broker part.

If scheduler are impacted, it means that it's upstream the broker, and
this
last one is an impact.

Can you try without retention module? Are the memory grow immediate after
the reload, or grow slooooowly during time? (first one is retention or
old
conf managment, second one is about queues)

thanks,

On Tue, May 19, 2015 at 1:53 PM, Ning Xie [email protected]
wrote:

Yes, scheduler seems to be in memory leak too. with the initial memory
usage being 1.7GB, after some configuration reload the memory usage
reaches
7.2G.

However, regenerator leaks in WebUI seems has nothing to do with memory
leak in scheduler except configuration broks sent from scheduler.

Please help me get out of this. I have worked on this for more than a
week.

++++++
Ning Xie

2015-05-19 19:07 GMT+08:00 Gabès Jean [email protected]:

Is the schedulers memory raising the same or not? (like with comments
or
downtime objects for example).

On Tue, May 19, 2015 at 11:50 AM, Ning Xie <[email protected]

wrote:

I am crazy about the memory linear up. With the patch i give in the
pull
request, i think i have resolved the memory leak problem. However,
we
run
the modified version with this pull request after four days and the
memory
actually is still rising. That is very disappointed.


Reply to this email directly or view it on GitHub
<
#1623 (comment)
.


Reply to this email directly or view it on GitHub
<
#1623 (comment)
.


Reply to this email directly or view it on GitHub
<#1623 (comment)
.


Reply to this email directly or view it on GitHub
#1623 (comment).


Reply to this email directly or view it on GitHub
#1623 (comment).

@gst
Copy link
Contributor

gst commented Dec 11, 2015

I may have some new results about this (about memory leak more exactly) ..

@andyxning
Copy link
Author

@gst, It has been a long time ago, i have try to figure it out with more than two weeks. However, the PR now may not work now and i can not remember exactly that whether i worked it out at that time. So, it is difficult for me to recall for this one and i am out of this for more than fourth months.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants