-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HPCC-31990 Add timeout to DNS lookups for soapcalls #18755
Conversation
Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-31990 Jirabot Action Result: |
0533bfb
to
a038362
Compare
@ghalliday initial thoughts ? |
I think I'll change to using a threadpool and then no need for a reaper thread. |
8a006f8
to
9550d0f
Compare
return true; | ||
} | ||
{ | ||
CriticalBlock block(queryDNSCS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could pthread_cancel() here to help stop thread asap but still would need to check for it / cleanup in the reaper thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mckellyln in general looks good. A few minor suggestions to clean up code and questions about timeouts. I didn't see any logic errors though.
system/jlib/jsocket.cpp
Outdated
|
||
class GetAddrInfoThread : public Thread | ||
{ | ||
char name[256]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be better to use a str::string or a StingAttr?
system/jlib/jsocket.cpp
Outdated
{ | ||
char name[256]; | ||
Semaphore semait; | ||
std::atomic<bool> started; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly better to use initializers {false} rather than assign in the constructor
system/jlib/jsocket.cpp
Outdated
Semaphore semait; | ||
std::atomic<bool> started; | ||
std::atomic<bool> ended; | ||
unsigned netaddr[4]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: an initializer = { 0, 0, 0, 0 }
probably better than the memset
system/jlib/jsocket.cpp
Outdated
|
||
bool thrdHasEnded() | ||
{ | ||
if (started) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: What does the extra check on started provide?
system/jlib/jsocket.cpp
Outdated
bool waitms(unsigned timeoutms, unsigned *resAddr) | ||
{ | ||
bool ret = semait.wait(timeoutms); | ||
if (ret && ended) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
redundant? I think ended must be true if the semaphore has signalled.
system/jlib/jsocket.cpp
Outdated
std::atomic<bool> stopped; | ||
Semaphore sem; | ||
|
||
AddrInfoReaperThread() : stopped(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleaner with false as the default? (and use an initializer)
system/jlib/jsocket.cpp
Outdated
join(); | ||
{ | ||
CriticalBlock block(queryDNSCS); | ||
gaPtrList.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could mean there are threads still running if they have not yet failed. That could potentially cause a core at closedown, but I suspect the chance is very small.
system/jlib/jsocket.cpp
Outdated
while (iter != end) | ||
{ | ||
GetAddrInfoThread *pItem = *iter; | ||
if ( (pItem->thrdHasEnded()) && (pItem->join(20)) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any chance this could take a while and mean the crit sec is held for longer than we would hope. If so, they might need copying to another list, and joining/releasing outside of the critical section.
system/jlib/jsocket.cpp
Outdated
} | ||
} | ||
} | ||
if (sem.wait(10)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 seems quite low - otherwise it will be waking up 100 times a second. 1000 might be more appropriate.
system/jlib/jsocket.cpp
Outdated
|
||
static Owned<AddrInfoReaperThread> addrinforeaperthrd; | ||
|
||
static bool queryDNSTimeout() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could combine with queryKeepAlive to initialise all socket related options.
switching to a thread pool |
Signed-off-by: M Kelly <[email protected]>
d5bb43e
to
b3a1885
Compare
Signed-off-by: M Kelly <[email protected]>
Closing this PR and will issue a new PR using a threadpool. |
Type of change:
Checklist:
Smoketest:
Testing: