Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-30156 Instrument CriticalSection #17706

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

richardkchapman
Copy link
Member

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

@github-actions
Copy link

inline void enter()
{
cycle_t start = get_cycles_now();
#ifdef CHECK_CS_CONTENTION
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this after the getCurrentThreadId() call

holdStart = get_cycles_now();
waitCycles += holdStart-start;
#endif
depth++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this before the get_cycles_now() - will be trivial
I'm not 100% sure this is correct code (is the compiler allowed to add 1 to the value of depth it got before the mutex_lock?)

inline void leave()
{
bool isContended = false;
depth--;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit early if the contended test is based on it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to decrement just before the unlock, and check (depth == 1) here.

#ifdef _ASSERT_LOCK_SUPPORT
ThreadId owner = 0;
#endif
unsigned depth = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly better if this is atomic - then the compiler can't magically move the code (and use fastInc(), fastDec()). Still get very inconsistent results though.


inline void enter()
{
cycle_t start = get_cycles_now();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be inside the #ifdef TIME_CRITSECS

@ghalliday ghalliday requested a review from mckellyln August 24, 2023 10:33
@ghalliday
Copy link
Member

@mckellyln this is curious - the number of reported contentions is much lower than the real value.

I think the reason is that the overhead of the lock calls/returns is larger than the work being done in the critical section.

This may still be useful, but it suggests that quick critical sections will have significantly under-reported contention.

Copy link
Contributor

@mckellyln mckellyln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Gavins comments.
We may need to add compiler fences to ensure statements are not moved around ?

@mckellyln
Copy link
Contributor

mckellyln commented Aug 31, 2023

Should we increase thread count ?
Or add a duty-cycle / spin to test3 ?

@richardkchapman richardkchapman force-pushed the contendedLock branch 3 times, most recently from 65a8331 to 7bea2af Compare September 6, 2023 15:18
@richardkchapman richardkchapman marked this pull request as ready for review May 1, 2024 09:48
}
}

thread_local CriticalBlockInstrumentation *__cbinst = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does this get created ?

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting out jtiming looks like a good change.
There are several changes for testing that probably need removing
A few questions about class names etc.
Generally I think it looks good, and I think worth merging with a bit more clean up.

#define __glue(a,b) a ## b
#define glue(a,b) __glue(a,b)

extern thread_local CriticalBlockInstrumentation* __cbinst;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread local variables generally generate terrible code. It would be better to keep this hidden within jmutex.cpp and provide an accessor function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is used to pass an implicit parameter into the CriticalBlock object without having to change the prototype? If so it is worth adding a comment to explain.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct. I can add a comment.

Not really sure how an acessor function helps though - makes the code even more obscure, and less efficient (though smaller) I would have thought.

extern thread_local CriticalBlockInstrumentation* __cbinst;

typedef InstrumentedCriticalSection CriticalSection;
typedef InstrumentedCriticalBlock ICriticalBlock;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not clear why you would use ICriticalBlock rather than the other classes. Would it make sense for InstrumentedCrtiticalSection to only be defined if the option was selected, and otherwise be typedefed to the uninistrumented.

ICriticalBlock suggests an interface, which this isn't..

Copy link
Member Author

@richardkchapman richardkchapman May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea I think is that you can Typedef CriticalBlock == InstrumentedCriticalBlock if you want ALL blocks instrumented, or use InstrumentedCriticalBlock explicitly if you want to investigate the behaviour of individual blocks.

I can add some comments to help explain that.

Copy link
Member Author

@richardkchapman richardkchapman May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for ICriticalBlock is that when CriticalBlock is a macro, you can't use it without a parameter. There are some instances where that is necessary. Not sure what the best name for it is.

@@ -68,7 +68,7 @@ option(SKIP_ECLWATCH "Skip building ECL Watch" OFF)
option(USE_ADDRESS_SANITIZER "Use address sanitizer to spot leaks" OFF)
option(INSTALL_VCPKG_CATALOG "Install vcpkg-catalog.txt" ON)
option(PORTALURL "Set url to hpccsystems portal download page")
option(PROFILING "Set to true if planning to profile so stacks are informative" OFF)
option(PROFILING "Set to true if planning to profile so stacks are informative" ON)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a separate flag to control the new functionality. (I set this on in my release builds by default.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I intended the option to be controlled via editing the definition of USE_INSTRUMENTED_CRITSECS in jmutex.hpp - it's something that will be set temporarily while investigating issues rather than left on.

Rename ICriticalBlock
Clean up some unwanted changes
Add some comments

Signed-off-by: Richard Chapman <[email protected]>
@ghalliday
Copy link
Member

@richardkchapman I just found this branch invaluable for diagnosing a completely confusing set of timings for the stress text example. Please can you rebase, squash and I will rereview and aim to merge.

@ghalliday
Copy link
Member

dfuserver is currently crashing on startup with this branch enabled
I am also struggling to get the output from roxie - I have having to run as a stand along program currently

@ghalliday
Copy link
Member

And dali locks up on closedown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants