Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failed #35

Open
armintoepfer opened this issue Mar 31, 2022 · 12 comments
Open

Assertion failed #35

armintoepfer opened this issue Mar 31, 2022 · 12 comments

Comments

@armintoepfer
Copy link

Hi!

I've been using SPSCQueue for some time and just updated to the latest version and now hit this:

SPSCQueue.hpp:188: void Rigtorp::SPSCQueue<T, Allocator>::pop() [with T = PacBio::CCS::Chunk; Allocator = std::allocator<PacBio::CCS::Chunk>]: Assertion `writeIdx_.load(std::memory_order_acquire) != readIdx' failed.

Any advice? Thank you!

@galchinsky
Copy link

To reproduce:

#include <complex>
#include <array>
#include "SPSCQueue.h"
int main() {
    rigtorp::SPSCQueue< int > queue(16);
    for (int j = 0; j < 6; ++j) {
        queue.emplace(0);
    }
    for (int j = 0; j < 3; ++j) {
        queue.pop();
    }
}

If I dump indexes after each operation this way

template<typename T>
void dump(T& q) {
    std::cout << q.writeIdx_ << " " << q.writeIdxCache_ << " " << q.readIdx_ << " " << q.readIdxCache_ << std::endl;
}

It is clear this is caused by the diverged non-atomic caches and the real values:

1 0 0 0 # push
2 0 0 0
3 0 0 0
4 0 0 0
5 0 0 0
6 0 0 0
6 0 1 0 # pop
6 0 2 0
6 0 3 0
6 0 3 0

If ensure the equality between the caches and the indexes in the end of the pop call, the problem disappears. Sure it's just a dirty fix because I wouldn't like to dig into the logic.

    readIdxCache_ = nextReadIdx;
    writeIdxCache_ = writeIdx_;

@Philippe91
Copy link

@galchinsky On which CPU are you running? For example, Apple Silicon CPU behaves a bit differently than Intel, as I could experiment (SPSCQueue was not involved).

@galchinsky
Copy link

@Philippe91 it's a simple single-threaded example with a few push and a few pop operations, I don't think CPU can make a difference here.

@usefulcat
Copy link

I submitted a PR to fix this, see PR "Bug fix for #35".

@Philippe91
Copy link

I submitted a PR to fix this, see PR "Bug fix for #35".

This solves the problem, apparently.

This being said, the logic behind the cached indexes system introduced by @rigtorp is challenging to grasp in my mind.
It gives me an insecure feeling, reinforced by the presence of this bug, introduced 18 months ago, and to which the author has never reacted since it was reported ten months ago.

I am considering reverting to the implementation prior 3a0507a.

@galchinsky
Copy link

I just moved to Facebook's queue, because who knows are there any other bugs like that

@jz315
Copy link

jz315 commented Apr 5, 2023

I have the same problem.

#include <iostream>
#include "SPSCQueue.h"
using namespace rigtorp;
int main(int argc, char* argv[]) {
    SPSCQueue<int> q(10000);
    
    q.push(1);
    q.pop();
    
    return 0;
}

Output: Assertion failed: writeIdx_.load(std::memory_order_acquire) != readIdx, file D:\Programs\Test\test1\SPSCQueue.h, line 184

If add q.front(); before q.pop();, the code will run properly.

@jz315
Copy link

jz315 commented Apr 5, 2023

And function front() has some bug.

RIGTORP_NODISCARD T* front() noexcept {
            auto const readIdx = readIdx_.load(std::memory_order_relaxed);
            if (readIdx == writeIdxCache_) {
                writeIdxCache_ = writeIdx_.load(std::memory_order_acquire);
                if (writeIdxCache_ == readIdx) {
                    return nullptr;
                }
            }
            return &slots_[readIdx + kPadding];
        }

If you run the pop() function first, readIdx_ will change from 0 to 1, while the initial value of writeIdxCache_ is 0, so they will be unequal and writeIdxCache_ will not update until readIdx_ reaches the end of the array and becomes 0 again. During this time, the front() function will always return an object regardless of whether the queue is empty.

@vetribox
Copy link

vetribox commented May 4, 2023

It's just an assertion fail in the class destructor because writeIdxCache_ value doesn't reflect writeIdx_ as it is only updated in the *front() member function when writeIdxCache_ == readIdx_.load(std::memory_order_relaxed) and only if it is so writeIdxCache_ gets updated, hence the fault. The sane fix is to check if size() > 0 in the class destructor loop.

Because you weren't supposed to call *front() when the size() is zero and that's what the class destructor does, it failed to check bounds as it should be.

@rigtorp
Copy link
Owner

rigtorp commented Sep 25, 2023

There seems to be a misunderstanding of the pre-conditions of calling pop(). I updated the documentation here: 4066113

This is a concurrent queue intended to be used by two threads. In that case the only way for the reader thread to know if there is more data to read is by calling front(). Only once it returns a non-nullptr can you call pop().

This code invokes undefined behavior (and triggers the assert):

SPSCQueue<int> q(10000);
q.push(1);
q.pop();

This code correctly checks that the queue is not empty:

SPSCQueue<int> q(10000);
q.push(1);
if (q.front()) {
   q.pop();
}

Since you are not reading front(), you are not communicating any data to the other thread, so I don't understand why you even need the queue?

@galchinsky
Copy link

I don't remember the details (even where the assertion got triggered), but here are 3 chunks of code I had before moving to the facebook's queue, and checking for front after checking for emptiness or size doesn't seem obvious to me.

            while (!queue.empty()) {
                lapper.put_sample(*queue.front());
                queue.pop();
            }

another one

           while (queue.size() > 4) {
                queue.pop();
            }

and

            while (queue.front()) {
                DBOUT("$" << queue.writeIdx_ << " " << queue.writeIdxCache_ << " " << queue.readIdx_ << " " << queue.readIdxCache_);
                queue.pop();
            }

Apparently, given there are debug outputs, it was the last one that triggered the assertion.
Pushing was being done this way, in a separate thread:

               bool success = queue.try_push(workbuf);
                if (success) {
                    last_timestamp = buf.timestamp;
                }

@rigtorp
Copy link
Owner

rigtorp commented Sep 26, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants