Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue: submitting interleaved or sequential data (chunkSize != 1) to DataThread/DataBuffer #639

Open
wants to merge 1 commit into
base: development
Choose a base branch
from

Conversation

kevkdev
Copy link

@kevkdev kevkdev commented Dec 6, 2024

Resolution to thread: #556

The DataBuffer::addToBuffer method, as written, only copies data correctly if chunkSize = 1.

Fixes included:

  1. Allowing chunkSize of any value that is a multiple of numItems (it wouldn't make sense, otherwise).
  2. Handling the case where a chunkSize copy is split between end-of-buffer and start-of-buffer for the circular buffer.
  3. Optimizations of when contiguous data is copied; in particular with the vectors timestamps sequenceNumbers, and eventCodes.
  4. Added clarity to the functionality of addToBuffer, with examples, to the header comment for the method.

Testing was performed using a sin wave as input across multiple channels in a DataThread::updateBuffer method:

  static int cntr{0};
  static int index{0};
  static int groups[] = {1, 2, 5, 10};

  std::vector<float> raw_samples;
  raw_samples.reserve(NUMBER_CHANNELS * MAX_SAMPLES_PER_CHANNEL);

  if (cntr++ == mSampleRate * 3) {
    if (++index == 4) {
      index = 0;
    }
    cntr = 0;
  }

  float* buf = raw_samples.data();
  auto grouping{groups[index]};

  for (auto i = 0; i < MAX_SAMPLES_PER_CHANNEL; i += grouping) {

    for (int chan = 0; chan < NUMBER_CHANNELS; ++chan) {
      for (auto j = 0; j < grouping; j++) {
        *buf++ = ::sinf(mSample[chan]) * 40;
        mSample[chan] += static_cast<float>(M_PI) * 2.0f / (mSampleRate * chan+1);
      }

    }
  }

  mDataBuffer->addToBuffer(raw_samples.data(), sample_numbers, timestamps,
    event_codes, MAX_SAMPLES_PER_CHANNEL, grouping);

MAX_SAMPLES_PER_CHANNEL was set to 10, the buffer size to 1024. This was done to create a scenario where a chunkSize copy could be split across the circular buffer.

PS - Although I've been developing code for over 30 years, this is my first contribution to open source. If in any of this process, I've made a faux pas, provide grace and allow it to be a teaching moment. Thanks for considering this update to your project.

…s in DataBuffer. Some optimizations were made when copying contiguous data.
@anjaldoshi
Copy link
Member

anjaldoshi commented Dec 6, 2024

Hi @kevkdev,

Thanks for submitting this PR! Although your fixes sound helpful, we’ve already taken care of this on our end. The change we made is in this commit: 7dd337d. This change is currently only available in v1.0.0 of the GUI, which is in preview. We plan to release the stable version soon.

Our modifications to the DataBuffer::addToBuffer method now employ a channel-major layout, which enhances data storage and processing. The key changes include:

  • Removal of the chunkSize parameter, simplifying data handling.
  • Updates to the addToBuffer function facilitate efficient data copying in contiguous blocks. By utilizing AbstractFifo, the implementation of circular buffering for multi-channel data is simplified, encompassing timestamps, sample numbers, and event codes.
  • Reorganization of buffer internals to optimize memory and computation performance.

The refactor enhances clarity and operational speed by focusing on block-based copying rather than sample-based copying. Since this change necessitates the original data transmitted to the DataBuffer to be in channel-major format, which would render existing DataThread plugins incompatible, we cannot backport it.

@kevkdev
Copy link
Author

kevkdev commented Dec 10, 2024

I need to pull the updated branch, but I believe the code has one of the issues I pointed out above - that given a chunk size that is not divisible by the size of the buffer, the copy method drops part of the data. If I get a chance in the next few days, I will do so.

Also, sadly, the data I receive from a remote entity has the channels interleaved (1 2 3 1 2 3) and not blocked (1 1 1 2 2 2 3 3 3). With the change you propose, there's an additional copy from the interleaved to blocked data.

Having a solution like the above wouldn't limit the incoming data sets, but allow for either of the two above, or some variants (e.g. 1 1 2 2 3 3 1 1 2 2 3 3), while maintaining the original API for this method, with optimizing towards the fewest number of copies.

Just something to consider.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants