-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dump load #15
base: main
Are you sure you want to change the base?
Dump load #15
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,2 @@ | ||
require_relative "bitarray/bit_array" | ||
require_relative "bitarray/bit_array_file" |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,7 +2,8 @@ class BitArray | |
attr_reader :field, :reverse_byte, :size | ||
include Enumerable | ||
|
||
VERSION = "1.3.0" | ||
VERSION = "1.4.0" | ||
HEADER_LENGTH = 8 + 1 # QC (@size, @reverse_byte) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't use this here, but need it for |
||
|
||
def initialize(size, field = nil, reverse_byte: true) | ||
@size = size | ||
|
@@ -24,6 +25,26 @@ def [](position) | |
(@field.getbyte(position >> 3) & (1 << (byte_position(position) % 8))) > 0 ? 1 : 0 | ||
end | ||
|
||
def ==(rhs) | ||
@size == rhs.size && @reverse_byte == rhs.reverse_byte && @field == rhs.field | ||
end | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added the ability to compare two |
||
|
||
# Allows joining (union) two bitarrays of identical size. | ||
# The resulting bitarray will contain any bit set in either constituent arrays. | ||
# |= is implicitly defined, so you can do source_ba |= other_ba | ||
def |(rhs) | ||
raise ArgumentError.new("Bitarray sizes must be identical") if @size != rhs.size | ||
raise ArgumentError.new("Reverse byte settings must be identical") if @reverse_byte != rhs.reverse_byte | ||
|
||
combined = BitArray.new(@size, @field, reverse_byte: @reverse_byte) | ||
rhs.field.each_byte.inject(0) do |byte_pos, byte| | ||
combined.field.setbyte(byte_pos, combined.field.getbyte(byte_pos) | byte) | ||
byte_pos + 1 | ||
end | ||
|
||
combined | ||
end | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wrote this because it allows parallelising bloom filter creation. I don't personally need this method, but figured it might be useful for others. |
||
|
||
# Iterate over each bit | ||
def each | ||
return to_enum(:each) unless block_given? | ||
|
@@ -55,4 +76,18 @@ def total_set | |
private def byte_position(position) | ||
@reverse_byte ? position : 7 - position | ||
end | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The following two methods, |
||
# Save contents to an io device such as a file | ||
def dump(io) | ||
io.write([@size, @reverse_byte ? 1 : 0].pack("QC")) | ||
io.write(@field.b) | ||
io | ||
end | ||
|
||
# Load bitarray from an io device such as a file | ||
def self.load(io) | ||
size, reverse_byte = io.read(9).unpack("QC") | ||
field = io.read | ||
new(size, field, reverse_byte: reverse_byte == 1) | ||
end | ||
end |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
require_relative "bit_array" | ||
|
||
# Read-only access to a BitArray dumped to disk. | ||
# This is considerably slower than using the RAM-based BitArray, but | ||
# avoids the memory requirements and initial setup time. | ||
class BitArrayFile | ||
HEADER_LENGTH = BitArray::HEADER_LENGTH | ||
|
||
attr_reader :io, :reverse_byte, :size | ||
|
||
def initialize(filename: nil, io: nil) | ||
if io | ||
@io = io | ||
elsif filename | ||
@io = File.open(filename, "r") | ||
else | ||
raise ArgumentError.new("Must specify a filename or io argument") | ||
end | ||
|
||
@io.seek(0) | ||
@size, @reverse_byte = @io.read(9).unpack("QC") | ||
@reverse_byte = @reverse_byte != 0 | ||
end | ||
|
||
# Read a bit (1/0) | ||
def [](position) | ||
seek_to(position >> 3) | ||
(@io.getbyte & (1 << (byte_position(position) % 8))) > 0 ? 1 : 0 | ||
end | ||
|
||
private def byte_position(position) | ||
@reverse_byte ? position : 7 - position | ||
end | ||
|
||
private def seek_to(position) | ||
@io.seek(position + HEADER_LENGTH) | ||
end | ||
end | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's actually fairly easy to implement the rest of |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
require "minitest/autorun" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that I'm more familiar with rspec. Please excuse me if these tests aren't how most people would write There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In a way, Minitest is to RSpec as Sinatra is to Rails. Minitest is far less opinionated and freeform which has upsides and downsides, but a big upside is pretty much any approach that works and fits into a project's own style is all good! :) |
||
require "tempfile" | ||
require_relative "../lib/bitarray" | ||
|
||
class TestBitArrayFile < Minitest::Test | ||
def setup | ||
ba = BitArray.new(35) | ||
[1, 5, 6, 7, 10, 16, 33].each { |i| ba[i] = 1} | ||
@file = Tempfile.new("bit_array_file.dat") | ||
ba.dump(@file) | ||
@file.rewind | ||
end | ||
|
||
def teardown | ||
@file.close | ||
@file.unlink | ||
end | ||
|
||
def test_from_filename | ||
baf = BitArrayFile.new(filename: @file.path) | ||
for i in 0...35 | ||
expected = [1, 5, 6, 7, 10, 16, 33].include?(i) ? 1 : 0 | ||
assert_equal expected, baf[i] | ||
end | ||
end | ||
|
||
def test_from_io | ||
baf = BitArrayFile.new(io: @file) | ||
for i in 0...35 | ||
expected = [1, 5, 6, 7, 10, 16, 33].include?(i) ? 1 : 0 | ||
assert_equal expected, baf[i] | ||
end | ||
end | ||
end | ||
|
||
class TestBitArrayFileWhenNonReversedByte < Minitest::Test | ||
def setup | ||
ba = BitArray.new(35, reverse_byte: false) | ||
[1, 5, 6, 7, 10, 16, 33].each { |i| ba[i] = 1} | ||
@file = Tempfile.new("bit_array_file.dat") | ||
ba.dump(@file) | ||
@file.rewind | ||
end | ||
|
||
def teardown | ||
@file.close | ||
@file.unlink | ||
end | ||
|
||
def test_from_filename | ||
baf = BitArrayFile.new(filename: @file.path) | ||
for i in 0...35 | ||
expected = [1, 5, 6, 7, 10, 16, 33].include?(i) ? 1 : 0 | ||
assert_equal expected, baf[i] | ||
end | ||
end | ||
|
||
def test_from_io | ||
baf = BitArrayFile.new(io: @file) | ||
for i in 0...35 | ||
expected = [1, 5, 6, 7, 10, 16, 33].include?(i) ? 1 : 0 | ||
assert_equal expected, baf[i] | ||
end | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed from
1.3.0
.