Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Program gets killed during compilation time #1513

Open
kkc2894 opened this issue Oct 5, 2024 · 12 comments
Open

Program gets killed during compilation time #1513

kkc2894 opened this issue Oct 5, 2024 · 12 comments

Comments

@kkc2894
Copy link

kkc2894 commented Oct 5, 2024

I am using the machine based on "malicious-shamir-party reading" and trying to read a file with approx 100M numbers. After reading each element I am storing in an array of sint type. Further for a query, I am searching the count of elements equal to search query using a==b command. This setup work up to 1k element but after that program gets killed.

[1]+ Killed Scripts/compile-run.py -E mal-shamir python_test -- -N 3

I am using docker images to run the system. Could you please help?

The underlying code:

def read_numbers_from_file(filename):
with open(filename, 'r') as file:
num_rows = 1000
numbers = Array(num_rows, sint)
count = 0
while count < num_rows:
line = file.readline()
if line:
numbers[count] = sint(int(line.strip()))
count += 1 # Increment the counter
else:
break
return numbers

filename = 'data_1M.csv'
numbers = read_numbers_from_file(filename)

search_query=sint(1)
count = sint(0)
length=len(numbers)

def equality_test():
for i in range(len(numbers)):
count.iadd(numbers[i] == search_query)

equality_test()
print_ln('Count for %s is %s',search_query.reveal(), count.reveal())

I even tried the below version of the code:

program.use_edabit(True)
def read_numbers():
numbers=sint.get_input_from(0, size=1000000)
b = sint.get_input_from(1)
eq_bits = [x == b for x in numbers]
b_in_a = sum(eq_bits)
print_ln("Count of %s is %s",b.reveal(), b_in_a.reveal())

read_numbers()

Party 0 has 1M numbers in Input_P0_0 file under Player_Data folder
Party 1 has 1 number in Input_P1_0 file under Player_Data folder

@mkskeller
Copy link
Member

This is because of the unrolling of large loops. In the second example, you should be able to fix this using

eq_bits = numbers == b
b_in_a = eq_bits.sum()

This switches to vectorized computation.

The first example is more difficult because it involves hard-coding all inputs in the program. I therefore recommend you stick to the second approach.

@kkc2894
Copy link
Author

kkc2894 commented Oct 8, 2024

Thank you for quick response.
When making the code changes as asked, I am getting below error:
/usr/src/MP-SPDZ/Scripts/run-common.sh: line 90: 2363 Killed $my_prefix $SPDZROOT/$bin $i $params 2>&1
2364 Done | { if test "$BENCH"; then
if test $i = $front_player; then
tee -a $log;
else
cat >> $log;
fi;
else
if test $i = $front_player; then
tee $log;
else
cat > $log;
fi;
fi; }

My code:
import time
program.use_edabit(True)
def read_numbers_from_file():
numbers = sint.get_input_from(0, size=1000000)
b = sint.get_input_from(1)
eq_bits = numbers == b
b_in_a = eq_bits.sum()
print_ln("Count of %s is %s", b.reveal(), b_in_a.reveal())

start_time = time.time()
read_numbers_from_file()
end_time = time.time()

@mkskeller
Copy link
Member

What is the output of Scripts/memory-usage.py <program-with-args>? It might be that the memory requirements exceed your RAM.

@kkc2894
Copy link
Author

kkc2894 commented Oct 9, 2024

I get time as printed as "Execution time:0.22 ms". However, the verbose of the run provides online/offline time. Which one should one consider to understand the time taken to run the code under MPSPDZ setting?

You cannot use the timing in the Python code because it doesn't measure the actual execution.

Also, MP-SPDZ considers only trusted client settings? Or Unstrusted client setting is also there?

I understand since I am using malicious majority honest setting based on Shamir's secret-sharing. if the number of parties is 3 parties, at least two have to be honest. But what about the clients who are communicating with these parties to do some computation? Is there any setting available for trusted or untrusted clients?

The standard client interface assumes an untrusted client, which means that it cannot influence the computation beyond the input provided and it does not learn more than the designated output.

@kkc2894
Copy link
Author

kkc2894 commented Oct 10, 2024

A follow up question, so the online time is the time to consider the end-to-end latency?
"Spent 0.96727 seconds (84.4956 MB, 22 rounds) on the online phase and 8.87888 seconds (396.995 MB, 2460 rounds) on the preprocessing/offline phase." --> Meaning 0.96727 seconds is the end-to-end time.

If not, how to get the total execution time per party?

@mkskeller
Copy link
Member

It depends on what you mean by end-to-end. The line you mention splits up the time in several phases whereas "Time = ..." indicates the wall clock time for the computation, excluding some setup costs.

@kkc2894
Copy link
Author

kkc2894 commented Oct 11, 2024

It depends on what you mean by end-to-end. The line you mention splits up the time in several phases whereas "Time = ..." indicates the wall clock time for the computation, excluding some setup costs.

For a problem statement as " give the count of a value "b" in array of "number", the code will look like below:
def read_numbers():
numbers=sint.get_input_from(0, size=1000000)
b = sint.get_input_from(1)
eq_bits = [x == b for x in numbers]
b_in_a = sum(eq_bits)
print_ln("Count of %s is %s",b.reveal(), b_in_a.reveal())
read_numbers()

I want to know the following details:

  1. The time it took to return the "count result" --> Basically, the time duration between the client sent "b" until the time the client received the "count" result
  2. In how many rounds of communication between party-party/party-client the operation got completed
  3. The network bandwidth incurred

When I run MP_SPDZ, I get verbose as "Spent 0.96727 seconds (84.4956 MB, 22 rounds) on the online phase and 8.87888 seconds (396.995 MB, 2460 rounds) on the preprocessing/offline phase.". Does this mean that:

  1. The time it took to return the "count result" --> 0.96727 seconds
  2. In how many rounds of communication between party-party/party-client the operation got completed --> 22
  3. The network bandwidth incurred --> 84.4956 MB

If not, please guide the right way to compute points 1, 2,3 .

@mkskeller
Copy link
Member

This depends on whether you want to include the preprocessing phase. The values in your list are for the online phase only, that is, the data-dependent phase without computation that can be done without knowing the data, other than the size.

@kkc2894
Copy link
Author

kkc2894 commented Oct 18, 2024

How can I do a dot product between a matrix of size 5000 * 100K with vector of size 5000. I am able to do uptil matrix of size 5000 * 100, but beyond that the process gets killed.

  1. Is there a limit to which dot product can happen?
  2. What should be my system configuration to support it? --> Currently i have 64GB RAM and 1TB SSD with M1 Max Macbok
  3. If there something specific I need to change from code perspective to support this kind of operation?``

@mkskeller
Copy link
Member

There must be a limit because not machine has unlimited storage. The concrete restrictions depend much on the way how you are using the framework, so it's hard to judge without knowing the code and the protocol.

@kkc2894
Copy link
Author

kkc2894 commented Oct 22, 2024

There must be a limit because not machine has unlimited storage. The concrete restrictions depend much on the way how you are using the framework, so it's hard to judge without knowing the code and the protocol.

keyword_count = 5000
inverted_columns = 110000
with open("/usr/src/MP-SPDZ/inverted.txt", 'r') as file:
  lines1 = [[int(value) for value in line.strip().split(',')] for line in file]

# writing shares of inverted_index and keyword vector
for i in range(0,inverted_columns):
  temp=sint(lines1[i])
  sint.write_to_file(temp)  # append to file

Q. Even the creation of shares gets killed for such a large size. Is there a way I could load a 2D array of such a size and then perform Matrix-vector dot product.

My completed code of dot product:

# inverted index
    keyword_count = 5000
    inverted_columns = 110000
    with open("/usr/src/MP-SPDZ/inverted.txt", 'r') as file:
        lines1 = [[int(value) for value in line.strip().split(',')] for line in file]

    # writing shares of inverted_index and keyword vector
    for i in range(0,inverted_columns):
        temp=sint(lines1[i])
        sint.write_to_file(temp)  # append to file

    temp = [0] * keyword_count
    position_cleartext = 0
    temp[position_cleartext] = 1
    file_vector = sint(temp)
    sint.write_to_file(file_vector)

    # reading shares of inverted_index and keyword vector
    inverted_index=[]
    for i in range(0,inverted_columns):
        _, temp=sint.read_from_file(i*keyword_count, keyword_count)
        inverted_index.append(temp)
        # print_ln("Inverted index size %s * %s", len(inverted_index), len(inverted_index[0]))
    inverted_index = Matrix.create_from(inverted_index)
    print_ln("Inverted index size %s * %s", len(inverted_index), len(inverted_index[0]))

    _, temp = sint.read_from_file(inverted_columns*keyword_count, keyword_count)
    file_vector = Array.create_from(temp)

    # phase 2 operation
    for i in range(0,inverted_columns):
        result =file_vector.dot(inverted_index[i])
        # print_ln(" %s", result.reveal())

The above code gets killed.

Logs:

Screenshot 2024-10-22 at 2 44 56 PM

Q. Also can you please tell, who sent want amount of data to whom as per below stats:
Party 0:
Spent 0.231426 seconds (20.3201 MB, 8 rounds) on the online phase and 2.19302 seconds (98.0889 MB, 612 rounds) on the preprocessing/offline phase.
Communication details:
Broadcasting 0.0044 MB in 169 rounds, taking 0.0242042 seconds
Partial broadcasting 37.04 MB in 175 rounds, taking 0.068515 seconds
CPU time = 2.40999
The following benchmarks are including preprocessing (offline phase).
Time = 2.42639 seconds
Data sent = 118.409 MB in ~620 rounds (party 0 only)
Global data sent = 355.067 MB (all parties)

Party 1:
Spent 0.281857 seconds (20.3201 MB, 8 rounds) on the online phase and 2.14259 seconds (98.0889 MB, 612 rounds) on the preprocessing/offline phase.
Communication details:
Broadcasting 0.0044 MB in 169 rounds, taking 0.0271209 seconds
Partial broadcasting 37.04 MB in 175 rounds, taking 0.0709618 seconds
CPU time = 2.47621
The following benchmarks are including preprocessing (offline phase).
Time = 2.42652 seconds
Data sent = 118.409 MB in ~620 rounds (party 1 only)
Global data sent = 355.067 MB (all parties)

Party 2:
Spent 0.239738 seconds (20.3201 MB, 8 rounds) on the online phase and 2.18467 seconds (97.9289 MB, 612 rounds) on the preprocessing/offline phase.
Communication details:
Broadcasting 0.0044 MB in 169 rounds, taking 0.0238763 seconds
Partial broadcasting 37.04 MB in 175 rounds, taking 0.0699712 seconds
CPU time = 2.40435
The following benchmarks are including preprocessing (offline phase).
Time = 2.42636 seconds
Data sent = 118.249 MB in ~620 rounds (party 2 only)
Global data sent = 355.067 MB (all parties)

Q. And what round refers to --> 1> A to B and then B to A or 2> A To B, B to C, C to A or 3.> A to B?

Q. Or can you please breakdown above stats of any one party to what broadcast/partial broadcast etc, their round mean between which parties?

Q. As per my understanding, as I am using malicious-shamir-party [honest majority with malicious server using Shamir secret sharing] with 3 server, if round is 1 means in that single round party1 sends to party2, party2 sends to party1, party2 sends to party3, party3 sends to party2, party3 sends to party1, party1 sends to party3. The data sent to each party is given by "data sent", thus each party send 118.249 MB to other two parties and the time spent by each party for online + offline phase is give by "Time" as 2.42636 seconds

@mkskeller
Copy link
Member

Q. Even the creation of shares gets killed for such a large size. Is there a way I could load a 2D array of such a size and then perform Matrix-vector dot product.

The overload probably happens in the loading procedure, use sint.input_tensor_via() instead of piece-wise loading: https://mp-spdz.readthedocs.io/en/latest/Compiler.html#Compiler.types.sint.input_tensor_via

Q. Also can you please tell, who sent want amount of data to whom as per below stats: Party 0: Spent 0.231426 seconds (20.3201 MB, 8 rounds) on the online phase and 2.19302 seconds (98.0889 MB, 612 rounds) on the preprocessing/offline phase. Communication details: Broadcasting 0.0044 MB in 169 rounds, taking 0.0242042 seconds Partial broadcasting 37.04 MB in 175 rounds, taking 0.068515 seconds CPU time = 2.40999 The following benchmarks are including preprocessing (offline phase). Time = 2.42639 seconds Data sent = 118.409 MB in ~620 rounds (party 0 only) Global data sent = 355.067 MB (all parties)

There is no accounting for how much data is sent to a particular party.

Q. And what round refers to --> 1> A to B and then B to A or 2> A To B, B to C, C to A or 3.> A to B?

Rounds refer to the number of invocations of the communication operation in question, so one broadcast would account for one round.

Q. Or can you please breakdown above stats of any one party to what broadcast/partial broadcast etc, their round mean between which parties?

A (full) broadcast means every party sending something to every other party while a partial broadcast means some parties sending to some other parties in parallel.

Q. As per my understanding, as I am using malicious-shamir-party [honest majority with malicious server using Shamir secret sharing] with 3 server, if round is 1 means in that single round party1 sends to party2, party2 sends to party1, party2 sends to party3, party3 sends to party2, party3 sends to party1, party1 sends to party3. The data sent to each party is given by "data sent", thus each party send 118.249 MB to other two parties and the time spent by each party for online + offline phase is give by "Time" as 2.42636 seconds

Correct. The 118 MB is the data sent in total to all other parties by the party in question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants