Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spiders_reader on long sentences crashes #156

Open
kinianlo opened this issue Sep 27, 2024 · 3 comments · May be fixed by #157
Open

spiders_reader on long sentences crashes #156

kinianlo opened this issue Sep 27, 2024 · 3 comments · May be fixed by #157

Comments

@kinianlo
Copy link
Contributor

Symptom

The use of spiders_reader on long sentences leads to excessive time and memory use

example
The following example uses spiders_reader on a sentence with 14 words, each given a 4-dimensional vector. The spider computes the element wise product between these vectors. On a modern computer this should be done in milliseconds. Instead, the following took more than 2 seconds.

from lambeq import spiders_reader, TensorAnsatz, AtomicType
from lambeq import PytorchModel
from lambeq.backend.tensor import Dim
from time import time

sent = ' '.join(str(i) for i in range(14))
diag = spiders_reader.sentence2diagram(sent)

ansatz = TensorAnsatz({AtomicType.SENTENCE: Dim(4)})
circ = ansatz(diag)

model = PytorchModel.from_diagrams([circ])
model.initialise_weights()

start = time()
model.get_diagram_output([circ])
end = time()
print(end - start) # 2.4105310440063477

It crashes (due to memory) if I increase the dimension from 4 to 5 on my machine.

This is my guess of what's going on:
the spider with (14+1) legs got converted to a rank-15 CopyNode in tensornetwork whose data is a dense array of size 2^15 during .to_tn().

Perhaps there should a safeguard which splits the large CopyNode into smaller ones. Similar to this in discopy:
https://github.com/discopy/discopy/blob/006c3966a1906edfbd4b1ac1bfe943e1a709c0e0/discopy/frobenius.py#L380.

@kinianlo kinianlo linked a pull request Sep 27, 2024 that will close this issue
@dimkart
Copy link
Contributor

dimkart commented Sep 30, 2024

@kinianlo Hi, In general, I think it's better to avoid any hidden manipulation at the tensor network level, since physical changes of this form can alter dramatically training. Perhaps a better solution here would be a spider-specific tensor ansatz which arranges the spiders reader output e.g. as follows, producing a chain of 3-legged spiders.

w1    w2
  \  /
   o      w3 
     \   /
       o    ....
        \   /
          o
          |
  • Have a look at MPSAnsatz, SpiderAnsatz, which break large tensors similarly in generic string diagrams.
  • Note this can be also implemented as a separate reader instead of a new ansatz, e.g. SpiderChainReader (perhaps this is even better).

@kinianlo
Copy link
Contributor Author

kinianlo commented Oct 1, 2024

@dimkart Yes I agree messing things at the tensornetwork level would be dangerous.

  1. It seems that grammar.Spider.apply_functor forces a spider to be mapped to another spider in the target category. This makes it impossible to construct a functor that decomposes spiders. Unless something is done to change this behaviour, an ansatz that breaks down a large spider should not be possible.
  2. Apart from having it as a new reader, could adding this as an option to SpidersReader be okay? Something like SpidersReader(use_small_spiders=True).

@dimkart
Copy link
Contributor

dimkart commented Oct 2, 2024

Yes, it's much easier to do this at the reader level. In fact, LinearReader gives you this functionality out-of-the-box:

spider_chain_reader = LinearReader(Spider(AtomicType.SENTENCE, 2, 1))

I guess adding an extra option to SpidersReader is OK, but it would be cleaner to just add the above line at the end of spiders_reader.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants