Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training step #4

Open
smurthy55 opened this issue Sep 21, 2020 · 11 comments
Open

Training step #4

smurthy55 opened this issue Sep 21, 2020 · 11 comments

Comments

@smurthy55
Copy link

Hi! I hope you are doing well.

I was following the steps listed in the tutorial for training the DeepHiC. At the training step (python training.py), I received the following error regarding an issue with connection:

WARNING:root:Setting up a new session...
Exception in user code:

Traceback (most recent call last):
File "/Users/murthys3/miniconda3/lib/python3.7/site-packages/urllib3/connection.py", line 157, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw
File "/Users/murthys3/miniconda3/lib/python3.7/site-packages/urllib3/util/connection.py", line 84, in create_connection
raise err
File "/Users/murthys3/miniconda3/lib/python3.7/site-packages/urllib3/util/connection.py", line 74, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 61] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Users/murthys3/miniconda3/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
chunked=chunked,
File "/Users/murthys3/miniconda3/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/Users/murthys3/miniconda3/lib/python3.7/http/client.py", line 1244, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/Users/murthys3/miniconda3/lib/python3.7/http/client.py", line 1290, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/Users/murthys3/miniconda3/lib/python3.7/http/client.py", line 1239, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/Users/murthys3/miniconda3/lib/python3.7/http/client.py", line 1026, in _send_output
self.send(msg)
File "/Users/murthys3/miniconda3/lib/python3.7/http/client.py", line 966, in send
self.connect()
File "/Users/murthys3/miniconda3/lib/python3.7/site-packages/urllib3/connection.py", line 184, in connect
conn = self._new_conn()
File "/Users/murthys3/miniconda3/lib/python3.7/site-packages/urllib3/connection.py", line 169, in _new_conn
self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f916229ae50>: Failed to establish a new connection: [Errno 61] Connection refused


Would you know how I could address this issue? Thanks so much for your help.

@omegahh
Copy link
Owner

omegahh commented Sep 23, 2020

As I can see, these errors are caused by the network connection. In this repository, only one step, loading the VGG16 model from torchvision, would try connecting the internet and downloading a pre-trained model file. How about running follow codes in you machine?

from torchvision.models.vgg import vgg16
vgg = vgg16(pretrained=True)

@smurthy55
Copy link
Author

Thanks so much! I am still learning how DeepHiC works and how we can use it with our data, and wanted to ask some follow-up questions:

  1. I wanted to ask about the selection of parameters when following these steps for our own data. For example, how do you suggest we determine the downsampling factor we should use based on our data?

  2. Also I notice that the GM12878 data is stored as .npz files by resolution and chromosome. Should we input our own data as .npz files per chromosome, or can the entire dataset at one resolution be stored as one .npz file?

  3. When running the data_generate.py script, one of the parameters is “-s” which is related to the inputted dataset. When running with -c GM12878 dataset, the example command uses “all” for the -s parameter. However, this is not working when I run it using this parameter (“train” and “valid” seem to run completely). The error I receive with “-s all” is:

[murthys3@cn0862 DeepHiC]$ python data_generate.py -hr 10kb -lr 40kb -lrc 100 -s all -chunk 40 -stride 40 -bound 201 -scale 1 -c GM12878_primary
Traceback (most recent call last):
File "data_generate.py", line 49, in
chr_list = set_dict[dataset]
KeyError: 'all'

@omegahh
Copy link
Owner

omegahh commented Sep 25, 2020

  1. I have made some analyses for determining downsampling factor for users' data. like Note S2, Fig S17. Hope these results helps you.

  2. DeepHiC only focuses on intra-chromosomal data, so I think they should be separated by chromosomes.

  3. Sorry for this error. options for -s should be "train/test/human/mouse". Option "all" has been deprecated. I forgot to update the instruction.

@smurthy55
Copy link
Author

Thanks so much for your help! Do you have suggestions on how to convert the prediction npz files to files that can be visualized in a map such as a .hic file?

@omegahh
Copy link
Owner

omegahh commented Oct 28, 2020

I use .npz file because it can be easily loaded in python (with numpy) without any conversion, and these compressed files save a lot of storage. And I use matplotlib package for Hi-C matrices visualization.

@smurthy55
Copy link
Author

Thanks so much for all your help. I wanted to ask about using relatively high resolution data for the DeepHiC pipeline. Do you think it still would work for these datasets?

It seems that when running the pipeline, the outputs are enhanced resolution maps of the lower resolution maps generated from the pipeline. Is it possible, if starting with an input of 10kb, to get an enhanced predicted output of 10kb, rather than an enhanced predicted output of the lower resolution data, such as 40kb? Perhaps I am misunderstanding the outputs or am not generating the outputs correctly.

@omegahh
Copy link
Owner

omegahh commented Nov 2, 2020

Yes, you are right. The low-resolution input of the model is actually still binned in 10kb, but it has lower sequencing depth than the real 10kb Hi-C data.

In short, both the input and output of our model is 10kb binned matrices, but the input has lower sequencing depth.

@smurthy55
Copy link
Author

smurthy55 commented Nov 6, 2020

Thanks for all your help. I have outputs from the DeepHiC method that I have visualized with Matplotlib, and I am seeing sharp edges where the squares are in the enhanced plots. Is there a way to smoothen or reduce the edges?

I am also currently trying to filter and normalize our data similar to how it was described in the paper to help fine tune the data and reduce the effect of outliers. I noticed how you used 255 as the average 99.9th threshold for your data and set values higher than this to 255. Was this 255 calculated by obtaining the 99.9th percentile for the numpy hic matrix for each chromosome and then obtaining the mean? If that is the case, it seems that from the raw GM12878 npz files generated in the pipeline the average 99.9th percentile would be much less (~78.6).

@omegahh
Copy link
Owner

omegahh commented Dec 17, 2020

Hello, smurthy55, it has been a long time. For question one, edges between divided blocks should be diminished if the training step is sufficient. We did observed edges if we used the model, for example, trained with 100 epochs. According to your description, we think we should update this model to avoid this problem. Maybe we could make the prediction in the whole matrix without splitting into blocks, but we need more tests for this, and I cannot ensure the time when this will be finished.

For question two, we used the processed Hi-C matrices from GEO. We noticed that the processed files were replaced with the .hic files currently, while they were compressed text files before. I am not sure whether they are the same data which just stored in different format. The following figure is the distribution of 99.x (x=1,3,5,7,9) percentiles for each chromosomes in 10kb GM12878 cell line data, as well as the downsampled data (40kb in the right)

屏幕快照 2020-12-17 09 18 25

Hope this helps you!

@Omeiko
Copy link

Omeiko commented Jan 19, 2022

Hello, I had the same problem. And my network connection was OK.
Here is my error
Setting up a new session...
Exception in user code:

Traceback (most recent call last):
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/util/connection.py", line 96, in create_connection
raise err
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/util/connection.py", line 86, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/connectionpool.py", line 699, in urlopen
httplib_response = self._make_request(
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/connectionpool.py", line 394, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/connection.py", line 239, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/http/client.py", line 1252, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/http/client.py", line 1298, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/http/client.py", line 1247, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/http/client.py", line 1007, in _send_output
self.send(msg)
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/http/client.py", line 947, in send
self.connect()
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/connection.py", line 205, in connect
conn = self._new_conn()
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f6c4847f190>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/connectionpool.py", line 755, in urlopen
retries = retries.increment(
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/urllib3/util/retry.py", line 574, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /env/0119-deephic (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f6c4847f190>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/visdom/init.py", line 708, in _send
return self._handle_post(
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/visdom/init.py", line 677, in _handle_post
r = self.session.post(url, data=data)
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/requests/sessions.py", line 590, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /env/0119-deephic (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f6c4847f190>: Failed to establish a new connection: [Errno 111] Connection refused'))
[Errno 111] Connection refused
on_close() takes 1 positional argument but 3 were given
0%| | 0/831 [00:00<?, ?it/s]
Traceback (most recent call last):
File "train.py", line 112, in
g_loss.backward()
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/torch/_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/wang/anaconda3/envs/hicpro/lib/python3.8/site-packages/torch/autograd/init.py", line 154, in backward
Variable._execution_engine.run_backward(
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 256, 1, 1]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

@jinsooahn
Copy link

jinsooahn commented Jan 25, 2022

I have been trying to solve these issues. I could solved '[Errno 111] Connection refused' after running the visdom server.

In my case, I run below after installing visdom and then navigate to http://localhost:8097 to see if it is working.
python -m visdom.server &

Regarding the inplace operation error, I received some details by adding torch.autograd.set_detect_anomaly(True) in train.py. It says "The variable in question was changed in there or anywhere later". I am not an expert, but after doing some research I moved optimizerD.step() to below location, and then it worked well. The authors can correct me if I am wrong.

     ######### Train discriminator #########
    netD.zero_grad()
    real_out = netD(real_img)
    fake_out = netD(fake_img)
    d_loss_real = criterionD(real_out, torch.ones_like(real_out))
    d_loss_fake = criterionD(fake_out, torch.zeros_like(fake_out))
    d_loss = d_loss_real + d_loss_fake
    d_loss.backward(retain_graph=True)

    ######### Train generator #########
    netG.zero_grad()
    g_loss = criterionG(fake_out.mean(), fake_img, real_img)
    g_loss.backward()

    optimizerD.step()
    optimizerG.step()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants