Docker-based Tutorial #5

activeshadow · 2018-09-10T19:15:33Z

I'm working my way through these tutorials, and rather than dirty up my machine or use a VM I'm attempting to create Docker images for each of the relevant portions of the tutorials and a Docker Compose file to run them. Once I do I'll contribute everything back to this repo.

I'm having some trouble containerizing a few of the components so far, which I thought I would track here.

If anyone here can help me jump start the above two issues I'd appreciate it.

activeshadow · 2018-09-11T21:48:10Z

With @phlptp's help I was able to get GridDyn to completely install in a Docker image and running gridDynMain --version works. Once I can get GridLAB-D to install I'll be able to test federation via these tutorials (from what I can tell, all the tutorials require GridLAB-D).

activeshadow · 2018-09-12T03:38:18Z

Still having a hard time compiling GridLAB-D with HELICS support. I've created a new issue (gridlab-d/gridlab-d#1148) for it in the GridLAB-D repo.

activeshadow · 2018-09-12T21:05:27Z

Sweet... thanks to @afisher1 and @kdheepak I can now compile GridLAB-D with HELICS support in a Docker image. Now I'm off to get some docker-compose experiments developed for the tutorials. I'll report back here as I go, just in case anyone's interested!

activeshadow · 2018-09-12T21:08:13Z

So... I get the following from GridLAB-D when trying to run Tutorial 1.

gridlab-d_1  | WARNING  [INIT] : substation:15 is using the default base power of 100 VA. This could cause instability on your system.
gridlab-d_1  | WARNING  [INIT] : power_convergence_value not set - defaulting to 0.01 base_power
gridlab-d_1  | WARNING  [INIT] : Fuse:GC-12-47-1_fuse_1 has a negative or 0 mean replacement time - defaulting to 1 hour
gridlab-d_1  | WARNING  [INIT] : last warning message was repeated 2 times
gridlab-d_1  | WARNING  [INIT] : Node:GC-12-47-1_node_27 does not have the same nominal voltage as its parent - copying voltage from parent.
gridlab-d_1  | WARNING  [INIT] : Underground_line:65 GC-12-47-1_ul_1 - phase N conductor should just be a normal conductor!
gridlab-d_1  | WARNING  [INIT] : last warning message was repeated 8 times
gridlab-d_1  | WARNING  [INIT] : INIT: underground_line:GC-12-47-1_ul_9 has a negative resistance in it's impedance matrix. This will result in unusual behavior. Please check the line's geometry and cable parameters.
gridlab-d_1  | WARNING  [INIT] : Underground_line:74 GC-12-47-1_ul_10 - phase N conductor should just be a normal conductor!
gridlab-d_1  | ERROR    [INIT] : init_underground_line(obj=81;GC-12-47-1_ul_17): from and to node nominal voltage mismatch of greater than 0.1%%
gridlab-d_1  | ERROR    [INIT] : init_by_deferral(): object GC-12-47-1_ul_17 initialization failed
gridlab-d_1  | ERROR    [INIT] : model initialization failed
gridlab-d_1  | FATAL    [INIT] : shutdown after simulation stopped prematurely
gridlab-d_1  | FATAL    [INIT] : environment startup failed: Invalid argument
gridlab-d_1  | Model profiler results
gridlab-d_1  | ======================
gridlab-d_1  | 
gridlab-d_1  | Class            Time (s) Time (%) msec/obj
gridlab-d_1  | ---------------- -------- -------- --------
gridlab-d_1  | climate            0.130     98.5%    130.0
gridlab-d_1  | substation         0.002      1.5%      2.0
gridlab-d_1  | ================ ======== ======== ========
gridlab-d_1  | Total              0.132    100.0%      0.0
gridlab-d_1  | 
gridlab-d_1  | WARNING  [INIT] : last warning message was repeated 6 times

This seems more like a GridLAB-D configuration error and less of an environment error (ie. the Docker images or the Docker Compose configuration).

kdheepak · 2018-09-12T21:26:04Z

Well, you are just one step away now! I think if all publications and subscriptions in a federation are not subscribed and published i.e. they don't all have corresponding matches in other federates in a federation, gridlab-d gives this error. There should be a Python federate in there that will work with just one gridlab-d feeder. Did you try this folder?

kdheepak · 2018-09-12T21:29:40Z

It does seem like you are using Tutorial 1. Let me test locally.

kdheepak · 2018-09-12T21:40:56Z

I did a fresh clone and was able to run the Tutorial1 without changing anything.

kdheepak · 2018-09-12T21:44:41Z

The three federates are run this way:

$ gridlabd DistributionSim_B2_G_1.glm

$ python federate1.py

$ helics_broker 2 --loglevel=5

activeshadow · 2018-09-12T21:56:44Z

@kdheepak I got that error w/ GridLAB-D before even starting up the other two federates, so I'm not sure how it can be an issue with federate pub/sub. I'm wondering if something's still just not right w/ the compilation of the GridLAB-D code with HELICS support.

activeshadow · 2018-09-20T12:57:10Z

@kdheepak and/or @afisher1 I'm at a loss as to where to go from here... I can't seem to get past this error. Any pointers?

kdheepak · 2018-09-20T13:38:16Z

I can try building the same docker image and try. Should I just follow your instructions? Or if you can send me your image I can try in there.

kdheepak · 2018-09-20T14:34:21Z

I'm running docker-compose now, I'll provide updates on this thread.

kdheepak · 2018-09-20T14:43:13Z

I get the same error:

WARNING: Image for service gridlab-d was built because it did not already exist. To rebuild this image you must use `docker-compose build` or `docker-compose up --build`.
Creating 1-distributionfederation-manualstart_gridlab-d_1 ... done
Creating 1-distributionfederation-manualstart_federate_1  ... done
Creating 1-distributionfederation-manualstart_broker_1    ... done
Attaching to 1-distributionfederation-manualstart_gridlab-d_1, 1-distributionfederation-manualstart_broker_1, 1-distributionfederation-manualstart_federate_1
federate_1   | zmq broker connection timed out (2)
federate_1   | Voltage value = 155.65854761454588 kV
gridlab-d_1  | WARNING  [INIT] : substation:15 is using the default base power of 100 VA. This could cause instability on your system.
gridlab-d_1  | WARNING  [INIT] : power_convergence_value not set - defaulting to 0.01 base_power
gridlab-d_1  | WARNING  [INIT] : Fuse:GC-12-47-1_fuse_1 has a negative or 0 mean replacement time - defaulting to 1 hour
gridlab-d_1  | WARNING  [INIT] : last warning message was repeated 2 times
gridlab-d_1  | WARNING  [INIT] : Node:GC-12-47-1_node_27 does not have the same nominal voltage as its parent - copying voltage from parent.
gridlab-d_1  | WARNING  [INIT] : Underground_line:65 GC-12-47-1_ul_1 - phase N conductor should just be a normal conductor!
gridlab-d_1  | WARNING  [INIT] : last warning message was repeated 8 times
gridlab-d_1  | WARNING  [INIT] : INIT: underground_line:GC-12-47-1_ul_9 has a negative resistance in it's impedance matrix. This will result in unusual behavior. Please check the line's geometry and cable parameters.
gridlab-d_1  | WARNING  [INIT] : Underground_line:74 GC-12-47-1_ul_10 - phase N conductor should just be a normal conductor!
gridlab-d_1  | ERROR    [INIT] : init_underground_line(obj=81;GC-12-47-1_ul_17): from and to node nominal voltage mismatch of greater than 0.1%%
gridlab-d_1  | ERROR    [INIT] : init_by_deferral(): object GC-12-47-1_ul_17 initialization failed
gridlab-d_1  | ERROR    [INIT] : model initialization failed
gridlab-d_1  | FATAL    [INIT] : shutdown after simulation stopped prematurely
federate_1   | Python Federate grantedtime = 6.90031462714805e-310
federate_1   | Load value = (6.90031462713e-313+5.1205e-320j) MW
federate_1   | Voltage value = 149.9175170684086 kV
gridlab-d_1  | FATAL    [INIT] : environment startup failed: Invalid argument
gridlab-d_1  | Model profiler results
gridlab-d_1  | ======================
gridlab-d_1  |
gridlab-d_1  | Class            Time (s) Time (%) msec/obj
gridlab-d_1  | ---------------- -------- -------- --------
gridlab-d_1  | climate            0.200     99.0%    200.0
gridlab-d_1  | underground_line   0.001      0.5%      0.1
gridlab-d_1  | substation         0.001      0.5%      1.0
gridlab-d_1  | ================ ======== ======== ========
gridlab-d_1  | Total              0.202    100.0%      0.1
gridlab-d_1  |
gridlab-d_1  | WARNING  [INIT] : last warning message was repeated 6 times
1-distributionfederation-manualstart_gridlab-d_1 exited with code 2

^CGracefully stopping... (press Ctrl+C again to force)
Stopping 1-distributionfederation-manualstart_broker_1    ...
Stopping 1-distributionfederation-manualstart_federate_1  ...

kdheepak · 2018-09-20T14:51:13Z

I've not used docker much before. How do I "ssh" into one of these containers? I want to interactively run some commands.

kdheepak · 2018-09-20T14:52:26Z

Is docker-compose up starting three different containers? One for the broker, one for the federate1.py and one for GridLAB-D? If that is the case, then the broker / python / gridlabd need some additional configuration changes, in order to run across multiple machines.

kdheepak · 2018-09-20T14:52:42Z

The tutorial was set up to be single machine only.

kdheepak · 2018-09-20T15:01:02Z

My docker skills are not up to par, I'm not able to figure out how to combine these into a single docker image. I'm going to leave it here for now, but if you are able to combine it into a single docker instance then this tutorial should work.

activeshadow · 2018-09-20T19:22:29Z

@kdheepak yes, three separate containers are running. Good point on networking. I'll try host networking in the docker-compose.yml file for each of the containers, which will essentially make them all available on localhost. As for accessing the containers, you can use docker exec -it <container id> bash to drop into the container.

kdheepak · 2018-09-20T19:41:42Z

Okay, if you are not able to get that to work, we should be able to get it to work using multi-node feature. I think the changes on the broker side and the Python side are trivial, assuming you can ping from one container to another. We will then need to check with @afisher1 for what the status of GridLAB-D multi-node feature is and whether it can be configured using the .json file (and @phlptp for the GridDyn status).

activeshadow · 2018-09-20T19:56:00Z

@kdheepak it looks like the broker listens on ports 22650 and 22651, correct?

activeshadow · 2018-09-20T19:57:01Z

Or wait... maybe it's ports 23404 and 23405.

activeshadow · 2018-09-20T20:04:24Z

That didn't seem to help... with each container set to use host networking, all the containers have direct access to the broker listening at 127.0.0.1:23404 and 127.0.0.1:23405. However, the GridLAB-D container still fails with the above error.

From what I'm needing to do with HELICS, it would be best to get the tutorial working using the multi-node feature anyway, so perhaps we can just work towards that @kdheepak? Is there any documentation around configuring multi-node?

kdheepak · 2018-09-20T20:17:10Z

Excellent question. Unfortunately I haven't put together any official documentation for this yet. A lot of our "documentation" is essentially in the history of the our gitter channel. If you search, you'll be able to find a lot of information regarding this (a little annoying to do that, I know). I've marked the beginning and the end of the relevant conversation I had with @phlptp.

▶️ August 24, 2018 2:46 PM
◀️ August 25, 2018 9:45 AM

My recommendation is to first try the pi-exchange example instead of the HELICS-Tutorial, i.e. pisender pireceiver helics_broker in three containers. If you can get this to work then it'll help eliminate one level of complexity. We can then try adding GridLAB-D.

I'll also work on improving the documentation.

kdheepak · 2018-09-20T20:21:59Z

I would break it down into the following steps. The first step is to do a simple ping between containers. If that works, try running the Python client server ZMQ example in the two containers (Here is a Python2 example that I convert to Python3 everytime I want to do something like this). If this also works, then getting pireceiver and pisender to talk to each other should be straightforward.

activeshadow · 2018-09-20T21:25:54Z

In the pi-exchange example, where is the broker assumed to be listening -- localhost? What port(s) -- 23404 and 23405? Is one of those ports for PUB/SUB and the other for REQ/REP?

How can I configure the Python HELICS library to use something other than localhost?

kdheepak · 2018-09-20T22:02:21Z

Here's what I ran for the example:

helics_broker 2 --ipv4 --loglevel=5

In pisender.py change fedinitstring to the following

fedinitstring = "--federates=1 --broker_address=tcp://192.168.1.8 --localport=23501 --interface=tcp://192.168.1.9"

In pireceiver.py change fedinitstring to the following

fedinitstring = "--federates=1 --broker_address=tcp://192.168.1.8 --localport=23504 --interface=tcp://192.168.1.10"

This is assuming the helics_broker is run on 192.168.1.8, the pisender is run on 192.168.1.9 and the pireceiver is run on 192.168.1.10.

If you run three containers, you will also have to comment out the following lines where pisender creates the broker and checks that the broker is still connected. If you run three containers, pisender should not start it's own broker, and you have to start the helics_broker separately. Hope that makes sense?

If you want to run just two containers, make the change for the broker init string here and run just pireceiver and pisender.

Thanks for sticking through this!

kdheepak · 2018-09-20T22:05:17Z

The port numbers don't matter, as long as it is not 23460 or the same port for both pisender and pireceiver. I believe the broker scans multiple ports for connections. I've ran the above example without the --localport=xxxxx and it has still worked fine across multiple machines.

activeshadow · 2018-09-20T22:06:48Z

Thanks @kdheepak, you rock. I'll test this later tonight and report back here.

phlptp · 2018-09-20T22:08:48Z

If the local port is not specified it, requests an open port number from the broker and then establishes a pull socket on that port.

phlptp · 2018-09-20T22:09:31Z

and I suspect if they are actually on different machines you could use the same port number if you like.

activeshadow · 2018-09-20T22:20:31Z

@phlptp @kdheepak what port number is the broker assumed to be listening on?

kdheepak · 2018-09-20T22:24:42Z

Depending on the type of core (e.g., tcp, udp, zmq etc) it may be slightly different. ZmqComms should use the following by default:

DEFAULT_BROKER_PULL_PORT_NUMBER = 23405;
DEFAULT_BROKER_REP_PORT_NUMBER = 23404;

activeshadow · 2018-09-20T22:55:24Z

Thx @kdheepak and sorry for not searching the code for that myself... I should look first then ask.

kdheepak · 2018-09-20T22:57:51Z

On the contrary, I think all of this should be in our documentation! Feel free to ask, it also helps us understand what needs to be documented, when we eventually get to it :)

activeshadow · 2018-09-21T03:28:32Z

OK, so I was able to get the pi-exchange example working. See my pull request on the HELICS-Examples repo. Thanks for the suggestion @kdheepak... this helped me to better understand some of the comm requirements between the broker and federates.

activeshadow · 2018-09-21T19:10:15Z

Still working on Tutorial 1, I think I have the Python federate script updated to correctly connect to the broker running in a separate container (based off what I changed to get the multi-node-pi-exchange working). @kdheepak @afisher1 any ideas on what needs to change in GridLAB-D to get it to talk to the broker on a different node over ZMQ TCP? @kdheepak you mentioned a GridLAB-D JSON file, but I'm not having much luck finding it.

kdheepak · 2018-09-21T19:45:28Z

This is the JSON file that is read by GridLAB-D. This is where GridLAB-D reads it and uses the information in the file to configure the HELICS GridLAB-D federate. There is a core_init_string keyword that can be added to the JSON. I haven't looked at this in a lot of detail, but it is possible that there's some changes needed in helics_msg.cpp to support the fedinitstring the way we are doing it in the Python example.

activeshadow · 2018-09-21T20:26:27Z

@kdheepak thanks for the pointers. It looks to me like helics_msg.cpp just sets the coreInitString variable on the helics::FederateInfo object to whatever core_init_string is set to in the JSON file. Based on what I see in the helics::FederateInfo.cpp file, I've tried setting the string to some permutations like 1 --broker=tcp://127.0.0.1, etc but to no avail. I don't see options in the helics::FederateInfo.cpp file for --federates and --interface in the init string like is used in the Python configs. I'm still trying to trace out code but not having much luck.

activeshadow · 2018-09-21T21:06:17Z

Dang... I'm not able to trace the code well enough to see exactly how in the HELICS source code the coreInitString arguments get mapped to the ZeroMQ connection parameters in ZmqComms.cpp.

activeshadow · 2018-09-22T17:14:10Z

For future reference, here's what I'm currently trying for tutorial 1.

1. running 3 docker containers, but all with host networking enabled so each container can access the other two containers on localhost.
2. running the broker with the --ipv4 option.
3. running the federates1.py federate with fedinitstring = "--federates=1 --broker_address=tcp://127.0.0.1 --interface=tcp://127.0.0.1"
4. running gridlab-d with DistributionSim_B2_G_1.json updated with "core_init_string" : "1 --broker_address=tcp://127.0.0.1 --interface=tcp://127.0.0.1"

The Python federate is able to establish two-way communication with the broker, so I know the networking stuff is OK. But the gridlab-d federate still fails to connect to the broker at all.

activeshadow · 2018-09-24T12:19:44Z

@afisher1 any chance you have some ideas about what I should use in the GridLAB-D JSON config to get GridLAB-D to talk to a HELICS broker that's running on a different host?

activeshadow mentioned this issue Sep 20, 2018

WIP: Support for running tutorials using Docker #6

Open

activeshadow mentioned this issue Sep 21, 2018

Support for running examples using Docker. GMLC-TDC/HELICS-Examples#6

Merged

Docker-based Tutorial #5

Docker-based Tutorial #5

Comments

activeshadow commented Sep 10, 2018 • edited Loading

activeshadow commented Sep 11, 2018 • edited Loading

activeshadow commented Sep 12, 2018

activeshadow commented Sep 12, 2018

activeshadow commented Sep 12, 2018

kdheepak commented Sep 12, 2018 • edited Loading

kdheepak commented Sep 12, 2018

kdheepak commented Sep 12, 2018

kdheepak commented Sep 12, 2018

activeshadow commented Sep 12, 2018 • edited Loading

activeshadow commented Sep 20, 2018 • edited Loading

kdheepak commented Sep 20, 2018

kdheepak commented Sep 20, 2018

kdheepak commented Sep 20, 2018

kdheepak commented Sep 20, 2018

kdheepak commented Sep 20, 2018 • edited Loading

kdheepak commented Sep 20, 2018

kdheepak commented Sep 20, 2018

activeshadow commented Sep 20, 2018

kdheepak commented Sep 20, 2018

activeshadow commented Sep 20, 2018

activeshadow commented Sep 20, 2018

activeshadow commented Sep 20, 2018

kdheepak commented Sep 20, 2018

kdheepak commented Sep 20, 2018

activeshadow commented Sep 20, 2018

kdheepak commented Sep 20, 2018 • edited Loading

kdheepak commented Sep 20, 2018

activeshadow commented Sep 20, 2018

phlptp commented Sep 20, 2018

phlptp commented Sep 20, 2018

activeshadow commented Sep 20, 2018

kdheepak commented Sep 20, 2018 • edited Loading

activeshadow commented Sep 20, 2018

kdheepak commented Sep 20, 2018

activeshadow commented Sep 21, 2018

activeshadow commented Sep 21, 2018

kdheepak commented Sep 21, 2018

activeshadow commented Sep 21, 2018

activeshadow commented Sep 21, 2018 • edited Loading

activeshadow commented Sep 22, 2018

activeshadow commented Sep 24, 2018

activeshadow commented Sep 10, 2018 •

edited

Loading

activeshadow commented Sep 11, 2018 •

edited

Loading

kdheepak commented Sep 12, 2018 •

edited

Loading

activeshadow commented Sep 12, 2018 •

edited

Loading

activeshadow commented Sep 20, 2018 •

edited

Loading

kdheepak commented Sep 20, 2018 •

edited

Loading

kdheepak commented Sep 20, 2018 •

edited

Loading

kdheepak commented Sep 20, 2018 •

edited

Loading

activeshadow commented Sep 21, 2018 •

edited

Loading