Merge branch 'release/1.1.0'

brettc · Nov 17, 2015 · 8596c23 · 8596c23
2 parents 650bc63 + 50ba6d3
commit 8596c23
Show file tree

Hide file tree

Showing 37 changed files with 1,931 additions and 1,835 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,3 +1,6 @@
 .ipynb_checkpoints/*
 *.pyc
 .idea
+build
+causalinfo.egg-info
+notebooks/.ipynb_checkpoints
diff --git a/HISTORY.rst b/HISTORY.rst
@@ -0,0 +1,22 @@
+.. :changelog:
+
+History
+-------
+
+1.1.0 (2015-11-17)
+~~~~~~~~~~~~~~~~~~~~~
+
+* Variables are now simply placeholders. All the action occurs in Distributions.
+* Ability to query a distribution using a string. 
+* Better introduction in ``README.rst``
+* Removed confusing examples folder.
+* Renaming and removal of various Notebooks (more to come)
+* Addition of tools for releasing and testing more easily.
+* TESTING: Initial test of Payoffs 
+
+1.0.0 (2015-10-15)
+~~~~~~~~~~~~~~~~~~~~~
+
+* Initial public release
+
+
diff --git a/README.rst b/README.rst
@@ -1,8 +1,118 @@
-===============================================
-causalinfo: flow & specificity in causal graphs 
-===============================================
+============================================
+``causalinfo``: Information on Causal Graphs 
+============================================
 
-A Python library for experimenting with information measures on causal graphs.
+.. image:: https://badge.fury.io/py/causalinfo.png
+    :target: http://badge.fury.io/py/causalinfo
 
+`causalinfo` is a Python library to aid in experimenting with different
+*information measures on causal graphs*---a combination of information
+theory with recent work on causal graphs [Pearl2000]_. These information
+measures can used to ascertain the degree to which one variable controls or
+explains other variables in the graph. The use of these measures has important
+connections to work on causal explanation in philosophy of science, and to
+understanding information processing in biological networks. 
 
+The library is a work in progress, and will be extended as research continues.
 
+What does it do?
+----------------
+
+`causalinfo` has been written primarily for interactive use within `IPython
+Notebook`_. You can create variables and assign probability distributions to
+them, or relate them to other variables using conditional probabilities.
+Several related variables can be combined into a directed acyclic graph, which
+can generate a joint distribution for all variables under observation, or
+under controlled interventions on certain variables. You can also calculate
+various information measures between variables in the graph whilst controlling
+other variables. These include correlative measures, such as Mutual
+Information, but also causal measures, such as Information Flow
+[AyPolani2008]_, and Causal Specificity [GriffithsEtAl2015]_.
+
+For some examples of how to use the library, please see the IPython Notebooks
+that are included:
+
+* Introduction_. A short introduction to some of the things you can do with
+  the library.
+
+* Rain_. Performing some interventions on a causal graph from Pearl's book.
+
+.. TODO: Add the signaling stuff in.
+.. * Signaling_. Looking at the measures of multiple pathways.
+
+.. _Introduction: https://github.com/brettc/causalinfo/blob/master/notebooks/introduction.ipynb
+
+.. _Rain: https://github.com/brettc/causalinfo/blob/master/notebooks/rain.ipynb
+
+.. Signaling: https://github.com/brettc/causalinfo/blob/master/notebooks/signaling.ipynb -->
+
+
+.. TODO: Add a getting started guide
+.. Getting Started
+    ---------------
+    .. code:: bash 
+    pip install causalinfo
+    curl https://raw.githubusercontent.com/brettc/causalinfo/master/notebooks/introduction.ipynb 
+
+Some Caveats
+------------
+
+The library is not meant for large scale analysis. The code has been written
+to offload as much as possible on to other libraries (such as Pandas_ and
+Networkx_), and to allow easy inspection of what is going on within `IPython
+Notebook`_, thus it is not optimized for speed. Calculating the joint
+distribution for a causal graph with many variables can become very *slow*
+(especially if the variables have many states). 
+
+
+Authorship
+----------
+
+All code was written by `Brett Calcott`_.
+
+
+Acknowledgments
+---------------
+
+This work is part of the research project on the `Causal Foundations of
+Biological Information`_ at the `University of Sydney`_, Australia. The work
+was made possible through the support of a grant from the Templeton World
+Charity Foundation. The opinions expressed are those of the author and do not
+necessarily reflect the views of the Templeton World Charity Foundation. 
+
+License
+-------
+
+MIT licensed. See the bundled LICENSE_ file for more details.
+
+
+.. Miscellaneous Links------------
+
+.. _LICENSE: https://github.com/brettc/causalinfo/blob/master/LICENSE
+
+.. _`Brett Calcott`: http://brettcalcott.com
+
+.. _`University of Sydney`: http://sydney.edu.au/ 
+
+.. _`IPython Notebook`: http://ipython.org/notebook.html 
+
+.. _Pandas: http://pandas.pydata.org/
+
+.. _Networkx: https://networkx.github.io/ 
+
+.. _`Causal Foundations of Biological Information`: http://sydney.edu.au/foundations_of_science/research/causal_foundations_biological_information.shtml 
+
+
+References
+----------
+
+.. [AyPolani2008] Ay, N., & Polani, D. (2008). Information flows in causal
+    networks. Advances in Complex Systems, 11(01), 17–41.
+
+.. [GriffithsEtAl2015] Griffiths, P. E., Pocheville, A., Calcott, B., Stotz, K., 
+    Kim, H., & Knight, R. (2015). Measuring Causal Specificity. Philosophy of Science, 82(October), 529–555.
+
+.. [Pearl2000] Pearl, J. (2000). Causality. Cambridge University Press. 
+
+
+.. vim: fo=tcroqn tw=78
diff --git a/causalinfo/__init__.py b/causalinfo/__init__.py
@@ -1,25 +1,21 @@
+import equations
+from .graph import CausalGraph, Equation
+from .measure import MeasureCause, MeasureSuccess
+from .payoff import PayoffMatrix
 from .probability import (
-    vs,
+    NS,
     Variable,
     make_variables,
     UniformDist,
     JointDist,
     JointDistByState
 )
 
-from .network import CausalGraph, Equation
-
-from .measure import MeasureCause, MeasureSuccess
-
-from .payoff import PayoffMatrix
-
-import equations
-
-__version__ = "1.0.0"
+__version__ = "1.1.0"
 
 __title__ = "causalinfo"
 __description__ = "Information Measures on Causal Graphs."
-__uri__ = "http://github/brettc/causalinfo/"
+__uri__ = "https://github/brettc/causalinfo/"
 
 __author__ = "Brett Calcott"
 __email__ = "[email protected]"
@@ -30,7 +26,7 @@
 __all__ = [
     "CausalGraph",
     "Equation",
-    "vs",
+    "NS",
     "Variable",
     "make_variables",
     "UniformDist",

diff --git a/causalinfo/equations.py b/causalinfo/equations.py
@@ -13,43 +13,48 @@
 """
 
 
-def f_same(i, o):
+def same_(i, o):
     o[i] = 1.0
 
 
-def f_rotate_right(i, o):
+def rotate_right_(i, o):
     ii = (i + 1) % len(o)
     o[ii] = 1.0
 
 
-def f_xnor(i1, i2, o):
+def xnor_(i1, i2, o):
     if i1 == i2:
         o[1] = 1.0
     else:
         o[0] = 1.0
 
 
-def f_xor(i1, i2, o):
+def xor_(i1, i2, o):
     if (i1 or i2) and not (i1 and i2):
         o[1] = 1.0
     else:
         o[0] = 1.0
 
 
-def f_and(i1, i2, o):
+def and_(i1, i2, o):
     if i1 and i2:
         o[1] = 1.0
     else:
         o[0] = 1.0
 
+def anotb_(i1, i2, o):
+    if i1 and not i2:
+        o[1] = 1.0
+    else:
+        o[0] = 1.0
 
-def f_or(i1, i2, o):
+def or_(i1, i2, o):
     if i1 or i2:
         o[1] = 1.0
     else:
         o[0] = 1.0
 
 
-def f_branch_same(i, o1, o2):
+def branch_same_(i, o1, o2):
     o1[i] = 1.0
     o2[i] = 1.0
diff --git a/causalinfo/network.py → causalinfo/graph.py b/causalinfo/network.py → causalinfo/graph.py
@@ -8,9 +8,21 @@
 
 
 class Equation(object):
-    """A Equation maps 1+ input variables to 1+ output variables"""
+    """Maps input variable(s) to output variable(s)"""
+
+    INPUT_LABEL = 'Input'
+    OUTPUT_LABEL = 'Output'
 
     def __init__(self, name, inputs, outputs, strategy_func):
+        """Use the strategy_func to map inputs to outputs.
+
+        Args:
+            name (str): Identifying name of equation.
+            inputs (List[Variable]): Variables to map from.
+            outputs (List[Variable]): Variables to map to.
+            strategy_func (function): Mapping function.
+
+        """
         assert str(name) == name
         assert not [i for i in inputs if not isinstance(i, Variable)]
         assert not [o for o in outputs if not isinstance(o, Variable)]
@@ -30,7 +42,7 @@ def __init__(self, name, inputs, outputs, strategy_func):
                      dtype=float) for o in outputs]
 
         # Create a lookup table based on the strategy function. Then we can
-        # discard the function.
+        # discard the function (very useful if we're interested in pickling).
         self.lookup = {}
 
         for i, states in enumerate(input_states):
@@ -68,7 +80,7 @@ def calculate(self, assignments):
         return dict(zip(self.outputs, results))
 
     def __repr__(self):
-        return "Equation<{}>".format(self.name)
+        return "<{}>".format(self.name)
 
     def to_frame(self):
         """Output the mapping equation in a nice way
@@ -80,7 +92,7 @@ def to_frame(self):
         """
         # Create a set of dictionaries/lists for each column
         data = dict([(i_var.name, []) for i_var in self.inputs])
-        data.update({'Output': [], 'State': [], self.name: []})
+        data.update({self.OUTPUT_LABEL: [], self.INPUT_LABEL: [], self.name: []})
 
         # A very ugly loop to produce all the probabilities in a nice way.
         # Note that this just reproduces what is already in `self.lookup`.
@@ -90,15 +102,15 @@ def to_frame(self):
                 for o_state, o_p in enumerate(results[i_index]):
                     for i_var, s in zip(self.inputs, i_state):
                         data[i_var.name].append(s)
-                    data['Output'].append(o_var.name)
-                    data['State'].append(o_state)
+                    data[self.OUTPUT_LABEL].append(o_var.name)
+                    data[self.INPUT_LABEL].append(o_state)
                     data[self.name].append(o_p)
         all_data = pd.DataFrame(data=data)
 
         # The magnificent pivot table function does all the work
         return pd.pivot_table(data=all_data, values=[self.name],
                               index=[i_var.name for i_var in self.inputs],
-                              columns=['Output', 'State'])
+                              columns=[self.OUTPUT_LABEL, self.INPUT_LABEL])
 
     def _repr_html_(self):
         # noinspection PyProtectedMember
@@ -109,10 +121,17 @@ class CausalGraph(object):
     """A Causal graph built using a set of equations relating variables"""
 
     def __init__(self, equations):
-        assert not [not_p for not_p in equations
-                    if not isinstance(not_p, Equation)]
-
+        # Everythings must be an equation
         self.equations = equations
+        self.equations_by_name = {}
+
+        for eq in equations:
+            if not isinstance(eq, Equation):
+                raise RuntimeError("Non Equation found.")
+
+            if eq.name in equations:
+                raise RuntimeError("Equations names must be unique within a graph")
+            self.equations_by_name[eq.name] = eq
 
         # Make a network from this. The first is the full network of both
         # equations and variables (a bipartite graph). The second is just the
@@ -154,7 +173,10 @@ def __init__(self, equations):
         self.ordered_nodes = nx.topological_sort(self.full_network)
 
         self.graphviz_prettify(self.full_network)
-        # self.graphviz_prettify(self.causal_network)
+        self.graphviz_prettify(self.causal_network)
+
+    def get_equation(self, name):
+        return self.equations_by_name(name)
 
     def graphviz_prettify(self, network):
         """This just makes things pretty for graphviz output."""
@@ -164,9 +186,10 @@ def graphviz_prettify(self, network):
         }
         network.graph.update(graph_settings)
 
-        for n in self.ordered_nodes:
-            network.node[n]['label'] = n.name
-            if isinstance(n, Equation):
+        for n in network.nodes_iter():
+            if isinstance(n, Variable):
+                network.node[n]['label'] = n.name
+            elif isinstance(n, Equation):
                 network.node[n]['shape'] = 'diamond'
 
     def generate_joint(self, root_dist, do_dist=None):