Skip to content

Commit

Permalink
Mu 1 nu 1 soap compression (#119)
Browse files Browse the repository at this point in the history
* Added the m1n1_compression option as an option for averaging (this corresponds to mu=1 nu=1 compression from Darby et al.) to the soap descriptor. Added the necessary modifications to the SOAP python wrapper and to ext under soapGeneral and soapGTO. Performed an initial test by generating features for the QM9 dataset using either a polynomial or GTO basis with successful results. updated test_soap.py under tests with tests specific for this averaging option. All tests pass except the numerical derivative test, which fails because the new averaging optino is not implemented in assert_derivatives. Will fix this shortly....

* Updated descriptor.cpp to ensure number of features is calculated correctly when using m1n1 compression. Finished updates to tests. All tests now run successfully.

* m1n1_compression was originally implemented as an alternative form of averaging, but this is likely to confuse the end user. To correct this problem, we have now modified soap.py so that the end user can supply two additional arguments to the class constructor: 'compression', which may be one of 'off', 'agnostic' or 'm1n1', and 'species_weights', which if supplied must be a dict containing the same elements as the species. The extension code has been updated to implement these alternatives. The code compiles correctly but tests have not been conducted. Next step: update and run unit tests.

* Fixed bugs in m1n1 and agnostic compression; numerical derivatives now calculate correctly with and without averaging. Parallellization test now fails however; fix this next.

* Fixed parallellization issue (caused by typo in average setting in descriptor_local.py). Updated polynomial basis code for agnostic compression. Added species weighting to unit tests. Updated all unit tests to include m1n1 compression and numerical derivatives for compression w/ and w/out averaging. All unit tests pass successfully.

* Ran linter, updated docs.

* Removed a .swp file.

* Updated SOAP descriptor to take compression dict as input. Removed crossover argument since it is now part of the compression dict. Removed species weighting argument since it is now part of the compression dict. Updated extension and test code to follow this new format. Tests passed successfully.

* Ran linter, updated docs.

* Revised nomenclature for compression so that 'agnostic' is now 'mu2' and 'm1n1' is now 'mu1nu1'. Re-ran tests, tests pass successfully. Updated docs.

* Fixed conflicts with main. In particular, changed nomenclature for rcut, nmax and lmax in soap.cpp and soap.h; changed test_exceptions in test_soap; updated objects.inv and searchindex in docs.

* Ran linter, resulting in reformatting for 1 file.
  • Loading branch information
jlparkI authored Aug 2, 2023
1 parent 60dc7f0 commit 47cb331
Show file tree
Hide file tree
Showing 27 changed files with 997 additions and 187 deletions.
5 changes: 4 additions & 1 deletion docs/2.0.x/_modules/dscribe/descriptors/descriptorlocal.html
Original file line number Diff line number Diff line change
Expand Up @@ -276,7 +276,10 @@ <h1>Source code for dscribe.descriptors.descriptorlocal</h1><div class="highligh
<span class="k">if</span> <span class="n">centers</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="n">n_centers</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">job</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">n_centers</span> <span class="o">=</span> <span class="mi">1</span> <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">average</span> <span class="o">!=</span> <span class="s2">&quot;off&quot;</span> <span class="k">else</span> <span class="nb">len</span><span class="p">(</span><span class="n">centers</span><span class="p">)</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">average</span> <span class="o">==</span> <span class="s2">&quot;off&quot;</span><span class="p">:</span>
<span class="n">n_centers</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">centers</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">n_centers</span> <span class="o">=</span> <span class="mi">1</span>
<span class="n">n_indices</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">job</span><span class="p">[</span><span class="mi">2</span><span class="p">])</span>
<span class="k">return</span> <span class="p">(</span><span class="n">n_centers</span><span class="p">,</span> <span class="n">n_indices</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="n">n_features</span><span class="p">),</span> <span class="p">(</span><span class="n">n_centers</span><span class="p">,</span> <span class="n">n_features</span><span class="p">)</span>

Expand Down
165 changes: 138 additions & 27 deletions docs/2.0.x/_modules/dscribe/descriptors/soap.html

Large diffs are not rendered by default.

65 changes: 57 additions & 8 deletions docs/2.0.x/doc/dscribe.descriptors.html
Original file line number Diff line number Diff line change
Expand Up @@ -2191,7 +2191,7 @@ <h2>Submodules<a class="headerlink" href="#submodules" title="Permalink to this
limitations under the License.</p>
<dl class="py class">
<dt class="sig sig-object py" id="dscribe.descriptors.soap.SOAP">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">dscribe.descriptors.soap.</span></span><span class="sig-name descname"><span class="pre">SOAP</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">r_cut</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_max</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">l_max</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">sigma</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">1.0</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">rbf</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'gto'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">weighting</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">crossover</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">True</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'off'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">species</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">periodic</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">sparse</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">dtype</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'float64'</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/dscribe/descriptors/soap.html#SOAP"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#dscribe.descriptors.soap.SOAP" title="Permalink to this definition"></a></dt>
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">dscribe.descriptors.soap.</span></span><span class="sig-name descname"><span class="pre">SOAP</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">r_cut</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">n_max</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">l_max</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">sigma</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">1.0</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">rbf</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'gto'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">weighting</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">average</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'off'</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">compression</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">{'mode':</span> <span class="pre">'off',</span> <span class="pre">'species_weighting':</span> <span class="pre">None}</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">species</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">periodic</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">sparse</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">False</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">dtype</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">'float64'</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/dscribe/descriptors/soap.html#SOAP"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#dscribe.descriptors.soap.SOAP" title="Permalink to this definition"></a></dt>
<dd><p>Bases: <a class="reference internal" href="#dscribe.descriptors.descriptorlocal.DescriptorLocal" title="dscribe.descriptors.descriptorlocal.DescriptorLocal"><code class="xref py py-class docutils literal notranslate"><span class="pre">DescriptorLocal</span></code></a></p>
<p>Class for generating a partial power spectrum from Smooth Overlap of
Atomic Orbitals (SOAP). This implementation uses real (tesseral) spherical
Expand Down Expand Up @@ -2283,13 +2283,6 @@ <h2>Submodules<a class="headerlink" href="#submodules" title="Permalink to this
for the central atoms.</p></li>
</ul>
</p></li>
<li><p><strong>crossover</strong> (<a class="reference external" href="https://docs.python.org/3/library/functions.html#bool" title="(in Python v3.11)"><em>bool</em></a>) – Determines if crossover of atomic types should
be included in the power spectrum. If enabled, the power
spectrum is calculated over all unique species combinations Z
and Z’. If disabled, the power spectrum does not contain
cross-species information and is only run over each unique
species Z. Turned on by default to correspond to the original
definition</p></li>
<li><p><strong>average</strong> (<a class="reference external" href="https://docs.python.org/3/library/stdtypes.html#str" title="(in Python v3.11)"><em>str</em></a>) – <p>The averaging mode over the centers of interest.
Valid options are:</p>
<blockquote>
Expand All @@ -2300,6 +2293,57 @@ <h2>Submodules<a class="headerlink" href="#submodules" title="Permalink to this
</ul>
</div></blockquote>
</p></li>
<li><p><strong>compression</strong> (<a class="reference external" href="https://docs.python.org/3/library/stdtypes.html#dict" title="(in Python v3.11)"><em>dict</em></a>) – <p>Contains the options which specify the feature compression to apply.
Applying compression can slightly reduce the accuracy of models trained on the feature
representation but can also dramatically reduce the size of the feature vector
and hence the computational cost. Options are:</p>
<blockquote>
<div><ul>
<li><dl class="simple">
<dt><code class="docutils literal notranslate"><span class="pre">`&quot;mode&quot;`</span></code>: Specifies the type of compression. This can be one of:</dt><dd><ul>
<li><p><code class="docutils literal notranslate"><span class="pre">&quot;off&quot;</span></code>: No compression; default.</p></li>
<li><dl class="simple">
<dt><code class="docutils literal notranslate"><span class="pre">&quot;mu2&quot;</span></code>: The SOAP feature vector is generated in an element-agnostic way, so that</dt><dd><p>the size of the feature vector is now independent of the number of elements (see Darby et al
below for details). It is still possible when using this option to construct a feature
vector that distinguishes between elements by supplying element-specific weighting under
“species_weighting”, see below.</p>
</dd>
</dl>
</li>
<li><dl class="simple">
<dt><code class="docutils literal notranslate"><span class="pre">&quot;mu1nu1&quot;</span></code>: Implements the mu=1, nu=1 feature compression scheme from Darby et al.: <span class="math notranslate nohighlight">\(p_{inn'l}^{Z_1,Z_2} \sum_m (c_{nlm}^{i, Z_1})^{*} (\sum_z c_{n'lm}^{i, z})\)</span>.</dt><dd><p>In other words, each coefficient for each species is multiplied by a “species-mu2” sum over the corresponding set of coefficients for all other species.
If this option is selected, features are generated for each center, but the number of features (the size of each feature vector) scales linearly rather than
quadratically with the number of elements in the system.</p>
</dd>
</dl>
</li>
<li><dl class="simple">
<dt><code class="docutils literal notranslate"><span class="pre">&quot;crossover&quot;</span></code>: The power spectrum does not contain cross-species information</dt><dd><p>and is only run over each unique species Z. In this configuration, the size of
the feature vector scales linearly with the number of elements in the system.</p>
</dd>
</dl>
</li>
</ul>
</dd>
</dl>
</li>
<li><dl class="simple">
<dt><code class="docutils literal notranslate"><span class="pre">`&quot;species_weighting&quot;`</span></code>: Either None or a dictionary mapping each species to a</dt><dd><p>species-specific weight. If None, there is no species-specific weighting. If a dictionary,
must contain a matching key for each species in the <code class="docutils literal notranslate"><span class="pre">species</span></code> iterable.
The main use of species weighting is to weight each element differently when using
the “mu2” option for <code class="docutils literal notranslate"><span class="pre">compression</span></code>.</p>
</dd>
</dl>
</li>
</ul>
<dl class="simple">
<dt>For reference see:</dt><dd><p>”Darby, J.P., Kermode, J.R. &amp; Csányi, G.
Compressing local atomic neighbourhood descriptors.
npj Comput Mater 8, 166 (2022). <a class="reference external" href="https://doi.org/10.1038/s41524-022-00847-y">https://doi.org/10.1038/s41524-022-00847-y</a></p>
</dd>
</dl>
</div></blockquote>
</p></li>
<li><p><strong>species</strong> (<em>iterable</em>) – The chemical species as a list of atomic
numbers or as a list of chemical symbols. Notice that this is not
the atomic numbers that are present for an individual system, but
Expand Down Expand Up @@ -2562,6 +2606,11 @@ <h2>Submodules<a class="headerlink" href="#submodules" title="Permalink to this
<em class="property"><span class="pre">property</span><span class="w"> </span></em><span class="sig-name descname"><span class="pre">species</span></span><a class="headerlink" href="#dscribe.descriptors.soap.SOAP.species" title="Permalink to this definition"></a></dt>
<dd></dd></dl>

<dl class="py property">
<dt class="sig sig-object py" id="dscribe.descriptors.soap.SOAP.species_weights">
<em class="property"><span class="pre">property</span><span class="w"> </span></em><span class="sig-name descname"><span class="pre">species_weights</span></span><a class="headerlink" href="#dscribe.descriptors.soap.SOAP.species_weights" title="Permalink to this definition"></a></dt>
<dd></dd></dl>

<dl class="py method">
<dt class="sig sig-object py" id="dscribe.descriptors.soap.SOAP.validate_derivatives_method">
<span class="sig-name descname"><span class="pre">validate_derivatives_method</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">method</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">attach</span></span></em><span class="sig-paren">)</span><a class="reference internal" href="../_modules/dscribe/descriptors/soap.html#SOAP.validate_derivatives_method"><span class="viewcode-link"><span class="pre">[source]</span></span></a><a class="headerlink" href="#dscribe.descriptors.soap.SOAP.validate_derivatives_method" title="Permalink to this definition"></a></dt>
Expand Down
Loading

0 comments on commit 47cb331

Please sign in to comment.