Fix typos 2 (#842)

Co-authored-by: Haicheng Wu <[email protected]>
SNU-ARC · Feb 7, 2024 · e123f92 · e123f92
1 parent 749a8d3
commit e123f92
Show file tree

Hide file tree

Showing 161 changed files with 310 additions and 309 deletions.
diff --git a/README.md b/README.md
@@ -328,7 +328,7 @@ or a subset of kernels for NVIDIA Ampere and Turing architecture:
 
 ### Building a subset Tensor Core GEMM kernels
 
-To compile a subset of Tensor Core GEMM kernels with FP32 accumulation and FP16 input targetting NVIDIA Ampere and Turing architecture, 
+To compile a subset of Tensor Core GEMM kernels with FP32 accumulation and FP16 input targeting NVIDIA Ampere and Turing architecture, 
 use the below cmake command line:
 ```bash
 $ cmake .. -DCUTLASS_NVCC_ARCHS='75;80' -DCUTLASS_LIBRARY_KERNELS=cutlass_tensorop_s*gemm_f16_*_nt_align8
@@ -376,7 +376,7 @@ reference_device: Passed
 
 ### Building one CUDA Core GEMM kernel
 
-To compile one SGEMM kernel targetting NVIDIA Ampere and Turing architecture, use the below cmake command line:
+To compile one SGEMM kernel targeting NVIDIA Ampere and Turing architecture, use the below cmake command line:
 ```bash
 $ cmake .. -DCUTLASS_NVCC_ARCHS='75;80' -DCUTLASS_LIBRARY_KERNELS=cutlass_simt_sgemm_128x128_8x2_nn_align1
 ...
@@ -418,7 +418,7 @@ $ ./tools/profiler/cutlass_profiler --kernels=sgemm --m=3456 --n=4096 --k=4096
 ### Building a subset of Tensor Core Convolution kernels
 
 To compile a subset of Tensor core convolution kernels implementing forward propagation (fprop) with FP32 accumulation 
-and FP16 input targetting NVIDIA Ampere and Turing architecture, use the below cmake command line:
+and FP16 input targeting NVIDIA Ampere and Turing architecture, use the below cmake command line:
 ```bash
 $ cmake .. -DCUTLASS_NVCC_ARCHS='75;80' -DCUTLASS_LIBRARY_KERNELS=cutlass_tensorop_s*fprop_optimized_f16
 ...
@@ -466,7 +466,7 @@ reference_device: Passed
 ### Building one Convolution CUDA kernel
 
 To compile and run one CUDA Core convolution kernel implementing forward propagation (fprop) with F32 accumulation 
-and FP32 input targetting NVIDIA Ampere and Turing architecture, use the below cmake command line:
+and FP32 input targeting NVIDIA Ampere and Turing architecture, use the below cmake command line:
 ```bash
 $ cmake .. -DCUTLASS_NVCC_ARCHS='75;80' -DCUTLASS_LIBRARY_KERNELS=cutlass_simt_sfprop_optimized_128x128_8x2_nhwc
 ...

diff --git a/docs/annotated.html b/docs/annotated.html
diff --git a/...d_3_01ElementA___00_01LayoutA___00_01ElementB___00_0c9bb6f4463ab6085e6008b5d5ad6abfd.html b/...d_3_01ElementA___00_01LayoutA___00_01ElementB___00_0c9bb6f4463ab6085e6008b5d5ad6abfd.html
@@ -108,7 +108,7 @@
 </div><!--header-->
 <div class="contents">
 
-<p>Parital specialization for column-major output exchanges problem size and operand.  
+<p>Partial specialization for column-major output exchanges problem size and operand.
 </p>
 
 <p><code>#include &lt;<a class="el" href="device_2gemm__batched_8h_source.html">gemm_batched.h</a>&gt;</code></p>

diff --git a/...x_3_01ElementA___00_01LayoutA___00_01ElementB___00_07c56401b4df75709ae636675d9980a9a.html b/...x_3_01ElementA___00_01LayoutA___00_01ElementB___00_07c56401b4df75709ae636675d9980a9a.html
@@ -108,7 +108,7 @@
 </div><!--header-->
 <div class="contents">
 
-<p>Parital specialization for column-major output exchanges problem size and operand.  
+<p>Partial specialization for column-major output exchanges problem size and operand.
 </p>
 
 <p><code>#include &lt;<a class="el" href="include_2cutlass_2gemm_2device_2gemm__complex_8h_source.html">gemm_complex.h</a>&gt;</code></p>

diff --git a/...lementA___00_01LayoutA___00_01ElementB___00_01Layout4d0960ae6b1d1bf19e6239dbd002249c.html b/...lementA___00_01LayoutA___00_01ElementB___00_01Layout4d0960ae6b1d1bf19e6239dbd002249c.html
@@ -108,7 +108,7 @@
 </div><!--header-->
 <div class="contents">
 
-<p>Parital specialization for column-major output exchanges problem size and operand.  
+<p>Partial specialization for column-major output exchanges problem size and operand.
 </p>
 
 <p><code>#include &lt;<a class="el" href="include_2cutlass_2gemm_2device_2gemm_8h_source.html">gemm.h</a>&gt;</code></p>

diff --git a/docs/command__line_8h_source.html b/docs/command__line_8h_source.html
diff --git a/docs/device_2gemm__batched_8h.html b/docs/device_2gemm__batched_8h.html
@@ -130,7 +130,7 @@
 <tr class="memdesc:"><td class="mdescLeft">&#160;</td><td class="mdescRight">Argument structure.  <a href="structcutlass_1_1gemm_1_1device_1_1GemmBatched_1_1Arguments.html#details">More...</a><br /></td></tr>
 <tr class="separator:"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:"><td class="memItemLeft" align="right" valign="top">class &#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="classcutlass_1_1gemm_1_1device_1_1GemmBatched_3_01ElementA___00_01LayoutA___00_01ElementB___00_0c9bb6f4463ab6085e6008b5d5ad6abfd.html">cutlass::gemm::device::GemmBatched&lt; ElementA_, LayoutA_, ElementB_, LayoutB_, ElementC_, layout::ColumnMajor, ElementAccumulator_, OperatorClass_, ArchTag_, ThreadblockShape_, WarpShape_, InstructionShape_, EpilogueOutputOp_, ThreadblockSwizzle_, Stages, AlignmentA, AlignmentB, Operator_ &gt;</a></td></tr>
-<tr class="memdesc:"><td class="mdescLeft">&#160;</td><td class="mdescRight">Parital specialization for column-major output exchanges problem size and operand.  <a href="classcutlass_1_1gemm_1_1device_1_1GemmBatched_3_01ElementA___00_01LayoutA___00_01ElementB___00_0c9bb6f4463ab6085e6008b5d5ad6abfd.html#details">More...</a><br /></td></tr>
+<tr class="memdesc:"><td class="mdescLeft">&#160;</td><td class="mdescRight">Partial specialization for column-major output exchanges problem size and operand.  <a href="classcutlass_1_1gemm_1_1device_1_1GemmBatched_3_01ElementA___00_01LayoutA___00_01ElementB___00_0c9bb6f4463ab6085e6008b5d5ad6abfd.html#details">More...</a><br /></td></tr>
 <tr class="separator:"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:"><td class="memItemLeft" align="right" valign="top">struct &#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="structcutlass_1_1gemm_1_1device_1_1GemmBatched_3_01ElementA___00_01LayoutA___00_01ElementB___00_213d78696663f4231cd52c6a277c60e5.html">cutlass::gemm::device::GemmBatched&lt; ElementA_, LayoutA_, ElementB_, LayoutB_, ElementC_, layout::ColumnMajor, ElementAccumulator_, OperatorClass_, ArchTag_, ThreadblockShape_, WarpShape_, InstructionShape_, EpilogueOutputOp_, ThreadblockSwizzle_, Stages, AlignmentA, AlignmentB, Operator_ &gt;::Arguments</a></td></tr>
 <tr class="memdesc:"><td class="mdescLeft">&#160;</td><td class="mdescRight">Argument structure.  <a href="structcutlass_1_1gemm_1_1device_1_1GemmBatched_3_01ElementA___00_01LayoutA___00_01ElementB___00_213d78696663f4231cd52c6a277c60e5.html#details">More...</a><br /></td></tr>

diff --git a/docs/device_2kernel_2tensor__foreach_8h_source.html b/docs/device_2kernel_2tensor__foreach_8h_source.html
diff --git a/docs/device_2tensor__fill_8h.html b/docs/device_2tensor__fill_8h.html
@@ -237,7 +237,7 @@
 <tr class="separator:a6e23d479ebb3760d5846ed1b67e450e4"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a6b0f21995c4fd5c33617550e6905c78e"><td class="memTemplParams" colspan="2">template&lt;typename Element , typename Layout &gt; </td></tr>
 <tr class="memitem:a6b0f21995c4fd5c33617550e6905c78e"><td class="memTemplItemLeft" align="right" valign="top">void&#160;</td><td class="memTemplItemRight" valign="bottom"><a class="el" href="namespacecutlass_1_1reference_1_1device.html#a6b0f21995c4fd5c33617550e6905c78e">cutlass::reference::device::TensorFillIdentity</a> (TensorView&lt; Element, Layout &gt; view)</td></tr>
-<tr class="memdesc:a6b0f21995c4fd5c33617550e6905c78e"><td class="mdescLeft">&#160;</td><td class="mdescRight">Fills a tensor's digonal with 1 and 0 everywhere else.  <a href="namespacecutlass_1_1reference_1_1device.html#a6b0f21995c4fd5c33617550e6905c78e">More...</a><br /></td></tr>
+<tr class="memdesc:a6b0f21995c4fd5c33617550e6905c78e"><td class="mdescLeft">&#160;</td><td class="mdescRight">Fills a tensor's diagonal with 1 and 0 everywhere else.  <a href="namespacecutlass_1_1reference_1_1device.html#a6b0f21995c4fd5c33617550e6905c78e">More...</a><br /></td></tr>
 <tr class="separator:a6b0f21995c4fd5c33617550e6905c78e"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:aaff3d7919a2f2dce14eb254c17eead9a"><td class="memTemplParams" colspan="2">template&lt;typename Element , typename Layout &gt; </td></tr>
 <tr class="memitem:aaff3d7919a2f2dce14eb254c17eead9a"><td class="memTemplItemLeft" align="right" valign="top">void&#160;</td><td class="memTemplItemRight" valign="bottom"><a class="el" href="namespacecutlass_1_1reference_1_1device.html#aaff3d7919a2f2dce14eb254c17eead9a">cutlass::reference::device::TensorUpdateDiagonal</a> (TensorView&lt; Element, Layout &gt; view, Element diag=Element(1))</td></tr>

diff --git a/docs/device_2tensor__fill_8h_source.html b/docs/device_2tensor__fill_8h_source.html
@@ -125,7 +125,7 @@
 <div class="ttc" id="structcutlass_1_1reference_1_1device_1_1detail_1_1RandomGaussianFunc_1_1Params_html"><div class="ttname"><a href="structcutlass_1_1reference_1_1device_1_1detail_1_1RandomGaussianFunc_1_1Params.html">cutlass::reference::device::detail::RandomGaussianFunc::Params</a></div><div class="ttdoc">Parameters structure. </div><div class="ttdef"><b>Definition:</b> device/tensor_fill.h:99</div></div>
 <div class="ttc" id="structcutlass_1_1Distribution_html_a07cb089b346ef06e198f6043128264fb"><div class="ttname"><a href="structcutlass_1_1Distribution.html#a07cb089b346ef06e198f6043128264fb">cutlass::Distribution::kind</a></div><div class="ttdeci">Kind kind</div><div class="ttdoc">Active variant kind. </div><div class="ttdef"><b>Definition:</b> distribution.h:64</div></div>
 <div class="ttc" id="structcutlass_1_1reference_1_1device_1_1detail_1_1TensorFillRandomUniformFunc_1_1Params_html_a267e7ea4e77076cc9be7d639b3cef64d"><div class="ttname"><a href="structcutlass_1_1reference_1_1device_1_1detail_1_1TensorFillRandomUniformFunc_1_1Params.html#a267e7ea4e77076cc9be7d639b3cef64d">cutlass::reference::device::detail::TensorFillRandomUniformFunc::Params::Params</a></div><div class="ttdeci">Params(TensorView view_=TensorView(), typename RandomFunc::Params random_=RandomFunc::Params())</div><div class="ttdoc">Construction of Gaussian RNG functor. </div><div class="ttdef"><b>Definition:</b> device/tensor_fill.h:422</div></div>
-<div class="ttc" id="namespacecutlass_1_1reference_1_1device_html_a6b0f21995c4fd5c33617550e6905c78e"><div class="ttname"><a href="namespacecutlass_1_1reference_1_1device.html#a6b0f21995c4fd5c33617550e6905c78e">cutlass::reference::device::TensorFillIdentity</a></div><div class="ttdeci">void TensorFillIdentity(TensorView&lt; Element, Layout &gt; view)</div><div class="ttdoc">Fills a tensor&amp;#39;s digonal with 1 and 0 everywhere else. </div><div class="ttdef"><b>Definition:</b> device/tensor_fill.h:630</div></div>
+<div class="ttc" id="namespacecutlass_1_1reference_1_1device_html_a6b0f21995c4fd5c33617550e6905c78e"><div class="ttname"><a href="namespacecutlass_1_1reference_1_1device.html#a6b0f21995c4fd5c33617550e6905c78e">cutlass::reference::device::TensorFillIdentity</a></div><div class="ttdeci">void TensorFillIdentity(TensorView&lt; Element, Layout &gt; view)</div><div class="ttdoc">Fills a tensor&amp;#39;s diagonal with 1 and 0 everywhere else. </div><div class="ttdef"><b>Definition:</b> device/tensor_fill.h:630</div></div>
 <div class="ttc" id="classcutlass_1_1TensorView_html_a7d3914dd5042c9c40be9e21a7b4e9ece"><div class="ttname"><a href="classcutlass_1_1TensorView.html#a7d3914dd5042c9c40be9e21a7b4e9ece">cutlass::TensorView::extent</a></div><div class="ttdeci">CUTLASS_HOST_DEVICE TensorCoord const &amp; extent() const </div><div class="ttdoc">Returns the extent of the view (the size along each logical dimension). </div><div class="ttdef"><b>Definition:</b> tensor_view.h:167</div></div>
 <div class="ttc" id="structcutlass_1_1reference_1_1device_1_1detail_1_1TensorUpdateDiagonalFunc_html"><div class="ttname"><a href="structcutlass_1_1reference_1_1device_1_1detail_1_1TensorUpdateDiagonalFunc.html">cutlass::reference::device::detail::TensorUpdateDiagonalFunc</a></div><div class="ttdoc">Computes a random Gaussian distribution. </div><div class="ttdef"><b>Definition:</b> device/tensor_fill.h:645</div></div>
 <div class="ttc" id="structcutlass_1_1reference_1_1device_1_1detail_1_1RandomUniformFunc_1_1Params_html_afe8637b103e25ec2e9b731389fa049be"><div class="ttname"><a href="structcutlass_1_1reference_1_1device_1_1detail_1_1RandomUniformFunc_1_1Params.html#afe8637b103e25ec2e9b731389fa049be">cutlass::reference::device::detail::RandomUniformFunc::Params::int_scale</a></div><div class="ttdeci">int int_scale</div><div class="ttdef"><b>Definition:</b> device/tensor_fill.h:315</div></div>

diff --git a/docs/device_2tensor__foreach_8h_source.html b/docs/device_2tensor__foreach_8h_source.html
diff --git a/docs/functions_func_s.html b/docs/functions_func_s.html
@@ -141,7 +141,7 @@ <h3><a class="anchor" id="index_s"></a>- s -</h3><ul>
 <li>Semaphore()
 : <a class="el" href="classcutlass_1_1Semaphore.html#a2ce4cd07fe773efa429f726cfbd98070">cutlass::Semaphore</a>
 </li>
-<li>seperate_string()
+<li>separate_string()
 : <a class="el" href="structcutlass_1_1CommandLine.html#a5f86e4b2bd8c44b739c83530d77c5590">cutlass::CommandLine</a>
 </li>
 <li>set()

diff --git a/docs/functions_s.html b/docs/functions_s.html
@@ -172,7 +172,7 @@ <h3><a class="anchor" id="index_s"></a>- s -</h3><ul>
 <li>Semaphore()
 : <a class="el" href="classcutlass_1_1Semaphore.html#a2ce4cd07fe773efa429f726cfbd98070">cutlass::Semaphore</a>
 </li>
-<li>seperate_string()
+<li>separate_string()
 : <a class="el" href="structcutlass_1_1CommandLine.html#a5f86e4b2bd8c44b739c83530d77c5590">cutlass::CommandLine</a>
 </li>
 <li>sequential