From f51d85a607572a4677c8cdd73561c8da9b7a1160 Mon Sep 17 00:00:00 2001
From: devilran6 <3470826156@qq.com>
Date: Mon, 26 Aug 2024 00:32:34 +0800
Subject: [PATCH] debug math formulate

---
 index.html | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/index.html b/index.html
index 81117e5..1a57c08 100644
--- a/index.html
+++ b/index.html
@@ -55,23 +55,24 @@ <h2 class="title is-2 has-text-centered">Abstract</h2>
       </div>
       <div class="markdown has-text-centered">
         <div align="left" style="font-size:20px; text-align:justify">
-          Speech enhancement plays an essential role in various applications, and the integration of visual information
-          has been demonstrated to bring substantial advantages.
-          However, existing works mainly focus on the analysis of facial and lip movements, whereas contextual visual
-          cues from the surrounding environment have been overlooked:
-          for example, when we see a dog bark, our brain has the innate ability to discern and filter out the barking
-          noise.
-          To this end, in this paper, we introduce a novel task, i.e. Scene-aware Audio-Visual Speech Enhancement.
-          To our best knowledge, this is the first proposal to use rich contextual information from synchronized video
-          as auxiliary cues to indicate the type of noise,
-          which eventually improves the speech enhancement performance.
-          Specifically, we propose the VC-S$^2$E method, which incorporates the Conformer and Mamba modules for their
-          complementary strengths.
-          Extensive experiments are conducted on public MUSIC, AVSpeech and AudioSet datasets, where the results
-          demonstrate the superiority of VC-S$^2$E over other competitive methods.
-          We will make the source code publicly available.
-          Project demo page: https://AVSEPage.github.io/
-          <br>
+          <p>Speech enhancement plays an essential role in various applications, and the integration of visual
+            information
+            has been demonstrated to bring substantial advantages.
+            However, existing works mainly focus on the analysis of facial and lip movements, whereas contextual visual
+            cues from the surrounding environment have been overlooked:
+            for example, when we see a dog bark, our brain has the innate ability to discern and filter out the barking
+            noise.
+            To this end, in this paper, we introduce a novel task, i.e. Scene-aware Audio-Visual Speech Enhancement.
+            To our best knowledge, this is the first proposal to use rich contextual information from synchronized video
+            as auxiliary cues to indicate the type of noise,
+            which eventually improves the speech enhancement performance.
+            Specifically, we propose the \[ VC-S^{2}E \] method, which incorporates the Conformer and Mamba modules for
+            their
+            complementary strengths.
+            Extensive experiments are conducted on public MUSIC, AVSpeech and AudioSet datasets, where the results
+            demonstrate the superiority of \[ VC-S^{2}E \] over other competitive methods.
+            We will make the source code publicly available.
+            Project demo page: https://AVSEPage.github.io/</p>
         </div>
       </div>
     </div>