Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
devilran6 committed Aug 28, 2024
1 parent 7e0c192 commit b49636a
Showing 1 changed file with 14 additions and 11 deletions.
25 changes: 14 additions & 11 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -206,18 +206,17 @@ <h2 class="title is-2 has-text-centered">Abstract</h2>
<!-- 修改这里:添加了 left-align 类 -->
<div class="markdown left-align">
<ul>
<li><b>Noisy</b>: Displays the spectrogram of audio generated by mixing clean speech and noise audio.</li>
<li><b>Clean</b>: Clean speech, serving as the source for mixing noisy audio and as the ground truth for
comparison after training.</li>
<li><b>Noisy Video</b>: The video source that contains noise and serves as the second modality input.</li>
<li><b>Grad-CAM Image</b>: Displays the middle frame of the video overlaid with the Grad-CAM heatmap, highlighting
key noise areas identified by the video encoder.</li>
<li><b>Noisy Speech</b>: Shows the spectrogram of audio generated by mixing clean speech with noise.</li>
<li><b>Clean Speech</b>: Shows the spectrogram of audio from clean speech.</li>
<li><b>Conformer</b>: A hybrid model that combines convolutional neural networks and transformers, designed for
speech recognition and related tasks.</li>
<li><b>
<font color=#0000FF>VC-S<sup>2</sup>E(Our)</font>
</b>: We propose a new model that leverages audio-visual modalities to improve speech quality and
<font color=#0000FF>VC-S<sup>2</sup>E (Ours)</font>
</b>: We propose a novel model that leverages audio-visual modalities to enhance speech quality and
intelligibility.</li>
<li><b>Conformer</b>: A hybrid model that combines convolutional neural networks and transformers, specifically
designed for speech recognition and related tasks.</li>
<li><b>Noisy video</b>: The video source of noise, used as our second modality input source.</li>
<li><b>Gradcam</b>: Displays the middle frame of the video with the corresponding Grad-CAM heatmap, highlighting
key noise areas identified by the video encoder.</li>
</ul>
</div>

Expand All @@ -235,7 +234,7 @@ <h2 class="title is-2 has-text-centered">Demos</h2>
</video>
</div>
<div class="media-item">
<div class="title-item">Gradcam Image</div>
<div class="title-item">Grad-CAM Image</div>
<img src="gradcam/0.jpg" alt="0">
</div>
<div class="media-item">
Expand Down Expand Up @@ -421,6 +420,9 @@ <h2 class="title is-2 has-text-centered">Demos</h2>
</div>
</div>

<div class="text-below-image">
<p>These rows display the spectrograms of different audio samples.</p>
</div>
<!-- Image Switcher Section -->
<div class="image-row" id="imageRow1">
<img src="noisy_speech/0_spectrum.png" alt="0">
Expand All @@ -434,6 +436,7 @@ <h2 class="title is-2 has-text-centered">Demos</h2>
<img src="noisy_speech/5_spectrum.png" alt="5">
</div>


<div class="buttons">
<button onclick="switchImages('noisy_speech', 'Noisy speech')">Noisy speech</button>
<button onclick="switchImages('clean_speech', 'Clean speech')">Clean speech</button>
Expand Down

0 comments on commit b49636a

Please sign in to comment.