add seminar: 7

Trustworthy-Software · Dec 21, 2023 · 9bfcf41 · 9bfcf41
1 parent e91748d
commit 9bfcf41
Show file tree

Hide file tree

Showing 2 changed files with 46 additions and 18 deletions.
diff --git a/img/Xin-Cheng_Wen.jpg b/img/Xin-Cheng_Wen.jpg
diff --git a/index.html b/index.html
@@ -177,23 +177,25 @@ <h2>Upcoming Seminars<h2>
 
 					<div class="event">
 						<div class="presenter-details">
-							<img src="img/ZhongLi_nanjing_university.jpg">
+							<img src="img/Xin-Cheng_Wen.jpg"> 
 							<h5> Zhong Li </h5>
-							<p> Nanjing University </p>
+							<p> HIT </p>
 						</div>	
 						<div class="event-info">
-							<h3>Robust Learning of Deep Predictive Models from Noisy and Imbalanced Software Engineering Datasets</h3>
-								<p> With the rapid development of Deep Learning, deep predictive models have been widely applied to improve Software Engineering tasks, such as defect prediction and issue 
-								classification, and have achieved remarkable success. They are mostly trained in a supervised manner, which heavily relies on high-quality datasets. Unfortunately, due to 
-								the nature and source of software engineering data, the real-world datasets often suffer from the issues of sample mislabelling and class imbalance, thus undermining the 
-								effectiveness of deep predictive models in practice. This problem has become a major obstacle for deep learning-based Software Engineering. In this paper, we propose 
-								RobustTrainer, the first approach to learning deep predictive models on raw training datasets where the mislabelled samples and the imbalanced classes coexist. 
-								RobustTrainer consists of a two-stage training scheme, where the first learns feature representations robust to sample mislabelling and the second builds a classifier robust 
-								to class imbalance based on the learned representations in the first stage. We apply RobustTrainer to two popular Software Engineering tasks, i.e., Bug Report Classification 
-								and Software Defect Prediction. Evaluation results show that RobustTrainer effectively tackles the mislabelling and class imbalance issues and produces significantly better 
-								deep predictive models compared to the other six comparison approaches. </p>
+							<h3>When Less is Enough: Positive and Unlabeled Learning Model for Vulnerability Detection</h3>
+								<p> Automated code vulnerability detection has gained increasing attention in recent years. The deep learning (DL)-based methods, which implicitly learn vulnerable code patterns, have proven 
+								effective in vulnerability detection. The performance of DL-based methods usually relies on the quantity and quality of labeled data. However, the current labeled data are generally automatically 
+								collected, such as crawled from human-generated commits, making it hard to ensure the quality of the labels. Prior studies have demonstrated that the non-vulnerable code (i.e., negative labels) 
+								tends to be unreliable in commonly-used datasets, while vulnerable code (i.e., positive labels) is more determined. Considering the large numbers of unlabeled data in practice, it is necessary and 
+								worth exploring to leverage the positive data and large numbers of unlabeled data for more accurate vulnerability detection. In this paper, we focus on the Positive and Unlabeled (PU) learning problem
+								for vulnerability detection and propose a novel model named PILOT, i.e., Positive and unlabeled Learning mOdel for vulnerability deTection. PILOT only learns from positive and unlabeled data for 
+								vulnerability detection. It mainly contains two modules: (1) A distance-aware label selection module, aiming at generating pseudo-labels for selected unlabeled data, which involves the inter-class 
+								distance prototype and progressive fine-tuning; (2) A mixed-supervision representation learning module to further alleviate the influence of noise and enhance the discrimination of representations. 
+								The experimental results show that PILOT outperforms 
+								the popular weakly supervised methods by 2.78%-18.93% in the PU learning setting. Compared with the state-of-the-art methods, PILOT also improves the performance of 1.34%-12.46 % in F1 score metrics in 
+								the supervised setting. In addition, PILOT can identify 23 mislabeled from the FFMPeg+Qemu dataset in the PU learning setting based on manual checking. </p>
 
-							<p><b><span class="black-underligned">Presentation Date:</span></b> <span class="black"><strong>Monday, December 18, 2023 at 10:30 AM CET</strong></span></p>
+							<p><b><span class="black-underligned">Presentation Date:</span></b> <span class="black"><strong>Monday, January 29, 2024 at 10:30 AM CET</strong></span></p>
 
 						</div>	
 					</div>
@@ -220,11 +222,40 @@ <h2>Past Seminars<h2>
 
 
 						    <li>
+
+							    <div class="speech-header">
+									<b>Robust Learning from Noisy and Imbalanced Software Engineering Datasets</b>, Monday, December 4, 2023, by <b>Zhong Li</b> from <b>NJU</b>
+								</div>
+								<div class="details">
+									<div class="event">
+										<div class="presenter-details">
+											<img src="img/ZhongLi_nanjing_university.jpg">
+												<h5> Zhong Li</h5>
+												<p> Nanjing University </p>
+										</div>
+										<div class="event-info">
+											<h3>Robust Learning of Deep Predictive Models from Noisy and Imbalanced Software Engineering Datasets</h3>
+												<p> With the rapid development of Deep Learning, deep predictive models have been widely applied to improve Software Engineering tasks, such as defect prediction and issue 
+												classification, and have achieved remarkable success. They are mostly trained in a supervised manner, which heavily relies on high-quality datasets. Unfortunately, due to 
+												the nature and source of software engineering data, the real-world datasets often suffer from the issues of sample mislabelling and class imbalance, thus undermining the 
+												effectiveness of deep predictive models in practice. This problem has become a major obstacle for deep learning-based Software Engineering. In this paper, we propose 
+												RobustTrainer, the first approach to learning deep predictive models on raw training datasets where the mislabelled samples and the imbalanced classes coexist. 
+												RobustTrainer consists of a two-stage training scheme, where the first learns feature representations robust to sample mislabelling and the second builds a classifier robust 
+												to class imbalance based on the learned representations in the first stage. We apply RobustTrainer to two popular Software Engineering tasks, i.e., Bug Report Classification 
+												and Software Defect Prediction. Evaluation results show that RobustTrainer effectively tackles the mislabelling and class imbalance issues and produces significantly better 
+												deep predictive models compared to the other six comparison approaches. </p>
+
+												<p><b><span class="black-underligned">Presentation Date:</span></b> <span class="black">Monday, December 18, 2023 at 10:30 AM CET</span></p>
+										</div>
+									</div>
+								</div>
+							</li>	
+
+
+							<li>
 								<div class="speech-header">
 									<b>Dataflow Analysis-Inspired DL for Efficient Vulnerability Detection</b>, Monday, December 4, 2023, by <b>Benjamin Steenhoek</b> from <b>ISU</b>
-
 								</div>
-
 								<div class="details">
 									<div class="event">
 										<div class="presenter-details">
@@ -245,7 +276,6 @@ <h3>Dataflow Analysis-Inspired Deep Learning for Efficient Vulnerability Detecti
 												with 96.46 F1 score, 97.82 precision, and 95.14 recall. </p>
 
 												<p><b><span class="black-underligned">Presentation Date:</span></b> <span class="black">Monday, December 4, 2023 at 3:00 PM CET</span></p>
-
 										</div>
 									</div>
 								</div>
@@ -258,7 +288,6 @@ <h3>Dataflow Analysis-Inspired Deep Learning for Efficient Vulnerability Detecti
 									<b>Towards Understanding Fairness and its Composition in Ensemble ML</b>, Monday, November 20, 2023, by <b>Usman Gohar</b> from <b>ISU</b>
 
 								</div>
-
 								<div class="details">
 									<div class="event">
 										<div class="presenter-details">
@@ -279,7 +308,6 @@ <h3>Towards Understanding Fairness and its Composition in Ensemble Machine Learn
 												ensemble design.</p>
 
 												<p><b><span class="black-underligned">Presentation Date:</span></b> <span class="black">Monday, November 20, 2023 at 4:00 PM CET</span></p>
-
 										</div>
 									</div>
 								</div>