Skip to content

Commit

Permalink
add seminar: 8
Browse files Browse the repository at this point in the history
  • Loading branch information
DaoudiNadia committed Dec 28, 2023
1 parent fca4f45 commit 9af0ce3
Show file tree
Hide file tree
Showing 3 changed files with 23 additions and 0 deletions.
Binary file modified img/Xin-Cheng_Wen.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/aashish_yadavally.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
23 changes: 23 additions & 0 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,29 @@ <h2>Upcoming Seminars<h2>

<div class="event-frame">

<div class="event">
<div class="presenter-details">
<img src="img/aashish_yadavally.jpg">
<h5> Aashish Yadavally </h5>
<p> UT Dallas </p>
</div>
<div class="event-info">
<h3>Contextuality of Code Representation Learning</h3>
<p> Advanced machine learning models (ML) have been successfully leveraged in several software engineering (SE) applications. The existing SE techniques have used the embedding models ranging from static
to contextualized ones to build the vectors for program units. The contextualized vectors address a phenomenon in natural language texts called polysemy, which is the coexistence of different meanings of
a word/phrase. However, due to different nature, program units exhibit the nature of mixed polysemy. Some code tokens and statements exhibit polysemy while other tokens (e.g., keywords, separators, and
operators) and statements maintain the same meaning in different contexts. A natural question is whether static or contextualized embeddings fit better with the nature of mixed polysemy in source code.
The answer to this question is helpful for the SE researchers in selecting the right embedding model. We conducted experiments on 12 popular sequence-/tree-/graph-based embedding models and on the samples
of a dataset of 10,222 Java projects with +14M methods. We present several contextuality evaluation metrics adapted from natural-language texts to code structures to evaluate the embeddings from those models.
Among several findings, we found that the models with higher contextuality help a bug detection model perform better than the static ones. Neither static nor contextualized embedding models fit well with the
mixed polysemy nature of source code. Thus, we develop Hycode, a hybrid embedding model that fits better with the nature of mixed polysemy in source code. </p>

<p><b><span class="black-underligned">Presentation Date:</span></b> <span class="black"><strong>Monday, January 15, 2024 at 3:00 PM CET</strong></span></p>

</div>
</div>



<div class="event">
<div class="presenter-details">
Expand Down

0 comments on commit 9af0ce3

Please sign in to comment.