Skip to content

Commit

Permalink
Update Leaderboard Table (#227)
Browse files Browse the repository at this point in the history
Co-authored-by: Charlie Cheng-Jie Ji <[email protected]>
  • Loading branch information
HuanzhiMao and CharlieJCJ authored Mar 1, 2024
1 parent 42c325e commit 6a9a13f
Showing 1 changed file with 85 additions and 43 deletions.
128 changes: 85 additions & 43 deletions leaderboard.html
Original file line number Diff line number Diff line change
Expand Up @@ -102,11 +102,11 @@ <h2>Leaderboard</h2>
<th>Rank</th>
<th>Overall Acc</th>
<th>Model</th>
<th class="detail-small-header">Organization</th>
<th class="detail-small-header">License</th>
<th class="summary-small-header">AST Summary</th>
<th class="summary-small-header">Exec Summary</th>
<th class="summary-small-header">Relevance</th>
<th>Organization</th>
<th>License</th>
<th>AST Summary</th>
<th>Exec Summary</th>
<th>Relevance</th>
<th class="detail-small-header">Simple Function</th>
<th class="detail-small-header">Multiple Functions</th>
<th class="detail-small-header">Parallel Functions</th>
Expand All @@ -125,8 +125,8 @@ <h2>Leaderboard</h2>
<td>
<a href=''>GPT-4-0125-Preview</a>
</td>
<td class="detail-row">OpenAI</td>
<td class="detail-row">Proprietary</td>
<td>OpenAI</td>
<td>Proprietary</td>
<td class="summary-row">88.30</td>
<td class="summary-row">63.78</td>
<td class="summary-row">87.50</td>
Expand All @@ -146,8 +146,8 @@ <h2>Leaderboard</h2>
<td>
<a href=''>GPT-4-1106-Preview</a>
</td>
<td class="detail-row">OpenAI</td>
<td class="detail-row">Proprietary</td>
<td>OpenAI</td>
<td>Proprietary</td>
<td class="summary-row">88.78</td>
<td class="summary-row">59.38</td>
<td class="summary-row">88.75</td>
Expand All @@ -167,8 +167,8 @@ <h2>Leaderboard</h2>
<td>
<a href=''>OpenFunctions-v2</a>
</td>
<td class="detail-row">Gorilla LLM</td>
<td class="detail-row">Apache 2.0</td>
<td>Gorilla LLM</td>
<td>Apache 2.0</td>
<td class="summary-row">83.93</td>
<td class="summary-row">72.20</td>
<td class="summary-row">71.67</td>
Expand All @@ -188,8 +188,8 @@ <h2>Leaderboard</h2>
<td>
<a href=''>GPT-3.5-Turbo</a>
</td>
<td class="detail-row">OpenAI</td>
<td class="detail-row">Proprietary</td>
<td>OpenAI</td>
<td>Proprietary</td>
<td class="summary-row">86.19</td>
<td class="summary-row">66.41</td>
<td class="summary-row">68.33</td>
Expand All @@ -209,8 +209,8 @@ <h2>Leaderboard</h2>
<td>
<a href=''>Mistral-medium</a>
</td>
<td class="detail-row">Mistral AI</td>
<td class="detail-row">Proprietary</td>
<td>Mistral AI</td>
<td>Proprietary</td>
<td class="summary-row">75.92</td>
<td class="summary-row">64.34</td>
<td class="summary-row">90.00</td>
Expand All @@ -230,8 +230,8 @@ <h2>Leaderboard</h2>
<td>
<a href=''>Claude-2.1</a>
</td>
<td class="detail-row">Anthropic</td>
<td class="detail-row">Proprietary</td>
<td>Anthropic</td>
<td>Proprietary</td>
<td class="summary-row">74.28</td>
<td class="summary-row">53.55</td>
<td class="summary-row">78.33</td>
Expand All @@ -251,8 +251,8 @@ <h2>Leaderboard</h2>
<td>
<a href=''>Mistral-tiny</a>
</td>
<td class="detail-row">Mistral AI</td>
<td class="detail-row">Proprietary</td>
<td>Mistral AI</td>
<td>Proprietary</td>
<td class="summary-row">53.44</td>
<td class="summary-row">51.06</td>
<td class="summary-row">77.08</td>
Expand All @@ -272,8 +272,8 @@ <h2>Leaderboard</h2>
<td>
<a href=''>Claude-instant</a>
</td>
<td class="detail-row">Anthropic</td>
<td class="detail-row">Proprietary</td>
<td>Anthropic</td>
<td>Proprietary</td>
<td class="summary-row">55.06</td>
<td class="summary-row">47.81</td>
<td class="summary-row">61.67</td>
Expand All @@ -293,8 +293,8 @@ <h2>Leaderboard</h2>
<td>
<a href=''>Mistral-large</a>
</td>
<td class="detail-row">Mistral AI</td>
<td class="detail-row">Proprietary</td>
<td>Mistral AI</td>
<td>Proprietary</td>
<td class="summary-row">41.58</td>
<td class="summary-row">33.19</td>
<td class="summary-row">84.58</td>
Expand All @@ -311,11 +311,32 @@ <h2>Leaderboard</h2>
<tr>
<td>10</td>
<td>54.46</td>
<td>
<a href=''>Gemini-1.0-Pro</a>
</td>
<td>Google</td>
<td>Proprietary</td>
<td class="summary-row">42.86</td>
<td class="summary-row">27.03</td>
<td class="summary-row">77.50</td>
<td class="detail-row">78.43</td>
<td class="detail-row">89</td>
<td class="detail-row">4.00</td>
<td class="detail-row">0.00</td>
<td class="detail-row">46.12</td>
<td class="detail-row">62.00</td>
<td class="detail-row">0.00</td>
<td class="detail-row">0.00</td>
<td class="detail-row">77.50</td>
</tr>
<tr>
<td>11</td>
<td>54.46</td>
<td>
<a href=''>Nexusflow-Raven-v2</a>
</td>
<td class="detail-row">Nexusflow</td>
<td class="detail-row">Apache 2.0</td>
<td>Nexusflow</td>
<td>Apache 2.0</td>
<td class="summary-row">58.39</td>
<td class="summary-row">59.22</td>
<td class="summary-row">0.00</td>
Expand All @@ -330,13 +351,13 @@ <h2>Leaderboard</h2>
<td class="detail-row">0.00</td>
</tr>
<tr>
<td>11</td>
<td>12</td>
<td>53.95</td>
<td>
<a href=''>Firefunction-v1</a>
</td>
<td class="detail-row">Fireworks-ai</td>
<td class="detail-row">Apache 2.0</td>
<td>Fireworks-ai</td>
<td>Apache 2.0</td>
<td class="summary-row">41.05</td>
<td class="summary-row">29.31</td>
<td class="summary-row">81.25</td>
Expand All @@ -351,13 +372,13 @@ <h2>Leaderboard</h2>
<td class="detail-row">81.25</td>
</tr>
<tr>
<td>12</td>
<td>13</td>
<td>53.86</td>
<td>
<a href=''>Mistral-small</a>
</td>
<td class="detail-row">Mistral AI</td>
<td class="detail-row">Proprietary</td>
<td>Mistral AI</td>
<td>Proprietary</td>
<td class="summary-row">55.26</td>
<td class="summary-row">30.41</td>
<td class="summary-row">89.58</td>
Expand All @@ -372,13 +393,13 @@ <h2>Leaderboard</h2>
<td class="detail-row">89.58</td>
</tr>
<tr>
<td>13</td>
<td>14</td>
<td>53.49</td>
<td>
<a href=''>GPT-4-0613</a>
</td>
<td class="detail-row">OpenAI</td>
<td class="detail-row">Proprietary</td>
<td>OpenAI</td>
<td>Proprietary</td>
<td class="summary-row">41.14</td>
<td class="summary-row">21.91</td>
<td class="summary-row">87.08</td>
Expand All @@ -393,13 +414,34 @@ <h2>Leaderboard</h2>
<td class="detail-row">87.08</td>
</tr>
<tr>
<td>14</td>
<td>15</td>
<td>43.19</td>
<td>
<a href=''>Gemma</a>
</td>
<td>Google</td>
<td>gemma-term-of-use</td>
<td class="summary-row">48.74</td>
<td class="summary-row">40.34</td>
<td class="summary-row">0.42</td>
<td class="detail-row">61.45</td>
<td class="detail-row">60.00</td>
<td class="detail-row">41.00</td>
<td class="detail-row">32.50</td>
<td class="detail-row">45.88</td>
<td class="detail-row">46.00</td>
<td class="detail-row">44.00</td>
<td class="detail-row">25.50</td>
<td class="detail-row">0.42</td>
</tr>
<tr>
<td>16</td>
<td>43.19</td>
<td>
<a href=''>Deepseek-v1.5</a>
</td>
<td class="detail-row">Deepseek</td>
<td class="detail-row">Deepseek License</td>
<td>Deepseek</td>
<td>Deepseek License</td>
<td class="summary-row">46.97</td>
<td class="summary-row">3.7</td>
<td class="summary-row">66.25</td>
Expand All @@ -414,13 +456,13 @@ <h2>Leaderboard</h2>
<td class="detail-row">66.25</td>
</tr>
<tr>
<td>15</td>
<td>17</td>
<td>33.61</td>
<td>
<a href=''>OpenFunctions-v0</a>
</td>
<td class="detail-row">Gorilla LLM</td>
<td class="detail-row">Apache 2.0</td>
<td>Gorilla LLM</td>
<td>Apache 2.0</td>
<td class="summary-row">29.88</td>
<td class="summary-row">25.35</td>
<td class="summary-row">4.58</td>
Expand All @@ -435,13 +477,13 @@ <h2>Leaderboard</h2>
<td class="detail-row">4.58</td>
</tr>
<tr>
<td>16</td>
<td>18</td>
<td>24.76</td>
<td>
<a href=''>Glaive-v1</a>
</td>
<td class="detail-row">Glaive</td>
<td class="detail-row">cc-by-sa-4.0</td>
<td>Glaive</td>
<td>cc-by-sa-4.0</td>
<td class="summary-row">15.64</td>
<td class="summary-row">14.42</td>
<td class="summary-row">46.25</td>
Expand Down

0 comments on commit 6a9a13f

Please sign in to comment.