From afb4e4f92699fafa3309305c3257f18edbe7423c Mon Sep 17 00:00:00 2001 From: "jakub.binkowski" Date: Wed, 12 Jun 2024 21:42:24 +0000 Subject: [PATCH] WIP: Prepare nbdev notebooks --- nbs/Data/01_Dataset_Description.ipynb | 8 ++ nbs/Data/02_Analyse_sft.ipynb | 7 + nbs/Data/03_Graph_Dataset_Description.ipynb | 7 + nbs/Data/04_Graph_Analysis.ipynb | 8 ++ .../01_Dataset_Description_Raw.ipynb} | 24 +--- .../02_Dataset_Description_Instruct.ipynb} | 33 ++--- nbs/Dataset Cards/03_Graph_Description.md | 128 ++++++++++++++++++ nbs/images/degree_distribution.png | Bin 0 -> 25265 bytes nbs/index.ipynb | 4 +- nbs/sidebar.yml | 19 +++ 10 files changed, 192 insertions(+), 46 deletions(-) rename nbs/{Data/02_Dataset_Description_Raw.ipynb => Dataset Cards/01_Dataset_Description_Raw.ipynb} (99%) rename nbs/{Data/03_Dataset_Description_Instruct.ipynb => Dataset Cards/02_Dataset_Description_Instruct.ipynb} (95%) create mode 100644 nbs/Dataset Cards/03_Graph_Description.md create mode 100755 nbs/images/degree_distribution.png create mode 100644 nbs/sidebar.yml diff --git a/nbs/Data/01_Dataset_Description.ipynb b/nbs/Data/01_Dataset_Description.ipynb index e7e2c47..a29623f 100644 --- a/nbs/Data/01_Dataset_Description.ipynb +++ b/nbs/Data/01_Dataset_Description.ipynb @@ -19,6 +19,14 @@ "sns.set_theme(\"notebook\")" ] }, + { + "cell_type": "markdown", + "id": "8f5ffccf", + "metadata": {}, + "source": [ + "# Raw & Instruct Datasets Analyses" + ] + }, { "cell_type": "code", "execution_count": null, diff --git a/nbs/Data/02_Analyse_sft.ipynb b/nbs/Data/02_Analyse_sft.ipynb index 922a218..f27386b 100644 --- a/nbs/Data/02_Analyse_sft.ipynb +++ b/nbs/Data/02_Analyse_sft.ipynb @@ -24,6 +24,13 @@ "warnings.filterwarnings('ignore', message=\"To copy construct from a tensor, it is recommended to use\")" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# SFT results inspection" + ] + }, { "cell_type": "markdown", "metadata": {}, diff --git a/nbs/Data/03_Graph_Dataset_Description.ipynb b/nbs/Data/03_Graph_Dataset_Description.ipynb index 574f483..c6aa121 100644 --- a/nbs/Data/03_Graph_Dataset_Description.ipynb +++ b/nbs/Data/03_Graph_Dataset_Description.ipynb @@ -16,6 +16,13 @@ "sns.set_theme(\"notebook\")" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Graph dataset analysis" + ] + }, { "cell_type": "code", "execution_count": null, diff --git a/nbs/Data/04_Graph_Analysis.ipynb b/nbs/Data/04_Graph_Analysis.ipynb index 06932d8..c9be72b 100644 --- a/nbs/Data/04_Graph_Analysis.ipynb +++ b/nbs/Data/04_Graph_Analysis.ipynb @@ -23,6 +23,14 @@ "sns.set_theme(\"notebook\")" ] }, + { + "cell_type": "markdown", + "id": "dcd46ebf", + "metadata": {}, + "source": [ + "# Local subgraphs analysis" + ] + }, { "cell_type": "code", "execution_count": null, diff --git a/nbs/Data/02_Dataset_Description_Raw.ipynb b/nbs/Dataset Cards/01_Dataset_Description_Raw.ipynb similarity index 99% rename from nbs/Data/02_Dataset_Description_Raw.ipynb rename to nbs/Dataset Cards/01_Dataset_Description_Raw.ipynb index 8cba868..0a2f199 100644 --- a/nbs/Data/02_Dataset_Description_Raw.ipynb +++ b/nbs/Dataset Cards/01_Dataset_Description_Raw.ipynb @@ -35,22 +35,6 @@ "raw_ds = pl.scan_parquet(source=\"../../data/datasets/pl/raw/*\")" ] }, - { - "cell_type": "markdown", - "id": "bac42f58ea3c3d96", - "metadata": {}, - "source": [ - "---\n", - "language: {{language}}\n", - "multilinguality: {{multilinguality}}\n", - "language_creators: {{language_creators}}\n", - "size_categories: {{size_categories}}\n", - "source_datasets: {{source_datasets}}\n", - "pretty_name: {{pretty_name}}\n", - "tags: {{tags}}\n", - "---" - ] - }, { "cell_type": "markdown", "id": "350cb2d131ba5aeb", @@ -86,10 +70,10 @@ "\n", "## Dataset Description\n", "\n", - "- **Homepage: TBA**\n", - "- **Repository: [github](https://github.com/pwr-ai/JuDDGES)**\n", - "- **Paper: TBA**\n", - "- **Point of Contact: lukasz.augustyniak@pwr.edu.pl; jakub.binkowski@pwr.edu.pl; albert.sawczyn@pwr.edu.pl**\n", + "* **Homepage: TBA**\n", + "* **Repository: [github](https://github.com/pwr-ai/JuDDGES)**\n", + "* **Paper: TBA**\n", + "* **Point of Contact: lukasz.augustyniak@pwr.edu.pl; jakub.binkowski@pwr.edu.pl; albert.sawczyn@pwr.edu.pl**\n", "\n", "### Dataset Summary\n", "\n", diff --git a/nbs/Data/03_Dataset_Description_Instruct.ipynb b/nbs/Dataset Cards/02_Dataset_Description_Instruct.ipynb similarity index 95% rename from nbs/Data/03_Dataset_Description_Instruct.ipynb rename to nbs/Dataset Cards/02_Dataset_Description_Instruct.ipynb index 998b302..2f3fa11 100644 --- a/nbs/Data/03_Dataset_Description_Instruct.ipynb +++ b/nbs/Dataset Cards/02_Dataset_Description_Instruct.ipynb @@ -38,23 +38,6 @@ "ds = load_dataset(\"JuDDGES/pl-court-instruct\") " ] }, - { - "cell_type": "markdown", - "id": "bac42f58ea3c3d96", - "metadata": {}, - "source": [ - "---\n", - "language: {{language}}\n", - "multilinguality: {{multilinguality}}\n", - "language_creators: {{language_creators}}\n", - "size_categories: {{size_categories}}\n", - "source_datasets: {{source_datasets}}\n", - "pretty_name: {{pretty_name}}\n", - "tags: {{tags}}\n", - "task_categories: {{task_categories}}\n", - "---" - ] - }, { "cell_type": "markdown", "id": "350cb2d131ba5aeb", @@ -90,10 +73,10 @@ "\n", "## Dataset Description\n", "\n", - "- **Homepage: TBA**\n", - "- **Repository: [github](https://github.com/pwr-ai/JuDDGES)**\n", - "- **Paper: TBA**\n", - "- **Point of Contact: lukasz.augustyniak@pwr.edu.pl; jakub.binkowski@pwr.edu.pl; albert.sawczyn@pwr.edu.pl**\n", + "* **Homepage: TBA**\n", + "* **Repository: [github](https://github.com/pwr-ai/JuDDGES)**\n", + "* **Paper: TBA**\n", + "* **Point of Contact: lukasz.augustyniak@pwr.edu.pl; jakub.binkowski@pwr.edu.pl; albert.sawczyn@pwr.edu.pl**\n", "\n", "### Dataset Summary\n", "\n", @@ -101,8 +84,8 @@ "\n", "### Supported Tasks and Leaderboards\n", "\n", - "- `information-extraction`: The dataset can be used for information extraction tasks.\n", - "- `text-generation`: The dataset can be used for text generation tasks, as the dataset is formatted as instructions.\n", + "* `information-extraction`: The dataset can be used for information extraction tasks.\n", + "* `text-generation`: The dataset can be used for text generation tasks, as the dataset is formatted as instructions.\n", "\n", "### Languages\n", "\n", @@ -124,7 +107,9 @@ "id": "3f161970acf83cfa", "metadata": {}, "outputs": [], - "source": "display(ds[\"train\"][0])" + "source": [ + "display(ds[\"train\"][0])" + ] }, { "cell_type": "markdown", diff --git a/nbs/Dataset Cards/03_Graph_Description.md b/nbs/Dataset Cards/03_Graph_Description.md new file mode 100644 index 0000000..afb8e4e --- /dev/null +++ b/nbs/Dataset Cards/03_Graph_Description.md @@ -0,0 +1,128 @@ +# Polish Court Judgments Graph + +## Dataset description +We introduce a graph dataset of Polish Court Judgments. This dataset is primarily based on the [`JuDDGES/pl-court-raw`](https://huggingface.co/datasets/JuDDGES/pl-court-raw). The dataset consists of nodes representing either judgments or legal bases, and edges connecting judgments to the legal bases they refer to. Also, the graph was cleaned from small disconnected components, leaving single giant component. Consequently, the resulting graph is bipartite. We provide the dataset in both `JSON` and `PyG` formats, each has different purpose. While structurally graphs in these formats are the same, their attributes differ. + +The `JSON` format is intended for analysis and contains most of the attributes available in [`JuDDGES/pl-court-raw`](https://huggingface.co/datasets/JuDDGES/pl-court-raw). We excluded some less-useful attributes and text content, which can be easily retrieved from the raw dataset and added to the graph as needed. + +The `PyG` format is designed for machine learning applications, such as link prediction on graphs, and is fully compatible with the [`Pytorch Geometric`](https://github.com/pyg-team/pytorch_geometric) framework. + +In the following sections, we provide a more detailed explanation and use case examples for each format. + +## Dataset statistics + +| feature | value | +|----------------------------|----------------------| +| #nodes | 369033 | +| #edges | 1131458 | +| #nodes (type=`judgment`) | 366212 | +| #nodes (type=`legal_base`) | 2819 | +| avg(degree) | 6.132015294025195 | + + +![png](../images/degree_distribution.png) + + + +## `JSON` format + +The `JSON` format contains graph node types differentiated by `node_type` attrbute. Each `node_type` has its additional corresponding attributes (see [`JuDDGES/pl-court-raw`](https://huggingface.co/datasets/JuDDGES/pl-court-raw) for detailed description of each attribute): + +| node_type | attributes | +|--------------|---------------------------------------------------------------------------------------------------------------------| +| `judgment` | `_id`,`chairman`,`court_name`,`date`,`department_name`,`judges`,`node_type`,`publisher`,`recorder`,`signature`,`type` | +| `legal_base` | `isap_id`,`node_type`,`title` | + +### Loading +Graph the `JSON` format is saved in node-link format, and can be readily loaded with `networkx` library: + +```python +import json +import networkx as nx +from huggingface_hub import hf_hub_download + +DATA_DIR = "" +JSON_FILE = "data/judgment_graph.json" +hf_hub_download(repo_id="JuDDGES/pl-court-graph", repo_type="dataset", filename=JSON_FILE, local_dir=DATA_DIR) + +with open(f"{DATA_DIR}/{JSON_FILE}") as file: + g_data = json.load(file) + +g = nx.node_link_graph(g_data) +``` + +### Example usage +```python +# TBD +``` + +## `PyG` format + +The `PyTorch Geometric` format includes embeddings of the judgment content, obtained with [sdadas/mmlw-roberta-large](https://huggingface.co/sdadas/mmlw-roberta-large) for judgment nodes, +and one-hot-vector identifiers for legal-base nodes (note that for efficiency one can substitute it with random noise identifiers, +like in [(Abboud et al., 2021)](https://arxiv.org/abs/2010.01179)). + + + +### Loading +In order to load graph as pytorch geometric, one can leverage the following code snippet +```python +import torch +import os +from torch_geometric.data import InMemoryDataset, download_url + + +class PlCourtGraphDataset(InMemoryDataset): + URL = ( + "https://huggingface.co/datasets/JuDDGES/pl-court-graph/resolve/main/" + "data/pyg_judgment_graph.pt?download=true" + ) + + def __init__(self, root_dir: str, transform=None, pre_transform=None): + super(PlCourtGraphDataset, self).__init__(root_dir, transform, pre_transform) + data_file, index_file = self.processed_paths + self.load(data_file) + self.judgment_idx_2_iid, self.legal_base_idx_2_isap_id = torch.load(index_file).values() + + @property + def raw_file_names(self) -> str: + return "pyg_judgment_graph.pt" + + @property + def processed_file_names(self) -> list[str]: + return ["processed_pyg_judgment_graph.pt", "index_map.pt"] + + def download(self) -> None: + os.makedirs(self.root, exist_ok=True) + download_url(self.URL + self.raw_file_names, self.raw_dir) + + def process(self) -> None: + dataset = torch.load(self.raw_paths[0]) + data = dataset["data"] + + if self.pre_transform is not None: + data = self.pre_transform(data) + + data_file, index_file = self.processed_paths + self.save([data], data_file) + + torch.save( + { + "judgment_idx_2_iid": dataset["judgment_idx_2_iid"], + "legal_base_idx_2_isap_id": dataset["legal_base_idx_2_isap_id"], + }, + index_file, + ) + + def __repr__(self) -> str: + return f"{self.__class__.__name__}({len(self)})" + + +ds = PlCourtGraphDataset(root_dir="data/datasets/pyg") +print(ds) +``` + +### Example usage +```python +# TBD +``` \ No newline at end of file diff --git a/nbs/images/degree_distribution.png b/nbs/images/degree_distribution.png new file mode 100755 index 0000000000000000000000000000000000000000..dc00bd1c20be9a33fcc636bb7fa96eac63c7d2b1 GIT binary patch literal 25265 zcmd431zeQt);2zNA*dJ#2yCSkC8VTLQ5;&3ln{_^hVE?u2B3hTq(Mk`mw> zK)NIc7~)$u>^x_`?|07op6~zr^W*HJILtiH{oMCj*SfB2ttWCa;)f4X9>icUhq1TB zTKO+38v1j4aGK zS+B5iU8FO#v9ZMQv$2`|{sLADYXde~hI6*?CI>8Usp2pgGHvv~9UnzvjW8G;8m!pW zI}Tw}J&rC4dYzlI7XCreEt$u6oVvr@UPGGRBA=*sTTEL{bdT84qX}nsj_4k5_q~5g()Rw0Sv=;>151zXr;dSj)=Yu-F5|P*X%-Ak2Y(Ur##}ds# z37Kj2k^Gr}@50K~^-t=!q!r+jFqlvNll?MVf5I&LmCClJNeOv!l#yq(L{UMe8 zp!3$hTzvjNd&yA_yQ$9TmXt^!o$Qma zD3_q0pYrh~(c=z;9{ViQj&#e|pFgiRMmMuu^(UZrihJjiZ5Onrt+Z?;_r@AsV|r*PB_7$^hm(xBbU>H zq3e7dFaKhTLg+#+TgjTdWY9&Urq5$D`62Pe-Pf|*FpX1_84YctjUTx0|MJ2zc0e&| z#Os$*VNBwqGMc01`WbP0wL#WbsQYF1Xv?Xpo}NBl2!Au~NRKwhF4s zoK0Fj-d=@IFH9yh%ZjuP1#2b4PbZX1R$mpb%}J(=3oCCtbdLxuw9~Lc^-! zL``iiu20L^%tM^UQOwB5D1V_zazojOc@1_h{w@Y zy1KS+!`XRS@y?x5_l*(vz*n!LHH)0i3khi`#7H&f^tnw;Ystv?(vOuLVzaRb&?vCu zZYOxwTXdDUm+04r#bq=I--TOp`u6iao6)a75!PYJ6qS^Mss;;vTwU{8b2es*i88#s z4j<};iKdG)eOdubvTe#*u0ys$0`}7_-FY}}%WtBGDcD8lhwBs+6%`+L?f5iX8_aim z&a7V8@|y>Z)96n=l-v^QL;}9UATntpr9@N9W%kCcRe3qNw+F}>Kgvc5Q*!7$>UA6m z=O^$ol&r0}^jOzkl8}{EE^%Auvi^Q61%GhPcCApG*25LEwBx;`R4zHjJhy+ZhSOMd zC)^2*%a2kIouQGDiI`Ax`B<57n~`vPW5RS#{`yjvQ0Ix0C&$r0c#)p5sXAl-F*jF* z)1ZFAQ2F-lcd(Aen;YvPTt@!zwNNiGGa7W)ys>X%i8nerIX&j zgrI73jC8YopKEMb7`2Au(5t6DK4$WQ*nHx2!K;XfA%{fw~?ljPg-^8P&L-5>kbhI!*Ze*C(K*P1cEOC0Vic6F|M&2imRd$hrQBLgnK zaF`u8EER55c{)VMIa{jPv}rZOUzu*5FkDQWZeCrS;5KTax1H)#+(%0NbtKX~8J2o+ zvP0pNhRsX6$@X_};1$qggll7ntH0tf;e1`cE+kq(R!9-;yW-VxmL&B&c|AS7=@E|< zm+C3Qt{e+IH(vb%k6GtvC6&b;DXHgeiK-t=Qi_xkl#=A(5{2d_TjBzya|cOXA?i5w zKRy)*x2(0fikGe-7hRaQm07T@-x!OHM0DGmZFNmQ!P--n@VGX!rW^ zpr6y`I)Pia@;RelB=6*|R1^n~U;cUjES�=g*%XmSijXZq*0>y&n@5^;%?oK_PtT zl)B~5d22<+(a+0Ap)45gr?_bTgL)?Av*`RUY_r2`nfl14G4H zzZb2Guf2`?sY`0#HM5-w=A7KztzYWu7AYe6aKo?B^;FGyb~{ZZ>Cf~Q)s>Z9Bcww# zD4pyrDtP1XKcH#XAVhe5`t)giEEXGH)es@nMa_QCLse06NHZur+-zy83*Vh#;OE@o zdq&_3jgV8zla+Q|zpVv0W8dBXGB{cxqI|yjg|orIwVtQN z9#V{%JA`$-kuxl3C|UzTpSwkf{(8gDomE8t(456SbaC*(b?)mETGM(rZrsp^b5&=O zoL8wU1e+>>9JjhKHvFyhq0rpUBkRH&%eq^)4v(m)cV;4NbRtYcQ`0;sI2f-guC4v< zO-P8b@WR*Y()A6>Su^1%ln2y$?AY}s)I2|~dc_s4LtjGTX&xj@elz!#p++<$FMD zw%OeiCr)rXs{0OE=PmfTuU#+qJluRfHf%t%H_Jr%@EQIV*DfyUqWOA(nwlDaqmbz^ zNeKxqn^8H~04V`4B?3xx5mHf%r>L) zkNm;UAG^Hlpr@}t9=W+@5HN4jlb;oJPMJ*vPMEW!lvK|yv(S9m+s{x?b}1G1ShQpo zeqRF#Rftqbz)AdVTZ71EBK(!Na~jTjFe$BY;=+PW%7nnGPyEt*_q4RM>FS)N83>pI zxWZ1dUG%Lh9@qFUci<}ziea%&;u8{VD^D9ZbmOBXPZ3re85kHYu&^}4q0fY<4xsXiCPhW7Ac&xqva1P-rZZ#3PV$G^EY-RK0 z?abGm9^Z**tC~@GqQC>`T&cirDqf+TMuUzwI6#I!u6_!R*f?x+2-R{fE`<#%C?4lS zw93-2qePp4pE#L;k_F&^whnQ_ap4=6mUgu7`>}_JX}hjXCMmGgJQ{TA?&_IZb`n25 zo>1xQUZky%DvE>|Sy^R6pG>so`f8a7{yU4)EDB%y$+&8soUxz7AmSN-j z>Xr3r`;z8XWvwJKku^DpOYcz9Ptqt*N-0_jao*T}e+_L9`^jA`egN3sr+P7buB?>s zhX>qKOH%$7J;qFdp+JR9gN6wHTDOfA13w3*NSAcGnO+Tpjx@i3sAd@fX-MA(J1u(l z3TP(9xM5yE7Ewx4Ga9K6ry-y~Z`+w=)O^!Zn=1qdkEA7CPcnt0=u!e3n^xP1b)HLo zLh<5q)21iDoU3?soKJGzFy&NNj;yk})#C_ctS;f?uHhMdZbY-glw7x9 z6;ho$5cZwf_rqrhf5&-gGH!lK4{)B0;H;Rvc3;by6Z{Ag#rSZjMH>Pem(((E(Yh|Y ztQXmI!EVebphV@t*1j*5rasM!p(P?EORx4MAswP_-(czjuAWbPb8SRKQC)pfa|sf! z!)L0?_q~N-XYpG&Sd^@1ix+5@dyWCTn_pGn>J?@;lu8|JHExdaS(@of2??Q$>=d9T zh0t^d@SS-$$1e>Hc#(}+KmI9CGTP+D3H-$7#>}SR0JZDK(DMGTFMhhd(Ca+Y$~R*Q z>%ID;oXoYpHh7VnA!W5~P;#@#Do>q8KpPUa0z9N~Y6(?m-zYY5o8NOdPmq?vnQ(<0Hy$H=1d9?|y#i3!;bOw2 zZ^p93fVI9@lnpRi*1EKxoR%wTg1)9`iDa2|DLp%RsR@=J9s82a^_l7Nc4H0u9$U2n zyCg>0h{S}HVztMQA45vZ5gY*6*U@7>@cGFFR@SBEB2{b%k6B1tidM>bg_zOtgcKzJ zZCM5lfDjwSA(G-b8w8u9+;+n@-B>4@4As0^ow9IGfX8v3hes(cF3y(q9xf3u-)K3R zNDBf_0D^xRvNnQM8gEMq%{J>AHn#@=J02h_+^DFm5zUfPDBqKBt6*dlSG>M-7k$nu z9LpXvi-IYh^+DRrua6Fpb7`0DhQOtmpfm<)y9GANjDnVnhL*zI!KlqDeup1#0ow{5 zrWIB<2)7Ly7#PTOoY!A3*(7Xo*-xw67j#C_yTBXUjztGwfa^q%Lss`&U9v`j4J7`F zZd?N#aMx(Omb=C|ao-kIgK&niU*7$U_PRM-hKBQ0R(x`D+kAsa3JRp^Vjs`v)rD|N zhVojpeSSjDZ8adqM@^cfnsL*MjCOd$1j98Ns33DYQrL9?4ljhdimIvvNW;^Sw%eGn zzNSXr+&p=>AuC?h>Ot=&W)Lw!*?^a@o-h;tXDE zW9}4s$3}75q~OtRwwKO36mPrsL3wH!*ShXkFw0uEr5K&~pl6)rdzw@e^8lEyF)XkMMp)MTIS6 z7P+k8ZKyCGY(N+=fsF68u`=@N_3Mx3J^5pRpySCiU@@Im$6{Xv1bh%#ACpB`#g;gg zlkoPf=$<`$gcmFD=m3kZx+kJI3ikxyIRL7#GjJ6sIRaddWRwXTHOI%w($LqH*{n{{Qk`ZGvo8aAF??%+Vy zjqh(fvswP3T^GvRpmBxB&$P;K&}Y*u9iN!kLS<3-4zKAf?Yg?q2nTL3E#;euoAlz> z>ttiF0q{_O#K?s6kEK=7;7(~uSF>?*D_-{`Y4kiybw^!25nvoDSu^zO;^z+=(GidT zS{W>)y-}~j9M!4vJwqw%l1t_r$}RjeUK0eRV=>GahO#?)w1?drqmpJ$mjYGXuj7 zk3oYg&Pi3 z2qNjLmMEs7&v`6VW(+$;E&cuz9UUFS5((QgA5TwD4-5{Dg}Ul0Q<&w0n|?GJBQ6Zf zBk086wJ3@n7|?+%e{Fqru^CE*ckmRw1h_FCHy%6H)9~ATl%0Z@)b(O2M`Z`B!ZNG^ zFkqH}@XUxUtU^LUqa~`On2oo;F=e?9+uYg7o7!ZED)V@3l|@GTCgyy3>W1neyMjv> zR)4x7tMxvbR(@fDLGAay{8-h=>AI4Wb9y(f*e_3W|cMnZzrQ^3dL^XGN|=2Wj+c<)e~3$1iR*F7C|K@P@V4p#V%c<(3`av6_f7T)6Y{Jlnl-8uXIqU{|(8q}W5VO^0!^#RR2J+jE zAD`wzKg1ZGJgVG2)19Xv8O+)WVLA!W4GSNiszvcijCcW5`a(ZG?OL6yWowO*M%0}e z?0$I&eq}Fg9Xh6XI;OI3cJWa!@w*pskSKU+@4du}!T1^+uFI1yT_eF5-uirE395Z% zP0fhar74zj_WL0b0V~pnZy-KIa zePox{Fjda5@87&_{wh4DvIc+JMJ6Vuf`;f6MNQ3Qh>r|ROkU?%woc)-2LH#d`uZ_E z7QKn->6Z!Vwi9jD<~c0q&YgSyW=kYEqIT4t*$wc@+}s>J7EmVQ^Z+~1V;}A#Qv$B9 z(vLRf6^pVl&?PnX^-54RSX*0jxve=6EVcuWrJ#3inHz9xA$&OfaJ{S?b%Z)6!Zew-j0ioli_Up?eoQAXL!{{jG+CQj<9XSPIV#p?&R81;a zd-T(W#M3Qx~=q=Yp3s^CAC&$ zUCR=oy+m4%B8H@#L=A+szNvpdu6LUmmxS|@TnA|U^x3mRs&E**j~`#szV~YcCeA7b z?i+ZNt8&`&d~Ako_)t~v(yLl_n|tmI6+)Otr9heiKNSlA0!raGF(b}a|9!9r4C{ng zISDMbF^ms)trecx*}aclDcYxa_$KXya;BZ9F2#{BD;bZ_`%pC9;JN#)Z$ms?>5?wo z_`=$1(uGcYIS7mpQCNcsXUaTnZar=OESX6YIS$qk!EOIL z^73b~5)u+X{;8F?71ayQfAPSdAtzssv!0$9Z9o3wwr!T5pXbE*ZYQ5G!_TiBs&dh{ zj+E~fa5@^Bm?#BQgs$VYYY#wjV!%R9Xof{zsAVg;<&O#vD2;gEE_P1wKy}ObmO>oY zKK1m8)$WsnA_55;n#zpFZYXFyc+4^DYlXgT3jGcs%#s=f*jWLX3TruCO3%3T^iRp| z=dfvMq!m>;{|M%UUU(i=|GS$S*g zbfWsI8qi!&ZYx3Ld!4=$Lw6ZrE)Va3H*ewqStcYVekn4Ak~qC#7Z+nE=N(Pj+OKzr2jW9Q8pD z{?Q%KoiYK!-|RvP3v5u6jYqJ;SL)OL!y@rNgsQDL)IVUkdjxl>oL!GXUrqU+2etSxIN6e8I-USZ5 zfCK^ixgdNgu13Jr=}+R>*dOor#)7YWv7&TA(rweEpgkYn`XAmU{1J z*bJ>{y5g=-La*XWHlopC3LM)4`A6g-TNn5AI!FM=e#glvzu;qV6jWrtu#+IH=fAKr z5;*}l6Tx+;r0CDAe0`gno2xp9ToD*L>a){S&RWNhA2*zAPqPi;3zx~V=+lJq+dm3) z3@mStO#pEN>HV7<#2rbzjEq+zAGU0`<=5#4)af5$4kFI-aMRQ-^_B2AW}C?;-Wwj0 zExfsUEWX>AR;xfdG=9lyLs-~fO!ChK(-yq?o_b$fOG$|_ws|Z~5cV%@LhQ!S*^VZLgzOgCT zaLnttkn58mw_A;^r{Gz;1znt6KRRmBPKBaI&D-w-mhF(~Ay#~hsXQTfaT}`D+C?2n zw#uVt6Upn1QYrmX8|mP)z;GuoPhDIGDzm)JzT;U9d)#ShvG1e*1V$fpor=xi?puQg zfIWF>GF&DWwu4Yma8S^NpMHAR(RNpXy!(tqGgp!3Vo4l_@0z$zr8RB3rfgTt=Y#s1 z8=p-Q+1}M8EzODxN6l`o9334zht2$REorj|Wz?fc29a@D85z4sNv(v!LUmwOBxGbr z6_n6(!#s*$I79@-(yX^o70I1?6KzQWJ?6?cZ{DQf+}ap)nP;cv*5czYK%~O@LjkS` zYBGR5tp8pznnbW;KwQyeQBH0FN5HG_@TB+e&klY6Zn(BQ3&~0MO7m7wpo@ZS+LvPy z30l0Wf$G#0*eIAUU6MX<;DB1OOWsLV^^arCu?fWu+1W3DLCPmfE9SR0 z!~fcmw6)v*m+u1GMij?Ea-Ez}O3v0ci=K+gvLPXq)fu~9WA=~i#FE2H_!1`fQ6L@|c zpAreioj_nb!3=;mQXozhYEu9c=U=C9vb45+a-zKjj=TRuidisJP*aPq4q!@vRDs-3 zP;{NVq!u+kZd?~BQX^S@>~Y>VRE)xF~H*m6k@_)Gy+yFfW9tVK>2RA z>8xP}P;qAj9pu3IbQ@%2`f4EeOCRh(#T*ceJ!Zj|?lK0~Mgx_q0cp_ zPyPFQukxYIAt+*q-v5Z^f0uizKNNv7gN#k6fPNjf^XSBHU_J{3T=GEoCuW29b8!5J zJbp=Y-*oG781zA;=1m{UXfUEUh6U{dii2*TIPLH0h70Ghv^Qj zuMO)Cp_G1l#2Kwn!7iTjO6>D>6Iq|i{@m9$K`x#OrpfJ35LlShgLTok|$w<`% zf6cC!B5dwgu`*U3Re8P{mj5l4NW)UuUlcBq_TWvh$n5xpAkx8Waz(0!ybt!4T~%A{=>q>(;j_=Pn%U2AK1#r?unfvw%7l1 zC6+|EzU)cfO>{Zh6d{}jzD-R{fp6ZN_6J_{2==0C`#+H(yj@Nv#t*g`7K`&Cl!L#x zw-{a|{4}wcX(q7^Vk!}c+37QsMSJvxvg@p`3q|XXiBG{if2OALR|ot=Sp%K_sJfmz zuYfZJ>P1^PI9%fSupEN09BG%C zbWwGndEegR{gvFd-6dy_eV&L>S={gQ!7BWK?DHq~&!53JQ{{JWNUaI+@j!|*VJ}>~ zcn9P~5GhU_%+gxxO_-lGdE#TsEL^N;nH)M$ zZYz^`HYEAY(Tl&`U92HZK#aO9(>4{f!iBZ}Q>h~Q%j3aod07vv)^`znRgR>D9+SrF zd&d;-KBPNvW)OXkInV|S62*4I!oDoCeW#AWBzkWby?=&E>db%;X8;|!!0UQ)p$tH(<{~JF1s0%^Y4~@asG{Z@pp1AXr4Cox~nMct?c4EW)F@9#EXtmv5T@;>&pPI!IF0Lvw3pZ4jW6 zpilrV=YZgKc=6?}7t015B!cr#8KU}2daynqDbu7aakjH<%x3(Z7tO(ccx^&*z65ye zilv%cUMXwFby<{Xz&^F@1d=<;yhklI(k)*{CwCMMky4pD8OCFg%ZO+C>$`{iHluV{ zwKSc_`GD(Sy|^WHppaO2J+T9mN>d`DPYe1HJId}yJ@cUMG}%A8oQHfDY{kp(-B!n# z;5HrAh4A1V?6L1V%zPE|vb_J}rWcp-jLApotqSq5mKl}k{d>v#Fm!V(8to#GVbuy9 z(!it`E9|<;%+1aH+Hr?B9T3ws>{5K;NmOPNl2pGjcywhP_+6NLhuuysu_z^ehG8uS^1`zSo_Y|L1zr+j47FANDVRat5X+8dkS*g)@G~j4Fi5UyMOB+ zzS(QL1M{>cgAV%eK6nG?(EG@*@9#Cx05UY(6f$Yr!pkP20%ZJqGzqw!!1#brBJ_=Z z2S#-C_uGPS4|Gz?X@jt%UtkN1K0+(?iE@o~QaM@u^cO+&Bkk=U30J`rfUif9eSPhl z^O2hiaXoLZK3vm)0%QNVVJA2sr88w4ohB)-0D)xoAXR>QcmDoqu!uiYhOP7NsHl9* zGQk7a)&oY^9SneS%e=N8@Q;F2^vsJ0KBqZvk~7J)z}ATeBbB={0GcxIt=AX^JsW|z zhr~)$(TALatK83tF5s+qhQ^7K%*;EGFzQag=weC-JMZU65-qMZWtBL@@>{aA~F><~Gu^Lvu5>tcCZwgDTr z6|~=Oa@QopSi`B$|50{mmjvl=pZv06rc9WPZA(>>;Tt**p3A%7ZGJ_MfFRgVLO66i zk*No~YkM!L!u|07c0cRj22X~9TL`Q&K#StoO5mN5Emk`0%5C?J`N%iG{iQN*Kf39e zT{&i5F+j2v){AMefcpv&eh_L;%2~siblEzSdrJ-xfn12s%*^a1LPBl@=aG9kuvP53 zzwMjmrlw4wl3c;1NJDU4U&O2BSw}ZDH*en&+4foz_?AjG7E`z(y`XCWVf!|~WO}T~ z#_|W`wuj9j0~QtKug73>y{q_baU< zQ=a^+|7rUQgvdpZt_Y_31@JOC6}f|v0J-7;!%f&*W8TaDrRJlMOZ+p@|MoQdZ@~oF z##l?%=A33|0i1)_pClmpDH^e}a~FW;@UPz6F2`PH0U~Mzqa*?hFLxR^-nHMDv%Y+K>j` z+Zf@sAJ>u58=RZ+Ip#g|SmbyHN5)&I@WB(oY4~I(#^dYnD=A<%`4L=q*k`BM+JHrh z1%uNq|AE0lZsQha*rWxe^K%eB?r-mmovZyFq-Xd?A$b5X{5#&(&)CDj_uCm1EA}`@ zRD);;yZX>gX87^XXxDkGyJ-n-7x|B9|L^Lvid^zj+9$~Rg^br=3W;oaf8!7WvxiPz zltUtL_!ftUxk{eZ*A7)V$f+i9;GL$j@Te~!NiFL_x6xMpbmY#jGV}aM10I?p z#6$l(wvT3t>f!*acG59X&~pguVz%73_&pbI>!0YWu2v!M10dK5Z3)AgQ- zhoa{VEHLsu93v-Zz_JE!-8uj`e!Iz0a}x?rz#kt#M1zpw8AXPA zr}JX@n)V#InWR?vi+Rr2%Hmzc%$a+lY5epRMgt5zdO2MW0?)v^9-)L)CB6r(0QPlK z#us4GU5QHC0Ka zl8MWanWwfyVBb$VYa_c3dG!!fI7nxv)}Cx#iLw^Dl2o0FE7V6(`9s;o__DL1^BRkA z@hkd0Q$IFt9l7)myy8|#1^Tkt)n^72F*P+89v-U&nSRXlQgE%Dv=7T$71nagO*5{b zsauVTgVtEhH$qjG5N)Zo--Lzj4SJ6q~s@Yri2NO96BnW-qq2?6&u6;*w5LUbirQes1&jhqP`l zaeQ<3`R!tka;uo*3c|aIiK5T5MW1IvtL3b>c8<1CwZJ>^#9!}j?V|`EUtWFQg8!ZW zFQwyU{*rCGjcDexe?dA<{?)ettq*or}Z?cJlFO2mVBHdm0k?+8B% zlqh3B)^!@Y-_o-jpFJk5lhphEo=Qo=RkKa*R%DF(Oy z8){30M0FSD2#}C~fNufO{qHEOy37iDF?8>(kUwhw`GTU^9>2fJrS~a3VI#FX@7RI4 zC**=S>9aEbY;5_OIN|%`z?w5zL66;c5gurI+H*W%X_nM$vBWQK>N?k-#m48=N4M5! z02m{fmY68vyuwY9Zi`A+=){X4tf9>5XrQ}=n$ z3H-#W1b(eWb++GXG-}?vPK{s0sM#mkKBAJs7VpCT-k5fX^@sR<>wl&GtuLQJubm)N}AyvD%~5;4w;Wb*iB& z4;PCJ*Nq><6U6;!DEX|T@OCRJDu_&_tM{i#KNF7R7Jap{Vj&Xvzk zH$OkU6BBjraZ$NB9yd{p@Aa?o(E0V3Z>Vd}Jo{%a$J>rAV%j3w*j=G>3xQ1@Q6dbQ z84CtYmAISDnRfjH6d83Blm8ux2+_<`;S%`f4l+jso5>TGb=|vqX|p(<5|oVyC9WlE zCA^1+6bmj=@OjP`yNBFlAqT263sTW~V?) zX3$m?RAUW(7ec8lnx^jAztLrQvFqiTHOgSgIqN8sfNfEky9q?)la`4eR*&9xTp@3$ zGCLNxF32B5UlMV*?xnKmY@sQ6@Wd^!%v~UCuAgWjr0<=J*N|nmCe69G^e8pw+@#qfp!Fcq+3l1(Aj&hETUWYEmx>@I?QWrLXxu1R}5IQ}l zGeQiAt7wwvZn6rHctkN!%1Y9qL?fTw(TP7BQ-a*^? zEUg^#@JYJ@kf6%K&S1on|E%1dYTq*>FTiB<`339MJW)E>pKGVA9=rZee1LA{G#HoE zoGAIn6(@F781@BOti!dOy$XfNzinQD>(C~L>-byq3jCw@y)_x2UEu3KWN!Z5_x*d^ zwgY72YtN49M&NLTj04?)}`bvwn?jG1a!gnCZTSN;(y#e^WAiMIa% z&yZQycKVOF7-Mn6=?{qt8Np!WUdnxngI5ER6SHgByaHPbebMHu!{%pYF9qv*Cj zS-tMDiTj6 zZny&z+L@;71#KCOSmf5=F>ODBBr=ax#PuFghi(2%<=-Pa{G*`vd=!X1I}s3xdC&|D z#NhxM@JxgZ_m&mi$pe<{lu2Nk`PY0-RdO*v;f5eO&EwnC-0QPbOO*Lgm z;$_x%#}hdYEM}AN{7u?L-d23y1##aKaX@U>BhY<}l(n8NdP zOZdXDZw2vz{Yzoiey4~VtGSP^ztue!H*>l;{l4~0!tWkWozvTHt&A}(m`}jN%z)O~f&gkl%fpS|f2_GWzyZYSsqSKX5 ztBaUmCsgXt6BE&>6ihO9tcBq5fk+$Akw(e#T#`M~pf`f*a1y6~6+_Dd4#xiWpkl)s zb&Kl1HTjpi&c4?F=4F^mYZ2e5GJRcT!l*0oS0CCU_|W!H)|!Z%r9*d?zrDxE==85t z_EA}M(GOJhNnUi~^u!=QmwyavM00rdI%lee8{ zQ2;)O2#QqffU&O!{{|;5Fdu+kRAG!`I&=)_-Awl4*=lNWVL=6d~NDz|-rAHfHBi6YIIc+?1Czu;d% zXRY4qB0{@6ee5TB(#n>dpS8B&ouigGE47ucxky6ClBj`JK@VeS-AR=r4*T0&EHbcm z&JM$?auxm?#4tPVp8)n+$UD0~HqlD>Y0*ddW94V#M0VCvrwHz$tox6x4rJYXj=hqt zrwi?43fny!_q4ajng53c=ifDnbo zt4d(Nz++~dEEsUo@nJM>%l&(44MJj2=WDq8#>#ketY32Rs0tMWT+=QHT7Y%>v& zS5R1NyKB)(%)Qmlh(-aet;}=7^d~g-1{#DHI;S6>()Ewf`O5$y>PpeI+I4CF^!mJq7p=J&>;6p%!q|T#7m{LEx?#v^`O?BK&$@`%2~8 zyk5x%d`vfLb9du3y@NPjx=;lVPUSU)A%En~9Mg`=s5dx7z&;s#SP9@)pFk6(mL@ws zbeR_4UaiW;tVEcTj}+O5;B3pl_n#J%RM7ix>J>oIsI66O_hb{uGIk2uxu};x=hR z4rZ9W5d?}=6w+Q$7xQA1BwHLDd7tFL(tWgdkayn>p4P)oFSTM*ax9v3!3}mf{#r0; zwA`D#)yW8crmyfW;}_K z0CS1ZJSvSAmBaSp?Jg8#jGY~y9t(y&wS^`u%Dx26&Dt?BF(JUb zOzR0k_Cl)h|{KNiDSXAQaq12efqDe&`7=4LNSWw6m?jMlDv7rOx-tUL;^P;x5f-p zJiezTIuoYRv=ksZC@vpN0wc>c2nndA(iyR_HjOt<83Bi*{bjPGczNIuD4~)cZu%Uw zzu=SJMdfyP8RX1Nkj?l9FJaNd6JeO9vzLso2M{d;sR@@4AunIP`}|o93=n&QmN_UV z#MRZq)N;%<3{*j{vjH{KStyew4741&vqqr~^RRjsgoQO>fv?sv9Pud{Q5)iZ@U3eb zAVZ7&s3s54NRTA}l0Rlk)BiRx?*f+x91ZlMSmt15WR%djs4tDsG0*}*P-SF)7R;Br z1Gk6TZy6rD6_d5OQ21_dy*g|o7i($AsrN=wQQumb8+lmk9+wGApk4&j#RS?hl_>$X z1sYL=q7)PY!_6xd@7Q53nk~#nLG!R8T;{NO>d8)BCyyV$B_Dn&Vr$43ghM4;TicA` z@n>k7hLx2Ss&c5zsN%Y!l%^c)$C{$+=U~3WIE*=JloeTT`KIgFBLh-17@w@F4#$EG z%B-)r=UmllH@FDKVbm02m7&A_@p|48ta-dviQ721aTujTqL6dKN0`w+ry7#Tdlp-T+Xx16=DA>x?{bPl%WCdUDem#oO!_=nMX*Xs|J%=@RI!qJX2khFmic&?$rFn;7DNoJ9T*)Wr-2AP68@?rVdz z&<%8FX5|SyB%rt+SNWDSj|GA#f9VqVSCg*2ELH{w(JGi-lHn{vuK;&D&#|29G}MfB z=5*hO*52fzzEEfY;U6(>L)zgFt9NstL0pC91lzM=YR>_tu(xooZ-ZhZ56wY~Fr{E3 z#eF>i>ZdZh-d1o8q1lPHFny>T7O#Q7*98U!{QBn4yY^iG$?4q9 z=U?Hfkb#O97tFq;SGl)E-LDC>;OHi z)9gTWux1muEMStrS=1VRIC~1DEvJRi{^D>Lg@jxX5W~We&jn^d#O5_GT0)+}=b>X@ z0v2y4aC|L-JJ3ro@*en@dC@2h9;Z_5d6%xJ^D}uKuef+6d6J21xME%b93{6hM%E|eYlr|;w_qz^XBqG1N8NwF2Qv|pL0lM8LN?h&II zhTh>QL+W7Bk;jJA0iCw1Y9FZC8eR>HYSS`64?gNTL2`Cv%oDq;k&-nuBPrmVakgM&G6BDYcJ)tTXRShSHKDN@8X9NPPKB9{3h{Pop8FA=GXwtiOMX+8utuP#ubG&?@@Ih=a zmuufaBD0)=!Uxq1{W$POKrg@4hmsY|NC1&r0TL$fz2GGI0v4W2Ugft(*tRwqF0WRJLzL<$HTzw-TfSpef{>93FHchChkyxME z+_3ntBG$n*>gA0~o&sUaDq@tMW`i&0{u^22@7#uJnFaql$tZ zARrp4Kj3B)Ykqxe1G2}4x{*wnF72L^7ggS({RNDWB7g)mxsl)s4SC4x!{s=4f2z0W zE)|dI3!D3~D-X$pzk=m379B#c7B<1?5e9%sFx}`;98(jTQ3<6WSmv-z$k7Bgv~j?t z3cxfoRDWQDnGAQpL}QJ^p$1C;LqafdjUVQI0`{WJF1{sRe^eMIdvV?Sgz+PuwGD-d zOL^{@^=-sJb3csEaks8| zvH*Hb#uey?K(qR|%sRPJtjda{T^t*-%1m~b^8Qz*aN_zq=iKGXw8S~V8OYu+Lna;~ zr#LvgZ=);#?{xu2(?G#t6~0(BTS=7#-bU_L+@ zIEd^q+7#0~RZA6I(#`B7^FZ&hb8>Rl!Km3c;9iJcfnRyVzs^reedQixkmfJd)w53S zQRV?q$i(%IG*{8OPnzf0O{vYlf5v(lprw&7IL2Hcl@wJvPNr3jU+e>Wz9Up6Z-9&C zgzU%OU?`slj-Zv2u=B#R&A3J)fLIT+4(~NjF1bL;KFS4+3j?{%3z|DR| zm$7x(a8I*_1n30~MeEU#pv(sA(+nYwNvQ1<)Qu|h?~FVRNJ1w5iiNc*ha6mzxtKzY zXmHTxjJ{Gy{!B5lMh#{03p7 z6!aq{B^NEJ*@wysuZY3KGc-qS0k--$%oI|Ds0i&ztSXDlxM1L;2qV3%TWK2Fl_g7xY?-?HhKYeXP#6X>ttYbf<6#GwB++zKIBwlpCc)^h^KeU&;Z^2c*>#fuot#$yXMHJs zxGxkcA8wd`dJ1OHYPoetVX+)c(jnh9mjGaH4Qho7#83&MZZMUnwkAyb-Z~wWp8cAd zii&#rdU|hQQnVFY0r-BuK&_k+6%{3;2^1gCJ`xH!Cg|O$Vu2-hLM|Veudp?34t9l2 z&H32eVAMSVd&su0H&y!~w0i-Rf=LGh8}_If=Q(HnR~X`W7K;p+a9G}=h6oOhO(Awe zn8FPh`A0}O8o&+1$MP(YeIKbO$PWd8w*U|G$m77{z7C#x7GaAFQ$dVF2XrO9DJ< zDQ?j)v8&9a0J3Sx{7%d_2k87zAqVjdt~(A2jZPpfs$7xdo45(7IA5w6O%;GKcEL`A ztuzA!C>SJ1!f?LMj^NIQo@3V#7g>QRF04EZ8UpVS!dytrFSjFLs4vRoicA2m>^xWjjm&>A~7d`a3O-M?<}g26#fz zGtGlqF`+7Ot%-vg6nXp?p-Em5`sf#OOO{>;z`KLShU~(tC7Z+&r>-3sOF?irzYnbw zTsm>+TW+T9gpvjfv<4%3IHS~SI&g2my~crIANa9QP_r1Zz?a2l0(XFB3_&xc0u<(% zCV18&V7cB6f2Q*<2K0)?dB9uu@FZ;5Krt>G@(63{I z)vdANVZtll$E;Vu`aa{9(U z_z5?``csy@+zqoQK<#RLc71loE&qD{iHoRD0-)7f8s|w4RLY=Q5;_Txa)#XYYBl^FTv| zO(ssj`B-US`rHdV>(b$bMB74O7K#OqpMcJHKQtFOkI!KQ%*-~xS!{>R7QjKO<2x%s zyRtyJ6Ihgh&oC0uIrG)w$phf#t+S6n&3$0&aD}(tPSCoWU%(TtK&NzT0ZXRGf2@F0 z*tQ_Gz=J<7==j-w4f)<$3=BZvhQtNH^+Q1SyaOLEHW#>vl0yLKC`-`fIS6n+mZ4-G e0As4)&41>iw#nQlFPIg8jQ4c)b6Mw<&;$U7GNjf3 literal 0 HcmV?d00001 diff --git a/nbs/index.ipynb b/nbs/index.ipynb index 2f640e2..aa4ae64 100644 --- a/nbs/index.ipynb +++ b/nbs/index.ipynb @@ -31,11 +31,11 @@ "\n", "### Installation\n", "- to install necessary dependencies use available `Makefile`, you can use `python>=3.10`:\n", - " ```shell\n", + " ```\n", " make install\n", " ```\n", "- if you want to run evaluation and fine-tuning with `unsloth`, use the following command with `python=3.10` inside conda environment:\n", - " ```shell\n", + " ```\n", " make install_unsloth\n", " ```\n", "\n", diff --git a/nbs/sidebar.yml b/nbs/sidebar.yml new file mode 100644 index 0000000..cddc5b9 --- /dev/null +++ b/nbs/sidebar.yml @@ -0,0 +1,19 @@ +website: + sidebar: + contents: + - index.ipynb + - section: Data + contents: + - Data/00_prepare_instruction_dataset_for_ir.ipynb + - Data/01_Dataset_Description.ipynb + - Data/02_Analyse_sft.ipynb + - Data/03_Graph_Dataset_Description.ipynb + - Data/04_Graph_Analysis.ipynb + - section: Dataset Cards + contents: + - Dataset Cards/01_Dataset_Description_Raw.ipynb + - Dataset Cards/02_Dataset_Description_Instruct.ipynb + - Dataset Cards/03_Graph_Description.md + - section: Presentations + contents: + - Presentations/00_workshop_demo.ipynb