Developped by a M. Eng. student in AI & Data science, who usually find himself preparing reports of projects using LaTex. This implies summarizing results in tables, as we can find in a lot of scientific papers. In the context of Machine Learning projects, the results are most probably the output of a python code. So why copying them from python and have a painful time creating the table in LaTeX ?
The goal is to use python to translate these results directly into LaTeX format with a reasonable amount of customization while still working with Python!
- Work with multilevel column.
- Load data from dataframe, csv file or dictionary.
- Automatically highlight max/min by rows/cols.
You can clone the repository or simply installing it from PyPI using pip
pip install pytabtex
For the moment, the specific use case of this library is tables assembling results.
The columns parameter is a dictionary where each key is the name of the columns, the value is 0 if there are no subcolulmns, or a list of subcolumns names if there are subcolumns.
Example (simple columns) :
Desired output :
Python code :
columns = {"column 1" : 0, "column 2" : 0, "column 3" : 0, "column 4" : 0}
Example (multilevel columns) :
Desired output :
Python code :
columns = {"column 1" : ["Sub-column 1", "Sub-column 2"],\
"column 2" : 0,
"column 3" : 0}
The body of the table can be a dictionary, a pandas Dataframe or a csv file path.
Example
Dictionary
{
"id 1" : [29, 31, 90],
"id 2" : [97, 78, 67]
}
Dataframe
if the dataframe has a header, don't include it.
id 1 | 29 | 31 | 90 |
id 2 | 97 | 78 | 67 |
CSV file
if the dataframe has a header, don't include it.
csv_file = "path\to\csv"
- columns : columns defined in the format specified above.
- body : table rows in one of the different formats.
- title : title of the table if needed.
- highlight : {func : axis} defined as follows - max by rows : {max: 0}, max by cols : {min: 0}, min by rows : {min: 0}, min by cols : {min: 1}
- orientation : "P" for portrait and "L" for landscape
- position of the table : htbp by default, check latex for other positions.
- align cols : "c" center, "l" left, "r" right.
from pytabtex import Table
table = Table(columns = columns,\
body = body,\
title=None,\
highlight=None,\
caption=None,\
orientation="P",\
position="htbp",\
align_cols="c"
)
src : Zhenwen Li, Tao Xie Using LLM to select the right SQL Query from candidates
Defintion of columns
"columns": { "Model" : 0, "Base":0, "10 MTS" : 0, "15 MTS": 0, "Fuzzing": 0, "SQLite Format" : 0, "7-shot" : 0, "9-shot" : 0 }
Body from csv file (No header)
open data1.csv
Create table
caption = """The prediction accuracy of GPT-3.5-turbo and GPT-4 on different hyper-parameters. “Base” is our baseline (5 MTS, Random Selection, the CSV database format, 5-shot), while the rest are by changing one of the hyper-parameters"""
table = Table(columns, "tests/data1.csv", caption=caption)
src : Rolf Jagerman, Honglei Zhuang, Zhen Qin, Xuanhui Wang, Michael Bendersky (2023) ‘Query Expansion by Prompting Large Language Models’
Defintion of columns
"columns": {
"": ["Dataset", "BM25"],
"Classical QE" : ["Bo1", "Bo2", "KL"],
"LLM-based QE" : ["Q2D", "Q2D/ZS", "Q2D/PRF", "Q2E", "Q2E/ZS", "Q2E/PRF", "CoT", "CoT/PRF"]
}
Body from pandas Dataframe
import pandas as pd
body = pd.read_csv("tests/file.csv", header=None)
Create table
In this example we want to highlight the highest value by row.
caption = "Recall@1K of various prompts on BEIR using Flan-UL2"
highlight = {"max": 0}
table = Table(columns, body, caption=caption, highlight=highlight)