Skip to content

Commit

Permalink
Testing parse_theses
Browse files Browse the repository at this point in the history
  • Loading branch information
ricardonpa committed Jul 2, 2024
1 parent 343af28 commit 25c9507
Show file tree
Hide file tree
Showing 2 changed files with 126 additions and 136 deletions.
42 changes: 0 additions & 42 deletions publications/theses.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,127 +47,85 @@
URL: https://etda.libraries.psu.edu/paper/12437/
- type: 'PhD Thesis'
metadata: 'SungHoon Lee: A study of ionic materials for the energy applications through first-principles calculations and CALPHAD modeling, June 2011'
URL:
- type: 'PhD Thesis'
metadata: 'Arkapol Saengdeejing: A Computational Study of Superconducting Materials: A Case Study in Carbon-Doped MgB2, June 2011'
URL:
- type: 'PhD Thesis'
metadata: 'Guang Sheng: Phase-field simulation of phase transitions, domain stabilities and structures in ferroelectric thin films, March 2011'
URL:
- type: 'PhD Thesis'
metadata: 'Swetha Ganeshan: A first-principles study of elastic and diffusion properties of Mg based alloys, October 2010'
URL:
- type: 'PhD Thesis'
metadata: 'James Saal: Thermodynamic Modeling of Phase Transformations: Cobalt Oxides, September 2010'
URL:
- type: 'PhD Thesis'
metadata: 'Hui Zhang: Thermodynamic Properties of Mg Based Alloys by CALPHAD Approach Couple with First-Principles: Application of Mg-Al-Ca-Ce-Si System, August 2010'
URL:
- type: 'PhD Thesis'
metadata: 'Weiming Feng: Phase-Field Models of Microstructure Evolution and New Numerical Strategies, 2009'
URL:
- type: 'PhD Thesis'
metadata: 'Manjeera Mantina, A First-Principles Methodology for Diffusion Coefficients in Metals and Dilute Alloys, 2008'
URL:
- type: 'PhD Thesis'
metadata: 'Jingxian Zhang, Phase-field simulations of microstructures involving long-range elastic, magnetostatic and electrostatic interactions, 2007'
URL:
- type: 'PhD Thesis'
metadata: 'Dongwon Shin, Thermodynamic properties of solid solutions from special quasirandom structures and CALPHAD modeling: Application to Al-Cu-Mg-Si and Hf-Si-O, 2006'
URL:
- type: 'PhD Thesis'
metadata: 'Soon Il Lee, Defect-phase equilibrium and ferroelectric phase transition behavior in non-stoichiometric BaTiO3 under various equilibrium conditions, 2006'
URL:
- type: 'PhD Thesis'
metadata: 'Tao Wang, An integrated approach for microstructure simulation: application to Ni-Al-Mo alloys, 2006'
URL:
- type: 'PhD Thesis'
metadata: 'Shengjun Zhang, Thermodynamic Investigation of the Effect of Alkali Metal Impurities on the Processing of Al and Mg Alloys, 2006.'
URL:
- type: 'PhD Thesis'
metadata: 'Yu Zhong, Investigation in Mg-Al-Ca-Sr-Zn System by computational thermodynamics approach coupled with first-principles energetics and experiments, 2005.'
URL:
- type: 'PhD Thesis'
metadata: 'William J. Golumbfskie, Modeling of the A1-rich region of the A1-Co-Ni-Y system via computational and experimental methods for the development of high temperature A1-based alloys, 2005'
URL:
- type: 'PhD Thesis'
metadata: 'Chao Jiang, Theoretical studies of aluminum and aluminide alloys using calphad and first-principles approach, 2004'
URL:
- type: 'PhD Thesis'
metadata: 'Koray Ozturk, Investigation in Mg-Al-Ca-Sr system by computational thermodynamics approach coupled with first-principles engergetics and experiments, 2003'
URL:
- type: 'MS Thesis'
metadata: 'Ross, Austin: Solubility of Oxygen and Hydrogen and Diffusivity of Oxygen in the fcc Phase of the Al-fe-ni-h-o System with Application to the Formation of a Protective α-al2o3 Scale at High Temperatures. July 2015.'
URL:
- type: 'MS Thesis'
metadata: 'Wang, Yi: Structure evolution, diffusivity and viscosity of binary al-based and ni-based metal melts: ab initio molecular dynamics study. October 2012.'
URL:
- type: 'MS Thesis'
metadata: 'Zhang, Lei: Thermodynamic investigation of transition metal oxides via CALPHAD and first-principles methods. June 2013'
URL:
- type: 'MS Thesis'
metadata: 'Yan (Annabelle) Ling, First-principles calculations and thermodynamic modeling of the Hf-Re binary system with extension to the Hf-Ni-Re ternary system, 2011'
URL:
- type: 'MS Thesis'
metadata: 'Bradley Hasek, Thermodynamic modeling and first-principles calculations of the Cr-Hf-Y ternary system, 2010'
URL:
- type: 'MS Thesis'
metadata: 'James E. Saal, Thermodynamic modeling of the reactive sintering of Nd:YAG, 2007'
URL:
- type: 'MS Thesis'
metadata: 'Mei Yang, Thermodynamic modeling of La1-xSrxCoO3, 2006'
URL:
- type: 'MS Thesis'
metadata: 'William J. Golumbfskie, Fracture toughness of spray formed Al-Y-Ni-Co alloys, 2002.'
URL:
- type: 'MS Thesis'
metadata: 'Carl Owen Brubaker, Computational and experimental investigations of phase equilibria in magnesium alloy systems, 2002'
URL:
- type: 'BS Thesis'
metadata: 'Frank P. McGrogan, Thermodynamic modeling of LixMn2O4 spinel for Li-ion battery cathode applications, 2013'
URL:
- type: 'BS Thesis'
metadata: 'Yan (Annabelle) Ling, First-principles calculations and thermodynamic modeling of the Hf-Re binary system with extension to the Hf-Ni-Re ternary system, 2011'
URL:
- type: 'BS Thesis'
metadata: 'Abdelaziz M. Elmadani, Effect of Lead Oxide Vapor on the Strength of Alumina, 2010'
URL:
- type: 'BS Thesis'
metadata: 'Bradley Hasek, Thermodynamic modeling and first-principles calculations of the Cr-Hf-Y ternary system, 2010'
URL:
- type: 'BS Thesis'
metadata: 'Laura Jean Lucca, An Experimental Investigation of the Mg-Al Binary System, 2010'
URL:
- type: 'BS Thesis'
metadata: 'Justin T. Savrock, Computational Modeling of the Ce-Sn Binary System Using Thermo-Calc, 2010'
URL:
- type: 'BS Thesis'
metadata: 'Chad M. L. Althouse, CALPHAD modeling of BaTi2O5, BaTi5O11, and Ba1.054 Ti0.946O2.946 in the BaO-TiO2 system, 2007'
URL:
- type: 'BS Thesis'
metadata: 'Tuan Tran, Computational modeling of the Sr-Si binary system by using thermo-calc, 2006'
URL:
- type: 'BS Thesis'
metadata: 'Matt Benzio, Phase stability in the Al-Mg binary system, 2004'
URL:
- type: 'BS Thesis'
metadata: 'Justin Hyska, Computational modeling of the B-Ba, B-Ca, and B-Sr systems using thermo-calc, 2004'
URL:
- type: 'BS Thesis'
metadata: 'Joseph Harvey, The verification of Dictra modeled liquation in Al-3Cu, 2003'
URL:
- type: 'BS Thesis'
metadata: 'Jason Arndt, Computational modeling of the K-Na and F-Na binary systems using thermo-calc, 2002'
URL:
- type: 'BS Thesis'
metadata: 'Roger Ice, Computational modeling of the Na-F and K-F binary systems using thermo-calc, 2002'
URL:
- type: 'BS Thesis'
metadata: 'Briama Cooper, Computer simulation of liquation in Al-Cu, 2001'
URL:
- type: 'BS Thesis'
metadata: 'Melissa Marshall, A computational thermodynamic analysis of atmospheric magnesium production, Honor, 2000'
URL:
- type: 'BS Thesis'
metadata: 'Ricki Stevenson, Grain Size and its Effect on the Formation of Continuous Versus Discontinuous Precipitation, 2000'
URL:
220 changes: 126 additions & 94 deletions yaml2md.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,130 @@ def parse_pressprints(header, file):

return parsed_entries


def parse_articles(header, file):

# Read the entries from the YAML file
with open('publications/' + file, 'r', encoding='utf-8') as f:
entries = yaml.safe_load(f)

# Count how many entries contain 'bumpyear' as a key in 'entries'
bumpyears = sum(1 for item in entries if 'bumpyear' in item) + 1

# Split 'entries' into lists at the position where a 'bumpyear' key appears
split_entries = []
temp_list = []
for item in entries:
if 'bumpyear' in item:
if temp_list:
split_entries.append(temp_list)
temp_list = [item]
else:
temp_list.append(item)
if temp_list:
split_entries.append(temp_list)

# Create a dictionary with 'bumpyears' items
bumpyear_dict = {1987 + bumpyears - i: split_entries[i] for i in range(bumpyears)}

# Remove entries containing 'bumpyear' as a key from the sublists in 'bumpyear_dict'
for year, sublist in bumpyear_dict.items():
bumpyear_dict[year] = [item for item in sublist if 'bumpyear' not in item]

parsed_entries += f"# {header}\n---\n## {max(bumpyear_dict)//10*10}'s\n\n"

# Create a string to store the formatted entries
id = len(entries) - bumpyears + 1
for key, value in bumpyear_dict.items():
if key%10 == 9 and key != max(bumpyear_dict):
parsed_entries += f"## {key//10*10}'s\n\n"

parsed_entries += f"### {key} ({id} - {id-len(value)+1})\n\n"
for i, entry in enumerate(value):
if "ID_deprecated" in entry and entry["ID_deprecated"] != None:
id = int(entry["ID_deprecated"])

entryString = f"<div id='{id}'></div> **{id}.** {entry['authors']}, _{entry['title']}_, {entry['metadata']}\n\n"
URLs = []
try:
url = entry['DOI']
URLs.append(f"DOI: [{re.search(r'https?://([^/]+/)?(.+)', url).group(2)}]({url})")
except Exception as e:
if entry['DOI']:
URLs.append(f"DOI: [{url}]({url})")
try:
url = entry['arXiv']
URLs.append(f"arXiv: [{re.search(r'https?://([^/]+/)?(.+)', url).group(2)}]({url})")
except Exception as e:
if entry['arXiv']:
URLs.append(f"arXiv: [{url}]({url})")
try:
url = entry['URL']
URLs.append(f"URL: [{re.search(r'https?://([^/]+/)?(.+)', url).group(2)}]({url})")
except Exception as e:
if entry['URL']:
URLs.append(f"URL: [{url}]({url})")

try:
bibentry = cn.content_negotiation(ids = entry['DOI'], format = "bibentry")
entryString += f"<button onclick='copyToClipboard(\"bib{id}\")'><i class='fas fa-copy'></i></button>\n" + " \| ".join(URLs) + f"\n<p id='bib{id}' style='display: none;'>{bibentry}</p>\n\n"
except Exception as e:
entryString += " \| ".join(URLs) + "\n\n"
print(f"An error occurred while processing the BIBENTRY DOI {entry['DOI']}: {e}")

try:
result = cr.works(ids = entry['DOI'])
if 'abstract' in result['message']:
abstract = result['message']['abstract'].replace('<jats:p>', '').replace('</jats:p>', '').replace('<jats:title>Abstract</jats:title>', '')
entryString += f"\n<details style='margin-bottom: 20px;'>\n <summary>Abstract:</summary>\n \n {abstract}\n</details>"
except Exception as e:
print(f"An error occurred while processing the ABSTRACT of DOI {entry['DOI']}: {e}")

parsed_entries += entryString + "\n\n"
id -= 1

return parsed_entries


def parse_theses(header, file):
parsed_entries = f"# {header}\n---\n"

# Read the entries from the YAML file
with open('publications/' + file, 'r', encoding='utf-8') as f:
entries = yaml.safe_load(f)

for type in ['PhD Thesis','MS Thesis','BS Thesis']:
formatted_entries += f"## {type.upper()}\n\n"
No = len([entry for entry in entries if entry['type'] == type])
for entry in entries:
if entry['type'] == type:
entryString = f"**{No}\.** {entry['metadata']}\n\n"
URLs = []
try:
url = entry['URL']
URLs.append(f"URL: [{re.search(r'https?://([^/]+/)?(.+)', url).group(2)}]({url})")
except Exception as e:
if entry['URL']:
URLs.append(f"URL: [{url}]({url})")
try:
url = entry['arXiv']
URLs.append(f"arXiv: [{re.search(r'https?://([^/]+/)?(.+)', url).group(2)}]({url})")
except Exception as e:
if entry['arXiv']:
URLs.append(f"arXiv: [{url}]({url})")
try:
url = entry['Recording']
URLs.append(f"Recording: [{re.search(r'https?://([^/]+/)?(.+)', url).group(2)}]({url})")
except Exception as e:
if entry['Recording']:
URLs.append(f"Recording: [{url}]({url})")
entryString += " \| ".join(URLs) + "\n\n"
parsed_entries += entryString
No -= 1

return parsed_entries


def parse_others(header, file):
parsed_entries = f"# {header}\n---\n"

Expand Down Expand Up @@ -73,84 +197,7 @@ def parse_others(header, file):

formatted_entries += parse_pressprints('IN-PRESS', 'inpress.yaml')

# Read the entries from the YAML file
with open('publications/articles.yaml', 'r', encoding='utf-8') as f:
entries = yaml.safe_load(f)

# Count how many entries contain 'bumpyear' as a key in 'entries'
bumpyears = sum(1 for item in entries if 'bumpyear' in item) + 1

# Split 'entries' into lists at the position where a 'bumpyear' key appears
split_entries = []
temp_list = []
for item in entries:
if 'bumpyear' in item:
if temp_list:
split_entries.append(temp_list)
temp_list = [item]
else:
temp_list.append(item)
if temp_list:
split_entries.append(temp_list)

# Create a dictionary with 'bumpyears' items
bumpyear_dict = {1987 + bumpyears - i: split_entries[i] for i in range(bumpyears)}

# Remove entries containing 'bumpyear' as a key from the sublists in 'bumpyear_dict'
for year, sublist in bumpyear_dict.items():
bumpyear_dict[year] = [item for item in sublist if 'bumpyear' not in item]

formatted_entries += f"# ARTICLES\n---\n## {max(bumpyear_dict)//10*10}'s\n\n"

# Create a string to store the formatted entries
id = len(entries) - bumpyears + 1
for key, value in bumpyear_dict.items():
if key%10 == 9 and key != max(bumpyear_dict):
formatted_entries += f"## {key//10*10}'s\n\n"

formatted_entries += f"### {key} ({id} - {id-len(value)+1})\n\n"
for i, entry in enumerate(value):
if "ID_deprecated" in entry and entry["ID_deprecated"] != None:
id = int(entry["ID_deprecated"])

entryString = f"<div id='{id}'></div> **{id}.** {entry['authors']}, _{entry['title']}_, {entry['metadata']}\n\n"
URLs = []
try:
url = entry['DOI']
URLs.append(f"DOI: [{re.search(r'https?://([^/]+/)?(.+)', url).group(2)}]({url})")
except Exception as e:
if entry['DOI']:
URLs.append(f"DOI: [{url}]({url})")
try:
url = entry['arXiv']
URLs.append(f"arXiv: [{re.search(r'https?://([^/]+/)?(.+)', url).group(2)}]({url})")
except Exception as e:
if entry['arXiv']:
URLs.append(f"arXiv: [{url}]({url})")
try:
url = entry['URL']
URLs.append(f"URL: [{re.search(r'https?://([^/]+/)?(.+)', url).group(2)}]({url})")
except Exception as e:
if entry['URL']:
URLs.append(f"URL: [{url}]({url})")

try:
bibentry = cn.content_negotiation(ids = entry['DOI'], format = "bibentry")
entryString += f"<button onclick='copyToClipboard(\"bib{id}\")'><i class='fas fa-copy'></i></button>\n" + " \| ".join(URLs) + f"\n<p id='bib{id}' style='display: none;'>{bibentry}</p>\n\n"
except Exception as e:
entryString += " \| ".join(URLs) + "\n\n"
print(f"An error occurred while processing the BIBENTRY DOI {entry['DOI']}: {e}")

try:
result = cr.works(ids = entry['DOI'])
if 'abstract' in result['message']:
abstract = result['message']['abstract'].replace('<jats:p>', '').replace('</jats:p>', '').replace('<jats:title>Abstract</jats:title>', '')
entryString += f"\n<details style='margin-bottom: 20px;'>\n <summary>Abstract:</summary>\n \n {abstract}\n</details>"
except Exception as e:
print(f"An error occurred while processing the ABSTRACT of DOI {entry['DOI']}: {e}")

formatted_entries += entryString + "\n\n"
id -= 1
formatted_entries += parse_articles('ARTICLES', 'articles.yaml')

formatted_entries += parse_others('CONFERENCE PROCEEDINGS AND REPORTS','proceedingsandreports.yaml')

Expand All @@ -160,22 +207,7 @@ def parse_others(header, file):

formatted_entries += parse_others('WEBCASTS', 'webcasts.yaml')

formatted_entries += "# THESES\n---\n"

# Read the entries from the YAML file
with open('publications/theses.yaml', 'r', encoding='utf-8') as f:
entries = yaml.safe_load(f)

for type in ['PhD Thesis','MS Thesis','BS Thesis']:
formatted_entries += f"## {type.upper()}\n\n"
No = len([entry for entry in entries if entry['type'] == type])
for entry in entries:
if entry['type'] == type:
if 'URL' in entry:
formatted_entries += f"**{No}\.** [{entry['metadata']}]({entry['URL']})\n\n"
else:
formatted_entries += f"**{No}\.** {entry['metadata']}\n\n"
No -= 1
formatted_entries += parse_theses('THESES', 'theses.yaml')

# Write the formatted entries to a .md file
with open('index.md', 'w', encoding='utf-8') as f:
Expand Down

0 comments on commit 25c9507

Please sign in to comment.