utilities

Some utiltities

Added following files/directories

ML  - Machine Learning
pdfsplit.py - PDF Splitting using python
subtitle-processing.py - Subtitle text extraction using graphical interface
weekly_lsp_stats.py - MySQL logs database processing
print_unicode_range.py - print unicode range from start to end from argv
#mongorestore.py - to restore mongo collection
#pet2concordance.py - postedit to concordance db
subtitle.py - extract text from .srt file, create four type of files[placeholder, new lines, without new lines, story mode]
data_range_API2concordance.py - parsing API response 
list_match.py	- find and replace from a file(having mapping words) with the input file to be replaced
ngram-generator.py	- Generate n grams frequency on an input text file

201907231515

tmx2tab.py - Extract text from tmx into tab seperated format
pdf2text/doc - Extract text from pdf into doc/txt

201903301150

mongo_restore.py/sh - restoring mongo collections one by one
pet2concordance.py - Extract from postedit db and insert into conordancedb

201905091439

docxtable2text.py - convert table text in doc/docx into tab seperated text

201905241149

generate_docx.py - generate docx file from txt file

2019101710434

bulk_pdf_extract_api.py - extract text from pdf using tikka from input folder
create_srt.py - create srt file from transcription file with timeline
prime.py - script to determine prime number
vlc_mp4_flac.py -  convert mp4 to flac files using vlc command from input folder to output folder

20191181616

normalizer.py to remove white spaces, tabs, new lines

20191181704

remove_rich_text.py to remove tags and text between tags from rich text transcription file.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
ML		ML
pdftest		pdftest
README.md		README.md
bulk_pdf_extract_api.py		bulk_pdf_extract_api.py
clean_corpus.py		clean_corpus.py
clean_tsv_text.py		clean_tsv_text.py
config_tp.py		config_tp.py
count_sen_words.py		count_sen_words.py
create_srt.py		create_srt.py
data_range_API2concordance.py		data_range_API2concordance.py
docx_lsp.py		docx_lsp.py
docxtable2text.py		docxtable2text.py
generate_docx.py		generate_docx.py
hyphenated_words.py		hyphenated_words.py
insert_space.py		insert_space.py
lang_pair_wise_stats.py		lang_pair_wise_stats.py
mongorestore.py		mongorestore.py
mongorestore.sh		mongorestore.sh
ngram-generator.py		ngram-generator.py
normalizer.py		normalizer.py
pdf2doc.py		pdf2doc.py
pdf2text.py		pdf2text.py
pdfsplit.py		pdfsplit.py
pet2concordance.py		pet2concordance.py
phrase_and_training_data.py		phrase_and_training_data.py
prime.py		prime.py
print_duplicate_line_no.py		print_duplicate_line_no.py
print_unicode_range.py		print_unicode_range.py
pypdf2.txt		pypdf2.txt
remove_bkp.py		remove_bkp.py
remove_rich_text.py		remove_rich_text.py
run_shell_cmd.py		run_shell_cmd.py
sort-dict.py		sort-dict.py
split_corpus.py		split_corpus.py
str_to_hex.py		str_to_hex.py
subtitle-processing.py		subtitle-processing.py
subtitle.py		subtitle.py
task_wise.py		task_wise.py
tmx2tab.py		tmx2tab.py
txt_to_unicode.py		txt_to_unicode.py
user_stats.py		user_stats.py
verb_feature_convert.py		verb_feature_convert.py
vlc_mp4_flac.py		vlc_mp4_flac.py
weekly_lsp_stats.py		weekly_lsp_stats.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

utilities

Added following files/directories

201907231515

201903301150

201905091439

201905241149

2019101710434

20191181616

20191181704

About

Releases

Packages

Languages

nagaraju291990/utilities

Folders and files

Latest commit

History

Repository files navigation

utilities

Added following files/directories

201907231515

201903301150

201905091439

201905241149

2019101710434

20191181616

20191181704

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages