Crawing-List-of-Admitted-Schools-for-Students

get data from using Ctrl+S at department list each dep.

How to run code?

I didnt clean the code at the end!

git pull in terminal.
get data from using Ctrl+S at department list each dep. Save it in to ./data/department folder

or you may try another library named selenuim
Run get_test_location_url.ipynb
Based on the name of the file whole.csv, save (Ctrl+S) all of the links with sequence [0-n] at the end of the default filename.
Run main.ipynb and move all of the csv files into another folder(still need it in the future), if you didnt move it, then whem you run main_get_side_rank.ipynb the result will cover it.
Run main_get_side_rank.ipynb
paste the column named dept_rank from the new output into those original files(step 5) at the new line(column).
check if rank & dept_rank(correct rate 60%) is completely correct or not artificially.

if you want to make it better, then you may try to find a better OCR tool.
hand in the result!

TO-DO

(not sure)add efficiency thoughts: calculate each students searching(every folders with same exam_location) for each correspond stuno(third and last) and make it store in a variable(hard, have to do dict in list [{}]). and just call it to get the value.
If the filename named correctly, you may think of a better way to improve the efficiency.

Done

figuring out what's processing_list function doing (?count?)
find the department he get into.
merge dataframe one by one column from different dataframe
正取、備取 have limit. using try, except, else functions. We can try to download all of the test place and use substring to enter it and compare to what unis he had apllied to get the whole test number.
1. download all test place
2. compare (two if-else statement)
  1. test number first three digits and the last digit to get only few candidates.
    
    if first digit is L, l then change it to 1.
  2. campare to unis he had applied
3. get whole test num
try to get the whole <td count i:9 and print univ_and_department with accepted univ_and_department.
fixed the sequence of department_name (正1、正2沒有對應到都會延後兩位)
Get location link first, use excel to delete which more than one and Ctrl+s by each.
for stuno:
- If want to get the whole number then have to enter the test place link, but we cannot enter the website(now we've downloaded whole list of all department using Ctrl+s)
- get all univ & department they had applied and add it into df using for loop. Run get _test_location_url.ipynb to get a csv file named whole.csv. -> Trying to delete which repeat.: We can use excel! Nice!

fix: glob.iglob repeat finding same exam_location_folder.

to fix \n in the stuno numbers in list

feat: add if-statement

make each students in list find their own, using

first: stuno_second and stuno_last
second: comapre numbers
third: compare depts (use this)

find the lower number first, cuase if I build the get_test_location_url.ipynb it probably will start from lower number: using glob.iglob(list)

WTF section

when csvs all output, then use excel to clean it (下拉cell) it can sort by the rules itself, in the end just check。
output_list_&_num.txt have some blanks because we start from black ones not white, if blank then white.
output_list_&_num_fixed.txt white and dark will find the complete exam_location list seperately.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
others		others
scripts		scripts
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Crawing-List-of-Admitted-Schools-for-Students

get data from using Ctrl+S at department list each dep.

How to run code?

TO-DO

Done

WTF section

About

Releases

Packages

Languages

Tang-hubert/Crawing-List-of-Admitted-Schools-for-Students

Folders and files

Latest commit

History

Repository files navigation

Crawing-List-of-Admitted-Schools-for-Students

get data from using Ctrl+S at department list each dep.

How to run code?

TO-DO

Done

WTF section

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages