hsql

An SQL engine on top of Hadoop

There are 4 files

lexer.py : this does the job of lexing
parser.py : this does the parsing of the SQL queries
main.py : this is the which runs the interpreter (like python !) and you can run SQL Queries
implemenation.py : this file contains the implemenations of SQL queries and some functions (Must be completed)

The SQL Queries Currently Supported :

select
use
drop
load
create database
schema (not there in standard SQL. Added to view schema of databases and tables)
current database (again not there in standard SQL. Added to know the currently selected database)
exit() or quit() (to quit the interpreter)
list database (to list all available databases)

What are requirements ?

hadoop python3 ply

How to install ply ?

pip3 install ply

How to run the interpreter ?

python3 main.py

What's Currently Working ?

use
create database
load (partially)
drop
schema
current database

What's must be done ?

Make it work on hadoop (create and delete files/folders in hadoop. currently made to work on file system and not hadoop. May have to change remove() in implementation.py. can do in end I guess)
Implement load completely
Implement select
Implement aggregate functions MAX, COUNT, SUM

Note : May have to write mapper/reducer in separate files and call them via system call in the wrapper functions select, load, MAX,COUNT and SUM via hadoop streaming API

The Directory Organization

DATABASE_ROOT/
    database_name.schema
    dblist.db
    database_name/
        table_name/
                csv_file

Note

dblist.db is file which contains the list of all the databases (only 1)
There is one schema file per database
There is one directory for each database Have commented as many important lines as possible. If you have any doubts, call me.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.DS_Store		.DS_Store
README.md		README.md
implementation.py		implementation.py
lexer.py		lexer.py
main.py		main.py
parser.py		parser.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hsql

An SQL engine on top of Hadoop

There are 4 files

The SQL Queries Currently Supported :

What are requirements ?

How to install ply ?

How to run the interpreter ?

What's Currently Working ?

What's must be done ?

The Directory Organization

About

Releases

Packages

Languages

achintyashivam11/hsql

Folders and files

Latest commit

History

Repository files navigation

hsql

An SQL engine on top of Hadoop

There are 4 files

The SQL Queries Currently Supported :

What are requirements ?

How to install ply ?

How to run the interpreter ?

What's Currently Working ?

What's must be done ?

The Directory Organization

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages