-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pending] pyDKB: common/hdfs.py documentation #178
base: pyDKB-master-merge
Are you sure you want to change the base?
Conversation
Conflicts: Utils/Dataflow/091_datasetsRucio/datasets_processing.py
Added: parameters' names and types in each function return type and object description
Note: for some functions :return: and :rtype: parameters are empty (no return). It can be fixed with one more commit if needed.
Not reviewed yet as #165 should go first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few more comments:
- Would be good to add information about exceptions that this or that function rises (see
:raises:
here). As some functions return nothing, it is the only way their results should be checked. - To see what a function returns, if it returns nothing, you could try following:
>>> def f(): pass ... >>> print f() None >>> if f() is None: print "None it is." ... None it is.
- I believe git client(s) have settings at least to detect trailing whitespaces in commits; and you can always run
git diff --staged
beforegit commit
to make sure you`re commiting what was meant to be commited. It is a good practice and, in fact, item 3 of this guide clearly states: "Review changes before every commit". It was not written for my own graphimaniac pleasure, trust me ;)
@@ -1,5 +1,6 @@ | |||
""" | |||
Utils to interact with HDFS. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why a new line?
:param max_lines: maximum quantity of lines | ||
:type max_lines: number (int) | ||
|
||
:return: subprocess return code, set by poll() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return code is not set by poll()
-- it is returned by the subprocess, and can be accessed by poll()
.
I believe it will be Ok to simply say that it is a "subprocess return code".
Utils/Dataflow/pyDKB/common/hdfs.py
Outdated
|
||
Return value: file name (without directory) | ||
|
||
:param fname: file name of a downloaded file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I`d rather say "path to file", then "file name".
:param fname: file name of a downloaded file | ||
:type fname: string | ||
|
||
:return: file name without its directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Without path"?
Utils/Dataflow/pyDKB/common/hdfs.py
Outdated
|
||
Return value: open file object (TemporaryFile). | ||
|
||
:param fname: name of a temporary file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahem.. please check again ;)
:param fname: name of a temporary file | ||
:type fname: string | ||
|
||
:return: open file object (TemporaryFile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...or None in case of error.
:param mode: type of a list of files | ||
:type mode: string | ||
|
||
:return: a list of files' names/subdirectories inside HDFS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...in HDFS directory.
"Inside HDFS" looks a bit too broad and can make one to believe it is about all files/subdirectories.
""" Return file name without path. | ||
|
||
:param path: path to a file | ||
:type path: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If path
is None
, it is taken as empty string -- I believe it should be mentioned somewhere...
""" Return dirname without filename. | ||
|
||
:param path: path to a file | ||
:type path: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above about None
.
:param path: path to a file | ||
:type path: string | ||
:param filename: file name | ||
:type filename: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above about None
.
Create directory and check status of process | ||
with subprocess.Popen. | ||
|
||
:param dirname: a name of a created directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "name of the directory" or "directory name" or something else?
- "a" before "name" - similar instances of "name" in this file come without "a".
- "a" before "directory" - it's not some directory, it's a specific directory which was already mentioned in lines 52 and 54. By the way, in those lines it can be "create a directory".
- "created" - we are talking about arguments, so nothing was created yet. "to create" or "to be created" can be used if necessary.
Get file from HDFS and check status of process with | ||
subprocess.Popen. | ||
|
||
:param fname: path of a downloaded file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Path to the file (to download/to be downloaded)". Apart from the past tense in function arguments (see comment to line 57, point 3), it is possible to read this as "path to the local copy of the downloaded file" - which is incorrect, if I understand correctly.
132b720
to
fb7e780
Compare
f415e34
to
abd340a
Compare
Changed according to the template.
Note: there are still some empty :return: and :rtype: lines when
a function don't return anything explicitly.
Waits for #165