Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel Compression #16

Merged
merged 32 commits into from
Nov 11, 2024
Merged

Parallel Compression #16

merged 32 commits into from
Nov 11, 2024

Conversation

jwong-nd
Copy link
Contributor

@jwong-nd jwong-nd commented Nov 4, 2024

Logic:

  • Decoupled logic for recursive copy from convert_video calls for parallelization
  • Filesystem.py logic moved into etl.py to reference fields in self.job_settings directly
  • Lift logging outside of convert_video for parallelization

Readibility:

  • Added comments, added typing, changed some variable names

pyproject.toml Outdated Show resolved Hide resolved
@galenlynch
Copy link
Collaborator

I will review this soon.

Copy link
Collaborator

@galenlynch galenlynch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please split this PR into the following distinct PRs:

  1. a PR that is just related to parallelization
  2. a PR that moves functions into class methods
  3. a PR that changes the logic of testing for overrides etc

Too many things are happening in this PR, and because you've moved functions around it's hard to tell what's changed.

src/aind_behavior_video_transformation/etl.py Outdated Show resolved Hide resolved
src/aind_behavior_video_transformation/etl.py Outdated Show resolved Hide resolved
src/aind_behavior_video_transformation/etl.py Outdated Show resolved Hide resolved
src/aind_behavior_video_transformation/etl.py Outdated Show resolved Hide resolved
src/aind_behavior_video_transformation/etl.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@galenlynch galenlynch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is pretty close. I haven't had a chance to run the PR yet, but did you confirm that the information from each process, printed by ffmpeg to stderr, is being combined correctly? I would have thought that you would need to redirect the stderr of each subprocess if you were invoking them in parallel, so that you could log it or print it back out when the future completes.

It wasn't clear to me why you exported convert_video at the top level (in the __init__ file), or why you added another test of it. Each of the other tests should also test convert_video

src/aind_behavior_video_transformation/transform_videos.py Outdated Show resolved Hide resolved
src/aind_behavior_video_transformation/transform_videos.py Outdated Show resolved Hide resolved
src/aind_behavior_video_transformation/transform_videos.py Outdated Show resolved Hide resolved
]
for job in as_completed(jobs):
result = job.result()
logging.info("FFmpeg job completed:", result)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am worried that this will garble the stderr info output of each call. Here the result is simply the filename of the output. What does the output look like? I would think that each subprocess would still print to stderr, but they would all do it at the same time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the output of a working call (same input directory as test_run_job_with_data_structure):
"""
2024-11-07 11:04:40,762 - INFO - FFmpeg job completed: /private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmpxq15v823/camera2/clip.mp4
2024-11-07 11:04:40,792 - INFO - FFmpeg job completed: /private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmpxq15v823/camera1/clip.mp4
"""

This is the output if I set the requested compression to:
"""
faulty_req = CompressionRequest(
compression_enum=CompressionEnum.USER_DEFINED,
user_ffmpeg_input_options = "invalid user input args",
user_ffmpeg_output_options = "invalid user output args",
)
"""

Nothing is logged and I believe stderr is forwarded to stdout:
"""
[AVFormatContext @ 0x11df07720] Unable to choose an output format for 'invalid'; use a standard extension for the filename or specify the format manually.
[AVFormatContext @ 0x12e006c50] [out#0 @ 0x11df12660] Unable to choose an output format for 'invalid'; use a standard extension for the filename or specify the format manually.
Error initializing the muxer for invalid: Invalid argument
Error opening output file invalid.
[out#0 @ 0x12e011b70] Error opening output files: Invalid argument
Error initializing the muxer for invalid: Invalid argument
Error opening output file invalid.
Error opening output files: Invalid argument
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/Users/jonathan.wong/miniconda3/envs/compression/lib/python3.11/concurrent/futures/process.py", line 261, in process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jonathan.wong/Projects/aind-behavior-video-transformation/src/aind_behavior_video_transformation/transform_videos.py", line 242, in convert_video
subprocess.run(ffmpeg_command, check=True)
File "/Users/jonathan.wong/miniconda3/envs/compression/lib/python3.11/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ffmpeg', '-y', '-v', 'warning', '-hide_banner', 'invalid', 'user', 'input', 'args', '-i', '/private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmp7ogiojea/camera2/clip.mp4', 'invalid', 'user', 'output', 'args', '/private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmpxj4mbyd
/camera2/clip.mp4']' returned non-zero exit status 234.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/Users/jonathan.wong/Projects/aind-behavior-video-transformation/src/aind_behavior_video_transformation/etl.py", line 234, in
job_response = job.run_job()
^^^^^^^^^^^^^
File "/Users/jonathan.wong/Projects/aind-behavior-video-transformation/src/aind_behavior_video_transformation/etl.py", line 140, in run_job
self._run_compression(convert_video_args)
File "/Users/jonathan.wong/Projects/aind-behavior-video-transformation/src/aind_behavior_video_transformation/etl.py", line 97, in _run_compression
result = job.result()
^^^^^^^^^^^^
File "/Users/jonathan.wong/miniconda3/envs/compression/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/Users/jonathan.wong/miniconda3/envs/compression/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self.exception
subprocess.CalledProcessError: Command '['ffmpeg', '-y', '-v', 'warning', '-hide_banner', 'invalid', 'user', 'input', 'args', '-i', '/private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmp7ogiojea/camera2/clip.mp4', 'invalid', 'user', 'output', 'args', '/private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmpxj4mbyd
/camera2/clip.mp4']' returned non-zero exit status 234.
"""

Is this garbled? It appears like this returns an error for any failed process.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the two outputs are being interleaved in a way that makes it hard to follow, for example:

[AVFormatContext @ 0x11df07720] Unable to choose an output format for 'invalid'; use a standard extension for the filename or specify the format manually.
[AVFormatContext @ 0x12e006c50] [out#0 @ 0x11df12660] Unable to choose an output format for 'invalid'; use a standard extension for the filename or specify the format manually.

IMO it would be better if the output of each process was captured and printed to stdout or stderr in a block, so you can follow the messages from each subprocess.

# For logging I guess
ffmpeg_str = " ".join(ffmpeg_command)
logging.info(f"{ffmpeg_str=}")

subprocess.run(ffmpeg_command, check=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might need to make sure the stderr output of this subprocess call is routed correctly if you're running many subprocesses at the same time. Right now the stderr of the subprocess goes to stderr.

src/aind_behavior_video_transformation/__init__.py Outdated Show resolved Hide resolved
tests/test_transform_videos.py Outdated Show resolved Hide resolved
tests/test_transform_videos.py Show resolved Hide resolved
@jwong-nd
Copy link
Contributor Author

jwong-nd commented Nov 9, 2024

Coverage threshold decreased to 93 as coverage library is not registering error handling code as run when it is.

Extra info about logging update:

  • Stdout for 1 or 2 invalid requests:
File "/Users/jonathan.wong/Projects/aind-behavior-video-transformation/src/aind_behavior_video_transformation/etl.py", line 273, in <module>
    job_response = job.run_job()
                   ^^^^^^^^^^^^^
  File "/Users/jonathan.wong/Projects/aind-behavior-video-transformation/src/aind_behavior_video_transformation/etl.py", line 150, in run_job
    self._run_compression(convert_video_args)
  File "/Users/jonathan.wong/Projects/aind-behavior-video-transformation/src/aind_behavior_video_transformation/etl.py", line 115, in _run_compression
    raise RuntimeError('One or more Ffmpeg jobs failed. See error logs.')
RuntimeError: One or more Ffmpeg jobs failed. See error logs.
  • Log for 1 valid request and 1 invalid request:
2024-11-08 13:23:24,716 - INFO - FFmpeg job completed: /private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmphns1wvcc/camera2/clip.mp4
2024-11-08 13:23:24,730 - ERROR - FFmpeg conversion failed for /private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmpl9o2d0kg/camera1/clip.mp4
Command: ffmpeg -y -v warning -hide_banner invalid user input args -i /private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmpl9o2d0kg/camera1/clip.mp4 invalid user output args /private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmphns1wvcc/camera1/clip.mp4
Return code: 234
Error output:
[AVFormatContext @ 0x1341056c0] Unable to choose an output format for 'invalid'; use a standard extension for the filename or specify the format manually.
[out#0 @ 0x1341105d0] Error initializing the muxer for invalid: Invalid argument
Error opening output file invalid.
Error opening output files: Invalid argument
  • Log for 2 invalid requests:
2024-11-08 13:27:04,397 - ERROR - FFmpeg conversion failed for /private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmpoco639kw/camera1/clip.mp4
Command: ffmpeg -y -v warning -hide_banner invalid user input args -i /private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmpoco639kw/camera1/clip.mp4 invalid user output args /private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmpm5pxsmq7/camera1/clip.mp4
Return code: 234
Error output:
[AVFormatContext @ 0x122e2ded0] Unable to choose an output format for 'invalid'; use a standard extension for the filename or specify the format manually.
[out#0 @ 0x122e38d40] Error initializing the muxer for invalid: Invalid argument
Error opening output file invalid.
Error opening output files: Invalid argument

2024-11-08 13:27:04,397 - ERROR - FFmpeg conversion failed for /private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmpoco639kw/camera2/clip.mp4
Command: ffmpeg -y -v warning -hide_banner invalid user input args -i /private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmpoco639kw/camera2/clip.mp4 invalid user output args /private/var/folders/jp/tzg4yhb959s7g46rlqjjl2qh0000gp/T/tmpm5pxsmq7/camera2/clip.mp4
Return code: 234
Error output:
[AVFormatContext @ 0x14cf07720] Unable to choose an output format for 'invalid'; use a standard extension for the filename or specify the format manually.
[out#0 @ 0x14cf12660] Error initializing the muxer for invalid: Invalid argument
Error opening output file invalid.
Error opening output files: Invalid argument

@jwong-nd jwong-nd requested a review from galenlynch November 9, 2024 00:46
@galenlynch
Copy link
Collaborator

I see, so if any job fails then the relevant errors will be in the log?

This looks great! I will give it a once over one last time tomorrow.

Copy link
Collaborator

@galenlynch galenlynch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@galenlynch galenlynch merged commit b233655 into main Nov 11, 2024
4 checks passed
@galenlynch galenlynch deleted the feat-parallelization branch November 11, 2024 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants