Boatman is a simple Ruby DSL for polling directories and ferrying around / manipulating new files that appear in those folders. It was created as an attempt at something more elegant than having numerous scripts that all do very similar file transfer and manipulation tasks.
Install the boatman gem (assuming you have Ruby and RubyGems):
gem install boatman
Get a quick feel for what boatman does with this example.
Create a YAML file to define the task scripts and directories they’ll operate on. Let’s call it demo.yml:
tasks: - demo.rb directories: fresh_text_folder: txt_output text_storage_folder: storage
Now create two folders under the folder where you have demo.yml called “txt_output” and “storage”.
Make a task file demo.rb:
fresh_text_folder.check_every 5.seconds do age :greater_than => 1.second files_ending_with "txt" do |file| move file, :to => text_storage_folder end end
Now run your task:
boatman demo.yml
Now, while boatman is running, open another terminal or a file browser and create a file in the txt_output folder you created called “demo.txt”. Wait a bit and it should get moved to the “storage” folder you made.
Hit Ctrl-C to exit out of boatman. You can take a look at the boatman.log file it creates to see a log of what operations have been performed.
The top level of a boatman task will usually be a directory polling loop. This is accomplished by running the check_every method on a directory specified in your YAML configuration file. The check_every method takes a time interval as its only argument other than a block. Using the example above, running
fresh_text_folder.check_every 5.seconds do ... end
will run the provided do..end block every 5 seconds in the context of the fresh_text_folder directory.
It is often desirable to consider just a subset of the files in the folder being polled. Files/folders can be selected by age and file/folder name. Age can be specified inside the polling folders do..end block:
# age of the file must be greater than 1 minute age :greater_than => 1.minute # age of the file must be less than 24 hours age :less_than => 24.hours
There are a few ways to select based on file name. Each of these methods accepts a block to run on each selected file/folder:
# select files by a string or regular expression files_matching /\d+\.tif/ do |file| ... end # select files by a string or regular expression ending files_ending_with "txt" do |file| ... end # select folders by a string or regular expression folders_matching /\d+/ do |folder| ... end # select folders by a string or regular expression ending folders_ending_with "log" do |folder| ... end
Files/folders can be copied or moved inside the block provided to the file/folder matching methods:
# move selected files to destination_folder, which needs to be defined in the configuration YAML file. files_ending_with "txt" do |file| move file, :to => destination_folder end # same thing but copy the file instead of moving it files_ending_with "txt" do |file| copy file, :to => destination_folder end
boatman will perform a checksum verification by default in order to catch errors in file transfers. This can be disabled with the disable_checksum_verification directive inside the file/folder matching block:
files_ending_with "txt" do |file| disable_checksum_verification move file, :to => destination_folder end
Files can optionally be renamed using the :rename parameter for move or copy:
files_ending_with "txt" do |file| # use the path method on the file new_name = "renamed_" + File.basename(file.path) # file will renamed, e.g. test.txt becomes renamed_test.txt move file, :to => destination_folder, :rename => new_name end
The file being copied/moved can also be modified by passing a block to the copy or move methods. The parameters for the block are the path to read the original file and a the path to write the modified file:
files_ending_with "txt" do |file| move file, :to => destination_folder do |old_file_name, new_file_name| old_file = File.new(old_file_name, "r") new_file = File.new(new_file_name, "w") old_file.readlines.each do |line| new_file << "# #{line}" end end end
YAML configuration files for boatman consist of two parts, tasks and directories.
Under tasks you can specify any number of task files to be loaded and run together. Note that boatman will take care of running each task on the interval it specifies, however the tasks are run serially so a long-running task will prevent subsequent tasks from running until it completes:
# config.yml tasks: - text_file_reformatter.rb - raw_data_transfer.rb directories: ...
Directories allow you to name directories you’d like to have available to your tasks. It is possible to specify both Windows- and POSIX-style paths:
# Windows-style my_shared_folder: //mycomputer/myshare # POSIX-style my_shared_folder: /home/bmarzolf/share
-
Fork the project.
-
Make your feature addition or bug fix.
-
Add tests for it. This is important so I don’t break it in a future version unintentionally.
-
Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but
bump version in a commit by itself I can ignore when I pull)
-
Send me a pull request. Bonus points for topic branches.
Copyright © 2009 Bruz Marzolf. See LICENSE for details.