-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Mailing list migration
Eric Larson edited this page Dec 11, 2020
·
17 revisions
Following:
- https://meta.discourse.org/t/beginners-guide-to-install-discourse-for-development-using-docker/102009
- https://meta.discourse.org/t/importing-mailing-lists-mbox-listserv-google-groups-emails/79773#1-5-prepare-files
git clone https://github.com/discourse/discourse.git
cd discourse
d/boot_dev --init
d/rails db:migrate RAILS_ENV=development
printf "\n\ngem 'sqlite3'" >> Gemfile
d/bundle
d/shell
# follow the import/list_name hierarchy from
# https://meta.discourse.org/t/importing-mailing-lists-mbox-listserv-google-groups-emails/79773#1-5-prepare-files
sudo mkdir -p /shared/import/data
sudo chown -R discourse:discourse /shared/import
wget -r -l1 --no-parent --no-directories "https://mail.nmr.mgh.harvard.edu/pipermail//mne_analysis/" -P /shared/import/data/mne_analysis -A "*-*.txt.gz"
rm /shared/import/data/mne_analysis/robots.txt.tmp
gzip -d /shared/import/data/mne_analysis/*.txt.gz
wget https://gist.githubusercontent.com/larsoner/940cd6c7100b87c4c5668cb0bc540afb/raw/9e78513620d11355ad0e10f4a2470996c26ebc8c/mailmanToMBox.py -O ~/mailmanToMBox.py
python3 ~/mailmanToMBox.py /shared/import/data/mne_analysis/
rm /shared/import/data/mne_analysis/*.txt
sudo apt install -y libsqlite3-dev
# check results
cat /shared/import/data/mne_analysis/*.mbox > ~/all.mbox
sudo apt install -y procmail
mkdir -p ~/split
export FILENO=0000
formail -ds sh -c 'cat > ~/split/msg.$FILENO' < ~/all.mbox
rm -rf ~/split ~/all.mbox
# settings
wget https://raw.githubusercontent.com/discourse/discourse/master/script/import_scripts/mbox/settings.yml -O /shared/import/settings.yml
printf "\n\n\"mne_analysis\": \"[Mne_analysis]\"" >> /shared/import/settings.yml # remove [Mne_analysis], turn into tag
# run it
cd /src
bundle exec ruby script/import_scripts/mbox.rb /shared/import/settings.yml
$ bundle exec ruby script/import_scripts/mbox.rb /shared/import/settings.yml
Loading existing groups...
Loading existing users...
Loading existing categories...
Loading existing posts...
Loading existing topics...
creating index
indexing files in /shared/import/data/mne_analysis
indexing /shared/import/data/mne_analysis/2018-March.mbox
...
indexing /shared/import/data/mne_analysis/2007-December.mbox
indexing replies and users
creating categories
1 / 1 (100.0%) [17497813 items/min]
creating users
creating topics and posts
7373 / 7373 (100.0%) [1360 items/min]
Updating topic status
Updating bumped_at on topics
Updating last posted at on users
Updating last seen at on users
Updating first_post_created_at...
Updating user post_count...
Updating user topic_count...
Updating topic users
Updating post timings
Updating featured topic users
Updating featured topics in categories
5 / 5 (100.0%) [6890 items/min] ]
Resetting topic counters
Done (00h 06min 21sec)
d/unicorn
google-chrome http://0.0.0.0:9292
Done!
To wipe and start over, from here in the discourse
root on Ubuntu host (this allows the Ruby commands to execute there rather than on the dev docker instance, which is necessary to kill the dB as the docker instance does not have permissions):
rm -R tmp/*
rm -R log/*
# sudo apt install ruby-dev libsqlite3-dev libpq-dev redis-tools # only needs to be done once
# bundle install --path vendor/bundle # only needs to be done once
Note that the dev root directory Gemfile discourse/Gemfile
on the Ubuntu host is the same as /src/Gemfile
on the dev docker instance, so this effectively duplicates the env that is on the docker instance.
Note that, without wiping the old instance, the import can be repeated, so only new posts will be added!
TBD