-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Mailing list migration
Eric Larson edited this page Dec 11, 2020
·
17 revisions
Following:
- https://meta.discourse.org/t/beginners-guide-to-install-discourse-for-development-using-docker/102009
- https://meta.discourse.org/t/importing-mailing-lists-mbox-listserv-google-groups-emails/79773#1-5-prepare-files
git clone https://github.com/discourse/discourse.git
cd discourse
d/boot_dev --init
printf "\ngem 'sqlite3'" >> Gemfile
d/rails db:migrate RAILS_ENV=development
d/bundle install
# These I actually usually do in another dir like ~/Desktop so I only ever have to do it once
mkdir shared
wget -r -l1 --no-parent --no-directories "https://mail.nmr.mgh.harvard.edu/pipermail//mne_analysis/" -P shared/import/data/mne_analysis -A "*-*.txt.gz"
rm shared/import/data/mne_analysis/robots.txt.tmp
gzip -d shared/import/data/mne_analysis/*.txt.gz
wget https://gist.githubusercontent.com/larsoner/940cd6c7100b87c4c5668cb0bc540afb/raw/9e78513620d11355ad0e10f4a2470996c26ebc8c/mailmanToMBox.py -O shared/mailmanToMBox.py
python3 shared/mailmanToMBox.py shared/import/data/mne_analysis/
rm shared/import/data/mne_analysis/*.txt
wget https://raw.githubusercontent.com/discourse/discourse/master/script/import_scripts/mbox/settings.yml -O shared/import/settings.yml
vim shared/import/settings.yml # add "Mne_analysis": "mne_analysis" under "tag:" section
# Then I `cp -a` these to the `discourse/` root dir in case I need to hard reset, then go to a shell:
d/shell
In the docker shell:
cd /src
sudo mv shared/import /shared/
bundle exec ruby script/import_scripts/mbox.rb /shared/import/settings.yml
Output
Loading existing groups...
Loading existing users...
Loading existing categories...
Loading existing posts...
Loading existing topics...
creating index
indexing files in /shared/import/data/mne_analysis
indexing /shared/import/data/mne_analysis/2018-March.mbox
...
indexing /shared/import/data/mne_analysis/2007-December.mbox
indexing replies and users
creating categories
1 / 1 (100.0%) [17497813 items/min]
creating users
creating topics and posts
7373 / 7373 (100.0%) [1360 items/min]
Updating topic status
Updating bumped_at on topics
Updating last posted at on users
Updating last seen at on users
Updating first_post_created_at...
Updating user post_count...
Updating user topic_count...
Updating topic users
Updating post timings
Updating featured topic users
Updating featured topics in categories
5 / 5 (100.0%) [6890 items/min] ]
Resetting topic counters
Done (00h 06min 21sec)
To check/diagnose message conversion errors in docker image...
d/shell
# check results
cat /shared/import/data/mne_analysis/*.mbox > ~/all.mbox
sudo apt install -y procmail
mkdir -p ~/split
export FILENO=0000
formail -ds sh -c 'cat > ~/split/msg.$FILENO' < ~/all.mbox
rm -rf ~/split ~/all.mbox
exit
To kill the docker instance and start over...
docker stop /discourse_dev && docker rm /discourse_dev
git reset --hard
sudo rm -rf data
git clean -xdf
TBD