Skip to content

Mailing list migration

Eric Larson edited this page Dec 11, 2020 · 17 revisions

Following:

1. Import maling list to dev env

git clone https://github.com/discourse/discourse.git
cd discourse
d/boot_dev --init
printf "\ngem 'sqlite3'" >> Gemfile
d/rails db:migrate RAILS_ENV=development
d/bundle install

# These I actually usually do in another dir like ~/Desktop so I only ever have to do it once
mkdir shared
wget -r -l1 --no-parent --no-directories "https://mail.nmr.mgh.harvard.edu/pipermail//mne_analysis/" -P shared/import/data/mne_analysis -A "*-*.txt.gz"
rm shared/import/data/mne_analysis/robots.txt.tmp
gzip -d shared/import/data/mne_analysis/*.txt.gz
wget https://gist.githubusercontent.com/larsoner/940cd6c7100b87c4c5668cb0bc540afb/raw/9e78513620d11355ad0e10f4a2470996c26ebc8c/mailmanToMBox.py -O shared/mailmanToMBox.py
python3 shared/mailmanToMBox.py shared/import/data/mne_analysis/
rm shared/import/data/mne_analysis/*.txt
wget https://raw.githubusercontent.com/discourse/discourse/master/script/import_scripts/mbox/settings.yml -O shared/import/settings.yml
vim shared/import/settings.yml  # add "Mne_analysis": "mne_analysis" under "tag:" section

# Then I `cp -a` these to the `discourse/` root dir in case I need to hard reset, then go to a shell:
d/shell

In the docker shell:

cd /src
sudo mv shared/import /shared/
bundle exec ruby script/import_scripts/mbox.rb /shared/import/settings.yml
Output
Loading existing groups...
Loading existing users...
Loading existing categories...
Loading existing posts...
Loading existing topics...

creating index
indexing files in /shared/import/data/mne_analysis
indexing /shared/import/data/mne_analysis/2018-March.mbox
...
indexing /shared/import/data/mne_analysis/2007-December.mbox

indexing replies and users

creating categories
        1 / 1 (100.0%)  [17497813 items/min]  
creating users

creating topics and posts
     7373 / 7373 (100.0%)  [1360 items/min]  

Updating topic status

Updating bumped_at on topics

Updating last posted at on users

Updating last seen at on users

Updating first_post_created_at...

Updating user post_count...

Updating user topic_count...

Updating topic users

Updating post timings

Updating featured topic users

Updating featured topics in categories
        5 / 5 (100.0%)  [6890 items/min]   ]  
Resetting topic counters


Done (00h 06min 21sec)
To check/diagnose message conversion errors in docker image...
d/shell
# check results
cat /shared/import/data/mne_analysis/*.mbox > ~/all.mbox
sudo apt install -y procmail
mkdir -p ~/split
export FILENO=0000
formail -ds sh -c 'cat > ~/split/msg.$FILENO' < ~/all.mbox
rm -rf ~/split ~/all.mbox
exit
To kill the docker instance and start over...
docker stop /discourse_dev && docker rm /discourse_dev
git reset --hard
sudo rm -rf data
git clean -xdf
In Ubuntu host (hopefully): ``` d/unicorn google-chrome http://0.0.0.0:9292 ``` Done!

4. Exporting the database for the Discourse people to host

TBD