Skip to content

Mailing list migration

Eric Larson edited this page Dec 11, 2020 · 17 revisions

Following:

1. In Ubuntu host

git clone https://github.com/discourse/discourse.git
cd discourse
d/boot_dev --init
d/rails db:migrate RAILS_ENV=development
d/shell
vim /src/Gemfile  # add gem 'sqlite3' to end
exit
d/bundle

2. In docker shell

# follow the import/list_name hierarchy from
# https://meta.discourse.org/t/importing-mailing-lists-mbox-listserv-google-groups-emails/79773#1-5-prepare-files

cd ~
sudo mkdir -p /shared/import/data
sudo chown -R discourse:discourse /shared/import
wget -r -l1 --no-parent --no-directories "https://mail.nmr.mgh.harvard.edu/pipermail//mne_analysis/" -P /shared/import/data/mne_analysis -A "*-*.txt.gz"
rm /shared/import/data/mne_analysis/robots.txt.tmp
gzip -d /shared/import/data/mne_analysis/*.txt.gz
wget https://gist.githubusercontent.com/larsoner/940cd6c7100b87c4c5668cb0bc540afb/raw/9e78513620d11355ad0e10f4a2470996c26ebc8c/mailmanToMBox.py -O ~/mailmanToMBox.py
python3 ~/mailmanToMBox.py /shared/import/data/mne_analysis/
rm /shared/import/data/mne_analysis/*.txt
sudo apt install -y libsqlite3-dev

# check results
cat /shared/import/data/mne_analysis/*.mbox > ~/all.mbox
sudo apt install -y procmail
mkdir -p ~/split
export FILENO=0000
formail -ds sh -c 'cat > ~/split/msg.$FILENO' < ~/all.mbox
rm -rf ~/split ~/all.mbox

# settings
wget https://raw.githubusercontent.com/discourse/discourse/master/script/import_scripts/mbox/settings.yml -O /shared/import/settings.yml

# run it
cd /src
bundle exec ruby script/import_scripts/mbox.rb /shared/import/settings.yml
$ bundle exec ruby script/import_scripts/mbox.rb /shared/import/settings.yml
Loading existing groups...
Loading existing users...
Loading existing categories...
Loading existing posts...
Loading existing topics...

creating index
indexing files in /shared/import/data/mne_analysis
indexing /shared/import/data/mne_analysis/2018-March.mbox
...
indexing /shared/import/data/mne_analysis/2007-December.mbox

indexing replies and users

creating categories
        1 / 1 (100.0%)  [17497813 items/min]  
creating users

creating topics and posts
     7373 / 7373 (100.0%)  [1360 items/min]  

Updating topic status

Updating bumped_at on topics

Updating last posted at on users

Updating last seen at on users

Updating first_post_created_at...

Updating user post_count...

Updating user topic_count...

Updating topic users

Updating post timings

Updating featured topic users

Updating featured topics in categories
        5 / 5 (100.0%)  [6890 items/min]   ]  
Resetting topic counters


Done (00h 06min 21sec)

3. In Ubuntu host (hopefully)

d/unicorn
google-chrome http://0.0.0.0:9292

Done!