Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate mirror submission #448

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions AUTHORS
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,5 @@ Thayer Williams <[email protected]>
Thomas Bächler <[email protected]>
Tom Willemsen <[email protected]>
Tyler Dence <[email protected]>
Anton Hvornum <[email protected]>
Nina Nick <[email protected]>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add this in a separate commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate commit to the same PR, or separate PR with a separate commit?

3 changes: 3 additions & 0 deletions mirrors/tests/test_mirror_submission.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
def test_mirror_registration(client, mirrorurl):
response = client.get('/mirrorlist/submit/?name=test3&tier=2&upstream=1&admin_email=anton%40hvornum.se&alternate_email=&isos=on&rsync_user=&rsync_password=&notes=&active=True&public=True&url1-url=rsync%3A%2F%2Ftest3.com%2Farchlinux&url1-country=SE&url1-bandwidth=1234&url1-active=on&url2-url=&url2-country=&url2-bandwidth=&url2-active=on&url3-url=&url3-country=&url3-bandwidth=&url3-active=on&ip=&captcha_0=d5a017cc3851fb59898167f666759c99b42afd52&captcha_1=tdof')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a crude test, but will at least test that the endpoint exists and that it accepts the request data (captcha will fail)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks wrong, a form should never be able to be posted with a GET, it should always be a POST.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also please create the url programmatically so you can easily assert the data when expanding the test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below:

assert response.status_code == 200
1 change: 1 addition & 0 deletions mirrors/urls_mirrorlist.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

urlpatterns = [
path('', views.generate_mirrorlist, name='mirrorlist'),
path('submit/', views.submit_mirror, name='mirrorsubmit'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not linked anywhere on archlinux.org, maybe we should add a link with some text on https://archlinux.org/mirrors.

re_path(r'^all/$', views.find_mirrors, {'countries': ['all']}),
re_path(r'^all/(?P<protocol>[A-z]+)/$', views.find_mirrors_simple, name='mirrorlist_simple')
]
Expand Down
292 changes: 289 additions & 3 deletions mirrors/views/mirrorlist.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,151 @@
from operator import attrgetter, itemgetter
from urllib.parse import urlparse, urlunsplit
from functools import partial
from datetime import timedelta

from django import forms
from django.db import DatabaseError, transaction
from django.db.models import Q
from django.forms.widgets import SelectMultiple, CheckboxSelectMultiple
from django.core.mail import send_mail
from django.template import loader
from django.forms.widgets import (
SelectMultiple,
CheckboxSelectMultiple
)
from django.utils.timezone import now
from django.shortcuts import get_object_or_404, redirect, render
from django.views.decorators.csrf import csrf_exempt
from django_countries import countries
from django.contrib.auth.models import Group, User
from captcha.fields import CaptchaField

from ..models import MirrorUrl, MirrorProtocol
from ..utils import get_mirror_statuses
from ..models import Mirror, MirrorUrl, MirrorProtocol, MirrorRsync
from ..utils import get_mirror_statuses, get_mirror_errors

import random

# This is populated later, and re-populated every refresh
# This was the only way to get 3 different examples without
# changing the models.py
url_examples = []
TIER_1_MAX_ERROR_RATE = 10
TIER_1_ERROR_TIME_RANGE = 30
TIER_1_MIN_DAYS_AS_TIER_2 = 60


class CaptchaForm(forms.Form):
captcha = CaptchaField()

def as_div(self):
"Returns this form rendered as HTML <divs>s."

return self._html_output(
normal_row=u'<div class="captcha">%(label)s <div class="captcha-input">%(field)s%(help_text)s</div></div>',
error_row=u'%s',
row_ender='</div>',
help_text_html=u' <span class="helptext">%s</span>',
errors_on_separate_row=True)


class MirrorRequestForm(forms.ModelForm):
upstream = forms.ModelChoiceField(
queryset=Mirror.objects.filter(
tier__gte=0,
tier__lte=1
),
required=False
)

class Meta:
model = Mirror
fields = ('name', 'tier', 'upstream', 'admin_email', 'alternate_email',
'isos', 'active', 'public', 'rsync_user', 'rsync_password', 'notes')

def __init__(self, *args, **kwargs):
super(MirrorRequestForm, self).__init__(*args, **kwargs)
fields = self.fields
fields['name'].widget.attrs.update({'placeholder': 'Ex: mirror.argentina.co'})
fields['alternate_email'].widget.attrs.update({'placeholder': 'Optional'})
fields['rsync_user'].widget.attrs.update({'placeholder': 'Optional'})
fields['rsync_password'].widget.attrs.update({'placeholder': 'Optional'})
fields['notes'].widget.attrs.update({'placeholder': 'Optional (Ex: Hosted by ISP GreatISO.bg)'})

def as_div(self):
"Returns this form rendered as HTML <divs>s."
return self._html_output(
normal_row=u'<div%(html_class_attr)s>%(label)s %(field)s%(help_text)s</div>',
error_row=u'%s',
row_ender='</div>',
help_text_html=u' <span class="helptext">%s</span>',
errors_on_separate_row=True)


class MirrorUrlForm(forms.ModelForm):
class Meta:
model = MirrorUrl
fields = ('url', 'country', 'bandwidth', 'active')

def __init__(self, *args, **kwargs):
global url_examples

super(MirrorUrlForm, self).__init__(*args, **kwargs)
fields = self.fields

if len(url_examples) == 0:
url_examples = [
'Ex: http://mirror.argentina.co/archlinux',
'Ex: https://mirror.argentina.co/archlinux',
'Ex: rsync://mirror.argentina.co/archlinux'
]

fields['url'].widget.attrs.update({'placeholder': url_examples.pop()})

def clean_url(self):
# is this a valid-looking URL?
url_parts = urlparse(self.cleaned_data["url"])
if not url_parts.scheme:
raise forms.ValidationError("No URL scheme (protocol) provided.")
if not url_parts.netloc:
raise forms.ValidationError("No URL host provided.")
if url_parts.params or url_parts.query or url_parts.fragment:
raise forms.ValidationError(
"URL parameters, query, and fragment elements are not supported.")
# ensure we always save the URL with a trailing slash
path = url_parts.path
if not path.endswith('/'):
path += '/'
url = urlunsplit((url_parts.scheme, url_parts.netloc, path, '', ''))
return url

def as_div(self):
"Returns this form rendered as HTML <divs>s."
return self._html_output(
normal_row=u'<div%(html_class_attr)s>%(label)s %(field)s%(help_text)s</div>',
error_row=u'%s',
row_ender='</div>',
help_text_html=u' <span class="helptext">%s</span>',
errors_on_separate_row=True)


class MirrorRsyncForm(forms.ModelForm):
class Meta:
model = MirrorRsync
fields = ('ip',)

def __init__(self, *args, **kwargs):
super(MirrorRsyncForm, self).__init__(*args, **kwargs)
fields = self.fields
fields['ip'].widget.attrs.update({'placeholder': 'Ex: 1.2.4.5'})

def as_div(self):
"Returns this form rendered as HTML <divs>s."
return self._html_output(
normal_row=u'<div%(html_class_attr)s>%(label)s %(field)s%(help_text)s</div>',
error_row=u'%s',
row_ender='</div>',
help_text_html=u' <span class="helptext">%s</span>',
errors_on_separate_row=True)


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unneeded change

class MirrorlistForm(forms.Form):
country = forms.MultipleChoiceField(required=False, widget=SelectMultiple(attrs={'size': '12'}))
Expand Down Expand Up @@ -127,4 +261,156 @@ def find_mirrors_simple(request, protocol):
proto = get_object_or_404(MirrorProtocol, protocol=protocol)
return find_mirrors(request, protocols=[proto])


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, but PEP8#Blank-Lines dictates two blank lines between function definitions. The rest of the code is also spaced out this way.

For reference:

return urls
def find_mirrors(request, countries=None, protocols=None, use_status=False,

def mail_mirror_admins(data):
template = loader.get_template('mirrors/new_mirror_mail_template.txt')

mirror_maintainer_group = Group.objects.filter(name='Mirror Maintainers')
mirror_maintainers = User.objects.filter(is_active=True).filter(groups__in=mirror_maintainer_group)

for maintainer in mirror_maintainers:
send_mail('A mirror entry was submitted: \'%s\'' % data.get('name'),
template.render(data),
'Arch Mirror Notification <[email protected]>',
[maintainer.email],
fail_silently=True)


def validate_tier_1_request(data):
if data.get('tier') != '1':
return None

# If there is no Tier 2 with the same name,
# We invalidate this Tier 1.
if not len((tier_2_mirror := Mirror.objects.filter(name=data.get('name'), tier=2, active=True, public=True))):
return False

if tier_2_mirror[0].created - now() < timedelta(days=TIER_1_MIN_DAYS_AS_TIER_2):
return False

main_url = MirrorUrl.objects.filter(
url__startswith=data.get('url1-url'),
mirror=tier_2_mirror[0]
)

# If the Tier 2 and Tier 1 does not have matching URL,
# it requires manual intervention as it's not a direct upgrade.
if len(main_url) <= 0:
return False

error_logs = get_mirror_errors(mirror_id=tier_2_mirror[0].id, cutoff=timedelta(days=TIER_1_ERROR_TIME_RANGE),
show_all=True)

if error_logs:
num_o_errors = 0
for error in error_logs:
num_o_errors += error['error_count']

if num_o_errors >= TIER_1_MAX_ERROR_RATE:
return False

# Final check, is the mirror old enough to qualify for Tier 1?
print(tier_2_mirror[0].created)

return tier_2_mirror


def submit_mirror(request):
if request.method == 'POST' or len(request.GET) > 0:
data = request.POST if request.method == 'POST' else request.GET

captcha_form = CaptchaForm(data=data)

# data is immutable, need to be copied and modified to enforce
# active and public is False.
tmp = data.copy()
tmp['active'] = False
tmp['public'] = False
data = tmp

mirror_form = MirrorRequestForm(data=data)

mirror_url1_form = MirrorUrlForm(data=data, prefix="url1")
if data.get('url2-url') != '':
mirror_url2_form = MirrorUrlForm(data=data, prefix="url2")
else:
mirror_url2_form = MirrorUrlForm(prefix="url2")
if data.get('url3-url') != '':
mirror_url3_form = MirrorUrlForm(data=data, prefix="url3")
else:
mirror_url3_form = MirrorUrlForm(prefix="url3")

rsync_form = MirrorRsyncForm(data=data)

mirror_url2_form.fields['url'].required = False
mirror_url3_form.fields['url'].required = False
rsync_form.fields['ip'].required = False

if data.get('tier') == '1':
if existing_mirror := validate_tier_1_request(data):
existing_mirror.update(tier=1)
else:
return render(
request,
'mirrors/mirror_submit_error_upgrading.html',
{
'TIER_1_ERROR_TIME_RANGE': TIER_1_ERROR_TIME_RANGE,
'TIER_1_MAX_ERROR_RATE': TIER_1_MAX_ERROR_RATE,
}
)
else:
if captcha_form.is_valid() and mirror_form.is_valid() and mirror_url1_form.is_valid():
try:
with transaction.atomic():
transaction.on_commit(partial(mail_mirror_admins, data))

mirror = mirror_form.save()
mirror_url1 = mirror_url1_form.save(commit=False)
mirror_url1.mirror = mirror
mirror_url1.save()

if data.get('url2-url') != '' and mirror_url2_form.is_valid():
mirror_url2 = mirror_url2_form.save(commit=False)
mirror_url2.mirror = mirror
mirror_url2.save()
if data.get('url3-url') != '' and mirror_url3_form.is_valid():
mirror_url3 = mirror_url3_form.save(commit=False)
mirror_url3.mirror = mirror
mirror_url3.save()

if data.get('ip') != '' and rsync_form.is_valid():
rsync = rsync_form.save(commit=False)
rsync.mirror = mirror
rsync.save()

except DatabaseError as error:
print(error)

else:
captcha_form = CaptchaForm()
mirror_form = MirrorRequestForm()
mirror_url1_form = MirrorUrlForm(prefix="url1")
mirror_url2_form = MirrorUrlForm(prefix="url2")
mirror_url3_form = MirrorUrlForm(prefix="url3")
rsync_form = MirrorRsyncForm()

mirror_form.fields['active'].widget = forms.HiddenInput()
mirror_form.fields['public'].widget = forms.HiddenInput()
mirror_url2_form.fields['url'].required = False
mirror_url3_form.fields['url'].required = False
rsync_form.fields['ip'].required = False

return render(
request,
'mirrors/mirror_submit.html',
{
'captcha': captcha_form,
'mirror_form': mirror_form,
'mirror_url1_form': mirror_url1_form,
'mirror_url2_form': mirror_url2_form,
'mirror_url3_form': mirror_url3_form,
'rsync_form': rsync_form
}
)

# vim: set ts=4 sw=4 et:
3 changes: 2 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ Markdown==3.3.7
bencode.py==4.0.0
django-countries==7.3.2
django-extensions==3.1.3
django-simple-captcha==0.5.17
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I like the captch dependency, we don't have it for flagging packages. Do you think it's really required?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this is what people said I should put in when I was asking in the arch channels, I don't mind dropping it but to reduce spam this is what was suggested.

To quote: "Just make a captcha thing"

jsmin==3.0.1
pgpdump==1.5
parse==1.19.0
Expand All @@ -15,4 +16,4 @@ feedparser==6.0.10
bleach==5.0.0
requests==2.25.1
xtarfile==0.1.0
zstandard==0.17.0
zstandard==0.17.0
1 change: 1 addition & 0 deletions settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,7 @@
'public',
'releng',
'visualize',
'captcha'
)

# Logging configuration for not getting overspammed
Expand Down
18 changes: 18 additions & 0 deletions sitestatic/archweb.css
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,15 @@ code {
padding: 0.15em 0.25em;
}

.error {
font-family: monospace, monospace;
padding: 0.15em 0.25em;
overflow: auto;
color: #a94442;
background: #f2dede;
border: 1px solid #ebccd1;
}

pre {
font-family: monospace, monospace;
border: 1px solid #bdb;
Expand Down Expand Up @@ -1197,3 +1206,12 @@ ul.signoff-list {
.pgp-key-ids {
display: inline-block;
}

.captcha {
display: flex;
}

.captcha-input {
display: flex;
flex-direction: column;
}
Loading