PendingDeprecationWarning: Creating a ModelForm without either the 'fields' attribute or the 'exclude' attribute is deprecated - form needs updating
Signed-off-by: Dan McGee <dan@archlinux.org>
We sometimes record a duration even on a failed fetch attempt, such as
if we get an HTTP 404. However, we never record a last_sync value on a
failed fetch. Use this field instead to sum up the total number of
successful checks.
Signed-off-by: Dan McGee <dan@archlinux.org>
This should be a small enough chunk of data that it isn't super
expensive to put into and pull out of memcached.
Signed-off-by: Dan McGee <dan@archlinux.org>
FTP is a terrible protocol these days compared to HTTP. IPv6 support is
spotty at best, it is much slower for the connect/begin transfer cycle,
and overall just doesn't provide anything HTTP does better. Start
killing bits that we've added to treat FTP as a first-class protocol and
regulate it to the back seat.
The expectation here is once this commit goes live to the production
site, the FTP mirror URLs themselves will get removed completely from
the database, and the FTP protocol object itself will get deleted.
Signed-off-by: Dan McGee <dan@archlinux.org>
Give a window of 7 days for logs here rather than the default 24 hours
we do on the main status page since we are only retrieving details for a
single mirror with a handful of URLs. This should make it easier to have
all information regarding one mirror in a single location.
Signed-off-by: Dan McGee <dan@archlinux.org>
If certain attributes came back from the database as NULL, we had issues
parsing them. Pass None/NULL straight through rather than trying to
type-convert.
Signed-off-by: Dan McGee <dan@archlinux.org>
Most of these were suggested by PyCharm, and include everything from
little syntax issues and other bad smells to dead or bad code.
Signed-off-by: Dan McGee <dan@archlinux.org>
Rather than lump it all together and have odd spikes depending on which
side of the Atlantic checked a mirror in a given timeslot, draw a chart
per check location.
Signed-off-by: Dan McGee <dan@archlinux.org>
The benefit of these storage operations might be outweighed by the cost,
especially given how infrequently these functions are called.
Signed-off-by: Dan McGee <dan@archlinux.org>
Move completely to custom SQL for this logic. The Django ORM just
doesn't play nice with the kind of query we are looking to do, so it is
easier to do using raw SQL.
The biggest pain factor here is in supporting sqlite as it doesn't have
nearly the capabilities in handling datetime types directly in the
database, as well as having some different type conversion necessities.
Signed-off-by: Dan McGee <dan@archlinux.org>
Rather than doing this in the Python code and needing 12,000+ rows
returned from the database, we can do it in the database and get fewer
than 300 rows back.
If I recall correctly, the reason this was not done originally was due
to our usage of MySQL and some really bad date math/overflow stuff it
did when the interval between last_sync and check_time were greater than
about a week. Luckily, we have switched to using a more sane database.
Signed-off-by: Dan McGee <dan@archlinux.org>
This gives us a bunch more flexibility on this field, and now supports
all the options that the rsync config file supports.
Signed-off-by: Dan McGee <dan@archlinux.org>
This was a silly thinko here; it caused the logs to fill up with a bunch
of 'unknown url type: rsync' errors.
Signed-off-by: Dan McGee <dan@archlinux.org>
Rsync doesn't like this so much:
Unexpected remote arg: rsync://mirror.example.com/archlinux/lastsync
rsync error: syntax or usage error (code 1) at main.c(1214) [sender=3.0.9]
Signed-off-by: Dan McGee <dan@archlinux.org>
This adds the -l/--location argument to the command in order to pass in
a check location that we are currently running from. This locks the IP
address family to the one derived from the address on that location, and
stores any check results tagged with a location ID.
Signed-off-by: Dan McGee <dan@archlinux.org>
This reverts commit 3c4ceb16. We don't need this anymore as bulk_create
gets automatic batching now on sqlite3 so it is safe to use.
Signed-off-by: Dan McGee <dan@archlinux.org>
We have been better about doing this to most of our models, but the ones
here didn't have a created field. Add it where appropriate and set a
reasonably old default value.
Signed-off-by: Dan McGee <dan@archlinux.org>
For now, it is not included in the default selection, but we have a few
existing mirrors that do support it.
Signed-off-by: Dan McGee <dan@archlinux.org>
This was added in Django 1.5 and allows saving only a subset of a
model's fields. It makes sense in a few cases to utilize it.
Signed-off-by: Dan McGee <dan@archlinux.org>
Less noticeable in production as the templates don't show
'@@@INVALID@@@' there, but we were trying to access attributes that
don't actually exist on certain mirror objects.
Signed-off-by: Dan McGee <dan@archlinux.org>
We now always look for this information at the URL level, not the mirror
level. This simplifies quite a bit of code in and around the mirror
views.
Signed-off-by: Dan McGee <dan@archlinux.org>
Rather than have the weird indirection we need now to find the right
country for URLs, just always store it on the URL.
Signed-off-by: Dan McGee <dan@archlinux.org>
This seems to generate much more performant queries at the database
level than what we were previously doing, and also doesn't show
duplicate rows.
Signed-off-by: Dan McGee <dan@archlinux.org>
We need to ensure we don't duplicate URLs in the status view, so add a
distinct() call back in to the queryset when it was inadvertently
dropped in commit a2cfa7edbb. This negates a lot of the performance
gains we had, unfortunately, so it looks like a nested subquery might be
more efficient. Disappointing the planner can't do this for us.
Signed-off-by: Dan McGee <dan@archlinux.org>
Now that we have as many mirror URLs as we do, we can do a better job
fetching and aggregating this data. The prior method resulted in a
rather unwieldy query being pushed down to the database with a
horrendously long GROUP BY clause. Instead of trying to group by
everything at once so we can retrieve mirror URL info at the same time,
separate the two queries- one for getting URL performance data, one for
the qualitative data.
The impetus behind fixing this is the PostgreSQL slow query log in
production; this currently shows up the most of any queries we run in
the system.
Signed-off-by: Dan McGee <dan@archlinux.org>