"Working on the crumbly edge of future-proofing." -- Heather Champ
Dotspotting does not so much piggyback on a traditional framework as it does hold hands with an anti-framework called "Flamework".
Flamework is the mythical ("mythical") PHP framework developed and used by the engineering team at Flickr. It is gradually being rewritten, from scratch, as an open-source project by former Flickr engineers. It is available to download and use on Github:
If you've never watched Cal Henderson's "Why I Hate Django" presentation now is probably as good a time as any. It will help you understand a lot about why things were done they were at Flickr and why those of us who've left prefer to keep doing them that way:
Flamework is not really a framework, at least not by most people's standards. All software development is basically pain management and Flamework assumes that the most important thing is the speed with which the code running an application can be re-arranged, in order to adapt to circumstances, even if it's at the cost of "doing things twice" or "repeating ourselves".
Dotspotting itself may eventually become a framework but today it is not.
Today, Dotspotting is a nascent application that is still trying to recognize, never mind understand, its boundaries. That means it's just too soon for for a unified database or object model and nothing is gained by having to fight against one all the time in order to adapt it to the needs of the application itself.
A complete Flamework reference is out of scope for this document but here's the short version:
Flamework is basically two things:
- A set of common libraries and functions.
- A series of social conventions for how code is arranged.
Flamework also takes the following for granted:
- It uses Smarty for templating.
- It uses global variables. Not many of them but it also doesn't make a fuss about the idea of using them.
- It does not use objects or "protected" variables.
- It breaks it own rules occasionally and uses objects but only rarely and generally when they are defined by third-party libraries (like Smarty).
- That "normalized data is for sissies".
For all intents and purposes, Flamework is a model-view-controller (MVC) system:
- There are shared libraries (the model)
- There are PHP files (the controller)
- There are templates (the view)
Here is a simple bare-bones example of how it all fits together:
# lib_example.php
<?php
function example_foo(&$user){
$max = ($user['id']) ? $user['id'] : 1000;
return range(0, rand(0, $max));
}
?>
# example.php
#
# note how we're importing lib_example.php (above)
# and squirting everything out to page_example.txt (below)
<?php>
include("include/init.php");
loadlib("example");
$foo = example_foo($GLOBALS['cfg']['user']);
$GLOBALS['smarty']->assign_by_ref("foo", $foo);
$GLOBALS['smarty']->display("page_example.txt");
exit();
?>
# page_example.txt
{assign var="page_title" value="example page title"}
{include file="inc_head.txt"}
<p>{if $cfg.user.id}Hello, {$cfg.user.username|escape}!{else}Hello, stranger!{/if}</p>
<p>foo is: {$foo|@join(",")|escape}</p>
{include file="inc_foot.txt"}
The only "rules" here are:
- Making sure you load
include/init.php
- The part where
init.php
handles authentication checking and assigns logged in users to the global$cfg
variable (it also creates and assigns a global$smarty
object) - The naming conventions for shared libraries, specifically:
lib_SOMETHING.php
which is imported asloadlib("SOMETHING")
. - Functions defined in libraries are essentially "namespaced".
Page template names and all that other stuff is, ultimately, your business.
Flamework uses and assigns global PHP variables on the grounds that it's really just not that big a deal. A non-exhaustive list of global variables that Flameworks assigns is:
-
$GLOBALS['cfg'] -- this is a great big hash that contains all the various site configs
-
$GLOBALS['smarty'] -- a Smarty templating object
-
$GLOBALS['timings'] -- a hash used to store site performance metrics
-
$GLOBALS['loaded_libs'] -- a hash used to store information about libraries that have been loaded
-
$GLOBALS['local_cache'] -- a hash used to store locally cached data
-
$GLOBALS['error'] -- a (helper) hash used to assign site errors to; this is also automagically assigned to a corresponding Smarty variable
Flamework assumes a federated model with all the various user data spread across a series of databases, or "clusters". For each cluster there are a series of corresponding helper functions defined in lib_db.php
.
By default Dotspotting does not require that it be run under a fully-federated database system. It takes advantage of Flamework's ability to run in "poor man's federated" mode which causes the database libraries to act as though there are multiple database clusters when there's only really one. Specifically, all the various databases are treated as though they live in the db_main
cluster. The goal is to enable (and ensure) that when a given installation of Dotspotting outgrows a simple one or two machine setup that it can easily be migrated to a more robust system with a minimum of fuss.
For complete details on how to set up and configure your database(s) for Dotspotting please consult the README.DATABASE.md document.
As of this writing Flamework defines/expects the following clusters:
- db_main
This is the database cluster where user accounts and other lookup-style database tables live.
- db_main_slave
These are read-only versions of the db_main
cluster that are updated using MySQL replication.
- db_users
These are the federated tables, sometimes called "shards". This is where the bulk of the data in Dotspotting is stored because it can be spread out, in smaller chunks, across a whole bunch of databases rather than a single monolithic monster database that becomes a single point of failure and it just generally a nuisance to maintain.
- db_tickets
One of the things about storing federated user data is that from time to time you may need to "re-balance" your shards, for example moving all of a user's data from shard #5 to shard #23. That means you can no longer rely on an individual database to generate auto-incrementing unique IDs because each database shard creates those IDs in isolation and if you try to move a dot, for example, with ID 123
to a shard with another dot that already has the same ID everything will break and there will be tears.
The way around this is to use "ticketing" servers whose only job is to sit around and assign unique IDs. A discussion of ticketing servers is outside the scope of this document but Kellan wrote a good blog post about the subject if you're interested in learning more. Which is a long way of saying: Flamework uses tickets and they come from the db_tickets
cluster.
Note: This is really specific to Dotspotting but the basic principle(s) apply equally to Flamework.
Search is one of those things that's all tangled up in how your databases are set up, whether you are doing full-text or spatial queries.
The first release of Dotspotting is geared specifically towards MySQL because it is readily available on shared web-hosting services, easy to install on both the server and desktop and has a large community of users and documentation. One consequence of using MySQL is that full-text search is not awesome. One consequence of a federated data model is that it makes doing global search (search across all of your users) problematic enough as to be impossible. This is also not awesome.
What does this mean? It means that during the initial releases of Dotspotting:
- There is no global full-text search.
- There is only limited global spatial search, which is done using Geohashes (stored in a lookup table on the
db_main
cluster).
Moving forward we imagine the code being written in such a way that it can support a limited number of additional databases or search engines, assuming they've been installed and configured by users, with little more effort than adding specific configuration variables. Before you start asking all the obvious questions, the answer is probably: We don't know yet but it seems like a good plan so we'll try to figure out a way to make it work.
We're not actively working on this architecture yet but are thinking about it as we go, with an eye towards supporting the following:
This is the default and gets you dots and bounding box (and Geohash) queries. It's also really really fast.
Solr is a open source document indexer written in Java and is principally used a full-text search engine but it can also be used to do spatial queries. Currently radial queries are only available by using a third-party plugin but spatial indexing for both points and polygons is being actively developed for the next release of Solr (1.5).
PostGIS a "proper" spatial database that can do amazing things so it's a no-brainer in so far as Dotspotting is concerned. It is also not always the easiest tool to install and maintain and in many cases is probably overkill for the problems people are trying to use Dotspotting to solve which is why, for the time being, it is not the default choice.