Go back to Ideas for Google Summer of Code 2009

W3C Google Summer of Code 2009 idea:
Complementary database management scripts for W3C's flexible access control flacl) Apache modules

Mentors: José Kahan / Dominique Hazaël-Massieux / Ted Guild

Summary

Write the complementary scripts that will allow to query / populate / manage flacl's databases. in preparation for its contribution to the Open Source community and, in particular, to the Apache Software Foundation. This involves quite a bit of coding, but also conceptual work, spanning web forms, a command-line tool, a common library, and test procedures.

Required skills:

Description

The W3C staff and its systems team makes it a point of honor to contribute to open-source projects, be it by publishing our in-house tools to our public CVS server or by contributing code / patches / bug reports to an existing project [7]. Over the years, the Apache server has been one of the targets for our contributions [1, 2]. This is natural: Apache is the web server we use in most of our services and if in almost all of our Web servers. We benefit from it and its a pleasure to contribute patches or bug reports that may help other Apache users... plus, we think that Apache is quite cool!

The goal of this proposal is to help us contribute flacl, our web-friendly and fine-grained access control list system to the Open Source community and, in particular, to the Apache Foundation. In a typical web server, we have individual users and group of users, resources (e.g., web pages, scripts), and access rights that are defined in terms of the HTTP protocol (i.e., GET, PUT, POST, etc.). To our knowledge, no other current Apache AAA module provides this flexibility today. The system has been in use in our servers for almost 10 years. The access rules evaluation, doesn't have any significant impact on the server's performance.

Background

In a standard Apache configuration [3]. user attributes (passwords, group membership, and so on) are usually stored in files or databases. If the user rights are stored inside a database, it's possible to manage and query them thru a web interface.

On the other hand, access rights are declared in either server configuration files (parsed only once when the server starts) or specific .htacces files (parsed each time there's a request). This complicates the managing and querying of the access rights. Any changes on the server's configuration file needs your restarting the server. While you could eventually modify the .htaccess file and store it using HTTP PUT or WebDaV, this is is not practical (you have to have access rights that allow you to create and modify that file). Finally, most important, humans may have a hard time evaluating what are a user's access rights over a given resource. Apache's access rights directives may be hierarchically cascaded. That is, evaluating a user's access rights on a given resource may depend on different access control directives and .htaccess files that apply to resources that precede it, such as path names (both virtual in URLs and physical related to the file system). Querying the access rights may thus require access to server configuration files as well as different .htaccess files, and expertise to evaluate those directives. This may not be easy to do thru a web API.

flacl in a nutshell

The W3C systems team developed flacl, a web API friendly, flexible and fine-grained access control system, in 1999. This system, which is used for all access-control decisions on our web servers, stores both user attributes (user name, password, IP address, group membership) and access control information in dbm files (an mysql version is also available). The system consists of an apache2 module, mod_dbbmacl, and a web script, ,access.

The module queries the databases for authenticating users and for granting access to a given resource. The module is written in C and conforms to Apache's AAA (access control and authorization) specification, that is, its directives are similar to those of any other Apache AAA module. The ,access script, written in PHP, is used to both query and manage the access rights and can be combined with other access control. The script also depends on a perl library that we use to synthetize the dbm files from a mysql database.

In a typical use case, a user browsers a URI on the web server. In order to invoke the ,access script the user adds the ",access" suffix to the URI in the navigation bar (that's why it's called ,access :-) ). This brings a simple view that shows all the access rights currently allocated tor that given URI (see screenshot and of the most common access right attribution operations. An advanced view allows more fine-grained control, giving a list of known users and groups in the database and lets a maintainer grant, deny, or revoke access to the URI, in terms of HTTP methods (see screenshot). ,access provides an additional view that lets manage all of the resources that are stored under a given path thru a single form (see screenshot).

Thus, by means of ,access script, a maintainer may browse or manage the actual access rights associated to any URI. flacl is compatible with the standard Apache AAA modules. If the access rights are hard coded inside the server or .htaccess directives, the script will show the final result but won't let the maintainer modify them.

Work to be done

Although the flacl Apache module has been available for a long time on our public CVS server [8], it's not useful to other people in its present state: the scripts that allow to query and populate the related dbm databases are not completely independent of other infrastructure in our system (user databases, legacy code).

The goal of this project is to rewrite the ,access script in order to make it independent of the W3C infrastructure and to prepare the whole system so that it can be contributed in general to the Open Source community and, in particular, to the Apache Software Foundation [6]. Although the existing ,access is written in PHP, there's no restriction to recoding it using some other language (perl, python). We would like to have a common library for the code and have both a CLI and web form UI to it as well as test procedures. Any ideas you have for making flacl more general or useful for other people will be welcome.

We can set aside several hours per week for mentoring and would very much like to work with someone experienced enough to work independently, capable to work with others, and unafraid to make decisions and argue for them. We will share with you our programming experience and interact with us as part of the W3C system's team.

The code will be copyrighted under both the W3C Software License [4] and the Apache 2.0 License [5]. The coding standards will be those currently in application by the Apache Software Foundation and those advised under best practices for the chosen scripting language (GNU, PHP, perl, python, ...)

References

[1] List of W3C's contributed Apache patches.
http://www.w3.org/2007/10/osc
[2] Apache HTTP Server changelog.
http://www.apache.org/dist/httpd/CHANGES_2.2.11
[3] Apache HTTP Server Version 2.2 Documentation: Access Control.
http://httpd.apache.org/docs/2.2/howto/access.html
[4] W3C Software Notice and License.
http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231
[5] Apache License, Version 2.0.
http://www.apache.org/licenses/LICENSE-2.0
[6] Apache Software Foundation.
http://www.apache.or/
[7] W3C's Open Source Software.
http://www.w3.org/Status
[8] cvs repository for flacl Apache modules.
http://dev.w3.org/apache-modules/mod_authnz_dbmacl/

Last modified: $Author: kahan $ $Date: 2009/03/10 00:40:35 $