1. Background
The Athena desktop computing environment currently runs on over 300 IS&T owned systems in general use clusters around campus and over 600 other systems owned by departments, labs, centers and individuals. Initially put into production use in 1990 as the sole Institute Academic Computing platform, the Athena desktop continues to be actively used as one platform among many, and to be actively supported by IS&T as a service offering.
Over the years, evolutionary changes have been made to the Athena desktop computing environment. Athena 10 is the next evolutionary stage of Athena LinuxOver the past 2½ years, ISDA has developed a set of web services that interact with MIT ID, Moira, Roles, Data Warehouse and GEONAMES.ORG in an effort to provide developers/development teams at MIT an easy to use and a reusable interface into these systems. The initial developmental phase of ISDA web services has been completed and work has begun on the second phase which includes creating a standing Web Services Work Group and providing expertise and documentation to developers.
2. Findings
- High Availability
Users of traditional web applications often have the expectation that services will be available anytimeUtilize Upstream Provider
Athena should rely on upstream providers for operating system and application updates as much as possible. Similar to the update process used on Windows and Mac OS X, this will allow MIT to focus on configurations and applications that are unique to the Institute. In order to provide this level of functionality, the underlying web services should be available 24 x 7. To that end, more work is required to understand what web services need to be highly available, and what updates they may require to meet this demand.
- Clarify Support
In order to provide a highly available service, the support procedure should be expanded to provide 24 x 7 response. In particular, an escalation path should be clearly defined in order to assure around the clock support.
- Data Latency
Any data latency issues due to source downtime or caching should be thoroughly documented. This will give developers a fuller understanding of the data they are using and allow them to create applications that deal with possible data delays.
- Versioning
In order to provided a reliable and stable interface to developers, a versioning methodology should be agreed upon and utilized. Developers are encouraged to learn from the versioning work currently being done on other projects, including Kuali Student and Kuali RICE. Also, IS&T must commit to supporting any released versions for the foreseeable future as applications tend to remain active for many years once deployed.
- Statistics and Reporting
The current method of gathering statistics should be reviewed and updated to allow for better and more robust reporting. This will allow for a better understanding of how current services are used and offer guidance where future development work may be needed.
- respect blackout periods, updates for cluster machines should be mirrored locally, giving us control of when they are deployed.
- Cluster Installation and Update Process
There are existing methods for installing and updating cluster machines that could decrease deployment time while minimizing network impact. These options should be reviewed to determine which could be used to deploy Athena in computing clusters. Some options mentioned in TAP discussion were Ghost and VMware.
- 3rd Party Applications
The Athena 10 team has not yet done due diligence to determine that Ubuntu will support all third party applications that are needed by the current Athena user community.
- Metrics
Current metrics gathered and presented about the Athena environment (numbers of users per system, cluster usage, demographics, etc) are insufficient for appropriate capacity planning and management of Athena Clusters as well as understanding who uses Athena around the Institute. The Athena Project team should consider ways of improving and better presentation of metricsIS&T Standards
IS&T should develop a convention around web-services and development for IS&T web services. Even with such a standard in place, services would still need to go through the TAP process before going into production.
3. Recommendation from the TAP Consultation
...