Previous issues and what we did to resolve them:
- Simplify and automate thalia installation:
We have two scripts:- Alfresco customization script (/zest/thalia/ime/Trunk/thalia-utils/scripts/setupalfresco/install.pl): It will create the thalia admin and guest users, change the default admin password for security, stop alfresco, copy thalia specific configuration files, and restart alfresco.
- Thalia customization script (/zest/thalia/ime/Trunk/thalia-utils/scripts/setupthalia): It will create the directories Thalia depends on and also copy Thalia specific configuration files.
- The simplified installation procedure is documented at https://wikis-mit-edu.ezproxyberklee.flo.org/confluence/display/ZEST/Thalia+Setup+and+Configuration+%28Simplified%29.
- Better monitoring:
In addition to nagios monitoring we currently have:
1. jmx remote is enabled on IME/UI servers.
2. added a command in the shutdown script to capture the thread dump. This will allow us to restart the servers quickly in case of a production problem, but still retain valuable debugging info for future diagnosis.
3. Hunter wrote a script that monitors disk usage on alfresco servers and will send out alerts if the disk space is running low - Better reporting:
1. We have a script that analyzes the thalia log files and reports on the daily usage per domain. It runs on the IME/UI servers. It allows us to see how many users Thalia is supporting on a daily basis and how heavy the usage is.
2. We have a script that reports the domain specific statistics, such as number of users, libraries, albums, images that are currently in the domain and the total disk usage of the domain. It helps us to identify if a domain is going over its space quota.
3. We also provided a servlet so the domain admins and domain users can get the statistics on their own domain. - SASH Server migration:
Completed. All clusters migrated and are running without problems. As a result, thalia developers will be able to log on as log user to production servers to look at log file. - Simplify the domain provisioning process
We have a new web application called builddomain (https://mv-ezproxy-com.ezproxyberklee.flo.org/builddomain). If you are an authorized user of this web application, you can enter a domain name to build/rebuild. We also modified our code to take out the URL rewrite rules in apache configuration file.
Here are the steps to build a domain:
1. use the builddomain to create the domain in alfresco.
2. wait for one hour for the domain info to refresh.
3. super user can self-reg to obtain the first domain account
4. create the domain admin account
5. start using the domain as https://mv-ezproxy-com.ezproxyberklee.flo.org.
Since we have a web console for creating domains, ISDA OPS doesn't have to be involved any more. We will be able to provision domains ourselves.
Other issues:
- Server Disk Space: Steve and Hunter doesn't know where this issue stands. Originally we were told that SAN will arrive in December and we would have more storage then. However the issue might get dropped during the recent organizational change. We need to re-discuss this.
- Replicated MySQL server: OPS will not provide an replicated MySQL server for us any more and we need to find our solution if we need a replicated and redundent MySQL server. Since the alfreco server is used much more heavily than the MySQL server in thalia's case. I would rather invest in clustering Alfresco than MySQL.
- Start/stop script when server reboots: Hunter will take care of this soon.