Thalia Business Layer Design Specifications
v. 1.1
Domain
Thalia is a centralized image storage and management service and it is hosted on a central server (or a cluster of servers). However, we would like to provide our users their own unique virtual domains. Each domain will have its own URL, eg. hst.thalia.mit.edu for HST users , and ap.thalia.mit.edu for Architecture and Planning users. Domains are also completely independent of each other. Each domain will have its own users, own public album, and own user spaces.
Right now in our DNS server, we are pointing all *.thalia.mit.edu to thalia.mit.edu. We are using Apache's mod_rewrite module to rewrite the URLs to direct request to different domains. For examples, if we configure two domains HST and AP. Request to those two domains will be forwarded to their URLs and all the other requests will be rewritten to the main server thalia.mit.edu. An error will be returned telling the user "Domain not recognized".
Alfresco has a store/workspace concept which can be mapped neatly to our domain concept. A repository consists of one or more workspaces, each of which contains its own tree of nodes. Alfresco supports multiple workspaces. Alfresco's default store/workspace is "SpacesStore". We will create stores for each of our domains.
Any servlet in our system will first retrieve the server name from the request and it can be parsed to get the domain name. Then we will check to see if the domain exists or not. If not, we will create a new workspace for that domain. For example, if the user is access hst.thalia.mit.edu and workspace://HST doesn't exist yet, this workspace will be created automatically together with its basic structure.
The domain names are case-insensitive and is represented as all upper-case in our system. We don't have any domain related APIs because we believe that the domain concept should be completely transparent to the users.
In the web.xml file, the parameter serverComponents is related to domain definition. It specifies how many parts a valid domain(sever) URL consists. The domain name is always the first part. If serverComponents is set to 4, then the URL thalia.mit.edu is a invalid domain url. The URL hst.thalia.mit.edu is valid and hst is parsed as the domain name.
The following is the basic domain structure. Everybody has READ permission for the domain root. users_home is where users home spaces reside. If a new user jsmith is created, the system will create a jsmith-space under users-home and a jsmith's library under jsmith-space. Users' home space will not inherit permission from its parent and the only permission will be all control for the specified user. libraries_home is where public library and team libraries reside. Collections_home and slideshows_home are for collections and slideshows. Users in the domain will have permission to add children in libraries_home, collections_home, and slideshows_home, so the users can create new libraries, collections, and slideshows. However, the children will not inherit these permissions. The initial permission for the children will only be all control for the creator. The creator can create other permissions via sharing.
Users
We have three types of users: a super user, a domain admin user, and a domain user. A super user is a user that has super power across domains. He/She can do anything in any domain including creating the domain admin users. A domain admin user is a user that has super power in the specific domain. He/She can do anything in that domain including creating, modifying and deleting domain users. A domain user is a regular user in the domain. He/She can write to the public library and also has all control to his/her own user space including library and collections under it. A domain user can share his/her library and collections with other users. This will be discussed in detail in the authorization section.
We identify a user using both the user's X509 certificate and the domain the user is accessing. If the user jsmith is accessing hst.thalia.mit.edu and ap.thalia.mit.edu, he is treating as two different users. We achieve this by appending the domain prefix in front the user name, so jsmith is hst-jsmith in the HST domain and ap-jsmith in the AP domain. The two users have different rights, different spaces in those two domains and are treated as two completely different users. Again this is transparent to the users.
We create our users directly in Alfresco and use Alfresco's user and person model to store our user attributes. However, we also use a mapping file to store the user's attributes of super user or domain admin. The mapping file is defined in web.xml as parameter userConfigFile and is of xml format. It stores three pieces of information for each user in the system: name, domain, and isAdmin. name is the user name without the domain prefix and this is always the name the user will see. domain is the name of the domain. In addition to the normal domains, we have a special domain called ADMIN. If a user is in the ADMIN domain, the user is a super user. isAdmin indicated if the user is a domain admin or not. Here is a sample mapping file:
<user-list>
<user><name>dtanner</name>
<domain>HST</domain>
<isAdmin>false</isAdmin>
</user>
<user><name>dongq</name>
<domain>HST</domain>
<isAdmin>true</isAdmin>
</user>
<user><name>dongq</name>
<domain>AP</domain>
<isAdmin>true</isAdmin>
</user>
<user><name>dongq</name>
<domain>ADMIN</domain>
<isAdmin>true</isAdmin>
</user>
</user-list>
This mapping file shows that user dongq is a super user, and dongq is a domain admin in HST, and dongq is a domain admin in AP. Note those three dongq are different. The super user dongq is a pseudo user and is not actually a valid user in Alfresco. dongq in HST is hst-dongq and dongq in AP is ap-dongq and are treated as two completely different users in the system. User dtanner is just a regular domain user in HST. Thalia gets the admin status of its users from this mapping file and all other user attributes from Alfresco.
At the very beginning, the mapping file needs to be created with the super users' information. From that point on, the file will be maintained by the business layer as the users are added, deleted, and modified. The super user will log into a domain, if the super user is not already a domain admin in that domain, he/she will be created as a domain admin in that domain. Then the super user can create other domain admins and domain users.
When a user accesses a domain URL, if no certificates is presented or if the user is not a valid domain user, the user will be treated as guests.
When a new user is created in the domain, we first create the user's home space in user-spaces folder with the user's default library before creating the user in the system. The new user has admin control to his/her home space and admin control of the personal library. The new user will also have write control to the public library.
User names are case-insensitive and are represented in the system as all lower-case letters.
Domain users have to be managed through the domain URL. The domain tag in the xml file when registering or updating a user is ignored. The domain info is always derived from the url.
In the web.xml file, there are three user related parameters: alfrescoAdmin, alfrescoAdminPwd, alfrescoUserPwd. alfrescoAdmin and alfrescoAdminPwd are admin user and password for Alfresco. If a super user and domain admin are trying to do administrative tasks in alfresco, like creating and deleting users, this user name and password will be used to proxy for the actual user. alfrescoUserPwd is the secret password for all users. All users will be created in Alfresco with this password. We use the actual user name and this password to log into Alfresco and Alfresco takes care of all the permissions and we don't have to implement our own authorization. Please be careful when you change the alfrescoUserPwd value. All the existing Thalia users in Alfresco will need to have their password updated also.
Also Alfresco's web service API does not support guest login and thalia needs to support guest users. A user is created in Alfresco manually (via the Alfresco web client) and since that user does not have any permission in any of the domain, we use that user as our guest user. The user name and password are stored in web.xml as alfrescoGuest and alfrescoGuestPwd.
Users have the following fields: name, isAdmin, firstname, lastname, emailaddress, organization (default to MIT) and the xml text for a full user object: <user>
<name>jsmith</name>
<firstname>John</firstname>
<lastname>Smith</lastname>
<emailaddress>jsmith@mit.edu</emailaddress>
<organization>MIT</organization>
<isAdmin>false</isAdmin>
</user>
There are two sets of APIs for the user objects. One is users, which will get all users in the domain via its GET method. The other is user. Its GET method will retrieve a user, POST method will create a user, PUT method will update a user and DELETE method will delete a user. When a user is deleted, his/her home space remains intact. Next time the same user is created, he/she will have the same home space. Libraries
Thalia uses its own custom model for library, collection, slideshow, items. The custom model is defined in thaliaModel.xml and is referenced by thalia-model-context.xml. Don't forget to drop those two files in Aflresco's extension directory.
Libraries are repositories for images. There are three types of libraries: public library, personal library, and team libraries.
Every domain will have a public library under libraries_home. It is created automatically when the domain is first established. Everybody has read permission to this library and domain users have write permission to the library. Public library can not be deleted. Public library properties can not be modified.
Personal library resides in the user's home space and is automatically created when a new user is registered. Only that user has full-control to his/her personal library. The user can give other users read, write, or admin rights to his/her personal library. Personal library can not be deleted. Even when a user is deleted, his/her personal space still remains.
Libraries has two fields title and description, along with some auditing fields: createdBy, createdDate, lastModifiedDate, modifiedBy. The xml text for a library looks like: <library>
<id>fe215db6-ec5b-11db-b809-d59de43b7676</id>
<title>dongq's Library</title>
<description>dongq's Personal Library</description>
<createdBy>dongq</createdBy>
<createDate>2007-04-16T16:49:38.765-04:00</createDate>
<modifiedBy>dongq</modifiedBy>
<modifiedDate>2007-04-16T16:49:38.765-04:00</modifiedDate>
</library>
There are two sets of APIs for the library objects. One is libraries, which will get all libraries in the domain the current user has rights to view via its GET method. If you append an "all" at the url, it will retrieve all the items in the libraries also. The other is library. Its GET method will retrieve a library, POST method will create a library, PUT method will update a library and DELETE method will delete a library. Items
Items are contents under libraries. Items don't have their own permissions. They always inherit permissions from its parent library.
Items contain Dublin-core metadata: title, description, creator, date, format, language, rights, publisher, contributor, source, coverage, type. If the item is an image file, we also create a large, a medium, and a thumbnail jpeg images based on the master image. Therefore it will contain info about those three images. It also has some auditing fields: createdBy, createdDate, lastModifiedDate, modifiedBy. The xml text for a full item looks like:<item>
<libraryid>8bd219f0-f271-11db-b5bc-05f275694451</libraryid>
<id>fddb7233-ec8e-11db-b809-d59de43b7676</id>
<title>flowers</title>
<description>Spring flowers</description>
<createdBy>dongq</createdBy>
<createDate>2007-04-16T22:54:42.640-04:00</createDate>
<modifiedBy>dongq</modifiedBy>
<modifiedDate>2007-04-16T22:54:44.093-04:00</modifiedDate>
<contributor>Jane Smith</contributor>
<creator>John Smith</creator>
<date>3-3-2006</date>
<format>jpeg</format>
<language>English</language>
<publisher>none</publisher>
<rights>unspecified</rights>
<source>personal album</source>
<type>none</type>
<thumbnail>*http://localhost:8180/alfresco/guestDownload/direct/workspace/AP/fea61542-ec8e-11db-b809-d59de43b7676/thumbnail*</thumbnail>
<medium>*http://localhost:8180/alfresco/guestDownload/direct/workspace/AP/fead4136-ec8e-11db-b809-d59de43b7676/medium*</medium>
<large>*http://localhost:8180/alfresco/guestDownload/direct/workspace/AP/feb46d29-ec8e-11db-b809-d59de43b7676/large*</large>
</item>
There are two sets of APIs for the item objects. One is items, which will get all items in the given library via its GET method. The other is item. Its GET method will retrieve an item, POST method will upload and create an item, PUT method will update an item and DELETE method will delete an item. Authorizations:
The following are Alfresco permissions:
ReadProperties
ReadChildren
WritePropertiesReadContent
WriteContent
ExecuteContent
DeleteNode
DeleteChildren
CreateChildren
LinkChildren
DeleteAssociations
ReadAssociations
CreateAssociations
ReadPermissions
ChangePermissions
It also has permission groups:
Read: includes ReadProperties, ReadChildren, ReadContent
Write: includes WriteProperties, WriteContent
AddChildren: includes CreateChildren and LinkChildren
Delete: includes DeleteNode and DeleteChildren
FullControl:
Mapping between Thalia permissions and Alfresco permissions:
read - ReadProperties, ReadChildren
download --- Read
write ---- AddChildren + Delete + Write + Read
admin --- FullControl
Here are the permissions set by the thalia system:
At the domain root level, we grant read permission to all
Libraries-home, collections-home, and slideshow-home grants AddChildren to all the users so the users can create things under them
However, when a library, collection, or slideshow is created, it doesn't inherit permission from its parent folder. It only grants ALL permission to the user who creates it. Even the public library and personal libraries don't inherit permission from their parent folders, so querying permissions will be less complicated.
Thalia modifies the Alfresco's permissionsDefinitions.xml file to add Thalia related permissions. Don't forget to replace Alfresco's permissionsDefinitions.xml with our own.
The xml text for an authorization is:
<authz user="dongq" role="admin" qualifier="8bd219f0-f271-11db-b5bc-05f275694451" /> Categories
Thalia gives the user ability to tag images. It is done by applying categories to images. Alfresco has support for classifications and we are just using their existing model. In alfresco, categories live under the node: /cm:categoryRoot/cm:generalclassifiable and can have multiple level of subcategories. The backend provides the capabilities of creating multiple levels of subcategories and applying categories to the items at any level. However the front end probably locks it down to two levels following the old category and tag tradition and only tags can be applied to categories.
The web service API 2.0 supports retrieving and applying categories. However it doesn't support creating and deleting categories. In order to creating and deleting categories, the backend manipulates the nodes under /cm:categoryRoot/cm:generalclassifiable directly. The future releases of the web service API might support category deletion and creation and we will need to modify the code to use the new API.
Only domain admins can create and delete categories. Any user including guests can retrieve categories and subcategories. Since classification is stored in the item, only users have write access of the parent library can tag and untag items. When searching by categories, the user will only see information that he or she has rights to see.
The xml text for a category is:
<category id="b73854cc-f271-11db-b5bc-05f275694451" name="animal" /> Things to do when setting up a new cluster:
- Install ImageMagic on the server that will be running the Thalia UI and IME.
B. put thaliausers.xml in /home/thalia/conf. This file should contain info about system super users.
C. Customize the Alfresco Server:
1. Stop Alfresco server
2. put our custom model (includes thaliaModel.xml and thalia-model-context.mxl) in the extension folder
3. replace Alfresco's permissionsDefinitions.xml with our own.
4. add the admin account specified by alfrescoAdmin in web.xml in Alfresco's authority-services-context.xml file.
5. start Alfresco server
6. create the user specified by alfrescoAdmin and alfrescoAdminPwd in web.xml. This is our admin user.
7. create the user specified by alfrescoGuest and alfrescoGuestPwd in web.xml. This is our guest user (because the web service api doesn't support guest login).
D. upload the thalia war file onto the UI and IME server.