You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Thalia Business Layer Design Specifications
v. 1.1
 Domain

Thalia is a centralized image storage and management service and it is hosted on a central server (or a cluster of servers). However, we would like to provide our users their own unique virtual domains. Each domain will have its own URL, eg. hst.thalia.mit.edu for HST users , and ap.thalia.mit.edu for Architecture and Planning users.  Domains are also completely independent of each other. Each domain will have its own users, own public library, and own user spaces.  
Right now in our DNS server, we are pointing all *.thalia.mit.edu to thalia.mit.edu. We are using Apache's mod_rewrite module to rewrite the URLs to direct request to different domains. For examples, if we configure two domains HST and AP. Request to those two domains will be forwarded to their URLs and all the other requests will be rewritten to the main server thalia.mit.edu. An error will be returned telling the user "Domain not recognized".  
Alfresco has a store/workspace concept which can be mapped neatly to our domain concept.  A repository consists of one or more workspaces, each of which contains its own tree of nodes. Alfresco supports multiple workspaces.  Alfresco's default store/workspace is "SpacesStore". We will create stores for each of our domains. 
We have an utility called BuildThaliaDomain.jar which will build new domains. Before starting the thalia-ime, run "java -classpath ./lib -jar BuildThaliaDomain.jar [DOMAIN NAME]" to build the new domains. Make sure the alfresco repository is pointing at the correct server.  
Any servlet in our system will first retrieve the server name from the request and it can be parsed to get the domain name. Then we will check to see if the domain exists or not. If not, an error will be returned.  
The domain names are case-insensitive and is represented as all upper-case in our system. We don't have any domain related APIs because we believe that the domain concept should be completely transparent to the users.  
In the web.xml file, the parameter serverComponents is related to domain definition. It specifies how many parts a valid domain(sever) URL consists. The domain name is always the first part. If  serverComponents is set to 4, then the URL thalia.mit.edu is a invalid domain url. The URL hst.thalia.mit.edu is valid and hst is parsed as the domain name.  
The following is the basic domain structure. Everybody has READ permission for the domain root. users_home is where users home spaces reside.  If a new user jsmith is created, the system will create a jsmith-space under users-home and a jsmith's library under jsmith-space. Users' home space will not inherit permission from its parent and the only permission will be all control for the specified user. libraries_home is where public library and team libraries reside. Collections_home and slideshows_home are for collections and slideshows. Users in the domain will have permission to add children in libraries_home, collections_home, and slideshows_home, so the users can create new libraries, collections, and slideshows. However, the children will not inherit these permissions. The initial permission for the children will only be all control for the creator. The creator can create other permissions via sharing.    Users

We have three types of users: a super user, a domain admin user, and a domain user. A super user is a user that has super power across domains. He/She can do anything in any domain including creating the domain admin users. A domain admin user is a user that has super power in the specific domain. He/She can do anything in that domain including creating, modifying and deleting domain users. Domain admin will need to login specifically as a domain admin (via a button on the UI) to fully achieve domain admin privilege. Otherwise he or she will only do things he or she have explicit rights to do. For example, if user A is a domain admin, but he doesn't have explicit rights to see library1. When user A login, he will not see library 1. However, if he logs in specifically as domain admin. He can see all libraries in the system including library1.  A domain user is a regular user in the domain. He/She can write to the public library and also has all control to his/her own user space including his own personal library. He/she can also create libraries, collections, and slideshows. He has all control over the objects of his creation. A domain user can share his/her library and collections with other users. This will be discussed in detail in the authorization section.  
We identify a user using both the user's X509 certificate and the domain the user is accessing. If the user jsmith is accessing hst.thalia.mit.edu and ap.thalia.mit.edu, he is treating as two different users. We achieve this by appending the domain prefix in front the user name, so jsmith is hst-jsmith in the HST domain and ap-jsmith in the AP domain. The two users have different rights, different spaces in those two domains and are treated as two completely different users. Again this is transparent to the users.  
We create our users directly in Alfresco and use Alfresco's user and person model to store our user attributes. We also modified alfresco's contentModel.xml to extend to person model to store the isAdmin attribute. (Don't forget to replace the contentModel.xml file with our own).  isAdmin is the attribute to indicate if somebody is a domain admin or not.  We also use a mapping file to store the superusers. The mapping file is defined in web.xml as parameter userConfigFile and is of xml format. It stores the super admins' names.  name is the user name without the domain prefix and this is always the name the user will seea:

<user-list>
<user><name>dongq</name>
<domain>ADMIN</domain>
<isAdmin>true</isAdmin>
</user>
</user-list>
 
This file shows that user dongq is a super user. Everytime we need to change the super user info we need to restart the server because this info is read into the memory at start up and never changes during the life time of the service.  
When a domain is first created, the super user will log into the new domain. The business layer seeing the user is a super user, will  automatically create him/her as a domain admin in that domain. Then the super user can create other domain admins. Domain admins can create other domain admins and users.  
When a user accesses a domain URL, if no certificates is presented or if the user is not a valid domain user, the user will be treated as guests.  
When a new user is created in the domain, we first create the user's home space in user-spaces folder with the user's default library before creating the user in the system. The new user has admin control to his/her home space and admin control of the personal library.  The new user will also have write control to the public library.  
User names are case-insensitive and are represented in the system as all lower-case letters.  
Domain users have to be managed through the domain URL. The domain tag in the xml file when registering or updating a user is ignored. The domain info is always derived from the url.  
In the web.xml file, there are a few user related parameters: alfrescoAdmin, alfrescoAdminPwd, alfrescoUserPwdalfrescoAdmin and alfrescoAdminPwd are admin user and password for Alfresco. If a super user and domain admin are trying to do administrative tasks in alfresco, like creating and deleting users, this user name and password will be used to proxy for the actual user.  alfrescoUserPwd is the secret password for all users. All users will be created in Alfresco with this password. We use the actual user name and this password to log into Alfresco and Alfresco takes care of all the permissions and we don't have to implement our own authorization. Please be careful when you change the alfrescoUserPwd value. All the existing Thalia users in Alfresco will need to have their password updated also.  
Also Alfresco's web service API does not support guest login and thalia needs to support guest users. A user is created in Alfresco manually (via the Alfresco web client) and since that user does not have any permission in any of the domain, we use that user as our guest user. The user name and password are stored in web.xml as alfrescoGuest and alfrescoGuestPwd.

Users have the following fields: name, isAdmin, firstname, lastname, emailaddress, organization (default to MIT) and the xml text for a full user object:          <user>
         <name>jsmith</name>
      <firstname>John</firstname>
      <lastname>Smith</lastname>
      <emailaddress>jsmith@mit.edu</emailaddress>
        <organization>MIT</organization>
      <isAdmin>false</isAdmin>
     </user>
 
There are two sets of APIs for the user objects. One is users, which will get all users in the domain via its GET method. The other is user. Its GET method will retrieve a user, POST method will create a user, PUT method will update a user and DELETE method will delete a user.  When a user is deleted, his/her home space remains intact. Next time the same user is created, he/she will have the same home space.    Libraries

Thalia uses its own custom model for library, collection, slideshow, items. The custom model is defined in thaliaModel.xml and is referenced by thalia-model-context.xml. Don't forget to drop those two files in Aflresco's extension directory.  
Libraries are repositories for images. There are three types of libraries: public library, personal library, and team libraries.  
Every domain will have a public library under libraries_home. It is created automatically when the domain is first established. Everybody has read permission to this library and domain users have write permission to the library. Public library can not be deleted. Public library properties can not be modified.  
Personal library resides in the user's home space and is automatically created when a new user is registered. Only that user has full-control to his/her personal library. The user can give other users read, download, write, or admin rights to his/her personal library.  Personal library can not be deleted. Even when a user is deleted, his/her personal space still remains.  
Libraries has two fields title and description, along with some auditing fields: createdBy, createdDate, lastModifiedDate, modifiedBy.  The xml text for a library looks like:        <library>
       <id>fe215db6-ec5b-11db-b809-d59de43b7676</id>
       <title>dongq's Library</title>
       <description>dongq's Personal Library</description>
       <createdBy>dongq</createdBy>
       <createDate>2007-04-16T16:49:38.765-04:00</createDate>
       <modifiedBy>dongq</modifiedBy>
       <modifiedDate>2007-04-16T16:49:38.765-04:00</modifiedDate>
     </library>
 
There are two sets of APIs for the library objects. One is libraries, which will get all libraries in the domain the current user has rights to view via its GET method. If you append an "all" at the url, it will retrieve all the items in the libraries also.  The other is library. Its GET method will retrieve a library, POST method will create a library, PUT method will update a library and DELETE method will delete a library.   Items

Items are contents under libraries. Items don't have their own permissions. They always inherit permissions from its parent library.  
Items contain Dublin-core metadata: title,identifier, description, creator, date, format, language, rights, publisher, contributor, source, coverage, relation, subject, type. If the item is an image file, we also create a large, a medium, and a thumbnail jpeg images based on the master image. Therefore it will contain info about those three images. It also has some auditing fields: createdBy, createdDate, lastModifiedDate, modifiedBy.  The xml text for a full item looks like:<item>
  <libraryid>8bd219f0-f271-11db-b5bc-05f275694451</libraryid>
 <id>fddb7233-ec8e-11db-b809-d59de43b7676</id>
      <title>flowers</title>
 <description>Spring flowers</description>
 <createdBy>dongq</createdBy>
      <createDate>2007-04-16T22:54:42.640-04:00</createDate>
     <modifiedBy>dongq</modifiedBy>
     <modifiedDate>2007-04-16T22:54:44.093-04:00</modifiedDate>
     <contributor>Jane Smith</contributor>
     <creator>John Smith</creator>
     <date>3-3-2006</date>
     <format>jpeg</format>
     <language>English</language>
     <publisher>none</publisher>
     <rights>unspecified</rights>
     <source>personal album</source>
     <type>none</type>
     <thumbnail>*http://localhost:8180/alfresco/guestDownload/direct/workspace/AP/fea61542-ec8e-11db-b809-d59de43b7676/thumbnail*</thumbnail>
     <medium>*http://localhost:8180/alfresco/guestDownload/direct/workspace/AP/fead4136-ec8e-11db-b809-d59de43b7676/medium*</medium>
     <large>*http://localhost:8180/alfresco/guestDownload/direct/workspace/AP/feb46d29-ec8e-11db-b809-d59de43b7676/large*</large>
  </item>
 
To enhance performance, we cache the ids of thumbnail, medium and large child nodes as properties on the item node. We also cache the mimetype and size information of the master image.  
There are two sets of APIs for the item objects. One is items, which will get all items in the given library via its GET method. The other is item. Its GET method will retrieve an item, POST method will upload and create an item, PUT method will update an item and DELETE method will delete an item.    Authorizations:

The following are Alfresco permissions:

ReadProperties

ReadChildren
WritePropertiesReadContent

WriteContent

ExecuteContent

DeleteNode 
DeleteChildren
CreateChildren
LinkChildren 
DeleteAssociations
ReadAssociations
CreateAssociations 
ReadPermissions
ChangePermissions
 
It also has permission groups:

            Read: includes ReadProperties, ReadChildren, ReadContent

            Write: includes WriteProperties, WriteContent

            AddChildren: includes CreateChildren and LinkChildren

            Delete: includes DeleteNode and DeleteChildren

            FullControl: 
Mapping between Thalia permissions and Alfresco permissions:

read - ReadProperties + ReadChildren + ReadContent

download --- ReadProperties + ReadChildren + ReadContent + ExecuteContent

write ----  AddChildren + Delete + Write + Read + ExecuteContent

admin --- FullControl 
This model has an "ugly" workaround for an Alfresco bug. The read permission should not contain ReadContent. The read permission only gives people rights to see the item and read its metadata. It doesn't give them to write to view the master image. However, Alfresco will not be able to search on that item if we don't have the ReadContent bit. So the read permission is actually the same as the download permission. To distinguish between the two, we add a useless ExecuteContent to download and control the download permission in the IME level instead of letting Alfresco handle it.  
Here are the permissions set by the thalia system:

At the domain root level, we grant read permission to all

Libraries-home, collections-home, and slideshow-home grants AddChildren to all the users so the users can create things under them

However, when a library, collection, or slideshow is created, it doesn't inherit permission from its parent folder. It only grants ALL permission to the user who creates it. Even the public library and personal libraries don't inherit permission from their parent folders, so querying permissions will be less complicated.  
Thalia modifies the Alfresco's permissionsDefinitions.xml file to add Thalia related permissions. Don't forget to replace Alfresco's permissionsDefinitions.xml with our own.  
The xml text for an authorization is:

<authz user="dongq" role="admin" qualifier="8bd219f0-f271-11db-b5bc-05f275694451" /> Categories

Thalia gives the user ability to tag images. It is done by applying categories to images. Alfresco has support for classifications and we are just using their existing model. In alfresco, categories live under the node: /cm:categoryRoot/cm:generalclassifiable and can have multiple level of subcategories. The backend provides the capabilities of creating multiple levels of subcategories and applying categories to the items at any level. However the front end probably locks it down to two levels following the old category and tag tradition and only tags can be applied to categories.  
The web service API 2.0 supports retrieving and applying categories. However it doesn't support creating and deleting categories. In order to creating and deleting categories, the backend manipulates the nodes under /cm:categoryRoot/cm:generalclassifiable directly. The future releases of the web service API might support category deletion and creation and we will need to modify the code to use the new API.  
Only domain admins can create and delete categories. Any user including guests can retrieve categories and subcategories. Since classification is stored in the item, only users have write access of the parent library can tag and untag items. When searching by categories, the user will only see information that he or she has rights to see.  
The xml text for a category is:

<category id="b73854cc-f271-11db-b5bc-05f275694451" name="animal" />

  • No labels