Project Blackbird: Deploying Condor in a Blackboard Environment
- Breaking down operational silos through collaboration across organizations enables creative new uses of existing technology.
- Clemson University’s lead Blackboard system administrator solved a significant operational problem through innovative application of Condor software, originally employed for high-performance research computing.
- Users and system administrators both benefited from Blackbird, which reduced the time needed for archiving Blackboard courses by 65 percent.
Clemson University implemented Condor architecture on the Blackboard learning management system (LMS) application servers in order to gain higher throughput when processing Blackboard course archives. Condor has typically been used for high-throughput research computing, not applied to non-research problems involving long processing times. This creative leap — applying a research technology to an administrative/academic problem — occurred because Clemson broke down operational silos, enabling cross-disciplinary personnel to gain different perspectives.
This article describes archiving Blackboard course data at Clemson, the implementation environment for Condor, and the broader significance to the campus community of breaking down operational silos.
Blackboard Course Archiving
Blackboard’s architecture requires the content within courses and workgroups to be part of a larger database rather than files and folders in a directory structure. The full Clemson Blackboard repository includes a 200 gigabyte database and a 900 GB content system with over 10.5 million files. For disaster recovery purposes, Clemson Computing and Information Technology (CCIT) staff perform complete system backups. To retrieve data from these backups requires a full system restore, which would only be performed in the event of a catastrophic failure.
The Blackboard course archive process encapsulates the content from an individual course. The content, which is compressed as a Zip file, can be stored locally or on the server. From the course’s Blackboard control panel, instructors have the option to create course archives individually as needed. Saving a course archive allows the instructor to restore the course to retrieve an individual file (for example, a Word document or a video file) or piece of content (an online test or student submission) from within the archived course (by restoring it to a different, temporary course). Or the archive can be restored to replace an entire course. If students did not complete the course (received a grade of incomplete) or dispute grades, the instructor and administrators must have full records available to evaluate and resolve the situation.
CCIT employs Blackboard’s batch processing when archiving courses, which enables system administrators to generate a batch job to archive a list of courses. After the batch job is submitted, the server performs the archiving of these courses unattended. If instructors require data from the batch archives, the Blackboard administrators within CCIT restore the course from the archive and assist the instructor with retrieving the data needed.
The Growing Challenge of Archiving Blackboard Courses
CCIT encourages all instructors to set up their courses in Blackboard and use the tools available; the university does not limit instructors on the amount of content they can have in their courses. As a result, over the past three years Clemson has seen tremendous growth in the amount of data stored in Blackboard courses and thus in the processing time required to archive these courses.
At the start of the fall 2006 semester, CCIT began archiving all Blackboard courses weekly. By the end of that semester, it took about 48 hours for the archiving to complete — well beyond the weekly maintenance window. CCIT implemented a distributed archive solution in spring 2007, splitting the course list evenly across the four application servers. While this solution cut the archive time in half initially, as course content volume grew, the archive time again grew, until it exceeded 48 hours.
Another solution, also implemented in spring 2007, was to archive only “active” courses every week, with full archiving of all courses once a month. Eliminating inactive courses from the weekly archive processing dropped the number of courses being archived to roughly 6,000 but also put CCIT at risk for data loss in the event that changes made to an inactive course later required recovery (for example, if while setting up a course to be used the next term, files got deleted or settings changed). Ultimately the number and size of courses in the system grew so large that even with four servers, the archives took more than 85 hours to complete. CCIT continued to work with Blackboard consultants and in fall 2008 added a fifth server to reduce the archive processing time. By the end of fall 2008, however, archiving just the active courses took 85.5 hours, and the servers finished at widely varying times (see Figure 1), primarily due to the variations in content volume per course.
Figure 1. Blackboard Archive Processing Time Split Across Five Servers
In the spring 2009 semester, Clemson had about 5,000 active courses in the Blackboard LMS and approximately 20,000 inactive courses (most saved from prior terms). CCIT staff knew there would not be enough time between the end of spring semester and beginning of Maymester (a condensed semester in May) to do a complete archive of all courses in Blackboard. A number of other problems had also developed:
- The number of courses that span semesters was increasing. Ends and beginnings of semesters blur more and more each year, making an “end of term” archive incomplete.
- The length of time needed for archive processing became a point of contention for planning Blackboard upgrades. The upgrade window, already small because of faculty needs, was further reduced by the processing time for archives.
- Because the archive required so much time to complete, the Blackboard system had to remain available to users during the processing. If someone made a change to an already archived course, then another archive had to be run for that course later (as a manual, out-of-cycle task).
- The template courses and test courses were not being archived. Template courses are those courses that instructors set up and then duplicate for multiple sections of a class. Some template courses are used repeatedly semester to semester. Test courses are courses in which instructors can explore Blackboard functionality and course configuration.
Condor to the Rescue
Condor, from the University of Wisconsin–Madison, is a specialized workload management system originally developed for the scientific programming community for high-throughput computing (HTC). Condor provides a job queuing mechanism, scheduling policy, priority scheme, resource monitoring, and resource management. Users submit their serial or parallel jobs to Condor, Condor places them into a queue, chooses when and where to run the jobs based upon a policy that they define, carefully monitors their progress, and ultimately informs the user upon completion. Typical Condor usage provides two important functions:
- It makes available resources more efficient by putting idle desktop, server, and lab machines to work.
- It expands the computing resources available to users’ jobs by functioning well in an environment of distributed ownership.
Through Condor, tremendous amounts of computation can be done with very little intervention from the user, once they define the parameters of their jobs and the policy determining what types of computers are needed to run those jobs. The typical Condor system puts to good use the otherwise wasted computation time on machines sitting idle for long periods of time. Once you connect all these computers together, Condor can manage this “cluster” efficiently.
For more information about Condor, visit the Condor Project Home Page.
I (Sam Hoover) have been providing systems administration for Blackboard at Clemson since December 2005. Well aware of the challenges that the archiving processing time caused users, I was involved in the incremental efforts to reduce the negative impact of the archiving process. In 2008, as part of a National Science Foundation CI-TEAM award with the Renaissance Computing Institute at the University of North Carolina, I began working with Clemson’s existing Condor Team. My experience with the complex Blackboard integration with other systems at Clemson made me a logical choice for this position. This cross-functional group included Dr. Sebastien Goasguen and other School of Computing academic faculty and graduate students, other CCIT systems and operations staff, student lab desktop support staff, and research application support staff. The CI-TEAM grant provides funding for the deployment and dissemination of cyberinfrastructure technologies around the country. Working as a team, the group has developed an end-to-end understanding of Condor.
Within this cyberinfrastructure team environment, it occurred to me that Clemson might be able to apply Condor to the Blackboard archiving problem, although not with the public Condor pool at Clemson. I called this project Blackbird, representing a combination of the production Blackboard LMS at Clemson with a private Condor pool for processing Blackboard archives. I set up project Blackbird using the five current Blackboard production application servers. All nodes are SunFire X4200’s containing 8 GB of memory and running the Red Hat AS 4 distribution of Linux. The head node has one 2.6 GHz dual-core AMD processor, while the other four nodes have two 1.8 GHz dual-core processors, making 18 cores in all.
During normal Blackboard operations, only one of four CPU cores is in use, and monitoring of CPU usage showed that only 20 to 30 percent of that one core was being utilized. Figure 2 shows the process for archiving Blackboard courses before Blackbird was implemented.
Figure 2. Previous Blackboard Archive Process
It occurred to me that using Condor as a job scheduler, and taking advantage of its built-in resource monitoring tools, would enable me to fully utilize all four cores without affecting end-user performance of Blackboard’s normal operations.
In setting up project Blackbird, I was applying Condor in a non-typical way. Instead of using idle desktop or lab machines, I used production Linux servers. Instead of making more computing resources available to everyone, I created a private pool of dedicated machines for one specific type of job (course archives) and only one user. In addition, the number of servers available for the type of jobs CCIT might want to run are limited to those that run Blackboard, connect to our production Blackboard database, and mount our production Blackboard Content file system.
Project Blackbird was created because of the Blackboard application’s requirements for running a course archive, as well as for security purposes. Blackbird jobs (Blackboard course archives via Condor) can only be submitted through the central manager by a specific user (who I defined). The execute nodes communicate with the central manager via a private network for additional security.
I then created the batch control jobs to enable Condor to schedule the individual Blackboard course archives. I used the Directed Acyclic Graph Manager (DAGMan) feature (see “About DAGMan”) in Condor so that Blackboard system administrators could submit the thousands of course archives in a single job. I enabled post-processing to verify the archives upon completion and send an e-mail notification to Clemson’s Blackboard administrator e-mail list.
With DAGMan and Condor, jobs are suspended and resumed automatically to protect end-user and normal system performance. While the standard Blackboard archive runs a single archive process on each server, Blackbird can run a single archive process on each core. The archiving processes that were running on five servers are now handled by 18 core processors. Figure 3 shows the archiving process of Blackboard courses as handled by Blackbird.
Figure 3. Blackbird Archiving Process
Figure 4 shows the Blackbird architecture. In fall 2008, the Blackboard end-of-semester archive ran for 85.5 hours, delaying other end-of-semester processing including faculty setup of spring courses. After applying the Condor architecture and processing capabilities to the archiving task, spring 2009 end-of-semester archives completed in just 23.5 hours, a reduction of 65 percent over the time required in fall 2008. Figure 5 shows the archiving time changes from spring 2006 through spring 2009, from the original four servers to the addition of a fifth server in fall 2008 (indicated by *) and the introduction of Condor in spring 2009 (indicated by #).
Figure 4. Blackbird Archive Solution Architecture
Figure 5. Blackboard Course Archiving Times, Spring 2006–Spring 2009
Condor’s job-queuing mechanism and scheduling policy ensures load balancing of the archive processes, enabling higher throughput by the dynamic distribution of course archives across the 18 cores. As a core finishes an archive of a course, it is given another course to archive. The dramatic load imbalance seen on the nodes in earlier archive solutions is cut from many hours to virtually nothing. The entire Blackbird archive process is complete in less time than the “quickest” server (30+ hours) required in the pre-Blackbird archive processing.
Impact of Blackbird on Clemson Faculty, Students, and Staff
The accelerated completion of archiving allowed faculty to begin setting up Maymester and summer courses much sooner. Before Blackbird was implemented, the end of semester for the Blackboard administrator involved working late nights and through the weekend to get things done in the narrow window between semesters. (This was my responsibility before but now is handled by CCIT staff member Matt Garrett.) In spring 2009, archives were completed in less than 24 hours, allowing Garrett to complete all end-of-semester activities during regular business hours. The Blackboard team now has the opportunity to increase the scope of the archive processing by archiving more courses more often. And we can plan for future Blackboard version upgrades, an impossibility with the old archiving timeline. In addition, all users of Blackboard, especially students, benefit from the performance protection that Condor archiving offers.
You are missing some Flash content that should appear here! Perhaps your browser cannot display it, or maybe it did not initialize correctly.
Listen in as Kathy Hoellen, director of Teaching and Learning Services at Clemson University, shares her thoughts on using HTC tools for course archiving and how it has affected the semester transition process.
The Broader Significance
When Condor was developed, the target use was managing workload for compute-intensive jobs, specifically to enable research. Condor’s developers and users didn’t expect to apply this technology to the non-research problem of archiving courses. Clemson’s experience shows that looking at tools and problems in new ways can generate tremendous results.
How Can Campuses Enable People to Think Creatively?
Engaging a system administrator in a cross-disciplinary research team was the catalyst for Clemson’s course archiving innovation. Participation in this team gave me the opportunity to explore Condor end-to-end and understand how it is used to solve research problems. Examining an existing administrative problem (course archiving) through the same lens suggested this new solution, which I called Blackbird. Taking existing HTC software and using it as a tool to solve a real-world administrative/academic problem delivered immediate, tangible benefits for Clemson and beyond.
What Other Problems Can Be Solved by Applying Technologies Creatively?
Clemson encourages other institutions to develop points of engagement in which research, administrative, and instructional faculty, staff, and students can collaborate to understand each other’s tools as well as challenges. Other campuses might try applying Condor to other administrative applications that have lengthy processing time, for example. (At Clemson, we’re now looking at e-mail and video backups.) What other problems are you facing? How can you think about those problems in new and creative ways?
You are missing some Flash content that should appear here! Perhaps your browser cannot display it, or maybe it did not initialize correctly.
Listen in as Jim Bottum, vice provost for Computing and Information Technology and CIO at Clemson, shares his thoughts on the promise of cyberinfrastructure and the contribution of Condor and Blackbird to productivity:
Miron Livny, the original creator of Condor, presented my use of Condor for course archiving at Condor Week 2009, held at the University of Wisconsin April 20–23, 2009. Please share with the community the innovations you have achieved by breaking down research and academic silos at your institution and allowing collaboration to thrive.
Clemson could not have achieved this innovation without the contributions of the following people and groups:
- Blackboard Operations Team, Clemson University
- Matt Garrett, Blackboard Systems Administrator, Clemson University
- Dr. Sebastien Goasguen, Assistant Professor, School of Computing, Clemson University
- Kathy Hoellen, Director of Teaching and Learning Services, Clemson University
- Dr. Miron Livny, Professor of Computer Science, University of Wisconsin-Madison
- Randy Martin, Research Computing Manager, Clemson University
- Dr. John McGee, Manager for Cyberinfrastructure Development, Renaissance Computing Institute (RENCI)
© 2009 Sam Hoover, Katherine Dobrenen, and Mary Trauner. The text of this article is licensed under the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 license.