Profile cover photo
Profile photo
Pegasus Workflow Management System
Robust and scalable workflow management tools for the scientific community
Robust and scalable workflow management tools for the scientific community

Pegasus Workflow Management System's posts

Post has attachment
Pegasus powers LIGO gravitational wave detection analysis

The Pegasus team is very happy to hear about LIGO’s incredible discovery: the first detection of gravitational waves from colliding black holes. We congratulate the entire LIGO Scientific Collaboration and the Virgo Collaboration on this incredible achievement.

The Pegasus team is very pleased to have contributed to LIGO’s software infrastructure. One of the main analysis pipelines used by LIGO to detect the gravitational wave was executed using Pegasus Workflow Management System (WMS). The PyCBC analysis pipeline analyzed data from the two LIGO detectors. Initially the analysis was managed by Pegasus WMS on the LIGO Data Grid. LIGO extended their computations to the nation-wide cyber-infrastructures, Open Science Grid and XSEDE. Pegasus aided this expansion by managing cross-site data transfers and computations in a reliable, scalable, and efficient manner. Pegasus enables LIGO researchers to easily monitor and analyze their workflows via a web based dashboard, and a suite of command line tools.

“We use large, complex workflows to search for gravitational waves with LIGO. Pegasus makes it faster and simpler to run our codes, which lets us concentrate on the new discoveries and new science.” explains Dr. Duncan Brown, a physicist at Syracuse University.

Pegasus WMS is a collaboration between the Science Automation Technologies Group at the University of Southern California’s Information Sciences Institute and the HTCondor group at the University of Wisconsin, Madison. Pegasus’ collaboration with LIGO dates back to 2001. The workflow technologies that underlie the PyCBC search were first described in the book “Workflows for eScience” co-edited by ISI’s Ewa Deelman.

Thanks to the National Science Foundation we have been able to sustain our collaboration for over 15 years, most recently through a joint NSF Data Infrastructure Building Blocks (DIBBS) award.

Pegasus received funding from the National Science Foundation through the following grants: ACI-1148515, ACI-1443047, ACI-0943705, and ACI-0722019. Research in Pegasus workflow performance modeling is supported by the Department of Energy through DE-SC0012636.

Links to more information:


Post has attachment
The Pegasus team will be at SC'13 next week! Please come and talk to us at one of the following events.

Paper presentation at The 8th Workshop on Workflows in Support of Large-Scale Science (WORKS13)
Talk by Rafael Ferreira da Silva, the newest Pegasus team member:
Sunday 12:00-12:30
Toward Fine-Grained Online Task Characteristics Estimation in Scientific Workflows
R. Silva, G. Juve, E. Deelman, T. Glatard, F. Desprez, D. Thain, B. Tovar, and M. Livny

Paper presentation at NDM 2013 : 3rd IEEE/ACM International Workshop on Network-aware Data Management
Sunday 9:00AM - 5:30PM
Evaluating I/O Aware Network Management for Scientific Workflows on Networked Clouds
Anirban Mandal, Paul Ruth, Ilya Baldin, Yufeng Xin, Claris Castillo, Mats Rynge, Ewa Deelman.

Talk: PRECIP - Pegasus Repeatable Experiments for the Cloud in Python, presented by Ewa Deelman
USC Booth (#3705)
Tuesday 11:00 am and 2:30pm

Demo ExoGENI NIaaS: Dynamic monitoring and adaptation of data-driven scientific workflows
RENCI/North Carolina booth (#4305)
Tuesday 2:30 pm
Wednesday 11:30am
Poster: Running A Seismic Workflow Application on Distributed Resources
Technical Posters Session
Tuesday 5:15PM - 7:00PM
Scott Callaghan, Philip Maechling, Karan Vahi, Gideon Juve, Ewa Deelman, Yifeng Cui, Efecan Poyraz, Thomas H. Jordan

BOF: Science and Scientific Workflows: Putting Workflows to Work
Tuesday 5:30pm-7:00pm

Post has attachment
We are happy to announce the release of Pegasus 4.3

Post has attachment
The following recording was made 9/20/13 during a special Introduction to Scientific Workflows presentation for the XSEDE campus champions.

Abstract: In this follow up to the overview presentation that introduced the ECSS Workflow Community Applications Team, the team will focus on workflow technologies themselves. They will first introduce scientific workflows through examples in different scientific domains, considering both their common features and their differences. They will next survey several workflow toolkits, summarizing their capabilities and the types of problems where they have been applied. They will then examine two workflow toolkits (Pegasus and Apache Airavata) in greater detail, illustrating their capabilities through their application to earthquake science and astrophysics, respectively. The goal is for attendees to identify and understand different types of scientific workflows, to have an overview of different tools that are available, and to learn how to engage different workflow development communities.     

Post has attachment
We are excited to invite the community to a SC'13 BOF we are involved with:

Abstract: The purpose of this Birds-of-a-Feather (BOF) session is for computational scientists, scientific workflow software researchers, cyberinfrastructure architects, and cyberinfrastructure operations leaders to share ideas on how to best use cyberinfrastructure such as the NSFs XSEDE to tackle the most challenging and scientifically valuable distributed computing problems. The emphasis will be on discussing open, unsolved scientific problems that will benefit from using multiple XSEDE, Open Science Grid, campus, and international cyberinfrastructure resources in coordinated fashion. The BOF will be mediated by the XSEDE Workflow Community Applications Team.

Time: Tuesday - 5:30PM - 7:00PM
Room: 404

More information:

Post has shared content
This is a webinar we did for the Open Science Grid's Campus Infrastructures Communities group.

Post has attachment
The July 24, 2013 issue of iSGTW is featuring a Pegasus workflow running on the Open Science Grid:

“We had contributing nodes on OSG totaling near 1.4 million hours for the first 19.5 million jobs, which ran for about a month,” Quick says. “Since then we’ve done two more runs and we’re now up to more than 30 million docking jobs consuming over 3 million CPU-hours from opportunistic OSG resources.”

Read more at:

Post has attachment
Pegasus 4.2 Released

Pegasus 4.2 is a major release which contains several improvements on data management capabilities, a new web based monitoring dashboard, job submission interfaces via CREAM CE, new replica catalog backends and support for PMC only workflows and IO forwarding for PMC clustered jobs.
The data management improvements include a new simpler site catalog schema to describe the site layouts, and enables data to be transferred to and from staging sites using different protocols. A driving force behind this change was Open Science Grid, in which it is common for the compute sites to have Squid caches available to the jobs. For example, Pegasus workflows can now be configured to stage data into a staging site using SRM or GridFTP, and stage data out over HTTP. This allows the compute jobs to automatically use the Squid caching mechanism provided by the sites, when pulling in data to the worker nodes over HTTP.
Also, with this release we include a beta version of a web based monitoring dashboard (built on flask) that users can use to monitor and debug their running workflows. The dashboard provides workflow overview, graphs and job status/outputs.
Job submissions to the CREAM job management system has been implemented and tested.
New simpler replica catalog backends are included that allow the user to specify the input directory where the input files reside instead of specifying a replica catalog file that contains the mappings.
There is prototypical support for setting up Pegasus to generate the executable workflow as a PMC task workflow instead of a Condor DAGMan workflow. This is useful for environments, where Condor cannot be deployed such as Blue Waters. I/O forwarding in PMC enables each task in a PMC job to write data to an arbitrary number of shared files in a safe way. This is useful for clustered jobs that contain lots of tasks and each task only writes out a few KB of output data.
A complete list of new features is documented in the release notes at  
The new version of Pegasus can be downloaded from the Pegasus Website or the apt/yum repositories

Post has attachment
We are proud to announce a new member of the Pegsaus projects family:

Precip -  Pegasus Repeatable Experiments for the Cloud in Python

Precip is a flexible experiment management API for running experiments on clouds. Precip was developed for use on FutureGrid infrastructures such as OpenStack, Eucalyptus (>=3.2), Nimbus, and at the same time commercial clouds such as Amazon EC2. The API allows you to easily provision resources, which you can then can run commands on and copy files to/from subsets of instances identified by tags. The goal of the API is to be flexible and simple to use in Python scripts to control your experiments.

Major features of Precip include support for vanilla images which can be bootstrapped at runtime, flexible tagging of instances for group manipulation such as running commands or copy files, and automatic handling of ssh keys and security groups.

For more information and downloads, please see

Post has attachment
Pegasus at SC'12

Next week the annual Supercomputing conference will be held in Salt Lake City, Utah. If you are planning to attend the conference we encourage you to stop by one of the Pegasus-related talks that are scheduled during the week.

Several of the talks will be given at the workshop on Workflows in Support of Large-Scale Science (WORKS) on Monday. More information can be found on the WORKS website:

If you would like to discuss the Pegasus Workflow Managment System or other projects such as running workflows on the Cloud, resource provisioning, or our new Precip: Pegasus Repeatable Experiments for the Cloud in Python software, please come to the USC booth during Ewa's talks.

The complete list of Pegasus-related talks is as follows:

Monday 10:50AM: "Peer-to-Peer Data Sharing for Scientific Workflows on Amazon EC2", Gideon Juve, WORKS Workshop, Room 155-A

Monday 2:15PM: "A General Approach to Real-time Workflow Monitoring", Karan Vahi, WORKS Workshop, Room 155-A

Monday 2:55PM: "Integrating Policy with Scientific Workflow Management for Data-Intensive Applications", Ann Chervenak, WORKS Workshop, Room 155-A

Tuesday 2:30PM: "Cloud Computing Cost- and Deadline-Constrained Provisioning for Scientific Workflow Ensembles in IaaS Clouds", Maciej Malawski, Cloud Computing Session, Room 355-D

Tuesday 4:00PM: "Enabling science with the Pegasus Workflow Management System", Ewa Deelman, USC booth

Wednesday 10:00AM: "Enabling science with the Pegasus Workflow Management System", Ewa Deelman, USC booth
Wait while more posts are being loaded