Messages & Announcements

  • 2018-08-04:  Crane: /work filesystem restored
    Category:  General Announcement

    The /work filesystem for Crane is back in service. A filesystem check was completed with no errors found. Jobs which were running during the outage may have exceeded their time limit or had errors accessing data on /work. There was no data loss from this outage.

    One of the storage servers experienced a disk drive failure, leading to a RAID controller reset. This caused the filesystems to report corruption and switch to read-only mode. While recovering the system, the initial consistency checks showed significant numbers of errors. However, these were likely spurious errors related to the journal. After the journal was replayed, the repair process went smoothly.


  • 2018-08-04:  Crane: /work filesystem unplanned downtime
    Category:  System Failure

    The /work filesystem for Crane is partially unavailable. One of the storage servers experienced hardware issues leading to corruption on the Lustre /work filesystem. Filesystem consistency checks are curently running, and the output unfortunately suggests that data loss or corruption is likely.

    Pending jobs will be held until the maintenance is complete.


  • 2018-07-30:  Anvil filesystem maintenance completed
    Category:  Maintenance

    The maintenance on the Ceph filesystem has been completed and the system is available for use. Please check your VMs and reboot if they appear unresponsive. As always, contact hcc-suppport@unl.edu if you require assistance.


  • 2018-07-12:  HCC Anvil filesystem maintenance planned --- 23rd July
    Category:  Maintenance

    This announcement is for Ceph filesystem maintenance affecting Anvil only. The maintenance window starts at 9:00 AM 23th July and may take up to a week. Anvil will keep running during the downtime, however some performance impact is expected. Users are encouraged to shut down their VMs if possible. Running VMs may be suspended or rendered unresponsive during this timeframe. A follow-up announcement will be posted when the system is ready for production use.


  • 2018-06-29:  SANDHILLS: Unexpected outage
    Category:  System Failure

    An unexpected power outage occurred in SCHORR approx 9:40pm Fri Jun 29. Cluster infrastructure maintained service but worker nodes rebooted. Please check the status of your running and queued jobs on SANDHILLS.


    An unexpected power outage occurred in SCHORR approx 9:40pm Fri Jun 29. Cluster infrastructure maintained service but worker nodes rebooted. Please check the status of your running and queued jobs on SANDHILLS.