- 2018-04-20: Anvil: Filesystem issues
Category: MaintenanceEarlier today the filesystem backing Anvil experienced some issues which may have caused instances to hang. We suggest users check running instances and perform a hard-reset if they suspect slow or strange performance.
show details...
Earlier today the filesystem backing Anvil experienced some issues which may have caused instances to hang. We suggest users check running instances and perform a hard-reset if they suspect slow or strange performance.
- 2018-03-19: Sandhills: Unscheduled maintenance complete
Category: General AnnouncementUnscheduled maintenance has been completed on the Sandhills cluster.
Jobs that were running prior to the maintenance have had their time limit extended.
Please let us know of any troubles using the cluster by sending email to hcc-support@unl.edu
show details...
Unscheduled maintenance has been completed on the Sandhills cluster.
Jobs that were running prior to the maintenance have had their time limit extended.
Please let us know of any troubles using the cluster by sending email to hcc-support@unl.edu
- 2018-03-19: Sandhills: Unscheduled maintenance required
Category: MaintenanceSandhills requires maintenance on one of the servers that provide the /work file system.
Running jobs will be suspended and pending jobs will remain queued during the maintenance.
Any access to files under /work will block on the login or the Globus transfer nodes during the maintenance.
A follow-up announcement will be sent when the maintenance is completed.
- 2018-03-16: Crane /work filesystem downtime resolved
Category: General AnnouncementThe /work filesystem for Crane is restored as of 2:50pm.
One of the storage servers crashed and rebooted. A filesystem check was completed with no errors found. Running jobs which were accessing /work stalled until the filesystem was restored. This may have caused jobs to exceed their time limit. There was no data loss from this outage.
We believe the storage server crash was triggered by I/O delays as the RAID controller was rebuilding a failed disk drive. The rebuild is still running and we are monitoring the system.
- 2018-03-16: Crane /work filesystem unplanned downtime
Category: System FailureThe /work filesystem for Crane is partially unavailable. One of the storage servers crashed and rebooted. We are now running a filesystem check before placing the server back online. Pending jobs will be held until the maintenance is complete.