Data Center Move: Status Updates

Main content

Sunday

4:55pm: We are back to regular production status for all our user-facing applications. A few staff-facing items remain to be checked off (old Voyager server, Datafarm).

12:28pm: The ILL form is now working normally except that we've temporarily disabled the standing-faculty check.

10:57am: Nebraska is back and we are going to switch the library domain back to the way it was before Thursday.

Saturday

6:19pm: We're stopping for tonight. Almost everything is back online and using our regular production equipment out of the new data center. Exceptions that are not yet up: old Voyager server, nebraska. There is still a new problem with the ILL form that came up today that we're looking into.

5:47pm: Many systems are now up and operating out of the new data center, but we are still working on some of the more complicated workflows through handles and proxy.

4:18pm: We are back on production Franklin and proxy, but seeing some weird errors with proxy and the link resolver that we are troubleshooting.

1:30pm: We're working on bringing regular production Franklin and proxy back online.

10:15am: Work is resuming.

Friday

8:20pm: Done for today, all racking and cabling done with a few issues to be ironed out. One replacement part is needed which is at Van Pelt. Work to resume tomorrow at 10.

7:05pm: The library team is still at the data center, addressing some issues with the cabling and preparing to bring some systems back online.

3:20pm: Franklin problem was caused by an IP address conflict as equipment comes back online. It's back up.

3:05pm: We have a problem with ersatz Franklin. Working on it.

2:42pm: Cabling is complete. We're still trying to figure out what the issue is at Biomed while we start powering systems up in order according to our schedule.

1:37pm: We got reports of connectivity problems at Biomed. So far, this doesn't appear to be related to the data center move but we will keep trying to troubleshoot.

7:19am: The cabling contractor is on-site and has begun reconnecting the servers, storage, and other equipment. This is several hours ahead of schedule.

Thursday

6:35pm: Day 1 is finished. The movers were able to finish racking all the equipment, which puts us slightly ahead of schedule (they were going to finish that tomorrow morning). The ersatz server is still chugging along.

1:32pm: Lunch is wrapping up after which the movers will resume racking the equipment in its new home.

11:35am: Franklin (the ersatz version) now appears to be fully functional. The work continues behind the scenes.

11:28am: Our hardware is on site at Pennovation! It will start being racked this afternoon. We're still troubleshooting the problem Franklin is having connecting to Ex Libris. In addition to the iframe workaround in the 10:16 update, you can still search Summon in the Articles+ tab; the results just aren't appearing in the Franklin two-column display.

10:16am: www.library is now working on both http and https. Sorry for this extended outage! It was not part of the plan, and was not testable in advance. Franklin and EZP are also up but the Ex Libris API is not responding properly. Users may not know how to get to e-resources. If you click on the title and go to the record page, you'll see the e-resource links in the iframe which is unaffected. We're still troubleshooting the API.

9:57am: We/ISC have been able to fix the problem with the www.library domain but it is taking a while to propagate to all users. It may help to try http instead of https.

8:55am: Switching the www.library domain is not working as expected. ISC is trying to help set it up differently. Currently both the website and Franklin appear down.

8:27am: We are switching the networking from Van Pelt to Pennovation. Because the www.library URL also goes through our equipment, the main library website is briefly appearing to be down.

7:15am: Done with our kick-off call. Library team was at VP before the call to start powering some things down.


Over the special holiday break, LTS (and the Core Services Team in particular) will be working to move the library’s data center from the basement of Van Pelt to the new, purpose-built ISC facility at the Pennovation Center. This will place our equipment in a better-managed location in terms of power and temperature control and complements some of the moves we’ve made into the cloud (i.e. offsite entirely) over the past few years.

The actual relocation will be performed by professional movers on 12/27-28 but we will announce a longer, more conservative window for the project to ensure time to wind servers down gracefully and test the connections, cabling, configuration, and applications. We will start 12/26 and plan to have completed and verified the work by the end of the weekend, 12/30. More schedule details are below.

Many of our applications are now cloud-based and will be unaffected by the move:

  • Alma
  • Summon
  • Library websites/webpages hosted on Pantheon (Drupal)
  • LibGuides and other Springshare tools
  • BorrowDirect/EZBorrow
  • Aeon
  • Authentication via ISC’s Shibboleth/single-sign-on service
  • Many popular tools that are not run by the library, such as PubMed, Google Scholar, Jstor, etc.

To mitigate the impact on users, we intend to host replicas of several essential local services on an ersatz server during the move:

  • EZProxy (including alumni proxy)
  • Handles (no editing or maintenance, just the redirects)
  • Franklin (will be slower than the usual server but should be acceptable for the lower level of traffic we see over the holiday break)

Several applications are local and cannot be replicated during the move, so will have outages:

  • ILLiad. Requests from Franklin like ILL and Scan & Deliver can be placed, but will be queued for processing after the work is done. Biomed is working on a solution for clinical rush requests (probably by phone).
  • Older webpages on the Nebraska server (not Pantheon/Drupal or LibGuides, which will be up)
  • DLA sites like Penn in Hand, Print at Penn, Penn/PACSCL finding aids sites, course reserves
  • OPenn
  • Colenda
  • FTP, rsync, and other non-web protocols

We will have a maintenance page that users will see for major destinations that are unavailable, but we have no way to guarantee that every single conceivable URL will be covered. The device we usually use to "catch" and redirect URLs is also being moved. Unavoidably, users attempting to access deep or stale links may get a timeout.

Several services that limit access will not be available but will failover to an open state:

  • The keyserver will allow all access rather than limiting the number of copies of applications running at once
  • Printing will be free; historically users print very little over the holiday break

For staff users, the shared drive, Confluence, and Jira will be unavailable but remote desktop/VPN will work–so if you have documents you need to work on, please put them on your desktop before 12/26.

Schedule

Thursday, 12/13 (tentative): Dry run for the ersatz server. Early in the morning of the 13th, we will switch over to the ersatz server for Franklin, EZProxy, etc., which will take up to 30 minutes of downtime. We will monitor its performance. If there are no problems, we will switch back early Friday morning, 12/14. If there are issues, we may switch back earlier.

Wednesday, 12/26: We will begin to shut down some servers. This will not take all day, but we are announcing this date as part of the outage to be conservative.

Thursday and Friday, 12/27-28: We will switch back to the ersatz server early in the morning. Professional movers will be on-site at Van Pelt and Pennovation packaging, moving, and reinstalling the equipment. This is hands-off time for LTS but someone will be on-site. We expect "custody" to be returned to us not before 5pm on Friday.

Saturday and Sunday, 12/29-30: We will work on restoring the servers and restarting applications.