Operating and service level objectives
Hours of Availability
KeystoneREN will operate the Keystone Research and Education (KeystoneREN) Network 24 hours per day every day. However, KeystoneREN expects, and shall have the right, to suspend the operation of the KeystoneREN Network from time to time for purposes of testing and maintenance. KeystoneREN will attempt to provide advance explanation and warning of changes to the service with a view to allowing mutually satisfactory implementation of end-to-end service. In the event of an emergency and to protect the network or other customers, KeystoneREN may modify, suspend, or discontinue any aspect of the KeystoneREN Network, without prior notice, impose limits on KeystoenREN Network features and services, or restrict any Customer’s access to any part or all of the KeystoneREN Network. As soon as possible thereafter, KeystoneREN will contact the Customer in an attempt to resolve any problems and to re-establish service.
Roles and Responsibilities
Title/Role Contact Information
Customer Support [email protected], 1-317-278-0328
Change Manager [email protected], 1-717-320-8359
Service Inquiries [email protected]
Complaints [email protected]
KeystoneREN (Service Provider) Responsibilities
- Manage the acquisition, deployment, integration and implementation of equipment and services necessary to deliver the service(s).
- Meet response times associated with this agreement for outages, issues, requests, and complaints.
- Provide advanced notice of scheduled maintenance via email distribution lists and web-based calendar.
- Include the details and options of services in the Services Catalog or product data sheets.
- Ensure availability of representatives as indicated to respond and participate in execution of their roles under this agreement.
Participant (Customer) responsibilities
- Ensure availability of representatives as indicated to respond and participate in the execution of their roles in this agreement.
- If applicable, adhere to the terms and conditions of the KeystoneREN Customer Node Agreements (Master, Site Access) for customers hosting co-located KeystoneREN Nodes.
- Provide network design specifications including pre-existing LAN/WAN IP addressing schemes, MAC addresses and circuit designs.
- Customer is solely responsible for all equipment and other facilities used in connection with the Service, whichare not provided by KeystoneREN or its service providers.
Service Requests
All service requests must be made to the KeystoneREN NOC Service Desk using one of the following approved methods:
- Contacting the Service Desk via telephone at 1-317-278-0328
- Logging a service request ticket via email to [email protected]
- Report a problem using the webform available at https://sn-tools.grnoc.iu.edu/keystoneren-request/
Service Disruption at Customer Site
Any incident that causes an unexpected or unplanned service disruption at the Customer location should be reported as soon as possible to the KeystoneREN NOC Service Desk using one of the following approved methods:
- Contacting the Service Desk via telephone at 1-317-278-0328
- Logging a service request ticket via email to [email protected]
Service Desk
KeystoneREN has outsourced Service Desk and NOC functions for KeystoneREN to the Global Network Operations Center (GlobalNOC) at the Indiana University. The GlobalNOC has a dedicated Specialized Support Technician (SST) for KeystoneREN. This SST will serve as the primary interface for KeystoneREN operations, customers, support groups and network administrators. The SST will work traditional day shift hours, but will interact with each Service Desk shift for training and information sharing purposes. Service Desk technicians will cover KeystoneREN operations across the entire support team to provide 24x7x365 service support.
As an extension of the front line Service Desk staff the GlobalNOC Operational Network Engineering team will serve as Tier 2 technical support service group for KeystoneREN operations. This will provide KeystoneREN customers with incident management services, troubleshooting of network issues, and escalation to KeystoneREN support teams.
Incident Management
Within the first twenty (20) minutes of a service disruption (customer contact, alerts, alarms, others) the Service Desk will begin troubleshooting and problem determination. This will include impact assessment, direct customer contacts, creation and assignment of tickets to Tier 2 engineering as appropriate, email notifications to distribution lists, vendor contacts, and other actions to assist with problem resolution. The Service Desk will stay engaged with the necessary parties throughout the duration of the incident until service has been restored, the problem has been resolved, and the incident is closed.
Process
Incident/Request Classification
At the GlobalNOC, incidents and requests are classified using a combination of customer impact (priority) and network impact (severity). Incident/Request classification will be used to direct the engagement of resources and troubleshooting steps for GlobalNOC, KeystoneREN, and others. Other support teams such as vendors for fiber maintenance, hardware maintenance, telco may not follow these guidelines. GlobalNOC
and KeystoneREN will be responsible for managing these resources during problem resolution. This classification scheme is outlined below:
Customer Impact Guidelines (Priority)
Customer Impact is used to classify incidents according to their subjective impact on the customer’s operations, performance, and usability. Ultimately the customer impact of an incident is up to the customer, and may change as the ticket progresses. Customer Impact is used to drive escalation and response expectations. Customer Impact seeks to answer, “How high of a priority is the problem/maintenance to the customer?”
1- CRITICAL
• A problem or issue for which the customer needs immediate, undivided attention from NOC staff until resolved.
• The customer is expected to be available immediately to commit full-time resources until the situation is resolved.
• The NOC uses this by default when the network is monitored to have an outage of a non-redundant core network element.
2- HIGH
• A problem or issue for which the customer needs resolution within 1 business day.
• The customer is expected to commit resources to resolve the situation between the hours of 1300 and 0100 UTC (1200 and 0000 UTC when Daylight Saving Time is in effect).
• The NOC uses this by default when the network is monitored to have an outage of a redundant core network element.
3- ELEVATED
• A problem or issue for which the customer does not need immediate resolution, but needs NOC attention within 3 business days.
• The customer is expected to be available to provide information or assistance when available during normal business hours.
• The NOC uses this by default when a customer connection or session is monitored to have a problem or outage. This is also used by default for maintenance, which are both NOC initiated and customer impacting.
4- NORMAL
• No impact to the customer’s operations, performance, and usability.
• Non-urgent customer service requests.
• Routine installation/provisioning tickets, non-customer impacting maintenance, and customer initiated maintenance.
Network Impact Guidelines (Severity)
Network Impact is an objective scale of measure used to quantify the highest level of impact to the network that occurred throughout the duration of a problem or maintenance. Network Impact seeks to answer, “How severe is the problem/maintenance to the network?”
1- CRITICAL
• The network, a portion of the network, or a key network resource has failed causing an outage of service.
• The network, a portion of the network, or a key network resource is severely degraded rendering the network nearly unusable.
2- HIGH
• The network, a portion of the network, or a key network resource has failed or is severely degraded, but service has not been affected due to redundant resources.
• The network, a portion of the network, or a key network resource is experiencing mild-to-moderate degradation, and service is affected.
• Security requests and incidents.
3- ELEVATED
• Network problems or maintenances of limited scope that pose no risk to the network as a whole.
• Direct connectivity to a single entity (peer, connector, lambda) has been lost.
4- NORMAL
• No network impact
Incident Response – Service Desk
The Service Desk responds immediately to all monitoring alarms and incidents reported to the GlobalNOC. Service Desk technicians will begin immediate problem assessment and analysis, identifying the severity of the issue. For Critical level problems, the Service Desk will notify Tier 2 Network Engineering within the first 5 minutes to begin engineer investigation. For High, Elevated, and Normal priority level trouble-tickets, the Service Desk will assign the ticket to Tier 2 Engineering within the first 20 minutes after the alert.
The Service Desk does not actively monitor dark fiber service. Customers with Service Agreement(s) for dark fiber service will need to call the GlobalNOC and report any outages to the Service Desk. GlobalNOC Tier 2 may not be engaged by the Service Desk in fiber related incidents.
Initial Problem Assessment:
- Verification of alarms and reported outage
- Create trouble ticket
- Contact customer for more information (if necessary)
- Collect outage information from network devices, circuits, tools
- Assess impact of outage on customer(s) and network services
- Contact Member Site/ IT support and/ or Netserve365 contact center
- Assign trouble-ticket to Network Engineering as appropriate
- Send initial customer notification as warranted by impact
- Note: Since Critical issues require faster response times for notifying engineers, some of the information above may be added to the ticket after initial ticket assignment.
Incident Response – Tier 2/3 (GlobalNOC Engineers, KeystoneREN Engineers)
Based on the incident/request classification and customer impact assessment (priority), Tier 2 and Tier 3 engineers will respond to incidents using the following guidelines:
| Impact Level/Priority | Investigation |
| Critical / 1 | 15 Minutes |
| High / 2 | 1 Hour |
| Elevated / 3 | 1 Day |
| Normal / 4 | 3 Days |
| Dark Fiber | Acknowledgement within 15 minutes Deploy within 30 minutes for Critical / 1 |
Incident Response for Dispatch – Field (Fiber, Node, Co-lo)
Using the incident/request classification of customer (impact) and network (severity), if initial investigation assessment determines a dispatch of field resources is required, the following guidelines will be used:
| Category | Notice to Dispatch | Resources On-Site after Notice to Dispatch | Resources |
| Fiber – Outside Plant, Inside Plant | 30 Minutes | 4 Hours | Third Party Providers |
| Network Equipment – Service Node (Core) | 1 Hour | 4 Hours | Front Line Maintenance Provider, KeystoneREN Operations, Customer Site/IT Remote Hands and Eyes |
| Network Equipment – Access Node | 4 Hours | 8 Hours | Front Line Maintenance Provider, KeystoneREN Operations, Customer Site/IT Remote Hands and Eyes |
| 3rd Party Telecommunications Services (NNI, Last Mile) | 2 hours | 8 Hours | TBD |
Incident Escalation
The matrix below details the escalation rules for next level engagement and notification. The decision to escalate is made by the managing group or designated “lead”. Customer may request escalation only when proper incident/request classification (using the guidelines) has been achieved. Customers may request escalation through the Service Desk or “lead” engaged in the incident.
| Contact | 1- Critical | 2- High | 3- Elevated | 4- Normal |
| Tier 2 Engineering Service Desk Supervisor | Immediate, by normal process | Immediate, by normal process | Immediate, by normal process | Immediate, by normal process |
| Service Desk Supervisor On-Call Service Desk Manager Tier 2 Engineering Manager KeystoneREN Engineering KeystoneREN Field Services Group Customer IT Support (Remote Eyes/Hands) | 1 Hour | 4 Hours | 1 Day | 30 Days |
| Service Desk Supervisor On-Call KeystoneREN Director of Network Operations KeystoneREN Field Services Group (Mgmt) | 4 Hours by phone | 12 Hours by phone | 3 Days | |
| Service Desk Director Tier 2 Engineering Director KeystoneREN Director of Network Operations | 4 Hours by phone | 12 Hours by phone | 3 Days | |
| KeystoneREN CTO KeystoneREN Executive Director Indiana University/GlobalNOC AVP of Networks | 12 Hours |
Field Services Coordination
The SST will coordinate Field Services in a joint effort with the Field Services organizations, Tier 2 Engineering, KeystoneREN, 3rd Party Providers, and the customer IT support team. The process outlined below is a general overview of the steps that will be followed when coordinating Field Services.
Field Services Coordination Process
1. Respond to direct operational issues and requests:
a. Outage and repair (break/fix)
b. Scheduled or emergency maintenance
c. Installation or decommission
d. Facility maintenance scheduled at a node
e. Determine what components are needed for repair, where they are located, and the availability
2. Determine if work is performed by remote hands, vendors, network engineering, or other support entities.
3. Develop schedule of obtaining parts, remote hands dispatch, change management events, and availability of NOC staff.
4. Contact those involved via field service requests and standard communication channels, share scope of work and schedule.
5. Send maintenance notification to the network community.
6. Document and timestamp all events and issues.
7. Ensure work is complete and resources are restored before dismissing remote hands and others assisting
8. Ensure trouble-tickets, network information databases, systems, and general documentation are updated and complete.
KeystoneREN/Customer Operating Level Agreements
KeystoneREN customers hosting Service or Access nodes agreed to a set of Operating Level Agreements that were included in the Customer Node Agreements. These OLAs were between the Customer Site/IT Staff and KeystoneREN to support the Service Agreements for service(s) delivered on KeystoneREN.
Section V-A Maintain Room Space, Environment, and Access Control
Section V-B Assist with Service Provisioning and Customer Interconnects
Section V-C Provide remote “eyes and hands” for troubleshooting and operational support
The GlobalNOC, using the incident/request classification system will determine when a request for “Immediate Service” or a request for “Planned Service” is required of the KeystoneREN Customer Site/IT support as specified in Section V-C “provide remote eyes and hands” for troubleshooting and operational support.
It is expected that Customers will honor the OLAs and support the GlobalNOC and KeystoneREN troubleshooting efforts by responding to Immediate Service and Planned Service requests within the time frames agreed to in the Customer Node Agreements.
Service Monitoring, Technical Support and Maintenance
Service Monitoring
The GlobalNOC has a sophisticated set of network measurement, monitoring, and management systems to monitor KeystoneREN network equipment and services on a 24x7x365 basis. All active network components are monitored by the GlobalNOC. Customers with Service Agreement(s) for dark fiber service, who experience fiber-related outages, must report problems to the Service Desk. Online tools for viewing information about KeystoneREN including status of service requests and trouble tickets can be viewed at https://noc.keystoneren.org/
Technical Support
Technical support is available by telephone, online, or email for problem reporting on a 24x7x365 basis. GlobalNOC provides technical support for all service-related inquiries. Trouble ticket submission and status tracking can be viewed at https://noc.keystoneren.org/
Maintenance
KeystoneREN’s standard maintenance window is Tuesday 00:00 EST to 06:00 EST. Scheduled maintenance is performed during the maintenance window. GlobalNOC will provide a forty-eight (48) hours notice via electronic methods for non-service impacting scheduled maintenance. GlobalNOC will provide seven (7) days notice via electronic methods for service impacting planned maintenance. Emergency maintenance for network hardware and KeystoneREN underlying infrastructure (lit and dark fiber) is performed as needed.
Change Management
Change Management is a process by which a NOC coordinates network installations, maintenance, and enhancements on a 24x7x365 schedule.
The Service Desk’s Change Management Team will help coordinate all aspects of KeystoneREN’s Change Management process, including scheduling, community notification, event logistics, and general communications with engineering, vendors and field service groups. Proposed network changes are tracked through the NOC’s trouble ticket system with requests submitted via web form. A Service Desk Supervisor serves as the primary change coordinator, with the assistance of Lead Shift Technicians.
KeystoneREN/GlobalNOC Change Management Process
- Network engineers submit a Change Request web form for all scheduled maintenance, provisioning and field serve requests, which automatically creates a ticket in the Footprints ticketing system.
- The request is reviewed by the Service Desk to ensure all necessary information is collected. If the information is incomplete, the Service Desk works with the requestor to complete the information.
- The request is approved based on whether the network’s policies for advanced notification is met.
- Customer/community is notified via email distribution lists.
- Maintenance is added to maintenance calendars.
- Maintenance is monitored at the scheduled date/time. Engineers will contact the Service Desk as appropriate prior to the start of the maintenance and after it is completed.
- All action items and/or pertinent information regarding the change itself are tracked in the ticket, including any correspondence with the customer, vendor, or engineer.
- If the change is successful, a “maintenance complete” notification is sent out.
- If the change goes beyond the scheduled time frame, an “unscheduled maintenance extension” notification is sent out.
- If the change was unsuccessful, back-out plans are implemented, and customer/community notification is sent out.
- After the change is complete, the Service Desk and Engineering groups will have one week to update all documentation related to the change, and notify staff of any pertinent information.
- Ticket closed.
Service Level Objectives
KeystoneREN provides service level objective definitions for KeystoneREN services including network availability and mean time to respond. The service objectives are measured monthly from the KeystoneREN port, fiber, or optical connection.
Availability
Availability is a measurement of the percentage of total time that the service is operational when measured over a 30 day period for normal operations not including maintenance windows. Service is considered “inoperative” when either of the following occurs: (i) there is a total loss of signal for the service, (ii) output signal presented to the customer by KeystoneREN does not conform to the technical specifications described in the KeystoneREN services product data sheets or services catalog.
Mean Time to Respond Mean Time to Respond is the average time required for the GlobalNOC to begin troubleshooting a reported fault. The Mean time to Respond objective is 20 minutes.