TY - JOUR
T1 - Unified Monitoring and Telemetry Platform Supporting Network Intelligence in Optical Networks
AU - Shen, Sen
AU - Han, Jing
AU - Bardhi, Klodian
AU - Li, Haiyuan
AU - Yang, Mark
AU - Teng, Yiran
AU - Yokar, Vaigai
AU - Yan, Shuangyi
AU - Simeonidou, Dimitra
N1 - Publisher Copyright:
© 2025 Optica Publishing Group. All rights,
PY - 2025/2/1
Y1 - 2025/2/1
N2 - In recent years, machine learning (ML) applications have generated considerable interest and shown great potential in optimizing optical network management, such as quality of transmission (QoT) estimation, traffic prediction, and resource allocation. However, these applications often require large datasets for training, inference, and updating, while network operators are generally reluctant to disclose their data due to privacy concerns and the sensitivity of operational information. Most open-source datasets typically lack transparency regarding network specifics, such as topology details and device configurations, making data acquisition and ML model training more difficult. In response, this paper presents a unified monitoring and telemetry platform that leverages distributed and centralized time-series databases on InfluxDB, a Kafka-based telemetry pipeline, and advanced ML applications. The separation of distributed and centralized databases improves data management flexibility and scalability. The Kafka-based telemetry pipeline ensures high-throughput, low-latency data streaming with end-to-end latency under 0.05s through optimized partitioning. Additionally, integrating Kafka and InfluxDB allows for real-time data visualization from multiple sources, improving transparency and supporting real-time data streaming for network applications. By implementing this advanced telemetry and ML architecture, network operators can build a more intelligent, responsive, and resilient optical network infrastructure.
AB - In recent years, machine learning (ML) applications have generated considerable interest and shown great potential in optimizing optical network management, such as quality of transmission (QoT) estimation, traffic prediction, and resource allocation. However, these applications often require large datasets for training, inference, and updating, while network operators are generally reluctant to disclose their data due to privacy concerns and the sensitivity of operational information. Most open-source datasets typically lack transparency regarding network specifics, such as topology details and device configurations, making data acquisition and ML model training more difficult. In response, this paper presents a unified monitoring and telemetry platform that leverages distributed and centralized time-series databases on InfluxDB, a Kafka-based telemetry pipeline, and advanced ML applications. The separation of distributed and centralized databases improves data management flexibility and scalability. The Kafka-based telemetry pipeline ensures high-throughput, low-latency data streaming with end-to-end latency under 0.05s through optimized partitioning. Additionally, integrating Kafka and InfluxDB allows for real-time data visualization from multiple sources, improving transparency and supporting real-time data streaming for network applications. By implementing this advanced telemetry and ML architecture, network operators can build a more intelligent, responsive, and resilient optical network infrastructure.
KW - Optical telemetry
KW - optical monitoring
KW - Machine Learning
KW - Optical network
U2 - 10.1364/JOCN.538552
DO - 10.1364/JOCN.538552
M3 - Article (Academic Journal)
SN - 1943-0620
VL - 17
SP - 139
EP - 151
JO - IEEE/OSA Journal of Optical Communications and Networking
JF - IEEE/OSA Journal of Optical Communications and Networking
IS - 2
ER -