User Case: Lack of Network Infrastructure Security
Ms. Zhong, the IT manager at a large chain of supermarkets’ computer center, reported a severe network infrastructure security issue to the Network Hospital today. The local area network at the central network was exceptionally slow, affecting financial settlements and logistics coordination with various chain store management centers.
The problem began two weeks ago, with an initial noticeable drop in network speed. Over time, the situation deteriorated until today when the network was essentially paralyzed. Internal data retrieval took 3 minutes (compared to the previous 3 seconds), and every transaction and logistics registration took around 2 minutes (compared to a maximum of 5 seconds). This slowdown resulted in delays in goods delivery, manual bookkeeping at some stores, and significant supply chain disruptions.
Ms. Zhong explained that since the transaction and settlement center for goods delivery were part of the central network, their network maintenance staff first initiated an emergency repair procedure. Ping tests for all critical servers, routers, remote routers, and external servers showed response times under 15ms, indicating reasonably good connectivity.
The central network system was shut down temporarily and restarted. Initially, it appeared to work faster, but within 10 minutes, the speed drastically deteriorated to a critically low level. Five backup servers were brought online, replacing five original servers, which significantly improved network speed. However, this improvement was short-lived, and after about 2 hours, it was observed that server traffic was abnormally high, with the router almost saturated.
Closing half of the servers and sites led to increased network speed. It seemed there was a correlation between network traffic and the number of sites, making it difficult to pinpoint the exact location of the network issue. Suspicions arose regarding a potential “virus” causing the problems, and all sites and servers underwent multiple antivirus scans, but the issue persisted after the systems were rebooted.
Diagnose Network Infrastructure Security Case
The root of the problem was likely in the central network, but the possibility of influence from other remote networks was not excluded. To investigate further, the Network Hospital team decided to visit the location of the supermarket chain’s central computer center.
Thirty minutes later, they arrived at the site. The F68X network tester was connected to the central network switch for observations, and MIB agents for each port on the core switch and workgroup switches were monitored one by one. It was observed that while port traffic was higher, the network was functioning normally. However, an odd phenomenon was discovered where the traffic on various ports was roughly the same, averaging around 50% to 60%. When asked if there were any previous baseline test records or recent network health test records, Ms. Zhong mentioned that they had none. The network had been working smoothly since its installation six months ago, and any minor issues were resolved quickly by the network management staff.
Hence, there were no other documents on network maintenance apart from machine profiles and network topology diagrams. The high network traffic levels indicated the presence of some fault. To address this, two key questions needed answers: the primary working protocols on the network and whether the observed traffic was expected for normal operation.
Unfortunately, the network could not provide this data at the time. The F69X traffic analyzer was connected to all eight servers and switches to observe the distribution of application traffic on the network backbone. The results indicated that approximately 50% of the traffic was cc:Mail data packets on each server. Following cc:Mail, the traffic was divided amongst Oracle applications (3%), HTTP (2%), MS-SQL server (1%), DNS (1%), Oracle (0.5%), Informix (0.1%), and FTP (0.7%) applications in sequential order. The cc:Mail data packets were observed with records in almost every central network site and server.
Notably, there were data packets sent through the router to other recipients in the local area network, suggesting that all members of the central network were sending cc:Mail application data to all other members. The issue was how these email data packets were entering the various servers and workstations. Upon revisiting the onset of the problem, it was revealed that the issue started two weeks ago, which was around New Year’s Day in 2000.
Everyone was asked if they had run any illegal software on the network, including electronic greeting cards. Ms. Zhong mentioned that she noticed the network management staff had circulated an interesting electronic Christmas card. She liked the card but prohibited its use due to professional responsibility and company policy. Could this card be the culprit?
To confirm their suspicions, the team decided to format the hard drives and reinstall the system on three main servers and ten workstations. Backup data was then restored to the servers, and at this point, only remote chain management centers were allowed to perform business data exchanges with the three servers. Other servers and workstations were temporarily shut down. Upon starting the system and monitoring port traffic, all ports had less than 4% of traffic. The network ran without issues, which indicated that the formatted servers were no longer running the cc:Mail application.
A decision was made to allow all chain stores to open by 10 PM. Initially, the non-formatted servers and workstations were started up, and 11 remote chain management center network management staff members helped simulate network operations. After approximately 10 minutes, port traffic began to rise rapidly. It was observed using the traffic analyzer that illegal cc:Mail application traffic first appeared on Server 6, followed by Workstations 17, 42, and 31, and other servers successively. These machines all had installed and run the greeting card program, “My World Is In Favor.”
A preliminary diagnosis was reached: the illegal network application likely started with the greeting card and expanded into a hacking program during data exchange, causing it to send cc:Mail application data to all sites involved in data exchange. Since this program was contagious, it quickly infected all central network members and progressed gradually. Due to relatively low application traffic, the issue persisted for an extended period, and the traffic on each switch port was roughly equivalent, at around 50%.
The captured data packets were decoded and analyzed, revealing that the emails were unidirectional with no responses and the message content repeated: “My world is in favor, I love you.” The network was temporarily halted, all network equipment (including routers) was powered off, and all servers and workstations were formatted. Teams were organized to reinstall the system and applications, recover backup data, and work diligently for nearly 4 hours. The network was restarted the next day at 7 AM. By noon, the monitored data traffic was below 5% at the port level and less than 4% at the server level.
Conclusion of Network Infrastructure Security Case
There are numerous potential risks in network applications. To maintain a healthy network environment, it is crucial to prevent any illegal programs or pirated software from running on a dedicated network.
In this case, the fault occurred because network management staff privately ran software containing a hacking program, leading to the network suffering from high traffic surges and almost complete paralysis. The mechanism behind this hacking program was concealed, as it initially infected local area network servers or workstations, progressively expanding during data applications, causing increased network traffic. The routers used DDN and partial ISDN links, which were more susceptible to bottleneck effects, and thus easier to congest. Therefore, the network speed decreased within the local area network, with wide-area links becoming even slower.
Since the network traffic distribution was relatively balanced, the network management system did not trigger any alarms. This network management system lacked traffic alarm threshold settings, and the network traffic load had almost reached its limit, making the router channels congested.
Tips for Network Infrastructure Diagnose
Regular baseline testing should be part of network testing and maintenance, as it helps network maintenance and management personnel understand the network’s changing trends, directions, and patterns of fault occurrence.
For instance, if baseline testing data indicates that the average network traffic is usually below 6% and there are 15 known working protocols on the network, any traffic exceeding 6% should be monitored for changes, and the protocols should be verified to check if there are any illegal protocols running. In this case, cc:Mail was not a legitimate protocol within the network’s working protocols. Therefore, network management staff needed to address this issue promptly. Documentation and records of baseline network tests could have led to the swift correction of this fault.
Furthermore, traffic management is an essential monitoring and management method in advanced network management, helping to monitor network applications, track hackers, purify network protocols, identify network issues, estimate network operating costs, optimize network structures, and more. From a preventive standpoint, strengthening internal management and user education should be consistently and strictly enforced.
Afterword
Ms. Zhong contacted us the following day to report that the network continued to function smoothly. Observations from the traffic tester showed that the illegal protocol application never reappeared. They were now in the process of documenting the network and conducting baseline tests. Starting today, they would commence continuous monitoring and analysis of the network’s health indicators, implementing a “Network Health Maintenance Policy.”