networkZONE Products for the week of September 20, 2004
Cavium Networks Says . . .
Cavium's Network Services Processors Support Internet
Services, Content and Security Processing Applications With Up to 16 MIPS64-Based
Cores
Cavium Networks has introduced OCTEON, the industry's first single-chip Network Services Processor (NSP) family for secure, Layer 3 to Layer 7 networking applications. Today's implementations of higher layer application processing require a myriad of chips including control plane processors, data-plane processors and coprocessors for Internet services and security. OCTEON NSPs introduce a revolutionary new system-on-chip (SoC) architecture that integrates functionality of these multiple types of processors to deliver up to 5x benefit in price, performance and power over existing solutions for Internet Services, Content and Security processing in networking applications.
OCTEON processors include 2 to 16 cnMIPS cores, Cavium Networks' implementation of MIPS64, with Release2 enhancements and additional built-in hardware acceleration for content and security processing, along with on-chip coprocessor blocks for Internet Services acceleration and multiple Gigabit Ethernet, SPI-4.2 and PCI-X interfaces. OCTEON NSPs provides full compatibility with the large base of developed application software and development tools available for the industry standard MIPS instruction set architecture (ISA) licensed from MIPS Technologies. The products are targeted for use in a wide variety of OEM networking equipment including routers, switches, network-edge appliances with Firewall, VPN, IDS, Anti-Virus and Anti-Spam functionality, secure intelligent switches with SSL and content switching, XML switches, intelligent NICs, storage and wireless network applications.
"As networking equipment has progressed from delivering raw bandwidth to intelligent services, there is a need for highly integrated devices that can deliver rich functionality at high packet throughputs with a standard C-based programming model," said Linley Gwennap, Principal Analyst at The Linley Group. "Using its world-class processor design team, which delivered GHz-plus Alpha processors at DEC, Cavium Networks has developed an innovative, integrated, MIPS64-based processor that fills this need."
Network Services Processor for Content-Aware Networks
The next phase in the evolution of the Internet is the deployment of application-aware
networks upon which secure, content-aware services can be provided at mass-market
cost points. Integrated networking application aware systems need to process,
filter and switch a range of L3 to L7 Internet service protocols such as
HTTP, TCP, XML, SMTP and simultaneously secure these protocols with access
and content based security through Firewall, VPN, SSL, IDS, IPS, Anti-Virus
and Anti-Spam functionality at wire-speed. General purpose CPUs that are
currently used in these applications have been designed for control plane
applications and therefore have limited data plane throughput and require
multiple application specific coprocessors. On the other hand, Network Processors
are designed primarily for L2-L3 processing and burdened with complex and
proprietary software development models. The Network Services Processor
(NSP) is a new class of processor that offers the ease of use of standard
OS based programmability, along with high data-plane throughput with built-in
hardware acceleration for both intelligent Internet services applications
and security processing in a seamless and balanced manner providing up to
5x benefit in cost, power and performance for integrated network services.
"The evolving needs of intelligent networks have outpaced the current generation of processor technologies, which are falling short of addressing the multilayer nature of network services at increasing data speeds," said Kevin Krewell, Editor in Chief, Microprocessor Report, In-Stat/MDR. "Cavium Networks initiative to combine multiple processor technologies in an innovative, easy to use architecture represents the beginning of the next wave of highly integrated, multi-core processors that will serve as the heart of next-generation intelligent networking equipment."
OCTEON Network Services Processor Family
The OCTEON family's scalable architecture combines 2 to 16 cnMIPS cores
with integrated HW acceleration, along with dedicated programmable coprocessor
blocks that deliver up to 10Gbps of application performance at conservative
600MHz chip clock rates. Each cnMIPS core in OCTEON NSP is a dual-issue,
superscalar processor with L1 instruction and data caches, write buffer,
local-scratch pad, full memory management unit for virtual memory support
and built-in hardware acceleration for cryptography algorithms including
3DES, AES (all modes), SHA-1, MD-5, RSA, DH. OCTEON NSP has a fully coherent
ECC protected1MB L2 cache and incorporates special cache locking and partitioning
functionality to ensure high data plane throughput.
OCTEON NSP's main memory interface supports ECC-protected DDR I / DDR II DRAM up to 400MHz, with capacity of up to 16GB. Additionally, there are up to two channels for ECC or parity-protected low-latency RLDRAM/FCRAM with up to 1GB memory support. OCTEON's Hyperaccess memory subsystem has been architected for multi-core support and tuned to deliver both high-throughput and low-latency required by memory intensive content networking applications. Hyperaccess uses extensive buffering and intelligent bank management to provide efficient cache and system bus utilization. Using Hyperaccess, the cnMIPS core has a unique low-latency direct-access path to RLDRAM/FCRAM that bypasses caches and allows fast access to state information, such as signatures for anti-virus and IDS applications and TCP context.
Hardware acceleration co processor blocks
OCTEON NSP integrates a number of application specific co-processors that
completely offload the cnMIPS cores and achieve high-throughput:
The OCTEON NSP family offers highly flexible external networking interfaces with 4 to 8 integrated Gigabit Ethernet ports (RGMII) or dual SPI-4.2 interfaces with a host/slave PCI-X 64bit 133MHz interface that can be used as both a data and control interface. OCTEON NSP also offers auxiliary interfaces such as GPIO, Flash, MDIO, dual UARTs and 2wire serial interfaces.
"We have validated this new innovative architecture over the last two years with industry-leading tier-one customers," said Syed Ali President and CEO of Cavium Networks. "The tremendous enthusiasm and commitment we have received from customers is a testament to OCTEON's value-proposition. OCTEON promises to revolutionize the landscape of networking services by enabling ubiquitous deployment of intelligent, content aware networks."
Standard OS, C-Code Based Software Development
OCTEON NSP supports standard operating systems including Linux and VxWorks
along with a thin executive for data-plane applications. OCTEON can host
a variety of popular software architectures, including support for separate
operating systems on separate cores, flexibly grouping cores into Data-plane
and Control-plane processors and ability to implement run-to-completion
or pipelined software models. Cavium Networks provides a complete GNU tool-chain
and popular third party tool-chain support that enables thousands of MIPS32,
MIPS64 and other C/C++ applications and code to be easily ported to OCTEON.
Additionally, Cavium Networks provides APIs and reference software for Firewall,
VPN/IPsec TCP, IDS and Anti-virus applications. No special micro-coding
or proprietary tool-chains are required.
OCTEON Delivers Unmatched Application Performance
The OCTEON NSP enables a whole new class of functionally integrated appliances
and services blades. For example, the 16-core OCTEON processor enables an
integrated security appliance with Firewall, VPN/SSL, IDS, Anti-virus and
Spam-filtering at performance of up to 4Gbps or a single application Firewall,
VPN or SSL appliance at line rates of up to 10 Gbps. The OCTEON processor
also enables content-aware switches with SSL, application firewall, load-balancing
and content filtering and processing at performance of up to 4Gbps. With
OCTEON processors, Storage HBAs and switches can achieve up to 10 Gbps of
TCP, iSCSI and IPsec performance. Leveraging the same software, a scalable
family of products can be designed from 500Mbps to 10Gbps at multiple price
points. The OCTEON processor can also be used in network-interface-card
(NIC) or co-processor applications.
analogZONE Says . . .
At first, Cavium's expansion of its original product line of security processors to include an application-centric general-purpose acceleration/offload chip does not make sense. After all, the company has done rather nicely after establishing its reputation as a vendor of high-powered security engines (they even took a 2003 Product Of The Year award for their NITROX II family). Heck, they even acquired Brecis family of secure communication processor to bolster the lower end of their product line which delivers embedded security processing at the 10 - 50 Mbit/s range. So why should a successful company move out of its "comfort zone?" A closer look at the architectures of these two products, and today's market trends, will give you some clear hints as to why the folks at Cavium have expansion on their minds.
They have obviously noticed that the definition of security processing has grown well beyond its original encrypt/decrypt function to include intrusion detection, and virus detection, as well as supporting DMZs, firewall, and anti-SPAM filters. Today, these tasks are usually handled by discrete (and expensive) boxes, but the functions are merging into combined boxes over time as manufacturers try to provide a single box that performs all security functions. The work they do lies across multiple layers of the protocol stack ranging from classification and examination of both header and content to decompression and decryption tasks. For example, service providers may need to perform inspection of e-mail for viruses, including extraction and inspection of e-mail attachment.
To address this expanding charter, Cavium has re-purposed the multi-processor cores used in its NITROX product lines and given birth to the OCTEON processor. It is a true application accelerator that addresses these under-served functions and leaves control plane or traffic management functions to other good chips already on market. The device complements "traditional" packet processors by focusing on service support tasks that unburdens network processors to concentrate on L2/L3 data plane functions, and lets traffic mangers support flow control and QoS-related issues. This philosophy of allowing other chips to do what they do best is also evident in the "special I/O" port provided that allows use of ASICs or TCAM for further acceleration.
While I'm a big fan of pipelined arrays for many applications, the discrete processors are appropriate here as they make for a very neat division of labor between applications and can be powered down on a per-CPU basis as the task allows for power conservation. OCTEON does depart from the NITROX architecture in one important respect -- it's designed as a true flow-through machine whereas earlier Cavium silicon was look-aside-oriented and required either an ASIC or NP to allow run in-line. It should hook up easily to a wide variety of new and existing designs via its SPI-3/4.2 interfaces.
The chip contains a nice blend of a programmable RISC cores and specialized security engines that deliver the ability to process higher layers of packets along with high throughput of encrypted traffic. As with the original security chips, it uses a 64-bit MIPS-compliant (more on this later) custom RISC core that's designed to be C-friendly for easy development. Depending on the particular chip you're using, you get between 2 and 16 RISC engines to play with. You also get a good selection of dedicated accelerator cores that support security functions such as IPsec, SSL, IKE, RSA, and RNG. Other cores compress/decompress packets on the fly while others perform Regex (regular expression) detection, pattern-match functions (IDS/AV ) that can be used to do wire speed detection of viruses and intruders.
Together, the processors can deliver as much as 10 Gbit/s of throughput when you're running a single application. If you don't need all that raw performance, you can run multi-application-sets that distribute a group of related functions across the chip's processors. Some fun combinations might include:
The chip can be programmed to operate in a data plane-only mode, but it can also support a control plane-only mode where it handles all administrative functions and routing stacks. It also supports a mixed mode that allows some control plane functions to remain resident on the host system. This mixed-mode operation offers potential cost savings in less bandwidth-intensive applications by running control panel functions without a second processor chip.
With an overall sketch of the chip out of the way, it's probably useful to take note of a number of important little details that make the OCTEON such a high-powered device. First of all, it's good to note that the RISC machines are not MIPS-supplied cores, but rather a custom-designed, code-compatible superset of the MIPS architecture with specialized enhancements for security and packet processing. Besides the extra instructions, they designed the CPU to execute two instructions per clock cycle and added power savings features that shut off unused parts of the processor when not in use.
Memory is used in abundance across the chip. Independent L1 caches for each RISC ensure speed while multi-processor operations are supported by a common coherency-managed L2 cache. Each MIPShas its own on-core memory management unit. This enables partitioning of discrete memory areas between different functions, and keeps applications running outside of core OS. It's a function that allows it to run wide variety of OSs that require MMU, including Linux, and OBSD. Another small chunk of memory was put on board to implement a "secure vault" that allows you to keeps encryption keys, the output of the integrated a random number generator, and other sensitive data from ever having to leave the chip.
Cavium has worked hard to simplify the task of programming OCTEON. As with the NITROX, the bulk of an application can be written in C without regard to processor architecture, but the development tools support direct calls to hardware functions for maximum efficiency in critical code segments. There is also lots of software support from both Cavium in the form of APIs for IPsec, SSL, Content Processing, TCP, Compression, and WLAN functions. As mentioned in more detail in the manufacturer's announcement, the chip supports standard operating systems including Linux and VxWorks along with a thin executive for data-plane applications.
The power and flexibility of the OCTEON recommends it to a wide variety of applications. I'd expect to see its integrated security processing capabilities to win it sockets in secure NICs, security appliances, as well as secure routers and WLAN Switches. You should also expect it to gain market traction in application-level gateways (for content switches and load balancing), NICs, and content switches.
Since it can perform application-aware packet inspection and manipulation, OCTEON could find lots of work powering web service appliances and intelligent router blades. Of course, the same intelligence and processing power can also be applied to secure storage networking products as well.
While it's hard to say whether this chip will successfully put itself at the center of the consolidation of services processors we're now seeing, it most certainly has the raw horsepower and open architecture to make it a chip of choice for these sorts of applications. The brisk sales of their original security processors are a good harbinger of things to come, though. I think their earlier success is in good part attributable to having mastered the tough job of providing ample programming support and making the development tools as friendly as their chip is powerful.
The OCTEON NSP family with evaluation boards will sample in the first quarter of 2005. Four parts will be available with pricing in 10-k piece lots from $125 for the two-core product to $750 for the 16-core version.
Cavium requires on-line registration to see the Product Brief on these products.
|
| ||||