Thursday, June 30, 2011

The Mali GPU, and ARM's 5-year strategy to dominate visual computing


ARM has established a position of dominance in mobile application processors, as the provider of CPU (central processing unit) core IP (intellectual property) to the likes of Qualcomm (Snapdragon), Texas Instruments (OMAP), Samsung (Hummingbird), NVIDIA (Tegra), and others. The company is now building on a strategy to do with GPUs (graphics processing units) what they have accomplished with CPUs. Tom Olson - Director of Graphics Research, and Jem Davies - ARM Fellow and VP of Technology in the Media Processing Division at ARM, spoke with the EE Daily News about the company's strategy to address the growing demand for high-performance visual computing.

ARM's venture into GPUs began five years ago, when they acquired Falanx - a 20-person startup based in Norway.  One year later, ARM formed the MPD (Media Processing Division) to address audio, video, and graphics as well as visual computing applications. ARM MPD has a diverse set of customers, including Cisco, and Netlogic in the networking space, ST Micro in STBs (set-top boxes), and mobile players that include ST-Ericsson, Samsung, Broadcom and LG. Samsung employs the Mali GPU in the forthcoming Galaxy S2, along with a dual-core Cortex A9 CPU.

Tom Olson says that the two critical factors that drive graphics in mobile devices are gaming and the UI (user interface). In the next five years ARM expects UI to move to gesture input and 3D interfaces (see Live from IDF10 - demo of gesture UI from GestureTek). Visual computing is not just about displaying images, says Olson, but also about taking images in and processing them, with much more complex use of images in the future. Emerging mobile applications such as AR (augmented reality), plenoptics - fixing out-of-focus images, and computational photography will drive requirements for more powerful GPUs.

Olson is also chair of the OpenGL® ES (embedded systems) committee. (OpenGL® ES is "a royalty-free, cross-platform API for full-function 2D and 3D graphics on embedded systems - including consoles, phones, appliances and vehicles").  The committee continues to work to develop standards to support the convergence of  graphics requirements, where smartphones now have the same capability as notebook PCs of just a few years ago.  Games continue to be the dominant category of mobile applications, and as handset screen resolutions have increased consumers expect their smartphone or tablet experience to be PC-like. The challenge, says Olson, is no longer how to integrate enough transistors to do the job, but how to manage the power dissipation within the energy budget that a battery can support.

Content remains king in the media industry, and the competition in Hollywood to produce the next big CG-driven (computer-graphics) blockbuster is very expensive. ARM cites the example of the latest "Pirates of the Caribbean" sequel, which cost $300 million to produce.  This has driven content owners and distributors towards multi-screen strategies, in order to leverage their investments beyond the cinema to home video and mobile devices.  At the same time, consumers are demanding the ability to place-shift their consumption of content anywhere they go, in the car, on smartphones and tablets. ARM wants to be the provider of a common scalable platform that will enable developers to deliver the rich HD (high-definition) media experience that consumers have come to expect across all CE (consumer electronics) devices.

Jem Davies says that the answer to meeting the challenge for scalable HD applications is the use of multi-core architectures.  SoC designers can utilize multiple ARM CPU and GPU cores to meet the peak requirements of an intensive video game, but turn off cores to save power for less intensive tasks such as reading email. Having all this fine-grained hardware control available can increase the complexity of the software required to manage it, but Olson says that ARM designed the Mali-400 MP, the first multicore GPU for mobile, to hide the power scaling functions from the application layer. The software driver, which you would typically run on your ARM CPU core, handles the distribution of tasks transparently. In the next-generation Mali-T604, ARM's GPU contains a hardware job manager that abstracts the power management task even further. Now, says Olson, the OS (operating system) can turn off a core in the middle of processing a graphics frame, and the job manager will handle the request without degrading graphic performance.


ARM sees themselves as ideally positioned to offer a complete optimized graphics sub-system,
integrating each of the four components necessary for high-performance visual computing.

Four components of a Graphics Sub-System
Designers and IP providers are increasingly taking a sub-system approach to SoC design. For graphics sub-systems, designers must optimize CPU-GPU interaction, and the associated memory bandwidth requirements, along with the interconnect fabric that ties the SoC together. As a provider of each of these SIP (silicon IP) components,  ARM's strategy is to leverage their combined strengths to make Mali cores as dominant in visual computing as the company's CPU cores are in the application processor market.

Davies and Olson see the ability to co-develop the GPU and CPU as absolutely necessary to visual computing subsystems. "You must be able to see the whole system", says Olson. Companies that only develop GPUs, will find that they are only addressing part of the problem, says Davies. One trend, that ARM sees developing more over the next few years, is the need to for tight integration of the CPU and GPU in order to work in tandem on the same computing task.

The GPU business is still in the early stage of development for ARM, but the company claims that Mali is already the SIP industry's most widely licensed GPU. ARM added 7 Mali licensees in Q1 of this year, for a total of 46, after adding 11 in all of 2010. Of the 46 Mali licensees, just 6 are in production and producing royalties for ARM, indicating that the business is just ramping up.

Wednesday, June 29, 2011

Activity picks up in non-volatile memory IP

Sidense uses a 1T split-channel antifuse in their NVM IP (source Sidense)
Synopsys
Earlier this week, Synopsys announced availability of their DesignWare® AEON® NVM (Non-Volatile Memory) silicon IP (intellectual property) for standard 180-nm CMOS process technologies. Synopsys makes AEON IP available for FTP (few-time programmable), MTP (multiple-time programmable), and EEPROM (electrically erasable programmable read-only memory) applications. 

In the FTP configuration, Synopsys configures AEON for 256 bits of storage, and the company specifies the IP for up to 100 write cycles. Synopsys targets RFID and wireless SoC designs with AEON MTP, with an increase in write cycles up to 1,000 and capacity from 128 bits to 1kb, with low power operation that enables read operation down to 1.0 V. The company targets the AEON MTP EEPROM for automotive grades, in 64 bit to 1kb configurations, with up to one million write cycles and operation at up to 150 degrees C. The FTP and MTP IP are currently in qualification, while the EEPROM has been fully qualified by Synopsys.

Kilopass
On June 28, Kilopass Technology and SMIC (Semiconductor Manufacturing International Corporation) announced that they have extended their OTP (one-time programmable) NVM product offering for SMIC’s 55nm logic CMOS process. Kilopass has previously taped out their NVM IP in SMIC’s 65nm process.


Kilopass uses a 2T (two-transistor) antifuse design, and they target applications such as trimming of analog/mixed-signal functions, embedded boot code, and security keya for multimedia processors, MCUs, and RFID ICs. In April, Kilopass announced development of Itera, an MTP version of their IP for 40nm processes at TSMC, GLOBALFOUNDRIES, and UMC.

Sidense
Sidense is unique in offering a 1T (one-transistor) split-channel, nonvolatile memory cell for OTP applications.With their single transistor design, Sidense claims the smallest NVM cell size in the industry. Sidense products are available from 180nm to 40nm from a number of foundries, including TSMC, UMC, Fujitsu Microelectronics Limited, SMIC, TowerJazz, IBM, GLOBALFOUNDRIES and ON Semiconductor.

Because their 1T technique does not rely on charge storage, Sidense says that their NVM is more secure than EEPROM, FLASH or Logic NVM which use floating gates. The compay says that although charge can not be seen optically, its presence can be detected (at least theoretically) with advanced sample preparation and microscopy techniques. The greatest vulnerability, according to Sidense, comes from the risk of erasure or reprogramming. Charge-storage NVM can be erased by exposure to high temperature, and their contents can be altered from the back side of the die with an electron beam, according to Sidense.

The Sidense antifuse programming method is based on inducing a structural change in the silicon to silicon dioxide interface, of approximately 10 - 25A (Angstroms) in height and 20-50A in diameter - equivalent to about 10 – 50 atomic layers. Sidense says that this structural change of the antifuse can only be detected using TEM (Transmission Electron Microscopy), which can potentially show the structural change if the cross-section cuts through the breakdown (programmed) spot, a highly laborious and time consuming process.

Sidense offers a SLP (low power) version of their 1T IP in 180nm processes, with capacity up to 256kb. The company also provides an ULP (ultra-low power) version, with up to 2kb storage, that Sidense specifies for standby power draw of less than 0.25µA with read operation down to 1.5V. For higher density applications, Sidense IP is available in densities of up to 512kb, in 130nm to 40nm processes.

Summary
Designers have a lot to choose from in deciding on a vendor for non-volatile memory IP. Your choice will depend on compatibility with your fabrication process, with 180nm as the current sweet spot for NVM.  From there, other factors come into play, such as reprogrammable or one-time programmable, security and low power. The table below provides a comparison of offerings from the vendors discussed in this article.

CEVA adds GPS capability for XC series communications processors

CEVA, Inc. is a leading provider of SIP (silicon intellectual property) DSP (digital signal processor) cores, including the XC321 for mobile handsets, and the XC323 IP core for 4G SDR (software-defined radio) base station applications. CellGuide is a fabless semiconductor and design services company that focuses on GNSS (Global Navigation Satellite System) solutions. CEVA has announced that the two companies have formed a partnership to offer a software-based GPS (global positioning system) solution for the CEVA-XC communications processors.

Licensees of the CEVA-XC IP will now be able to leverage CellGuide’s software intellectual property to add GPS capability to their processor designs, without incurring any modifications to hardware. CEVA says that the software-defined nature of the XC series processors makes it easy to perform intensive GPS computations entirely in software, so that designers will be able to eliminate dedicated GPS baseband hardware with a CEVA-XC based SoC design.  Customers will get a turnkey software-based solution, which they will be able to modify and adapt based on their particular needs.

The GPS software IP has been architected by CellGuide for integration into mobile processors, with support for concurrent operation with other air interfaces in multi-mode SDR modem designs. CellGuide has previously developed "hardware agnostic" solutions with their GPSense™ solution, a "multi-beacon positioning engine for mobile consumer applications", that supports application processors including ARM-based: ARM9, ARM9E, ARM11 and Cortex, MIPS-based, and Intel X86-based devices, with support for Android Version 2.1 and up, as well as various versions of Windows and embedded Linux. CEVA says that designers will be able to use their existing GPS application solution without any changes associated with the CEVA-XC based GPS modem.

CellGuide will be joining the CEVA-XCnet partner program, which includes a number of other SIP, EDA (electronic design automation), software and semiconductor foundry companies, including ARM, Cadence, Carbon Design, Mentor Graphics, CoWare (Synopsys), and TSMC.

Monday, June 27, 2011

Cavium MIPS64-based multicore processors for 4G applications

Cavium Networks is a 10-year old provider of MIPS® and ARM®-based SOCs (systems on a chip) for networking, communications, storage, video and security applications. The company moved to its new headquarters in San Jose, CA today.

Earlier this year, Cavium acquired 4G baseband DSP (digital signal processor) provider Wavesat, and in November 2009, they acquired embedded Linux developer Montavista. YJ Kim, General Manager of the Infrastructure Processor Group at Cavium, delivered a presentation on the company's 4G mobile infrastructure solutions at the recent Linley Tech Carrier Conference. Cavium has designed a family of MIPS-64 based processors for 4G (4th generation) base stations, RNC (radio network control), and core network applications. The latest generation OCTEON II ranges from a single-core device to a 32-core SoC.

Kim began by describing the complexity challenges for 4G network processors. In LTE (long-term evolution) infrastructure. The E-UTRAN (Evolved Universal Terrestrial Radio Access Network) base station, eNodeB or eNB, takes on added complexity over 3G (3rd generation) since the eNB also integrates the RNC (radio network control) functions. LTE promises a 10X or greater increases in data rate over 3G networks, but base station radios must live within a 15W to 20W power budget. All-IP networks will utilize security and DPI (deep packet inspection) functions, such as the 3GPP use of the SNOW 3G algorithm, with the result being up to a 40Gbps/Blade data throughput requirement in core network processing, according to Kim.

The OCTEON II CN66XX integrates 6-10 MIPS64 v2 cores for 3G/4G/LTE wireless base stations.
For Layer-2 to Layer-7 (MAC/Scheduler/Control/Transport) processing in LTE base stations, Cavium starts with the 2-4 core CN62XX design, which integrates 2-4 MIPS cores that are capable of operating at a 1.0 GHz clock rate. The next step up is the CN63XX family, which offer 2-6 MIPS cores while increasing operating speed to 1.5GHz. Cavium targets the OCTEON II CN66XX family at high performance 3G/4G/LTE wireless base station platforms, with 6-10 1.5GHz MIPS cores. The CN66XX provide a 2MB L2 cache, DDR3 (double data-rate) memory controller, hardware acceleration for the SNOW 3G and KASUMI security algorithms, TCP/IP (Transmission Control Protocol/Internet Protocol) packet processing acceleration, and QoS (quality of service). Cavium's PowerOptimizer™ technology limits maximum power from 9 to 20 watts from low-end to high-end configurations. 

The CN66XX SoC also includes several SERDES (serializer/deserializer) I/O’s for PCI-e (Peripheral Component Interconnect express) Gen2, XAUI (10 Gigabit Attachment Unit Interface), and SRIO (Serial RapidIO).  Users can take advantage of the integrated HFA (Hyper Finite Automata) DPI engine to perform deep packet inspection. Cavium also integrates accelerators for data compression, and encryption/decryption. The "Acclr" (application accelerator) manager engine is a hardware load balancer that distributes packet and control processing to the embedded cores, in a similar fashion to the ARM-based dispatcher in the Mindspeed Transcede base station SoC.

Authentik™ is Cavium's anti-counterfeiting technology.  With Authentik™, OEMs (original equipment manufacturers) can lock the multi-core processing chip, so that third parties can assemble their systems while reducing the risk that counterfeit copies of the system can be created. The CN66XX is software compatible with the OCTEON 63/62XX, so that designers can scale the architecture from macro to pico cells or micro cells.

Unlike competitors that are offering "base station on a chip" SoCs, you must add an external FPGA or DSP ASIC to handle Layer-1 PHY (physical layer) functions.  Cavium demonstrated such a solution at the recent Femtocell World Summit in London, by combining Picochip's PC960x LTE radio and PHY with a Cavium OCTEON processor.

For EPC (evolved packet core) applications, Cavium scales their SoC architecture up to a 32-core configuration in the CN68XX, which the company positions as being capable of replacing multiple embedded processors, DSPs and NPU (network processor units).Cavium claims greater than 40 Gbps LTE processing speed for EPC applications, equivalent to a CPU performance of greater than 60 GHz, while consuming less than 70W in the CN68XX. 

Related articles:

Friday, June 24, 2011

Mindspeed: 5G networks will be enabled by software-defined cognitive radios

Earlier this week, Texas Instruments announced two new SoCs (System-on-Chips) for the small-cell base-station market, adding an ARM A8 core while scaling down the architecture of the TCI6618, which they had announced for the high-end base-station market at MWC (Mobile World Congress).

Mindspeed had also announced a new heterogeneous multicore base-station SoC for picocells at MWC, the Transcede 4000, which has two embedded ARM Cortex A9s - one dual and one quad core. Jim Johnston, LTE expert and Mindspeed's CTO, reviewed the hardware and software architectures of the Transcede design at the Linley Tech Carrier Conference earlier this month. Johnston began his presentation by describing how network evolution, to 4G all-IP (internet protocol) architectures, has driven a move towards heterogeneous networks with a mix of macrocells, microcells, picocells and femtocells. This, in turn, has driven the need for new SoC hardware and software architectures.

Cognitive radios will be enable spectrum re-use
in both the frequency and time domains. (source - Mindspeed)


While 4G networks are still just emerging, Johnston went on to boldly describe the attributes of future 5G networks - self-organizing architectures enabled by software-defined cognitive radios. Service providers don't like the multiple frequency bands that make up today's networks, he said, because there are too many frequencies dedicated to too many different things. As he described it,  5G will be based on spectrum sharing, a change from separate spectrum assignments with a variety of fixed radios, to software-defined selectable radios with selectable spectrum avoidance.


Software-defined cognitive radios will enable dynamic spectrum sharing,
including the use of "white spaces" (source Mindspeed)


Touching on the topic of "white spaces", Johnston said that the next step will involve moving to dynamic intelligent spectral avoidance, what he called "The Holy Grail", with the ability to re-use spectrum across both frequency and time domains, and to dynamically avoid interference.

Mindspeed's Transcede 4000 contains 10 MAP cores, 10 CEVA x1641 DSP cores, and 6 ARM A9 cores, in a 40nm 800M transistor SoC (source Mindspeed)

Moving to the topic of silicon evolution, Johnston said that to realize a reconfigurable radio, chip architects need to take a deeper look at what needs to be done in the protocol stack, and build more highly optimized SoCs.  For Mindspeed, this has meant evolving data path processing from scalar to vector processing, and now to 1024b SIMD (single-instruction, multiple-date) matrix processing.

At the same time, Mindspeed's control plane processing is evolving from ARM-11 single issue instruction-level parallelism, to ARM-9 dual issue quad-core SMP (symmetrical multi-processing), to ARM Cortex-A15 3-issue quad core.  SoC-level parallelism has evolved from multicore, to clusters of multicores, to networked clusters, all on a single 800M transistor 40nm SoC that integrates a total of 26 cores.

The Transcede 4000 contains 10 MAP (Mindspeed application processors) cores, 10 CEVA x1641 DSP cores, and the 6 ARM A9 cores - in dual and quad configurations.  Designers can use the Transcede on-chip network to scale up to networks of multiple SoCs,  in order to construct larger base-stations. How far apart you can place the SoCs depends on what type of I/O (input-output) transceivers you use. With optical fiber transceivers, the multicore processors can be kilometers apart (see Will 4G wireless networks move basestations to the cloud? ) to share resources for optimization across the network. The dual core ARM-A9 processor in the Transcede 4000 has an embedded real time dispatcher that assigns tasks to the chip’s 10 SPUs (signal processing units), which consist of the combination of a CEVA X1641 DSP and MAP core.  To build a base-station with multiple Transcedes, designers can assign one device’s dual core as the master dispatcher to manage the other networked processors.
 
The evolution of software complexity is also a challenge, with complexity increasing 200X from the less than 10,000 lines of code in the days of dial-up modems, to 20M lines of code to perform 4G LTE baseband functions. Software engineers must support multiple legacy 2G and 3G standards in  4G eNodeB base-stations, in order to enable migration and multi-mode hardware re-use. Since the C-programming language does not directly support parallelism, Mindspeed takes the C-threads and decomposes them to fit within the multicore architecture, says Johnston.

Related articles:

Thursday, June 23, 2011

Touchstone Semiconductor develops enhanced 2nd source for Maxim comparators

Touchstone Semiconductor is a new entrant in the analog IC (integrated circuit) market, founded in 2010 by former executives of Maxim Integrated Products, including CEO Brett Fox and VP of Engineering Dr. Jeroen Fonderie. The company introduced their first product in March, the TS1001 - a single-supply, rail-to-rail operational amplifier that you can operate on a 0.8V supply while consuming just 0.6µA of supply current.

Touchstone has now added to their offerings with the TS9001-1 and TS9001-2, single-supply, low power analog comparator ICs with built-in voltage references. The new devices are described by the company as electrical specification improvements over the Maxim MAX9117 and the MAX9118, respectively.  Touchstone has designed both TS9001s with integrated +1.252V references that the company specifies for an initial accuracy of 1%. 

The TS9001-1 provides a push-pull output stage, while the TS9001-2 offers an open-drain output stage that you can use in mixed-voltage systems design. Touchstone specifies both TS9001s for operation with voltage supplies from +1.6V to 5.5V, with consumption of less than 0.65μA supply current.

Touchstone specifies the TS9001-1 and the TS9001-2 for use over the -40°C to +85°C industrial temperature range, and packages the ICs in 5-pin SC70 packages. Prices start at $0.95 each for the
TS9001 in 1000-piece quantities. Products are available from Future Electronics, www.futureelectronics.com.

Wednesday, June 22, 2011

6Wind addresses software performance optimization for multicore communications processors


In their recent announcement of the next-generation QorIQ communications processor, (see Freescale gives an advanced look at 28nm, 64-bit multicore QorIQ design), Freescale demonstrated the complexities of SoC hardware that are required in order to keep up with the worldwide explosion in IP (internet protocol) network data traffic. But what of the software required? How do you program such a complex multicore communication processor, with numerous specialized hadrware accelerators, to efficiently and securely manage data traffic?

This was the issue that packet-processing software provider 6Wind addressed at the recent Linley Tech Carrier Conference in San Jose, in a presentation titled "Portable Networking Software
Platforms for 4G Infrastructure
". Eric Carmès, CEO of Paris-based 6Wind, said that in a typical networked application, 90+% of the workload is consumed by sophisticated data-plane packet processing and forwarding, with the remaining 10% allocated to control plane signaling. Freescale, a 6Wind customer, targets this problem in the QorIQ hardware by including specialized accelerators for packet processing functions such as pattern matching, security algorithms, and data path acceleration.


Carmès said that OS (operating system) overhead can limit the performance of data packet processing tasks, and engineers must also address the problem with a software architecture that is optimized to perform packet inspection, processing and forwarding, and that is transparent to control plane applications.

6Wind's "Fast Path" software is a replacement for standard operating system networking stacks
(source - Eric Carmes, Linley Tech Carrier Conference, June-8 2011)

The "Fast Path" architecture, which 6Wind developed in their 6WINDGate™ software, has been designed by the company to be a drop-in replacement for standard OS networking stacks.  With 6WindGate, you can run your standard embedded Linux on one or more cores with the control plane and networking stack, and dedicate the remaining cores to fast path in order to maximize performance. 

The company says that 6WindGate typically provides up to 10x the packet processing performance of a standard networking stack.  Fast path processes the majority of incoming packets outside the OS environment to avoid the overhead problem. Only a small  number of packets that require complex processing are forwarded to the OS networking stack, in order to perform the necessary management, signaling and control functions.

6Wind says their fast path implementation is portable to a variety of multicore architectures (including devices from Cavium, Freescale, Intel, NetLogic, and Tilera), while also taking advantage of specific hardware accelerators. The company provides a synchronization module in 6WindGate, which is intended to make the fast path software transparent to
a Linux networking stack and the control plane. The fast path data plane software modules consist of processor-independent source code, and cycle-level and pipeline-level optimizations. 

The 6Wind fast path networking SDK (software developer kit) provides a set of processor-specific modules that 6Wind bases on the processor SDK,  providing a zero-overhead API (application programming interface) for fast path module implementation.

EDA Standards Groups Accellera and Open SystemC Initiative will merge

EDA (electronic design automation) standards bodies Accellera and the OSCI (Open SystemC Initiative) have announced their intent to merge into a single organization. The two groups have signed a memorandum of understanding outlining the merger plan, which they say will result in more comprehensive standards that will benefit the worldwide EDA community, and facilitate more efficient collaboration among its members.

In a joint press release, Accellera chair Shishpal Rawat said:
“Our new organization will leverage the excellent work of our technical committees to provide a bigger benefit to the electronics industry. By forming a combined organization, we will be able to accelerate development of system level standards that will move electronic design productivity to the next level.”
The groups cited the relationship between OSCI’s TLM-2.0 (Transaction Level Modeling) standard and Accellera’s UVM (Universal Verification Methodology) as an example of the potential synergy that exists between the two organizations.
OSCI chair Eric Lish said:
“We are taking this significant step to address the future needs of the system and semiconductor design communities,”  “We are excited about the opportunity this presents to our members to improve their design productivity with industry standards that encompass system-level, RTL and gate-level design flows.”
The groups expect the new organization to be in place by the end of the year with a unified set of policies and procedures. Until then, ongoing standards development activities will continue in both organizations.

Tuesday, June 21, 2011

Freescale gives an advanced look at 28nm, 64-bit multicore QorIQ design

The new QorIQ AMP SoC is a 28nm, 64-bit multicore, multithreaded design. (source Freescale)

Freescale Semiconductor has unveiled details of the company's next-generation QorIQ multicore processor platform today, at the Freescale Technology Forum in San Antonio.  The AMP (Advanced Multiprocessing) series, which Freescale plans to manufacture in a 28-nm process, will incorporate a new, multithreaded 64-bit Power Architecture® core, with support for 2-24 virtual cores, combined with acceleration engines and sophisticated power management.

Freescale says that the AMP series will deliver up to 4x the performance of the previous generation's  eight-core QorIQ P4080 device. The company is planning to offer the AMP series in a broad array of configurations (T1 to T5), from ultra-low-power single-core products up to highly advanced SoCs targeting applications in networking, robotics, storage, medical, video systems and military/aerospace.

New multithreaded e6500 core with AltiVec technology
Freescale based the QorIQ AMP series on a new multithreaded, 64-bit Power Architecture e6500 core that operates at up to a 2.5GHz clock rate. The e6500 incorporates an enhanced version of the AltiVec vector processor, a 128b SIMD (single-instruction multiple data) unit that operates independently  of the scalar processor and FPU (floating point unit). Freescale uses the AltiVec technology to address high-bandwidth data processing and algorithmic-intensive computations. (See Freescale's video introduction to the e6500 here).


CoreNet interconnect fabric
The CoreNet coherency fabric provides designers with a high-bandwidth fabric that can scale with clusters of multicore processors.  CoreNet is designed to support data communications with emerging and future DDR (double data rate) memories that will exceed the current 1600 MHz standards.

Acceleration technologies
Freescale has added a variety of acceleration engines and co-processing technologies to complement the e6500 cores. Accelerators in the AMP SoCs include units for security, pattern matching, decompress/compress engines, and Freescale's DPAA (data path acceleration) technology. 

The DPAA eliminates bottlenecks in communications from the cores to I/O (input-output) and cores to network accelerators, which Freescale says is especially critical in 10Gigabit networking.

The SEC (security acceleration engine) is a 5th generation upgrade from the P4080, which designers can use to offload protocol processing. The SEC enables execution of a number of encryption algorithms, including the ZUC algorithm for 4G (fourth generation) LTE (long-term evolution), IPSec (IP Security Protocol Working Group), and SSL (secure socket layer), at up to 40 Gbps. Freescale rates the SEC at up to 140 Gbps of crypto hardware acceleration for current and emerging wireless and wireline algorithms.  

Engineers can use the QorIQ AMP pattern matching engine along with software to accelerate detection of virus signatures, or to perform network policy enforcement. You can store PCRE (Perl-compatible regular expressions) in the pattern matching engine's internal cache to define signatures in packets that you want to identify.

DCE 1.0 is a new decompress/compress engine in the AMP, which provides for execution of lossless compression algorithms, including the raw DEFLATE algorithm (RFC1951), GZIP format (RFC1952) and ZLIB format (RFC1950), as well as Base64 encoding and decoding (RFC4648).

The AMP design also incorporates acceleration/offload technologies for 128-bit SIMD data prefetching, in-line parsing and classification, and quality of service functions.

Advanced power management system
The AMP series products will utilize a variable-mode power switch that will allow customers to precisely modulate the power of the cores and other processing units independently.  Freescale says that the new power management scheme, and the move to a 28 nm process technology, reduces power consumption by up to 50 percent. Developers can control individual core frequencies and use six core power management states.

Monday, June 20, 2011

Texas Instruments adds basestation SoCs for small cells


Texas Instruments has announced the TMS320TCI6612 and TMS320TCI6614 SoCs (System-on-Chips), at the Femtocell World Summit in London. The devices represent extensions of the company's KeyStone multicore architecture for developers of metro, pico and enterprise small cell base stations. TI previously announced the TCI6618,  for the high-end-base-station market in macrocell and compact-macrocell applications, at the 2011 MWC (Mobile World Congress) in February.


TI has integrated a mix of processing elements into the new devices, including radio accelerators, network and security coprocessors, combined fixed-and floating-point DSPs (digital signal processors) and an ARM® RISC processor.

TI's small-cell SoCs integrate an ARM Cortex-A8 processor with dual or quad DSP cores.

The integration of an ARM Cortex-A8 core is new with the TMS320TCI6612 (dual-core DSP) and TMS320TCI6614 (quad-core DSP). The TCI6618 also integrates a network coprocessor with DSP cores and other acceleration for layers 1 and 2 processing, which TI had previously said eliminates the need for a RISC processor.  TI provided further information to the EE Daily News with today's announcement, to clarify the differences in the small and large-cell SoC architectures.
For small cells, typically one sector solutions, it is the goal to have all the digital processing on a single device, hence this small SoC which includes the ARM for layer 3 processing.  In macro solutions (multiple sectors) several devices are often used together (for example 6616 or 18) for the layer 1 and 2 processing, while separate RISC based processor (for example an external ARM) interfaces to them for layer 3.
TI says that the integration of the ARM RISC core, with packet and security processors, greatly reduces small-cell base station system cost by providing a complete solution for layers 1, 2 and 3 and transport processing for high performance small cell base stations.

Embedded Layer-1 coprocessors/accelerators in the TMS320TCI6612 and TMS320TCI6614 include:
  • 4 enhanced Viterbi decoders.
  • 3 third-generation turbo decoder coprocessors.
  • A turbo encoder coprocessor.
  • 3 FFT (fast Fourier transform) coprocessors.
  • TFCI (transport format control identifier) , CQI (channel quality indicator)
  • 4  RSAs (rake/search accelerators) for CDMA (code division multiple access) assistance with chip-rate processing.
  • The BCP (bit-rate coprocessor) contains the modulator, demodulator, interleaver/de-interleaver, turbo and convolution encoding, rate matcher/rate de-matcher, correlator for block code decoding, and CRC
    engine. The BCP enables turbo interference cancellation for MIMO (multiple-input, multiple-output) equalization and enables high-performance PUCCH (Physical Uplink Control Channel) format 2 decoding. According to TI, the BCP offloads approximately 15 GHz of CPU MIPS.
Layer-2 accelerators include:
  • ROHC (Robust Header Compression).
  •  QoS (Quality of Service).
  • RLC/MAC (Radio Link Control and Medium Access Control).
 The TCI6612 and TCI6614 I/O (input/output) support includes:
  • I2C (inter-IC), SPI (serial peripheral interface), and UART (universal asynchronous receiver/transmitter).
  • PCI Express port with two lanes supporting GEN1 and GEN2.
  • Twelve 64-bit general-purpose timers (also configurable as sixteen 32-bit timers).
  • 32-pin GPIO (general-purpose input/output) port with programmable interrupt/event generation mode.
  • Four lanes of SRIO (serial RapidIO), compliant with RapidIO 2.1 for up to 5-Gbps operation per lane.
  • 1.6 GHz, 64-bit DDR3 SDRAM (double data rate, synchronous dynamic random access memory) interface, supports up to 8GB of addressable memory space.
  • 16-bit EMIF (external memory interface) for connecting to flash memory (NAND and NOR) and asynchronous SRAM (static RAM).
  • Second-generation SERDES-based AIF2 (antenna interface) capable of up to 6.144 Gbps operation per link with six high-speed serial links, compliant to OBSAI RP3 (Open Base Station Architecture Initiative) and CPRI (common public radio) standards. 
  • 4 lanes of HyperLink at up to 12.5 Gbaud/lane. HyperLink is a proprietary highspeed interconnect that enables designers to implement high-speed communication and connectivity to other KeyStone devices. The HyperLink on the TCI6612 and TCI6614 works in conjunction with the Multicore Navigator to dispatch tasks to multiple devices transparently, so they execute as if they are running on local resources.
TI's KeyStone SoCs include the TeraNet hierarchal switch fabric, which the company specifies for more than
two terabits bandwidth for data transfer within the SoC. The MSMC in the TCI6612 and TCI6614 is TI's Multicore Shared Memory Controller, which allows the cores to directly access shared memory without having to use any TeraNet bandwidth. The MSMC arbitrates access to shared memory between the cores and other IP blocks, eliminating memory contention.


Availability

TI is targeting the 3rd quarter of 2011 to begin sampling the TCI6612 and TCI6614 SoCs. Solutions incorporating a digital radio front end will follow.

Huawei Unveils 7-inch MediaPad running Android 3.2 Honeycomb


Huawei has announced a new 7" Android tablet, the Huawei MediaPad, which they claim to be the first 7" tablet to run the Android 3.2 Honeycomb operating system. The 10.1" Samsung Galaxy tablet and 10.1" Motorola Xoom currently run Honeycomb version 3.1.

The specifications that Huawei Device released are minimal: 
  • Thickness: 10.5mm (0.4 inches)
  • Weight: 390g (0.86 pounds)
  • Processor: Qualcomm dual-core 1.2GHz SnapDragon
  • Video: 1080P full HD video playback
  • Front Camera: 1.3 megapixel
  • Rear-facing camera: 5 megapixel auto-focus HD camera, with HD video recording capabilities.
  • Connectivity: HSPA+ 14.4Mbps, 802.11n WiFi
  • Media player: Flash 10.3 compatible
  • Battery life: specified at 6 hours.
Huawei stated that availability is planned for "selected markets" starting in Q3 2011, and they did not disclose the price. 

At 7", the Huawei MediaPad competes with the original Samsung Galaxy Tab.The Galaxy Tab is actually 10g lighter, though it is slightly thicker at 11.98mm. Samsung is still running the Android 2.2 Froyo operating in the Galaxy Tab, on a Cortex A8 1.0GHz Application Processor with PowerVR SGX540. Samsung specifies a 7-hour battery life. 

The Galaxy Tab also has two cameras. The 1.3 megapixel front camera matches the Huawei MediaPad, while the Tab has a lower resolution (3 mexapixel) rear-facing camera. The Tab supports an older, Flash 10.1 revision, of Adobe's media player.


You can watch a brief promo video for the Huawei MediaPad here:



Will 4G wireless networks move basestations to the cloud?

Perhaps not THE cloud exactly (already an overused term for any remote computing), but some wireless infrastructure companies are proposing a new architecture for RANs (radio access networks), that would replace today's terrestrial cell site base stations with remote clusters of centralized virtual base stations.  The C-RAN (cloud RAN) concept has received increased attention since the 2011 MWC (Mobile World Congress), where Alcatel-Lucent (along with Freescale and HP) introduced their novel lightRadio™cube.

lightRadio represents a new architecture where the base station, typically located at the base of each cell site tower, is broken into its components elements and then distributed into both the antenna and throughout a cloud-like network.

Alcatel-Lucent's lightRadio cube (source: Alcatel-Lucent)
Shortly after MWC, Alcatel-Lucent announced that they would collaborate with China Mobile to develop "C-RAN" (cloud RAN) technology, citing benefits of C-RAN as a "green" technology that would also "improve network quality and coverage, reduce transmission resource consumption and lower OPEX by up to 50% and CAPEX by 15%". The China Mobile Research Institute had presented the C-RAN concept earlier, at a workshop in Beijing which they hosted in April 2010.

China Mobile Research Institute's C-RAN concept (source CMRI C-RAN Workshop)


Kent Fisher, solution architect in the Networking Components Division at LSI, reviewed the C-RAN concept in his presentation on the "Evolution of Wireless Infrastructure" at the Linley Tech Carrier Conference in San Jose on June 8. The EE Daily News spoke with Mr. Fisher to discuss his presentation.

According to Fisher, one of the challenges in RAN design is dealing with RF (radio frequency) interference, which can limit network capacity. Operators also find it difficult to find the space to install base station enclosures in dense urban areas, and they incur high OPEX (operating expenses) from site lease fees and high energy consumption - which is also detrimental to the environment. A typical cell site may also have low average utilization, while base station designers must meet peak network load requirements and support subscriber mobility.

In the C-RAN concept, base station enclosures at cellular towers are eliminated by distributing only RRH (Remote Radio Head) units, such as the Alcatel-Lucent lightCube, which reduces power consumption and occupies only a small amount of space at the cell site. The RRHs can be mounted directly on poles. Designers can also employ wideband radio technology in the RRH to support multiple frequency bands, such as in multi-mode 3G/4G installations.

Because of the high-data rates in 4G networks, network engineers must connect the RRH to a centralized processing pool of virtual base stations through an optical fiber transport network, using the CPRI (common public radio) interface or OBSAI (Open Base Station Architecture Initiative). Data compression will be required to reduce bandwidth and transport costs. This requirement for a fiber-connected topology is likely to limit the use of C-RAN to dense urban centers, and in 'greenfield' installations that lack any pre-existing network infrastructure.

However, engineers deploying C-RANs can take advantage of two Bell Labs (now Alcatel-Lucent) technologies to increase capacity and achieve higher spectral efficiency, through the use of CoMP (coordinated multi-point) transmission, and ICIC (inter-cell interference coordination) in the virtual base-station pool. According to an Alcatel-Lucent whitepaper, in dense urban areas with a large number of RRHs deployed, network engineers can use the pooling capability of a centralized processing center to dynamically manage load variations across a number of baseband processing elements. This can result in resource savings of typically 10 to 20 percent, according to Alcatel-Lucent.

Of course, designers will need more powerful communications processors to handle the more sophisticated processing functions of a C-RAN. LSI's Fisher proposed the company's Axxia device as well-suited for such applications. LSI integrates IBM's PowerPC® 476 cores in the Axxis SoC, in up to a quad configuration, with up to a 1.8GHz operating frequency.  To handle network layer-2 and layer-3 processing, LSI adds acceleration engines for functions such as packet processing (20 Gb/s), a security engine (10 Gb/s), and a traffic manager/scheduler with up to 6 levels of hierarchy.


Saturday, June 18, 2011

A Father's Day reprise - how my Dad got me started on an engineering career.

I originally published this article on July-10, 2010, under the title "Antennas at War". At the time, that was a play on words to poke a little fun at Apple's iPhone-4 antenna controversy. But this article was much more about my very first electrical engineering project, which began when my dad helped me build a World War-II "Foxhole Radio".

The folks over at EE Times got me thinking about that article when they tweeted a request for stories about Dad's who influenced their offspring to go into engineering. With Father's Day coming up tomorrow, this is my story...
                                                                                                                                                    

The first electronics project I ever worked on was a radio that my father helped me build when I was about 8 years old. As I recall it, there was a book on electronics for boys (ok - they were sexist back then) that had the project in it. The radio was based on building an antenna from a coil of wire wound around a cardboard toilet paper tube.

Now, before you assume that I am making some sarcastic allusions to another antenna story that's been all over the news recently... I assure you that is not why I write this today.  (Although I have to admit that more iPhone-4 jokes do come to mind).
I was looking for some wire to use in order to hang yellow jacket traps today, when I recalled that I had a spool that's been in my toolbox as long as I can remember. I've used this wire for similar purposes many times in the past, although never (that I can remember) for its intended electrical application.

Except for one time.


Suddenly it struck me. This was the very same wire that my Dad bought for the antenna coil in that radio project so many years ago.  I don't know why it never really struck me so vividly before today (faulty memory). My wife likes to tease me about holding on to things, but this has to be some sort of record for me. With my memory jogged I can even see my Dad taking me to the electronics parts store on Vulcan Street in Buffalo. I looked it up, and I think it's still there... Radio Equipment Corp.!  How cool is that?!


So, I had to do a web search to see if I could find that book. I physically looked for the book a few years ago shortly after my Dad died. I had the somber task of closing up my Dad's house when it was severely damaged after a burst water pipe went undetected in the middle of winter. Regrettably, the book was not to be found.

However, Google did find something that looked just like the image in my now rejuvenated memory, the "foxhole radio". According to my internet search, the origin of my first electronics project (and 1st experience in the world is analog) was "How to Build a 'Foxhole Radio' ", from All About Radio and Television by Jack Gould, Random House, 1958.

That's a little bit different than my memory of a book on electronics for boys, but it could make sense because my Dad did some part time TV repair when I was growing up. Google books also came up with a reference to a guide to books for school libraries, so maybe my memory is correct as well. Perhaps the article was copied in other forms.

As the story goes according to my internet search, soldiers in WWII were not allowed to have radios, for fear of detection by the enemy. So they improvised with this simple - unpowered - design. I may just have to go see if I can re-create one now.

In any event, I thought this was a great way to cap off the week that brought us "Antennagate". It also serves as a great reminder. Yes, The World is Analog!

p.s. if you happen to come across an original edition of All About Radio and Television, you can imagine how much I'd love to put my hands on one.

p.p.s. My lovely wife did exactly that, which you can read about in "A gift then, and a gift now. My introduction to electronics"

Friday, June 17, 2011

Imec finds that FinFETs outperform planar CMOS for SRAM yield

The Imec research institute, headquartered in Leuven - Belgium, has announced results of a comparison of SRAM product yield that found FinFET structures outperform conventional planar CMOS.

In order to assess the impact of process variability, Imec says that they compared one planar and two FinFET technologies on key figures of merit, testing both single memory SRAM cells and full SRAM arrays. Both FinFET technologies that were tested were superior to planar technology for medium- to large-sized (>128-512Kbytes) SRAM arrays. The FinFET SRAMs were also found to be far less sensitive to device mismatches, suggesting that designers may be able to use more aggressive scaling of the power supply with FinFETs. For undoped SOIFFs (silicon-on-insulator FinFET), Imec's results showed that the power supply can be lowered by an additional 200mV compared to planar devices.
In their report, Imec noted that SRAMs, which are constructed from a large number of small, highly variable devices, are especially sensitive to device mismatch that can result in poor yield. Imec found that FinFET devices exhibited lower leakage as well as lower variability than planar transistor. Intel recently announced the first mass production of FinFET devices in their 22nm 3D Tri-Gate process.


Two product-related announcements from Atmel this week

The Samsung Galaxy Tab 10.1 incorporates the Atmel mXT1386 touchscreen controller
Two product-related announcements from Atmel this week:
In the first announcement, Atmel said that Samsung is using the company's maXTouch™ mXT1386 touchscreen controller in the recently introduced Galaxy Tab 10.1. Many reviewers have described the Tab 10.1 as the best competition so far for the iPad. The Samsung Galaxy Tab 10.1 is running the Google Android Honeycomb 3.1 operating system on an NVIDIA Tegra 2 dual-core processor.

Atmel also announced that they have integrated their Sensors Xplained software drivers in their recently-launched AVR Studio® 5 Integrated Development Environment (IDE). The company says that embedded system designers can use the new software drivers to accelerate application and device development for a variety of popular sensor types in consumer, industrial and medical applications. Atmel's partners for the AVR Studio 5 development include sensor manufacturers AKM, Bosch Sensortec, Honeywell, Invensense, Kionix, and Osram Opto Semiconductors.


The Sensors Xplained software drivers and expansion boards have been designed by Atmel to be plug-in compatible with all the Xplained series microcontroller boards for the Atmel AVR family of MCUs. The Atmel sensor solution provides designers with the AVR Xplained processor board and development system, a sensors board to add onto the Xplained processor board and the software drivers in the free Atmel AVR 5 Studio.

Pricing and Availability
The Atmel Sensors Xplained drivers are available now by downloading AVR Studio 5 at http://www.atmel.com/microsite/avr_studio_5.

Atmel also has several Sensors Xplained expansion boards available now:
  • Inertial Sensor Board One: Provides 9 degrees of freedom for inertial sensing. Includes an Invensense ITG-3200 3-axis gyro, Bosch Sensortec BMA150 3-axis accelerometer and AKM AK8975 3-axis magnetometer for USD $54.
  • Inertial Sensor Board Two: Provides 9 degrees of freedom inertial sensing. Includes an Invensense IMU-3000 3-axis gyro, Kionix KXTF9 3-axis accelerometer and a Honeywell HMC5883L 3-axis magnetometer for USD $54.
  • Pressure Sensor Board One: Includes a Bosch Sensortec BMP085 barometric pressure for atmospheric and altitude sensing for USD $24. 

Has Altair taken over the lead from Sequans to supply TD-LTE chips for China Mobile?

Earlier this week, Altair Semiconductor announced that they had completed a demonstration of the company's TD-LTE chipset for executives of China Mobile. AsiaTelco Technologies integrated the chipset into a USB dongle for Altair. Although conditions of the setup were not described, Altair claimed download speeds of more than 50Mbps and upload speeds of more than 18Mbps during the demo, which was held during a board meeting of the Next Generation Mobile Network Alliance. Altair said that they had previously demonstrated their TD-LTE solution in India and Japan, and had completed several months of testing with China Mobile prior to last week's demo.

In the Altair press release, Jason Ding, CEO of AsiaTelco Technologies Co., said
"Our partnership with Altair has provided us with the ability to offer LTE products to the market well ahead of our competition.."

The Altair announcement raises the question of whether they have taken over the lead from Sequans for China Mobile's TD-LTE business. One year ago, Sequans announced their own TD-LTE demonstration of USB dongles for China Mobile, at the World Expo 2010 in Shanghai.

 In their May 2010 announcement, Sequans said:
The USB dongles used for the demo are powered by Sequans’ SQN3010 baseband SOC.  The chip is designed to comply with the 3GPP R8 standard, supporting UE category 3 throughput of 100 Mbps in a 20 MHz channel, and LTE band classes 38 and 40.
Alcatel-Lucent and Motorola also collaborated with Sequans for the World Expo demonstration network. Sequans has also announced collaborations with other leading LTE infrastructure equipment suppliers, including Ericsson, Huawei, and Nokia Siemens Networks.

Related article: Altair Semiconductor - Last (base)Band Standing?