## DESIGN OF CURRENT-MODE CMOS MULTIPLE-VALUED LATCH

#### M. S. Bhat<sup>®</sup>, Rekha S\* and H. S. Jamadagni \*

#### Abstract

In comparison to binary logic, CMOS Multiple-Valued Logic (MVL) circuits provide very small chip size, higher speed, low power consumption, less number of interconnects and fewer VO pins. However, the drawbacks of MVL circuits include the lack of self-restoration of logic levels, lower noise margins and static power consumption. The use of multi-valued latches and flip-flops would considerably reduce the size of sequential part of the logic circuit. This paper discusses the design and implementation of a novel current-mode MVL latch using 0.7-µm MIETEC CMOS technology and its application in sequential circuits.

#### Keywords: Multiple-valued logic, current comparison, self-restoring logic

#### 1. Introduction

Multiple-Valued Logic (MVL) designs have been receiving considerable attention over a couple of decades [1]-[3]. The signal processing on the basis of the multiple-valued logic is carried out using multiples of logic levels and thresholds, in contrast to binary logic with its two states. Consequently, multiple-valued logic circuits are a promising approach to reduce signal lines and the number of active devices on the chip, effectively due to the increase of information per line. Similarly, by extending MVL across the chips, it is possible to reduce chip pinout. All these reductions would improve performance in terms of area, speed and power (due to reduced amount of switched capacitance). These advantages have been confirmed for various high-speed compact arithmetic circuits [4]-[5]. However, these circuits suffer from noise and parameter variations.

Most of the designs are current-mode circuits because of their advantages over voltage-mode circuits. Implementing voltage-mode MVL requires partitioning the total voltage range (zero to Vdd) into a number of discrete levels as decided by the radix chosen. The choice of the radix, the dynamic range and hence the threshold levels depend on noise margins, which in turn depend on the relative magnitude of noise levels (e.g., ground hounce) with respect to the supply voltage. Also, at the process level, multi-level ion implantation is necessary to realize multiples of threshold voltages, which in turn call for additional fabrication steps/masks.

1

<sup>\*</sup> CEDT, Indian Institute of Science, Bangalore. E-mail: msbhat@cedt.iisc.ernet.in

## CONFIGURABLE I/Q MODULATOR USING CORDIC BASED DDS ARCHITECTURE

#### G.Suresh, G.L.Biswas, K.D.N.V.S.Prasad, Dr.A.T.Kalghatgi

Central Research Laboratory, Bharat Electronics Limited, Jalahalli Post, Bangalore, INDIA - 560 013.

#### 1. Abstract

In this paper an implementation technique for configurable modulator using CORDIC based DDS architecture for software defined radio applications is presented. It can be operated either as BPSK modulator or QPSK modulator or a stand alone DDS for programmable data rates with a programmable center frequency. It can also be extended for M-ary QAM. DDS with good spectral purity with a phase noise of -100dBe at 1kbz offset and spurious free dynamic range (SFDR) up to 60dBcwas achieved. BPSK and QPSK modulation techniques for different data rates (re-configurability) were implemented using the same algorithm(CORDIC) and hardware with satisfactory results.

#### 2. Introduction

Recent applications ranging from instrumentation to radars and modern communication systems including spread-spectrum and phase-shift keying modulation techniques require Direct Digital Synthesizer with high resolution, frequency agility and good spectral purity.

Conventional DDS implemented with high resolutions typically requires large memory for storage of Sine/Cosine functions. An approach to overcome this drawback is the method of iterative technique of computation for the corresponding Sine and Cosine functions by means of CORDIC algorithm with the main advantage of using only a small look-up table (-n.n bit). Quadrature modulations like BPSK and QPSK can be achieved using CORDIC based DDS architecture.

#### 3. CORDIC theory

Simple form of CORDIC is based on the observation of that if q unit-vector with end point at (x,y) = (1,0) is rotated by an angle of  $\theta$  its new end point will be at  $(x,y) = (\cos \theta, \sin \theta)$ . This algorithm provides an iterative method of performing the above mentioned vector rotations in pre-determined steps of angles using

## **DESIGN OF CURRENT-MODE FLASH ADC**

#### M. S. Bhat<sup>\*</sup> and H. S. Jamadagni \*

#### Abstract

The design of a high-speed current-mode CMOS flash analog-to-digital converter (ADC) is presented. For high-speed operation, current mirroring technique with current comparison architecture is used and its advantages and limitations are explained. The optimization procedure is aimed at minimizing static power consumption, and its impact on circuit performance is investigated. A maximum sampling speed of 40Ms/sec is achieved at 90mW power consumption. The ADC is implemented in a 0.7-µm CMOS technology.

#### Keywords - Flash ADC, current comparison, low-power design

#### 1. Introduction

With the continued proliferation of mixed analog and digital VLSI systems, the need for small sized, low-power and high speed analog-to-digital converters using conventional CMOS processes has increased in integrated circuit designs employed for communication and signal processing applications. With shrinking device sizes, the threshold voltage do not scale by the same degree as the supply voltage and hence full advantage of speed improvement is not possible in voltage-mode circuits. Hence, alternate design techniques such as full current-mode techniques have become imperative in the design of high speed circuits.

Current-mode circuit techniques, which process the active signals in the current domain, offer a number of advantages. Current signals show significant immunity to ground and power supply noise and signal line impedance. Generally, current mode circuits neither require amplifiers with high voltage gains there by reducing the need for high performance amplifiers nor require high precision resistors or capacitors. They also offer the advantages of inherent low-voltage swing operations and can be easily realized in the standard digital process. Further, it is also well recognized that current-mode circuits are capable of high-speed operation [1]. However, the current-mode circuits are capable of high-speed operation [1]. However, the current-mode circuits have some disadvantages too. These circuits have a fanout of one and the current comparators have high output impedance, which introduce latency between the current-mode cirput and the voltage-mode output.

The advantages offered by current-mode techniques are exploited in various types of ADC circuits [2]-[12]. The works carried out in [7] and [8] employ current comparators with successive approximation techniques and hence they are inherently slow. In current-mode flash ADCs, a primary requirement is to

<sup>\*</sup> CEDT, Indian Institute of Science, Bangalore. E-mail: msbhat@cedt.iisc.ernet.in

## Design Methodology for sub-0.1µm Technologies

Ralf Pferdmenges Director Design Automation Design Flows Infineon Technologies AG

## *Emerging Non Volatile Memories: Technological Promise or Industrial Hoax*

Sreedhar Natarajan MoSys Incorporated, Ottawa, Canada Sn@ieee.org VLSI Design and Test 2004 Mysore, India August 27, 2004

Aug 27, 2004

Sreedhar Natarajan, VDAT 2004, Mysore, India 1

# Designing an Embedded Processor : Specifications to Implementations

By

Atanendu Sekhar Mandal, Ravi Saini, Pramod Tanwar, Nitin Sharma, S.C. Bose, Raj Singh and Chandra Shekhar

Central Electronics Engineering Research Institute Pilani – 333031, Rajasthan. Email : atanu@æeri.ernet.in

## PERFORMANCE OPTIMIZATION OF CMOS CIRCUITS USING RETIMING ALGORITHM WITH STEPWISE CHARGING P Vijavakumar<sup>\*</sup>

K Gunavathi<sup>#</sup>

### ABSTRACT

In this paper, we propose an efficient algorithm to optimize Power-Delay product in CMOS logic circuits. The proposed algorithm aims at reducing both power dissipation as well as the delay, which is achieved by retiming and stepwise charging techniques. The retiming technique divides the circuit into stages with latches interposed between them, which decreases the delay. After retiming, stepwise charging technique is employed, where in the supply voltage itself is applied in a series of steps before reaching the maximum supply voltage. which reduces the power dissipation. The algorithm is tested on ISCAS benchmark circuits. Experimental results have shown a reduction of around 80% in power-delay product with small area overhead.

### 1. INTRODUCTION

In the modern world portable systems are becoming increasingly popular. Hence the need for devices that dissipate lower power. Power dissipation in CMOS circuits can be classified into two types namely static and dynamic power dissipation. Of these, dynamic power dissipation is the chief contributor [1,2]. Hence efforts at power optimization will be aimed at reducing dynamic power dissipation. However, power optimization quite often results in an unavoidable reduction in the speed of the devices. As reduction in speed is not affordable, ways to optimize the speed of the devices will also have to be looked upon. In short, Speed-Power optimization is the need of the hour, which is precisely the aim of this paper.

Power optimization techniques currently in use such as voltage scaling, gate resizing, precomputation, etc. All these techniques optimize power at the expense of the speed of the circuit [3,4,5]. Stepwise charging results only in a small decrease in the speed of the circuit and that too can be compensated by pipelining the circuit [6,7,8].

Pipelining is a technique of decomposing a circuit into a number of segments, with each segment operating concurrently with all other segments thus increasing the speed of the circuit. Intermittent latches separate the segments

\* Lecturer, Department of EEE, PSG College of Technology # Asst. Prof, Department of ECE, PSG College of Technology

## VOLTAGE-SCALED REPEATERS FOR LOW-POWER LONG INTERCONNECTIONS IN VLSI CIRCUITS

Rajeevan Chandel<sup>1</sup>, S. Sarkar<sup>2</sup> and R.P. Agarwal<sup>2</sup>

#### Abstract

A SPICE simulation based analysis of the control of power dissipation in optimized repeater loaded long VLSI-interconnect is presented in this paper. The analysis shows that in general power dissipation decreases with voltage scaling. However, over some voltage ranges deviations from this trend may occur. Substantial voltage scaling reduces the optimum number of repeaters. These effects are technology independent.

#### 1. Introduction

CMOS technology has advanced and the minimum feature size has decreased over the years. As a result, both die size and device density of the VLSI circuits are on the increase. Consequently, CMOS capacitive loads are attaining appreciable values in VLSI circuits. These capacitive loads are due to three main components viz. (i) output capacitance of the logic gates, (ii) input capacitances of the next gates i.e. the fanouts and (iii) interconnections, in a VLSI circuit. The long interconnections in VLSI circuits lead to high propagation delays, as the parasitic resistance and capacitance of interconnects increase linearly with length, Kang and Leblebici (2003). To overcome this problem repeaters have been inserted to efficiently drive long interconnects in VLSI circuits [2-9].

Power dissipation is another important concern of VLSI design. Both large loads and long interconnects lead to power dissipation. Use of repeaters to drive long interconnects to reduce delay, simply add to the undesired increases in the power dissipation in VLSI circuits. As projected by international technology roadmap to semiconductors (ITRS) the total chip power dissipation is increasing rapidly as the total chip capacitance, operating frequency and leakage current increases with scaling, ITRS (2000-2002). The optimal sized repeaters are quite large sized, Cong and He (1998) and Banerjee et al. (2001), and so dissipate a significant amount of power. It has been pointed out by, Adler and Friedman (1998), Banerjee and Mehrotra (2002), that large interconnect loads affect the performance of VLSI circuits, due to excess power dissipation.

More recently, there has been an increasing prominence of portable battery operated low power systems. Therefore several methodologies have

<sup>&</sup>lt;sup>1</sup> QIP Research Scholar in Department of Electronic and Computer Engineering, Indian Institute of Technology, Roorkee 247 667 India, from E&CED, NIT Hamirpur 177 005 HP India. rajeedee@iit.remet.in

<sup>&</sup>lt;sup>2</sup> Department of Electronic and Computer Engineering, Indian Institute of Technology, Roorkee, Uttaranchal 247 667 India. fermifec@iitr.ernet.in, rajanfec@iitr.ernet.in

## A Compact Fast Parallel Multiplier Using Modified Equivalent Binary Conversion Algorithm

## Subhendu Kumar Sahoo<sup>1</sup> Chandra Shekhar<sup>2</sup> Anu Gupta<sup>1</sup>

#### Abstract:

A redundant to binary converter is proposed which is used in final addition stage of parallel multiplier. Use of this circuit in final adder stage is faster than carry look ahead implementation by 30% and also the transistor count decreases by 30% in this final addition circuit. We used this algorithm in such a way that no redundant binary adder is required in compressing partial product rows. Only the natural 4:2 compressor circuits are used. We proposed a new compact 4:2 compressor, which is best performing among all of the 4:2 compressors.

#### I. Introduction

The increased level of integration brought about by modern VLSI technology has rendered possible the integration of many complex components in a single chip. This has made the systems faster and thus useful for real time application like mobile digital signal processing, multimedia application, scientific computation etc. In all these cases the most recurring and time consuming computation is multiplication. To increase the speed parallel multipliers are used. Thus the speed, power and size of a parallel multiplier has always been a critical issue and therefore the subject of many research projects and papers. The size of the circuit depends on the size of the transistors as well as the transistor density. Previous works [1]-[9] have tried to develop new algorithms and circuit techniques to reduce delay, power, transistor count etc. However with increasing need of portability the present work tries to look into circuit design techniques of 4:2 compressors that will lower the delay and decrease the transistor count. To make the final stage addition fastest ever possible a new circuit is designed using the modified equivalent binary conversion algorithm (MEBCA), which can replace carry look ahead (CLA) adder and out performs in terms of delay, power and transistor count.

Multiplication is a three-step process: I. generation of partial products II. summing up of all partial products until only two rows remain III. adding the remaining two rows of partial products by carry propagate adder. In the first step, two methods are commonly used to generate partial products. The first method generates partial products directly by using two input AND gates. The second one use sign select radix-4 modified Booth encoding (MBE) to generate partial products [1]. This is used in parallel multipliers to reduce the number of partial products by a factor of two. The partial product generation in the second method consists of modified booth encoder (MBE) [1] and booth selector [85)[1]. After generating partial products, in second step Wallace reduction tree [4] is used to sum up all the partial products efficiently. In Wallace tree 4:2 compressors are used to reduce summing up time. As these compressors are used repeatedly we have analyzed various 4:2 compressor circuits and proposed

<sup>1</sup>Electrical and Electronics Eng. Birla Institute of Technology And Science, Pilani <sup>2</sup>Central Electronics Research Institute, Pilani, Rajasthan

## IMPLEMENTATION OF ADVANCED ENCRYPTION STANDARD (AES) ALGORITHM IN A RESOURCE LIMITED FPGA

#### K.Sridhar, Assistant Professor, School of Computing, SASTRA T.R.Sivaramakrishnan, Professor, School of EEE, SASTRA 1

Abstract: This paper describes the implementation of AES algorithm in a resource limited FPGA. It explains the use of FPGA as a reconfigurable computing machine. It describes the proposed system with FPGA as a coprocessor to the CPU of a 32 bit general purpose computing machine. The details about the implementation of the encryption and decryption modules of AES algorithm in a resource limited FPGA (ALTERA EPF10K10ATC144 FPGA) is explained. The results of the implementation and the speed of execution of the algorithm are given. The possible configurations are described. Key words: AES, Rijindael, FPGA, Reconfigurable Computing 1. Introduction

In Conventional computing, the algorithms can be implemented through either hardware or software. The hardware implementations are very fast, consumes less power and at least 10 to 100 times as fast as an equivalent software implementation. A well designed ASIC will efficiently solve the specific problem for which it is designed but can not be used even if a slight change is made in the algorithm after the ASIC has been made. On the contrary the software implementation is very flexible and the algorithm after the ASIC has been would. On the contrary the software implementation is very flexible and the algorithm for the ASIC solution. So the designing systems which are as fast as hardware and also offer the flexibility of the software implementations. This is achieved through the SRAM based FPGAs which can be configured on the fly. A detailed survey of the Reconfigurable Computing systems and relevant software is available in [1].

The organization of this paper is as follows; Section 2 describes the AES algorithm and analyses the different parts and their suitability for either software or for FPGA implementation. Section 3 introduces the proposed reconfigurable system and the implementation details of the algorithm in FPGA. Section 4 presents the results of the implementations. Section 5 concludes the discussion and presents the future developments to be completed.

<sup>&</sup>lt;sup>1</sup> The authors are affiliated to Shanmugha Arts, Science, Technology and Research Academy (SASTRA), Deemed University, Thirumalaisamudram, Thanjavur, TamilNadu.

### Interface between Automotive Electronic Control Units and a Real-Time Simulator

Hande V, Uday Prabhu, Shardul Bapat Infosys Technologies Ltd., India HandeV@infosys.com, VlayH Prabhu@infosys.com, Shardul bapat@infosys.com

Keywords: ECU, RTS, SPI, CAN

#### Abstract:

Modern day automobiles can contain over 30 Electronic Control Units (ECU) for various subsystems such as Engine Management, Transmission, Traction Control, Braking Systems etc. These ECUs are inter-networked via the Controller Area Network (CAN) bus.

Real Time Simulators (RTS) are typically used to test the functionality and behaviour of automotive ECUs in a laboratory environment. Comprehensive testing is possible only if the RTS has knowledge of the real-time data exchange between the processor and various peripherals within the ECU. The internal ECU processor-peripheral interface typically uses the Serial Peripheral Interface (SPI) bus.

This paper presents an approach to tap the ECU SPI bus and feed the data to the RTS through a Versa Module Europa (VME) bus interface to facilitate a closed loop simulation environment. The approach was tested for an ECU used in the Braking System in an automobile. The various design considerations and challenges faced are outlined herein.

#### Introduction

Electronic Control Units (ECU) are part of several automotive systems such as Engine management, Braking systems etc.

The ECU receives inputs from various sensors, based on which it controls its outputs. For example, an ECU used in the Electro-hydraulic Braking System comprises of a central processor (CPU) and various peripheral devices such as serial EPROMS, watch dog timers, pressure sensors and valve outputs.

The SPI bus is typically used within the ECU to interface the CPU with the peripheral devices. (refer Fig. 2)

The ECU software needs to be rigorously tested for all possible operating conditions in the laboratory before it is deployed in the vehicle. This requires comprehensive simulation of the various inputs under real life conditions. The Real Time Simulator (RTS) is used for this purpose.

The simulation capability of the RTS is enhanced if it has knowledge of the real time data exchange taking place within the ECU. Here, we present an approach to tap the SPI bus within the ECU and feed the data to the RTS to create a closed loop simulation environment.

#### **Technical specifications**

| Input ECU data:                         |                   |
|-----------------------------------------|-------------------|
| ECU-RTS interface distance: Up to 1.5 m |                   |
| SPI bus frequency                       | : 8 MHz (max.)    |
| SPI word length                         | : 8/10/12/16 bits |
| (configurable)                          |                   |
| SPI critical time                       | : 100ns (min.)    |
| Peripheral devices                      | : 14 numbers      |
|                                         |                   |
| Output RTS data:                        |                   |
| VME Bus interface                       | : M/MA-module     |
|                                         |                   |
| Debug:                                  |                   |
| On-line debug mode                      |                   |
|                                         |                   |

Mechanical: Dimensions

: M/MA-module

#### Design challenges

Refer to Fig. 3 for the system setup. The SPI lines (SIMO, SOMI and SCK) and the individual Slave Select (SS) lines for each

## FPGA/CPLD BASED SOLUTION TO STRETCH THE SPEED OF MICROPROCESSOR/ MICROCONTROLLER BASED INSTRUMENTATION.

Dr. (Mrs.) Shaila Subbaraman Walchand College of Engineering, Sangli. Mrs. Vaishali V. Patil Dr. J.J.Magdum College of Engineering, Jaysingpur.

#### Abstract:

An approach of implementing VLSI design in programmable logic devices has offered promising solutions to many high-speed digital systems. In fact this has led to a concept of fabless fabrication wherein the countries, which do not have expensive chip fabrication foundries, can fabricate VLSI chips of their designs and can. achieve verv large-scale system implementation. Microprocessor/microcontroller (µp/µc) based systems have been the favourite choice of many designers especially for medium and low speed electronic systems. Due to the requirement of some specific number of machine cycles to execute any instruction, µp/µc based systems do not really work at the rate of system clock. This is true especially when up/uc is interrupted by an external signal and then an interrupt service routine (ISR) is executed to carry out a particular task. This paper presents a novel approach to combine the state-ofthe-art programmable logic device approach with the traditional up/uc based instrumentation to reduce the delays and obtain the higher possible speeds comparable to speeds of system clock.

#### 1. Introduction:

Many industrial electronic systems use  $\mu/\mu$ ( $\alpha$  as the heart of many operations. The state-of-the-art design tools offer C-cross compilers and C-language interface to relieve the design engineers from writing tedious assembly language codes. However these systems suffer from a drawback of maximum operational speed not matching with the system clock speed. This is due to the fact that each instruction demands some specific number of machine cycles. Even the pipelining approach of Fetch-Execute-Overlap (FEO) cannot stretch the entire operation to a speed matching with the clock speed. The severity of this problem increases especially when a  $\mu/\mu$ ( $\alpha$  is interrupted by an external signal demanding a service for which it undergoes an execution of an interrupt service

## SELF TUNING CIRCUIT FOR FPGA

## **BASED WAVEPIPELINED MULTIPLIERS**

## G. Lakshminarayanan<sup>°</sup>, B. Venkataramani<sup>\*</sup>, M. Yousuff Shariff<sup>\*</sup>, T. Rajavelu<sup>\*</sup> and M. Ramesh<sup>\*</sup>

#### Abstract

In this paper, an automation procedure for the design of FPGA based wavepipelined multipliers is proposed. This maximizes the operating frequency of the wavepipelined circuit by tuning the clock frequency and the clock skew. To study the effectiveness of this approach, an array multiplier and multipliers using dedicated AND gate as well as fast carry logic are implemented using the automation procedure on Xilinx Spartan XCS30-3VQ100 and Spartan-II XC2S100-5PQ208 devices. Their performances are compared with that obtained using non-pipelined and pipelined approaches. Wavepipelined circuits are faster by a factor of 1.4-1.7 compared to non-pipelined circuits. The pipelined circuits achieved with the increase in the number of registers by a factor of 1.75-6.5. The technique proposed in this paper is also applicable for ASICs and FPGAs from other vendors.

Keywords: Wavepipeline, FPGA, FSM, Signature analyzer, Array multiplier, Self-test circuit

#### 1 Introduction:

FPGAs with gate counts of the order of few million gates have become a reality leading to the design of complete system on a chip (SOC) [1]. In view of this, FPGA based implementation of various systems becomes important especially for high speed and low volume applications. For high speed DSP circuits, pipelining is one of the popular techniques used. By splitting the combinational logic block into a number of smaller blocks and placing registers in between them, high speeds are achieved in the pipelined circuits. However

National Institute of Technology, Tiruchirappalli, INDIA bvenki@nitt.edu

## FPGA IMPLEMENTATION OF MULTIPLE TARGET SEGREGATOR

## Gaurav Singh Nim<sup>1</sup>, Scientist 'C' and B S Chauhan<sup>2</sup>, Scientist 'E'

#### Abstract

Segregation of multiple targets is very important to multiple targets tracking system, where target position and shape play a vital role. In this paper, a multiple target segregator is presented which does the front-end extraction to generate a coarse candidate target list. The hardware is implemented using Field Programmable Gate Arrays. First, image segmentation and horizontal connectivity analysis are performed where the target pixels are separated from the background pixels followed by data processing unit carrying out vertical connectivity analysis and calculations for target centroid and shape. An algorithm for vertical connectivity analysis is also proposed. The simulations have verified that large number of targets can be extracted in real time. The performance issues for the implementations have been discussed.

#### 1. Introduction

In tracking multiple targets the cues such as target position and shape information play a vital role. This information requires large number computation and complex operations. Various schemes for hardware implementations have been proposed but have some limitations.

The work presented in this paper is one of the results of a research activity aimed at implementing dedicated architectures for multiple target tracking.

Liebe [1] has proposed a scheme for centroid extraction for multiple regions of interest but the prior knowledge of the target size is required for the scheme to work best. This forms a serious limitation when prior information is not available.

Albus [2] has used the variable rectangular gate to track a target. With multiple targets this scheme will have its drawbacks in terms of computational load.

Axelrod [3] implemented multiple target extraction using the commercial processor. Due to the complexity of the algorithm and the real time constraints it is inflexible to perform the total computation using the hardware circuit.

Chen [4] provided the basic platform for this work. The algorithm proposed performs well for various types of target but has limitations as the target smoothness varies. The intensity centroid makes the target centroids more sensitive to light variations. The implementation uses different hardware platforms, which raises the complexity level of the final hardware. In this work multiple target segmentation cum segregation is performed using Field Programmable Gate Arrays (PPGA's). Section 2 briefly discusses FPGA as an

<sup>&</sup>lt;sup>1,2</sup> Instruments Research and Development Establishment, DRDO Labs, Dehradun. gauravsn@hotmail.com

## AN FPGA IMPLEMENTATION OF CODE PHASE SHIFT KEYING BASEBAND DECODER

## G.Thavasi Raja<sup>[1]</sup>, S. Rajaram<sup>[2]</sup>, Dr. V. Abhai Kumar<sup>[3]</sup>

ABSTRACT- Code-Phase-Shift Keying (CPSK) is a direct-sequence spread-spectrum (DS-SS) signaling system employing M different code phase shifts of a single pseudo-noise (PN) code sequence for M-ary signaling. With the emergence of home entertainment, automation, and information devices that are capable of being interconnected in home networks, there is increasing interest in the use of wireless transmissions in home networking. The CPSK signaling scheme offers the potential of increasing spreading gain without reducing data rate, or increasing data rate without sacrificing the spreading gain. The CPSK receiver consists of an IF demodulator and a CPSK baseband decoder. This paper proposes the design and implementation of a CPSK baseband decoder suitable for wireless home networking applications. The CPSK baseband decoder consists of several modules, which are easily realized by digital logics and implemented using an Field Programmable Gate Array (FPGA) chip. The tracking and acquisition schemes are adapted from conventional DS-SS scheme. A modified double-dwell serial search scheme is used for code acquisition and tracking, and the carrier-phase synchronization is solved by a double threshold decet coin sist solved by a double threshold providence in the CPSK bacebane in the CPSK bacebare.

Keywords: CPSK, FPGA, WLAN

#### 1. Introduction

For the last couple of years, one of the hottest topics in computing and communications have been wireless technology, will offer more bandwidth, security, and reliability, making it more suitable for multimedia, ecommerce, video conferencing and other advanced applications. Some experts say that 3G technology itself is not good enough; others maintain that wireless local area network (WLANs) or radio-router technology would be better suited than 3G for many advanced applications [4,5]. The spread of high performance portable computers and the need to network computers, data terminals and devices such as personal assistants, home entertainment and automation, WLANs have also become an emerging technology for today's computer interconnected homes and communication industries. The spread spectrum (SS) transmission range is 2.4 to 2.4835 GHZ ISM band, according to FCC part 15.247, used in the physical layer of WLANs [5] Commercially available radio transceivers employing conventional DS-SS Binary-Phase-Shift Keying (BPSK) scheme provide data throughput up to 2 Mbps, limited by available frequency bandwidth [1, 2]. One way to improve the throughput is to use an M-ary SS scheme in which different pseudo noise (PN) sequences are used to encode several data bits for transmissions [4]. However, this scheme has a drawback in that the PN codes may interfere with each other resulting in degraded bit error rate (BER) performance. Although orthogonal maximal length PN sequences can be used, the number of available sequences for any given code length is limited. Another drawback is the need of extra hardware to generate different PN codes. To minimize these drawbacks, a novel M-ary SS signaling scheme, known as code-phase shift keying (CPSK) has been proposed [3]. Previous performance evaluations [3] and

<sup>[1]</sup> Mr. G. Thavasi Raja, Final Year M.E Communication Systems student, Dept. of. ECE, Thiagarajar College of Engineering, Madurai-15, Anna University, Tamil Nadu, India. <u>g</u> thavasi@yahoo.co.in

<sup>[2]</sup> Mr. S. Rajaram., M.E., (Ph.D.)., Lecturer, Dept. of. ECE, Thiagarajar College of Engineering, Madurai-15, Anna University, Tamil Nadu, India. rajaram siya@yahoo.co.in

<sup>[3]</sup> Prof. Dr. V. Abhai Kumar., Ph.D., Principal and Head of ECE Dept, Thiagarajar College of Engineering, Madurai-15, Anna University, Tamil Nadu, India.

## FPGA Implementation of Subband Image Encoder using Discrete Wavelet Transform

Prashant R. Deshmukh<sup>1</sup>, A.A. Ghatol Dr. P.D. Polytechnic, Amravati 1 Pr\_deshmukh@yahoo.com

#### Abstract

Although FPGA technology offers the potential of designing high performance systems at low cost, its programming model is prohibitively low level. To allow a novice signal/image processing end-user to benefit from this kind of devices. the level of design abstraction needs to be raised. This approach will help the application developer to focus on signal/image processing algorithms rather than on low-level designs and implementations. This paper presents a framework for an FPGA-based Discrete Wavelet Transform system. The approach helps the end-user to generate FPGA configurations for DWT at a high level In this paper, the proposed DWT (Discrete Wavelet Transform) filter bank is made of simple architecture, but it is efficiently designed that a user have facility to provide desired compression rate. After implementation on FPGA chip, the designed encoder operates at 73.82MHz.

Keywords: FPGA, Discrete wavelet transform, Sub band coding, filter bank I. Introduction

Compression is based on two fundamental principles. One principle is to use the properties of signal source and to remove redundancy from the signal. When considering digital images as realizations of two-dimensional stochastic process, this structure manifests itself through statistical dependencies between pixels. The other principle is irrelevancy reduction that is to exploit the properties called human visual system that is not perfect and to omit parts or details of the signal that will not be noticed by the receiver [4,9]. The theory of subband decomposition provides an efficient framework for the implementation of schemes for redundancy and irrelevancy reduction. It has been demonstrated repeatedly in subband and wavelet based schemes. The important motives of using subband decomposition schemes are the demand for a "scalable" image representation [7]. The subband image decomposition using wavelet transform has a lot of advantages. Generally, it profits analysis for non-stationary image signal and has high compression rate.

In this paper, it is implemented by FPGA using xilinx chip, after VHDL coding of DWT encoder for image processing. It uses filter bank pyramid algorithm for wavelet transform and can speed up as each filter consists of the FIR filter and two filters are connected with parallel structure that can compute lowpass and highpass DWT coefficients in the same clock cycle. Because of using QMF properties, it reduces half number of the multiplier needed DWT computation. It can increase efficiency as well as reduce hardware size. Also, a user can manipulate the designed encoder with input parameter control i.e. compression rate.

#### 2. Back Ground Theory

The last few years, wavelet transform has been widely used for wide range of multimedia applications including signal analysis, signal decoding,

## IMPLEMENTATION OF CONVOLUTIONAL ENCODER AND HARD-DECISION VITERBI DECODER IN **FPGA**

C.Arun, V.Thiyagarajan \*S.Sasikumar, \*\* Dr.M.Madheswaran. \* Faculty - \*\* Professor, Department of ECE PSNA College of Engineering and Technology, Dindigul, Tamilnadu

## Abstract

Need for reliable data transfer is becoming more and more important in today's digital communication. Forward Error Correction (FEC) techniques are utilized for correction and elimination of potential noise errors in a data stream at the receiver end. Convolutional encoding is an FEC technique that is particularly suited to a channel in which the Transmitted signal is corrupted mainly by Additive white Gaussian noise (AWGN). The Viterbi decoding is used to decode the convolutionally encoded data. The decoder operates by finding the maximum likilihood decoding sequence this will be simulated by using VHDL. The hardware is designed in Xilinx schematic and also implement in the Spartran-2 FPGA. Maximum speed of 68Mbits/second was Obtained. The ability to process parallel data Paths within the FPGA takes advantage of the Parallel structures of the hardware units in Viterbi decoder and therefore, higher speed can Be achieved.

## 1. Introduction

High power transmitters or larger antennas were used to control the effect of noise on the transmitted data. Channel coding realized in VLSI circuits provides better performance and lower cost alternative. Channel coding [1] refers to the class of signal transformations designed to improve communications performance by enabling the transmitted signals to better withstand the effects of various channel impairments, such as noise, fading, and jamming. Broadly error control techniques are classified into error detection and *retransmission and forward error correction* (FEC). The former technique requires two-way link, where the receiver simply asks the transmitter to resend the data, if the receiver detects an error. In contrast, FEC requires one way link and in this case parity bits are designed for both the detection and correction of errors by the receiver. The FEC technique can be realized on linear block codes and convolutional codes. In this paper, we have realized FEC technique known as convolutional coding with Viterbi decoding. A tutorial on Viterbi Algorithm can be found in [2].

In this work we have designed and developed Viterbi decoder [3] using field programmable gate arrays [4]. Our aim was to implement Viterbi decoding in Spartran-2 from Xilinx. Maximum speed of 68MHz was obtained for Spartran-2 device. We implemented Trace Back Algorithm (TBA) method for the decision unit in a Viterbi decoder. The paper is organized as follows: Section II will briefly gives about convolutional encoder. Section III will briefly introduce the Viterbi Algorithm. In general, implementing hardware in an FPGA is complex. The reason being that FPGA

## **HIGH SPEED SQUARER**

Chandra Mohan Umapathy Senior IC Design Engineer Celstream Technologies Private Limited

Prestige Blue Chip, Block II #9, Hosur Road, Bangalore: 29

chandra.mohan.umapathy@celstream.com

Abstract: This paper proposes a novel architecture for modular, scalable & reusable hybrid squaring circuit. Comparison is made between different implementations of squaring circuit. The implementation results show a significant improvement in performance in terms of area, power & timing.

The paper is organized as follows: the first section gives a brief introduction of the different aspects & the motivation for the High Speed Squarers. The second section introduces the proposed algorithm. The third section explains the proposed architecture. The fourth section deals with the comparison of the proposed architecture with the existing method & conclusion.

Key Words: Squarer, Squaring Circuit, Multiplier, Low Power etc.

#### SECTION 1: Introduction:

Squaring is one of the frequently performed functions in most of the DSP systems. Squaring is a special case of multiplication. Squaring circuit forms the heart of the different DSP operations like Image Compression, Decoding, Demodulation, Adaptive Filtering, Least Mean Squaring etc.

Traditionally, squaring was performed using multiplier itself. As the applications evolved & the demand for the high speed processing increased, special attention was given for squaring function & dedicated squarers were proposed & implemented.

Initially, squaring was performed using Look Up Table approach if delay was primary concern, while trading of the area constraint. Major drawback of the scheme was area penalty, which increases exponentially as the number of input bits increases. Due to the cost & the interconnect delay this approach was not most preferred implementation method till now. Recently, with the evolution of VLSI Process Technology the above method of implementation is becoming more popular.

Recently, lot of research has been conducted in order to develop different methodologies to implement squarers, giving more importance to improve delay & reducing area constraints. Due to which a new scheme was developed to compromise the above-mentioned trade-offs, which is called Hybrid Squarers. Greater emphasis is given on Hybrid Squarers, which comprises of Memory Elements & Computing Logic.

## HARDWARE ARCHITECTURE FOR MESSAGE PADDING IN CRYPTOGRAPHIC HASH PRIMITIVES

T S B Sudarshan<sup>[1]</sup>, Ganesh T S<sup>[2]</sup>

#### Abstract

The increasing prominence of mobile communication and the Internet as a tool of commerce brings along with it the important issue of security. An essential aspect of secure communication over any type of network is cryptography. Cryptography algorithms fulfill specific security requirements such as authentication, confidentiality, integrity and non-repudiation. Hash algorithms are a class of algorithms used for fulfilling the requirements of integrity and authentication. They compute a fixed length message digest based on the input message, which can be used as a digital fingerprint for the original message. The increasing prominence of wireless devices has increased the necessity for hardware implementation of these algorithms. An important requirement in these algorithms is that the bit length of the algorithm input should be an integral multiple of a predefined block size, whatever is the input message length. In this paper, we propose a hardware architecture aimed at providing a unified solution to this task of message padding for any of the present class of commercial MDC (Manipulation Detection Codes) hash primitives. The architecture is simulated using Verilog HDL and then synthesized on a wide variety of FPGAs.

Keywords: Cryptography, Hash Algorithms, Hardware Implementation, Message Padding, FSM Controllers, FPGA

#### 1. Introduction

-- Converted from Word to PDF for free by Fast PDF -- www.fastpdf.com

In this age of information technology, the important concern of security is handled by cryptography algorithms. They ensure that the authentication, confidentiality, integrity and non repudiation aspects of communication are not compromised. Not only is data protected from theft or alteration, even user authentication can be performed. Message authentication and integrity checking are essential techniques used to verify that the received message comes from the alleged source and has not been altered. Hash algorithms are a class of cryptographic primitives which ensure integrity and authentication. Hash algorithms are further classified as MDC (Manipulation Detection Codes) and MAC (Message Authentication Codes). MDCs ensure integrity only, but MACs can ensure both integrity and authentication. The MDCs compute a fixed-length hash value (termed as the Message Digest) based upon the input plaintext message that makes it impossible for either the contents or length of the

<sup>1</sup> Asst. Prof, BITS, Pilani, tsbs@bits-pilani.ac.in

<sup>&</sup>lt;sup>2</sup> BITS, Pilani, vlsi\_comparch@yahoo.com

## An Application of Neural Network Learning to Physical Design Optimization in VDSM Technology

#### Ganesan.S.Iver and Dr. Rajendra.M.Patrikar

Visvesvaraya National Institute of Technology, Nagpur-440011. India.

#### Abstract

Introduction of new technology in design and manufacturing is becoming a complex issue because of the complexity involved in device operations and scaling. Physical design is becoming increasingly important because of their influence on the electrical parameters. It has become important to predict the effect of these parameters during the implementation to reduce design-manufacturing iterations. Since every new technology is not radically different than previous technology learning cycles of previous technology could be used effectively through artificial neural networks (ANN). In this paper we show effect of few physical parameters on power delay product, and transistor current, hot carrier effects etc., through neural network implementation. The network has been trained with technology parameters of 1.5u, 0.5u, 0.35u, and 0.18u [8]. The projection capability od ANN has been examined by comparing results with simulated results for 50nm technology. Comparison shows agreement between projected and simulated values with reasonable range of error margin, thus this capability could be used for design optimization.

#### 1. Introduction

With reduction in technology development cycle and market pressures it is becoming increasingly important to reduce design to manufacturing cycle. It has also become important to be able to predict performance of next generation devices. Among the biggest stumbling blocks to 90-nm and finer line widths is design for manufacturability and increasing attention to yield optimization. Industry experts observed that manufacturability is the issue that has surprised people in terms of difficulty at 130-nm, and they're dreading it at 90-nm and expected to give larger problems at lower dimensions node. Usually when new semiconductor technology is introduced in terms of design rules, the experience of old technology is not used effectively although new technology is not radically different. The design to manufacturing takes longer time if the technology parameters and impact of design parameters on the reliability and yield are not studied. Because of the rapid changes in technology not all the parameter dependence is known with exact analytical models. Besides this methodology and tools used for design and fabrication bring in some complications, which are difficult to handle by conventional tools. For example during the process of designing a VLSI circuits, designers often find that their implementations do not meet the timing and/or area constraints although they have been simulated exhaustively at design stage. Also when any new design is brought for manufacturing, fabrication houses go through learning cycle in which it is always beneficial to have device and circuit design information to correlate it with device performance. Usually these aspects result in too many parameters and long optimization cycle at design and manufacturing phases. We

## Code Compression using Unused Encoding Space for Variable Length Instruction Encodings

## Authors: Dipankar Das<sup>[1]</sup>, Rajeev Kumar<sup>[1]</sup>, and Partha P. Chakrabarti<sup>[1]</sup>

#### Abstract

Most of the work done in the field of code compression pertains to processors with fixed length instruction encoding. In this work we apply code compression on variable length instruction set processors whose encodings are already optimized to a certain extent with respect to their usages. We develop a dictionary based algorithm which utilizes unused encoding space of an instruction set architecture to encode code-words, and addresses issues arising out-of variable length instructions. We test the algorithm with a RISC processor and include results for compression and performance in terms of cycle-counts and memory accesses.

#### 1. Introduction

In today's world embedded systems have become all pervasive. From hand held devices to automotive parts, embedded systems have a great impact on our lives. Consequently the market for such systems has shown tremendous growth in recent years. This has resulted in tremendous competition and hence the desperate need to reduce costs while improving the performance of such systems in terms of speed, power consumption, area and memory accesses.

Most embedded systems are microcontroller based as microcontrollers provide flexibility and low tape-out time. The cost of the memory needed to store the code of the microcontroller forms a significant portion of the total cost of the system. Besides this, a larger code requires more bus accesses to the instruction memory as compared to a smaller code. It also results in an increased number of instruction-cache misses, extra power consumption and slowing down of the system. Therefore, the research community has shown a lot of interest in the development of methods for reducing the size of the executable code. Compiler optimizations, e.g. Debray et al (2000), have long been used to reduce the size of executables. However compiler techniques alone, whose primary aim is to improve the performance, have been found insufficient in reducing the size of the resident code to desirable levels. Hence in recent years the focus of research and development has been on post compiler techniques which make use of redundancies and the repetition of instructions in the code.

A post-compilation method analyzes the code and replaces repeating instructions or a sequence of repeating instructions by a small code-word. There are other methods which remove the redundant portions of an instruction such

1

<sup>[1]</sup> Department of Computer Science & Engineering, Indian Institute of Technology Kharagpur, Kharagpur, WB 721 302, India. {ddas, rkumar, ppchak}@cse.iitkgp.ernet.in

## **FPGA Implementation of OFDM Transceiver**

#### V. Appandai Raj, D. Jovin Vasanth Kumar, R. Madhu Karthikeyan, S. Rajaram, Dr V. Abhai Kumar

#### Abstract

Orthogonal Frequency Division Multiplexing (OFDM) is a multicarrier modulation system employing Frequency Division Multiplexing (FDM) of orthogonal sub-carriers, each modulating alow bir-trate digital stream. Multi-Carrier Transmission has a lot of useful properties such as delay-spread tolerance and spectrum efficiency that encourage their use in broadband communications. A set of orthogonal sub-carriers together forms an OFDM symbol. OFDM is gaining popularity in broadband standards and high speed wireless LANs due to its resistance to Inter Symbol Interference (ISI). This project deals with OFDM design, simulation and synthesis using VHDL. Specifically 802.11a OFDM system is dealt. In this project, a 16-bit QAM, rate ½ as modulation scheme for OFDM is considered. We have implemented OFDM transceiver in Virtex-E XCV3200e device and simulated using ModelSim.

#### 1. Introduction:

OFDM is a multi-channel modulation system employing FDM of orthogonal sub carriers each modulating a low bit rate digital stream [1]. In OFDM, to overcome the problem of bandwidth wastage, N overlapping but orthogonal sub-carriers, each carrying a baud rate of 1/T and spaced 1/T apart are used. Because of the frequency spacing selected, the sub-carriers are all mathematically orthogonal to each other. This permits the proper demodulation of the symbol streams without the requirement of non overlapping spectra [2].

In this paper, 16-QAM, rate  $\frac{1}{2}$  as modulation scheme for OFDM is considered for the implementation of OFDM transceiver.

#### 2. OFDM using Inverse DFT

Consider a data sequence  $d_0, d_2..., d_{N-1}$ , where each  $d_n$  is a complex symbol. The data sequence could be the output of a complex digital modulator, such as QAM, PSK etc. Suppose we perform an IDFT on the sequence 2d<sub>n</sub>, the factor 2 is used purely for scaling purposes, we get a result of N complex numbers S<sub>m</sub> (m = 0, 1..., N-1)

as: 
$$S_m = 2\sum_{n=0}^{N-1} d_n \exp(j2\pi nm/N) = 2\sum_{n=0}^{N-1} d_n \exp(j2\pi f_n t_m)$$
  
[m=0,1,...N-1] -2.1  
where  $f_n = -\frac{n}{2}$  and  $t=mT_n$ 

where  $f_n = \frac{1}{NT_s}$  and t=mTs

Passing the real part of the symbol sequence represented by equation (3.1) thorough a low-pass filter with each symbol separated by a duration of  $T_{\rm s}$  seconds, yields the signal,

## FPGA IMPLEMENTATION OF DWT BASED IMAGE COMPRESSION CODER

Prashant R. Deshmukh, A.A. Ghatol Dr. P.D. Polytechnic, Amravati

#### Abstract

In this paper we present an implementation of the image compression routine SPIHT in reconfigurable logic. A discussion on why adaptive logic is required, as opposed to an ASIC, is provided along with background material on the image compression algorithm. We analyzed several Discrete Wavelet Transform architectures and selected the folded DWT design. In addition we provide a study on what storage elements are required for each wavelet coefficient. A modification to the original SPIHT algorithm is implemented to parallelism the computation. The architecture of our SPIHT engine is based upon Fixed-Order SPIHT developed specifically for use within adaptive hardware. For an N x N image Fixed-Order SPIHT may be calculated in  $N^2/4$ cycles. Square images which are powers of 2 up to 1024 x 1024 are supported by the architecture. Our system was developed on an Annapolis Microsystems WildStar board populated with Xilinx XC4085 parts. The system achieves a 450x speedup vs. a microprocessor solution, with less than a 0.1 db loss in PSNR over traditional SPIHT.

Keywords: FPGA, Wavelet transform, SPIHT, PSNR

#### 1. Background

Proposed architecture presents a partitioned approach to wavelet transform based image compression and its application to image encoders using programmable hardware. We have developed an efficient FPGA architecture of the state of the art image codec *SPIHT* (Set Partitining in Hierarchical Trees), which is comparable to the original software solution in terms of visual quality.

#### 2 SPIHT

For our system, we selected Set Partitioning in Hierarchical Trees (SPIHT) image compression algorithm. It first converts the image into its Discrete wavelet transform (DWT) and then transmits information about the wavelet coefficients. The decoder uses the received signal to reconstruct the wavelet and performs an inverse transform to recover the image. We selected SPIHT because it displays exceptional characteristics over several properties all at once [8] including:

- Good image quality with a high PSNR
- · Fast coding and decoding
- · A fully progressive bit-stream
- · Can be used for lossless compression
- May be combined with error protection
- · Ability to code for exact bit rate or PSNR

#### 2.1 Discrete Wavelet Transform

The DWT runs a high and low-pass filter over the signal in one dimension. The result is a new image comprising of a high and low-pass subband. This procedure is then repeated in the other dimension yielding four subbands, three high pass components and one low pass component. The next

# **DSP ARCHITECTURES**

Soujanna Sarkar Subash Chandar G.

<u>souj@ti.com</u>

<u>subba@ti.com</u>

Texas Instruments India Ltd. Golf View Homes, Wind Tunnel Road, Bangalore – 560017

## VLSI IMPLEMENTATION OF VITERBI DECODER

#### Dr Anindya S. Dhar\*, Mrs Vaishali B.M.\*\*

#### Abstract

This paper presents the design of a Viterbi decoder and implemention on FPGA. FPGA has certain advantages over ASIC, which attracts its use. The design is implemented using Xilinx SpartanXL S40XLCS280. Viterbi decoder is used to decode the Trellis coded modulation. The motivation for Coded modulation is based on disadvantages of conventional approach of using separate modulation and coding blocks. In power-limited applications, such as Satellite channels, FEC codes can be invoked in order to improve power efficiency at the cost of expanding the bandwidth required. In a bandwidth limited scenario, more bits/symbol can be transmitted but at the cost of increased power requirements. In coded modulation, the advantages of higher-level modulation are combined with those of FEC, where modulation/encoding and demodulation/FEC decoding form an integral operation. In this paper coding is convolutional. It has consistently proved more effective and simpler to implement for Gaussian channels. Viterbi decoding algorithm is used in decoding coded modulation, the technique used in Telephone line modems. Viterbi decoding has the advantage that it has a fixed decoding time. It is well suited for hardware implementation. Hardware Description Language used is Xilinx Alliance VHDL.

#### 1. Introduction

Viterbi decoder is well suited for decoding the Trellis coded modulation. Coded modulation combines coding and modulation together at the sender end. In case of Trellis coded modulation coding scheme is usually convolutional codes. Modulation scheme is the higher-order QAM or PSK[1,2]. While at the receiver end, the decoding and demodulation are not performed in separate steps, but rather interleaved into a single step. Due to this combined process of decoding and demodulation, the parameter governing the channel performance is the *free Euclidean distance* between coded sequences of signals, not the *free Hamming distance* as used with traditional envolutional decoding / demodulation system.

#### 1.1 Uncoded transmission

Consider a modulation scheme where the signal constellation consists of M' states. Assuming every one of these states is independent from each other, i.e., each one is an equally allowable transmitted signal. Then d<sub>min</sub>, the minimum Euclidean distance between two allowable signal vector sequences, will be given by

 $d_{\min}^2 = \min |x' - x''|^2$ 

\*E&CCE IIT, Kharagpur, WB, INDIA \*\*Senior faculty, Instrumentation Engg, B.V.B.College of Engineering and Technology, Hubli-580 031, Karnatak state, INDIA

## SYSTOLIC ARRAY BASED VLSI ARCHITECTURE FOR MOTION ESTIMATION IN VIDEO COMPRESSION APPLICATIONS

Mallikarjuna Swamy. M.S.<sup>1</sup>, Ashok Rao<sup>2</sup>, D.V. Poornaiah<sup>3</sup>

#### Abstract<sup>1</sup>

Video compression applications have been growing significantly for several years. Consequently, standards have been developed and adopted. There is a large demand for efficient VLSI architectures for implementing these standards. In literature, many array architectures for Block matching algorithm (BMA) have been proposed. In the present work, a modular architecture for a logarithmic search BMA is presented. The proposed architecture exploits the patterns of overlapped search area in the reference frame so as to drive an efficient architecture, which reduces, the needs to reload or redistribute the same reference data over. As a result, this architecture offers more efficient processor utilization while achieving a moderate throughput rate with reduced pincount.

#### Introduction

A common approach of video compression is to exploit both spatial (intraframe) as well as temporal (interframe) redundancies. Transform coding is often used to serve the purpose of intraframe coding and predictive coding with motion estimation is often used for interframe coding [2]. In the context of video coding, motion estimation is concerned less about the movement of a specific object in each frame. Rather, within a predefined search area of the reference frame, the goal of motion estimation is to search for a block or region which best matches under certain matching criteria, with a given block (or a region) in the current frame. The displacement between the co-ordinate of the block in the current frame and the matched block in the reference frame is called a motion vector [5]. Given the block in the reference frame and the motion vector, sa well as the offiference between these two corresponding blocks, the block in the current frame can be recovered perfectly. Assuming the reference frame is available, then only the difference image between the two matched blocks and the corresponding motion vector needs to be transmitted to fully recover the current frame [3].

A block refers to a small square of pixels in a frame. During motion estimation, a distance measure, defined by the matching criterion between the target block in the current frame and a candidate block within the search area of

<sup>1</sup> Bapuji Institute of Engineering & Technology, Davangere-577004.

<sup>&</sup>lt;sup>2</sup> Center for Electronics Design & Technology , IISc, Bangalore-560012

<sup>3</sup> Indian Telephone Industries Ltd., Bangalore-560016

#### Electronic Flow Regulator (EFR) Using Field Programmable Gate Arrays (FPGA)

#### Author: J.Manikandan<sup>1</sup>, M.Jayaraman<sup>2</sup>, M. Jayachandran<sup>3</sup>

#### Abstract

This paper describes a scheme for electronically regulating the flow of propellant to the thruster from a high-pressure storage tank used in spacecraft application. The proposed scheme is based on a space-qualified Field Programmable Gate Arrays (FPGA) and Hybrid Micro Circuit (HMC). The use of Field Programmable Gate Array (FPGA) reduces the amount of analog circuitry and digital logic functions, which otherwise would be required in a traditional pressure regulator. Also the control algorithm being software, it is well modifiable without changing the hardware. The regulation scheme is fully done by the FPGA. This scheme is simple enough to adopt for a wide range of applications, where the flow is to be regulated for efficient operation.

Keywords: Flow Regulator, FPGA, Propulsion System

#### 1. Introduction

The aim of using Electronic Flow Regulator (EFR) in satellite is to ensure precise flow of propellant to the thrusters. Spacecrafts using Electric Propulsion Systems (EPS) require propellant feed systems which provide for precise delivery of propellant to the thrusters from a high pressure storage tank working in blow-down and leak tight isolation of the thrusters when not in operation [1]. Precise flow delivery of the propellant ensures propulsion system operation at best efficiency by maximizing the propellant and power utilization for the mission. Compactness and flexibility of changing the regulation scheme without affecting the hardware part of the control circuit is the main advantage.

Figure 1 shows an overview of the electronic flow regulator. The propellant flows from high-pressure storage tank to thruster. Pressure transducers placed in two stages sense the pressure and pass the data to FPGA. One set of current sensor is placed at the end of second stage to sense the beam current, which is needed for the control algorithm. A smaller tank is placed in between the first and second stage of transducers to overcome surges in the flow. Telecommand signals from ground station are sent to the spacecraft computer on the satellite. Signals from spacecraft computer, transducers and sensors is received by the Electronic Flow Regulator and after necessary processing, signals are sent to latch valves for driving them ON/OFF. Switching ON/OFF of the latch valve

compensates for the increase and decrease of flow respectively. The aim is to maintain flow within specified upper and lower threshold levels. The compensation is done in two levels. The propellant is input at A from the highpressure storage tank and flows through latch Valve I, which regulates the pressure at first level. Second level through latch II further regulates the pressure

<sup>1</sup>Scientist 'B', Systems Directorate, ADA, P.B.No.1718, Vimanapura Post, Bangalore-17 <u>imk77@mail.com</u> & jmk77\_ada@yahoo.com

<sup>2</sup>Group Head SCPS, LPSC, ISRO, Bangalore

3 Professor & Head, Dept of ECE. New Horizon College of Engg, Bangalore

## A LOW POWER ASYNCHRONOUS PIPELINE FIFO

## H.Mangalam<sup>(1)</sup>Dr.K.Gunavathi<sup>(2)</sup>Dr.S.Subramanian<sup>(3)</sup>

#### Abstract

An efficient architecture for a micro pipeline FIFO with a four phase handshaking protocol for low power applications is presented in this paper. The new structure has considerably lower power consumption and transistor count compared to the previously reported FIFOs in literature. This advantage is achieved by using an inverter gate with a weak keeper referred here as NOT Latch instead of using a conventional single transparent latch. The efficiency of the design is demonstrated by applying this to an 8 bit 4 stage FIFO with four phase micro pipeline control. The number of transistors is reduced by more than 43% and power consumption is reduced by 80%. The designs are simulated using T-spice with 0.18µm technology.

#### 1.Introduction

Asynchronous circuits offer high performance, lower power dissipation and better noise properties[1].However the design of these circuits is more difficult compared to synchronous circuits and has less availability of CAD tools.

Micro pipelines are the most important asynchronous architecture which was introduced in [2]. These architectures employ the advantages of the asynchronous design style and the benefits of pipelining in synchronous systems and therefore have attracted considerable attention in recent years [1-4]. A generic micro pipeline architecture is shown in Fig.1. where the Request and Acknowledge signals are used for handshaking.



- Ms.H.Mangalam, Assistant Professor in ECE, Sri Krishna College of Engineering and Technology, Coimbatore – 641008 and also a part time research scholar, PSG College of Technology, Coimbatore h\_mangalam@yahoo.co.in
- Dr.K.Gunavathi, Assistant Professor in ECE, PSG College of Technology, Coimbatore – 641004
- Dr.S.Subramanian, Principal, Sri Krishna College of Engineering and Technology, Coimbatore - 641008

## GENERATION OF CRITICAL SUB-GRAPHS FOR TIMING ANALYSIS OF DIGITAL CIRCUITS USING LOGICAL PRUNING

Ms.Poorva Waingankar, Assistant Professor, Electronics And Telecommunication Engg.Department,<sup>1</sup> Ms.Archana Kale, Assistant Professor, Information Technology Department,<sup>1</sup>

#### Abstract

This paper proposes a method for propagation delay estimation of digital circuits based on graphical model. Connectivity and flow information is used for mapping a digital circuit into a directed graph. Due to heterogeneity of performance of logical gates, the nodes of such graph are heterogeneous. The approach utilizes the dynamic flow characteristics of logical gates. These features force some components of the circuit to be passive in result generation. During a specific function flow, this dynamic behavior justifies logical pruning of such passive components for performance analysis.

#### 1. Introduction

Performance analysis of digital circuits based on graphical modeling leads to an observation that, dynamic input characteristics force nonparticipation of some sub graphs in the signal flow leading to the out put. In this paper a pruning method based on logical gate level approach is proposed. Further shortest path based algorithms to evaluate theoretically maximum propagation delay between input and output for a logical function are used.

Characteristics of various gates, which are the building blocks of any digital circuit, are heterogeneous (AND, OR, XOR etc). This paper proposes a min / max approach to bring all such sub-structures to a common homogeneous platform, based on number of inputs required and the respective propagation delays for earliest accurate output generation. Further a graphical structure based on circuit connectivity having min/max nodes is presented.

It is observed that due to basic AND OR logic, in general only a part of a digital circuit participates in generation of output from a given input for a specific function flow. This approach uses the dynamic passive component analysis to prune the graphical representation. The remaining sub-graph is the critical sub graph for propagation delay analysis.

<sup>&</sup>lt;sup>1</sup>Thadomal Shahani Engineering College, Mumbai 50.

## **Design and FPGA implementation of**

## wavepipelined distributed arithmetic based filters

#### G. Seetharaman<sup>\*</sup>, B. Venkataramani\* and G. Lakshminarayanan\*

#### Abstract

In this paper, a novel scheme for FPGA implementation of a filter using Distributed Arithmetic Algorithm (DAA) is proposed. To increase the speed of the filter, a suboptimal wavepipelined scheme is proposed between the various combinational blocks of the DA filter. An automated procedure is used for the design of the wavepipelined circuit and the circuit is self calibrating in nature. To test the efficacy of the scheme proposed, three 4 tap filters with multipliers of size 11x8 are implemented using DAA approach on Xilinx Spartan II XC2S100-SPQ208 device. From the implementation results, it is observed that wavepipelined DA filter is faster by a factor of 1.31 compared to non-pipelined DA filter. The synchronous pipelined DA filter. Fins is achieved with increase in the number of slices by 25% and no. of registers by a factor of 4.5. The technique proposed in this paper is also applicable for ASICs and FPGAs from other vendors.

Key words: FPGA, DAA, Wavepipeline, Self-test circuit, signature analyzer

#### 1 Introduction:

Distributed Arithmetic (DA) plays an important role in embedding DSP functions in the LUT based FPGAs and enables the FPGAs to achieve performance which is superior to those of programmable DSPs [1], [2]. DA can be optimized for area efficiency, speed efficiency or for both. For efficient implementation of DA on FPGAs, a number of algorithms such as ROM decomposition technique and offset binary coding have been proposed using pipelining technique in the literature [3]. In this paper, a novel technique for the FPGA implementation of DA filter using wavepipelining is proposed.

#### 2 Pipelined DAA scheme:

A pipelined DAA scheme with 2's compliment multiplication with sign extension is shown in Fig.1. In this scheme, the maximum clock frequency for the input register of the DAA block is determined by the processing time in the combinational logic block consisting of the ROM and the adders. The clock frequency can be increased by introducing pipeline registers at the output of the ROM and at the output of the adders. This scheme is also referred to as synchronous pipelining. In this case, the maximum operating frequency is determined by the largest critical path delay between any of the two registers. Pipelining increase the operating frequency at the cost of increase in the number of registers, increase in routing complexity and power dissipation.

National Institute of Technology, Tiruchirappalli, INDIA bvenki@nitt.edu

## DESIGN AND IMPLEMENTATION OF ECG CODEC IN FPGA.

T.Kalaivani M.S (R).,

Dr.S.Arumugam\*

Department of Electronics and Communication Engineering, Government college of Technology, Coimbatore, Tamil Nadu, India, 641013.

### ABSTRACT:

The main objective of this paper is to build architecture for hardware implementation of ECG compressor & de-compressor in FPGA. A wavelet based ECG codec is designed using SPIHT algorithm [1]. This algorithm is designed using Matlab6.5 & implemented in VHDL & synthesized using XILINX FPGA. The SPIHT algorithm, which is widely used for 2D signals, is modified for 1D signal & applied it for ECG compression. A wavelet electrocardiogram data codec based on the SPIHT algorithm is applied for this 1D signal. This algorithm is made use because of its high efficiency, high speed & simplicity (low complexity). This algorithm is guarantees percentage root mean square difference with high compression ratios with low complexity. This paper also describes architecture for single chip implementation of the proposed algorithm. The portable ECG unit with local storage can be used in remote & continuous monitoring of an active patient who carries this unit.

Keywords: ECG compression, wavelet signal processing, FPGA.

#### I. INTRODUCTION

A large number of techniques for compression of electrocardiogram signals have been proposed using wavelet transform for the last ten years. The objective of electrocardiogram data (ECG) compression is to reduce the amount of digitized ECG data as much as possible with a reasonable implementation complexity while maintaining clinically acceptable signal quality. In an ambulatory monitoring system, the volume of ECG data is necessarily large, as a long period of time is required in order to gather enough information about the patient. It would be more advantageous when the codec is made portable. An effective ECG data codec is required in many practical applications including: a) Ambulatory recording systems. b) ECG data storage. c) ECG data transmission over telephone line or digital communication network.

#### \*Principal,

Government College of technology, Coimbatore - 13.

Generally ECG compression techniques fall under three major categories [2]: a) direct data compression, b) transformation methods, and c) Parameter extraction techniques. Examples of direct schemes that attempt to code the signals directly include AZTEC, DPCM, FAN, CORTES and ASEC. Among transformation schemes, the wavelet transform schemes have shown good promise because of their good localization properties in the time and frequency domains. Several wavelet and/or wavelet packetbased compression have been proposed in [3]-[5]. Parameter extraction is an irreversible process with which a particular characteristic or parameter of the signal is extracted. These methods include linear prediction, long-term prediction.

In this paper, a wavelet coder based on the set partitioning in hierarchical trees (SPIHT) compression algorithm is made use of. This algorithm has achieved notable success in image compression. This paper is organized as follows: Section II is a brief introduction to the wavelet transform & its filter bank implementation. Section III presents the coding algorithm based on 1-dimensional SPIHT. Section IV shows the architecture of the codec in FPGA. Section V concludes the paper.

#### II.WAVELET TRANSFORM

The wavelet transform is defined by

$$\begin{split} W(u,s) &= \left\langle f, \psi \right\rangle \\ &= \int_{-\infty}^{\infty} f(t) \frac{1}{\sqrt{s}} \psi^* \left( \frac{t-u}{s} \right) dt \\ \end{split} \\ Where \quad \psi_{u,s}(t) &= \frac{1}{\sqrt{s}} \psi^* \left( \frac{t-u}{s} \right) \end{split}$$

Where the base atom  $\psi$  is a zero average function, centered on zero with a finite energy. The wavelet transform comprises the coefficients of the expansion of the original signal f(t) with respect to a basis  $\psi_{u,s}(t)$ , each

## FPGA Based Multilayer Feedforward Neural Network and its Applications for Odor Sensing

## M.S.Puranik<sup>1</sup> and Dr. D.C.Gharpure<sup>2</sup>

#### Abstract:

This paper presents an FPGA based Neural Network used for alcohol detection. The design is partitioned in two blocks viz. the control logic block and the data flow block in VHDL. The control logic supports scalability of neurodes in various layers making it a versatile system. Some of the resources such as the MACs and activation function generator are shared in the hidden layer and output layer to minimize the silicon area. **Keywords (Indes):** Artificial Neural Network, FPGA

ANNs have passed their novelty and are being employed successfully as a core component in many PC based applications. The inherent parallelism of ANNs can be exploited fully when implemented in hardware giving the advantage of tremendous speed up. Digital hardware offers the accuracy, noise immunity and ease of designing. The development in programmable logic devices such as FPGAs and associated hardware designing tools have further simplified the hardware designing. Although FPGAs do not achieve the power, clock rate or gate density of custom chips; they do provide a speed-up of several magnitude compared to software simulation[Hauck, 1998,Dehon, 2000]. This paper presents FPGA implementation of a Multilayered Feedforward Neural Network(MFNN) and its application for odor sensing.

#### 1. Development of ANN

The MFNN is composed of one input layer, one or two hidden layers and one output layer. Each layer is connected to the next layer through the synaptic weights. The input layer is a buffer that interfaces the outside world to the neural network. The output layer provides the neural network output to the external world. All layers except input layer operate nonlinearly on the net input obtained as the weighted sum of the outputs of the previous layer neurodes. The nonlinear function is called as activation function.

We have implemented the ANN in FPGA and applied to detect presence of methanol. While developing the hardware, considerations are given to the range-precision for data representation and activation function generation. These parameters are first optimized through an in-house developed software and then employed in a VHDL code. Following sections describe both the implementations and their application for odor detection.

#### 2. MFNN for Odor detection

The odor detection system consists of a gas sensor array, a data acquisition system and the ANN for odor identification. The number of gas sensors used, determine the input layer size while the number of gases to be identified determine the neurons in the output layer. The hidden layer neurons are decided by the learning performance of the ANN. The digitized data obtained from the sensor array is normalized and stored on the PC. Few of the

1. Department of Instrumentation Science, 2. Department of Electronic Science, Pune University, Pune

## AN APPROACH TO THE CLASSIFICATION OF ANALOG AND MIXED SIGNAL CIRCUITS IN AN OSCILLATION BASED TESTING SCHEME USING WAVELETS

## P.KALPANA<sup>1</sup>, Dr.K.GUNAVATHI<sup>2</sup>

#### Abstract

BIST methods based on oscillation based test have been suggested for testing analog and mixed signal circuits. In such methods the amplitude and frequency of the oscillations are considered for classification. In this paper we propose a wavelet transform based output response analysis technique in oscillation based testing scheme for detecting catastrophic and parametric faults in the analog and mixed signal circuits. Simulation results on bench mark circuits show that wavelet transform has higher fault detection sensitivity than previous methods.

#### 1. Introduction

In oscillation based testing circuit under test (CUT) is converted to circuit that oscillates during testing. [1] [2]. This test strategy is used in recent works for testing filters and embedded blocks in SoCs [3] [4]. In these works frequency and amplitude of the oscillations are taken for fault detection. But sensitivity of test threshold is very low using the above measurements for parametric fault detection. So in this paper wavelet analysis on the output response is proposed. The paper is organized as follows section-2 deals with the over view of wavelet transform. Fault detection using wavelet transforms is presented in section-3 and the results are given in section-4.

#### 2. Wavelet Transform

Wavelet Transform (WT) is capable of providing the time and frequency resolution simultaneously, hence giving time-frequency representation of the signal. In order to distinguish the two signals, discrete wavelet transform or wavelet packet decomposition techniques can be used to obtain the wavelet coefficients. From the wavelet coefficients, feature vector can be obtained by

<sup>&</sup>lt;sup>1</sup> Assistant Professor, ECE Department, PSG College of Technology Email: kalpana\_shekar @yahoo.co.in

<sup>&</sup>lt;sup>2</sup> Assistant Professor, ECE Department, PSG College of Technology Email:kgunavathi2000@yahoo.com

## VLSI IMPLEMENTATION OF I2C INTERFACE (SLAVE PART)

Namita Mujumdar, Akshata Mahale

#### ABSTRACT

Electronics as we know today is characterized by reliability, low power dissipation, extremely low weight and volume, and low cost, coupled with an ability to cope easily with a high degree of sophistication and complexity.

In modern electronics systems there are number of IC's that have to communicate with each other and outside world. To maximize hardware efficiency and simplify circuit design I2C interface is provided. The I2C Bus defines the signals, data formats, and protocols necessary for devices to communicate. This paper presents the VLSI implementation of I2C slave interface. The I2C interface is mainly used to transfer the control information from an external microcontroller to the IC. The synchronous operation of the slave I2C is controlled using the Finite state machine. The design is simulated in SPICE3 and layout of the IC is designed using MAGIC.

The Inter-Integrated Circuit (12C) Bus, is a simple, low-cost serial interface for connecting one or more microcontrollers and peripherals in an embedded system. Providing serial interface minimizes the interconnections, hence ICs have fewer pins reducing the PCB size. Three modes of 12C interface are defined based on rate of data transfer on SDA line. For Standard mode data can be transferred upto 100Kbits/sec, 400Kbits/sec for Fast mode and 3.4Mbits/sec for High speed mode. In this project synchronous 12C slave interface using the standard mode is designed as the data rate of 100Kbits/sec suits the application.

The 12C Bus defines two bidirectional wires, SDA for serial data and SCL for serial clock, to carry information between devices. Each device connected to the two lines is identified by a unique address. Each device can either transmit data or receive data. A device operates as either a master or as a slave. A master initiates data transfer, either to or from one or more slaves, and generates clock signals. The device addressed by the master is considered, at that time, to be a slave. Any device acting as a master must be able to both drive and sense SCL and SDA. In this project external microcontroller acts as master. The 12C interface in the audio signal processing IC is configured as slave which receives the control data only when it is addressed by the master. SCL and SDA are connected to a positive supply voltage (VDD) through pull-up resistors R<sub>p</sub> and are driven by open-drain devices. When the bus is free both the lines are HIGH.

## **Design of Syndrome Calculation for Reed Solomon Codes**

Pratap Ghorpade, Marthand Patil, Yamini Sharma

#### Abstract:

Reed-Solomon codecs operate on blocks of data in which information is divided into frames (blocks). The project aims to calculate the Syndrome for the specified value of message blocks for the Reed Solomon (RS) code. This project implements a RS code for (204, 188). The Syndrome calculation is the first step in decoding a RS code. If the received message r(x), is not erroneous the syndrome generated is zero. The Syndrome calculation is achieved by dividing the received encoded block with the generator polynomial.

1. Introduction

In digital communication, a stream of source data in the form of 0's and 1's will be transmitted over the channel. Disruptions can occur changing the logic level from 0 to 1 and 1 to 0, causing an error. Reed Solomon code is used to correct burst errors that occur during data transmission. The source information is encoded by adding extra information referred to as redundancy that is used to detect and correct the errors.

The decoding procedure for Reed-Solomon codes involves determining the syndrome, locations, magnitudes of the errors in the received polynomial r(x) and correcting the errors. The received codeword r(x) is the transmitted codeword c(x) plus the error e(x).

r(x) = c(x) + e(x).

#### 2. Syndrome Calculation

Syndrome calculation involves division of the received message with the generator polynomial. The division using the generator polynomial is based on the Galois Field arithmetic. All the arithmetic operations such as addition, subtraction, multiplication and division are calculated using the ExclusiveOr (EXOR) operation. The syndromes can be calculated by substituting the 2t roots of the generator polynomial g(x) into r(x), where 2t =n-k refers to the number of parity symbols in a given block. These codes are generally designated as (n, k) block codes: k is the number of information symbols input per block.

The magnitude and the error location are determined using the Euclid and Forney algorithms.

#### 3. Design Overview

The specification used for the satellite communication is,

- number of message symbols (k)=188 bytes
- code word length (n) =204 bytes
- symbol width (s)=8 bits

## **ON RAIL-PASSENGER INFORMATION SYSTEM(ORPIS)**

Subhankar Das, Raviraj Vader, Mahesh Kamat

#### Abstract:

The system discussed here enables the passengers to be aware of the details of the journey in terms of the distance left to reach an approaching station and the corresponding time left. These values would be continously updated. Also included in the objective is the alarm system. This alarm system would be enabled five kilometres before every station arrives.

The Design is implemented using the TSMC 0.35micron technology 2metal 1-poly nwell process.

#### 1. Introduction

People always prefer to travel by train. But in most of the cases, passenger on rail is unaware of approaching station, distance left to reach the next station, and time left to reach the next station. Always it causes all sort of inconvienance. It will be comfortable if passenger gets all information related to the approaching station on rail itself. This system enables the passengers to be aware of the details of the journey in terms of the distance left to reach an approaching station and the corresponding time left. A display unit in the train shows all these information. The whole system is divided into two parts. One block calculates the Distance. Calculations are repeated for every 32 seconds.

The further sections gives the details of the whole system. The aspects of implenetation is dealt in section 2, in section 3 the mode of operation is explained. The simulation results are given in section 4. The section 5 speaks of further developments that can be incorporated. The final section concludes with the theme of the whole system.

#### 2. Implementation

The Block Diagram of the system is as shown in Fig. 1

#### 2.1 The Input Section

The system has been developed based on the assumption that the following inputs are available to us at any time:

- · The total distance between the two stations in metres.
- The output from the tachometer is a sinusoidal signal, whose frequency is varying according to the speed of the train. A Zero Crossing Dectector (ZCD) detects the zerocrossing and generates the pulses. It is assumed that the pulses thus generated are arriving at a rate of 4 pulses for every 1 metre of distance covered. The total number of pulses received are counted with the help of a counter.
- The Maximum distance between any two stations is considered to be 128 km viz., 128,000 mtrs.

#### 2.2 The Registers

A set of shift registers(PIPO) are used to store the Total Distance between a station and the next station in a particular route. These are having 17 bits  $(2^{17}=128000)$  so that 128.000 km of distance is the maximum between any two stations. Another set of 17 bit registers(PIPO) are used to store the values of the
# **Energy Recovery Low Power CAM Design**

Josemin Bala.G<sup>1</sup> Poonkuzhali.N<sup>1</sup>

Raja Paul Perinbam.J<sup>1</sup>

#### Abstract:

This paper presents a low power Content Addressable Memory (CAM). The CAM uses Pass transistor Adiabatic Logic (PAL) an energy recovery logic family with single-phase power clock to reduce the power consumption associated with read, write and compare operations. The circuit simulation of the CAM is done through SPICE and compared with the conventional design. Simulation results of 16x16 CAM indicate around 95% of power savings at 10 MHz operating frequency compared to the conventional design. The circuits are designed using 0.6 µm CMOS technology.

<sup>&</sup>lt;sup>1</sup> Department of Electronics and Communication Engineering Anna University

# A CMOS ANALOG FRONT-END FOR MEMS SENSOR INTERFACE CIRCUIT

#### K. De, S. Kal

Microelectronics lab, Advanced Technology Centre Indian Institute of Technology, Kharagpur -721302

E-mail: koushik@ece.iitkgp.ernet.in

#### Abstract

A low-power instrumentation amplifier intended for low-frequency piezoresistive microsensor application is presented that achieves very low offset and noise. The key feature with this is the chopper modulation technique combined with a bandpass filter and a matching on-chip oscillator without requiring any external component for trimming. The amplifier gives a linear gain from 0-30dB with a low power dissipation of only 0.8mW. The circuit has been designed in a standard 0.18µm CMOS process.

#### 1. Introduction

Analog front-end for MEMS sensor interface circuit consist of a variable gain amplifiers (VGA). They are used to amplify precisely low level output voltages from the sensors to provide a signal suitable for subsequent dataacquisition system. The specification for the linearity of a VGA is very high to maintain good overall system linearity. VGAs are also used as an important building block for many applications such as in wireless communication system, disk drives, and imaging circuits.

Generally VGA in CMOS technology use capacitors or resistors selective array in the feedback path of an opamp [Shih et al.(1987), Babanezhad and Gregorian(1987)] or as a Gilbert cell [Pan and Abidi(1987)]. For application in MEMS sensor interface VGA is taking a crucial roll. The input transducer produce low level signals to the VGA often polluted by noise and nonlinearity. The input signal must be amplified linearly to avoid further pollution and intermodulation. The primary source of noise in this case is low frequency (1/f) noise. Also in conventional VGA ofter is a problem that is to take care for interfacing sensor signal. Thus VGA for this case should be capable of minimizing the offset as well as 1/f noise and also consume low power. Chopper stabilization in technique is the most dominant solution in this regard [Enz and Temes(1996)]. A chopper stabilized instrumentation amplifier has been designed in CMOS 0.18µm technology with linear gain control over a range of 0 to 30dB with low power dissipation of 0.8mW.

#### 2. Realization of Chopper stabilized instrumentation amplifier

A conventional instrumentation amplifier [Fig. 1(a)] is a dedicated differential amplifier with extremely high input impedance. The gain can be

# PROCESSOR SELECTION FOR EMBEDDED SYSTEM DESIGN BY USING SGA

#### S. Ramanarayana Reddy\*, Prof. Parimila.N\*\*

#### Abstract

Embedded Systems (ES) is a combination of H/W and S/W along with some Peripherals attached to perform a designated task or a range of narrow tasks. Generally, all Real Time systems are embedded systems. In all ES's, one think is common that is "Processor". ES's are used in wide range of applications. So it is the task of the designer to select the suitable processor from the vast list of processors. The design space available to any processor architecture is almost infinite and there are countless set of characteristics such as no of data line, no of address lines, cost, size and power consumption, operating temperature, size of on chip memory, no of counters, no of ports, no of interrupt, support of s/w tools such as compilers, assemblers, RTOS, debuggers etc. These characteristics will guide the designer for selecting the processor. Selecting the processor based on these parameters is not a simple task and it is a multidimensional search problem with each dimension corresponds to a processor characteristic and it requires an exhaustive search with tremendous computing resources and time. In our framework of processor selection, we have used the Simple Genetic Algorithm (GA) for selecting the processor.

Key Words: Processor, GA, and Embedded System Design etc.

## I. Fundamentals of GA

Genetic Algorithms (GA) have been applied to various optimization problems in different applications. In principle, GA's are adaptive procedures that find the solutions to problems by evolutionary process based on natural selection. In practice, GA's are iterative search algorithms with various applications. The flow chart shows the working of GA. In general, GA's maintain a population of individual solutions to the problem. Each individual can be represented by a string called chromosome, which is shown in fig.2. During each iteration, or called generation, the individuals in the current population are rated for their fitness as a solution. The fitness function evaluates the "survival" or "goodness" of each chromosome. By applying the different genetic operators, new populations of candidate solutions are generated. In general, GA's makes use of different operators. In this implementation, we use the *selection, crossover, and mutation* operators which are described bellow.

<sup>\*</sup>Lecturer, Dept of Computer Science & Engineering, Indira Gandhi Institute of Technology, GGSIP University, Delhi-06. Email: <u>rammallik@yahoo.com</u>

<sup>\*\*</sup> Prof. Parimala.N, School of Computer and Systems Science, JNU, New Delhi-67.

# High Speed Circuit Design for 4-bit and 8-bit Unsigned Integer Squarer

Y.V.Ramana Rao,(YVRamana Rao@hotmail.com),

N. Venkateswaran, (<u>nvenkat@svce.ac.in</u>) and S. Sundar(svsundarl@yahoo.co.in) Department of ECE, College of Engineering (CEG), Anna University, Chennai -600 025

Abstract— Squaring of unsigned integer numbers is an important arithmetic operation in all DSP tasks especially in image processing. Regular arrays and Wallace tree are the two methods that are commonly used to perform unsigned integer multiplication. The use of multipliers for performing this squaring operation suffers from latency. In this paper, simplified circuits are proposed for performing 4-bit and 8-bit squaring operations. The advantage of this dedicated circuit is that it requires less hardware than what is obtained when the same operation is implemented using multipliers. The synthesis has been performed using Xilinx FPGA: SPARTAN2E. The clock frequency obtained for a 4-bit squarer is 125MHz and an 8-bit squarer is 60MHz.

Keywords: FPGA, Fast Squarer.

1.Introduction: The standard binary integer multiplication is a fundamental operation signal processing. Another important component in Image processing applications is the squaring of integer numbers. The squaring operation can be implemented using a multiplier in which both the multiplicr and the multiplicand are the same number. Wallace tree [Wallace.1964] is a popular architecture for the multiplier. It is preferred in VLSI applications because of its regularity.

The speed of the multiplier unit has always been a critical issue [ P. Song and G. De Michelli, 1991, ] [ Z. Wang, G. A. Jullien, and W. C. Miller, 1994] [3] [4]. Several well-known schemes [5] [6], with the objective of improving the multiplier's speed, have been developed. Important optimizations of the speed, area and power consumption of circuits can be achieved by using dedicated circuits instead of general ones whenever possible. Multiplication of two numbers in which both the multiplier and multiplicand are same can be performed differently than using a full multiplier. This usually leads to smaller, faster and less power-consuming circuits. In this paper, hardware circuits for squaring unsigned 4 bit and 8 bit integers are proposed.

The number of Full Adders (FA) and Half Adders (FA) used for a 4-bit multiplier using Wallace Tree is 5 and 3 respectively. For an 8-bit multiplier Wallace tree uses 38 FA and 15 HA. The number of FA and HA in the proposed 4-bit squarer architecture is 2 and 5 respectively. The number of FA and HA in the proposed 8-bit squarer architecture is 10 and 23 respectively. The proposed 4-bit and 8-bit circuits are implemented on Xilinx FPGA.

This paper is organized as follows. The design of the squarer algorithm is

# **CMOS Implementation of Cellphone Interface**

#### Anand Yaligar, Aditya Desai, Vinayak Bhat

#### Abstract:

Most high-end mobile phones give users a whole range of options to transfer the data in the phones to other devices such as PCs, laptops, PDAs etc. But quite often, these features come at a very high cost. A vast majority of the cellular phones do not have any option for data transfer. Hence this project aims at developing an interface for the mobile phone to a standard bus such as the Universal Serial Bus (USB).

Using a lava program, the data to be transferred is converted into ringtone (audio) format. The converted data is then accessed through the headphone jack available on the mobile phones. Conversion of digital data in the cellular phones into audio frequency waves forms the basis of this project.

#### 1. Introduction

This paper presents an innovative approach towards this objective. This involves the use of a J2ME (Java 2, Micro Edition) program to convert the data into a FSK modulated signal. The CMOS implementation then processes this signal into a shift register for use in interfacing applications. The system uses ASCII code to represent the data. Hence it can be processed by most devices which understand the ASCII format.

In this paper, Section 2 provides an overview of the entire system. Section 3 describes the implementation details of the system. Section 4 looks into the simulation results and Section 5 provides the conclusion & the possible extensions to the design.

#### 2. Overview

The entire system is described by the block diagram shown in Fig 1.

#### 2.1 Mobile Programming Using J2ME

Mobile devices have many restrictive features which mandate the need for a special programming language. Sun Microsystems Inc. have released a special edition of the Java programming language known as Java 2 Micro Edition (12ME). A vast majority of the mobile device manufacturers support this language, hence it is a natural choice for the system. The program has been compiled and verified using the Java Wireless Toolkit [1].

#### 2.2 The Concept Of Ring Tones.

The ringtone available in all mobile phones, is the chosen method of data transfer in this project. This is mainly due to the fact that they are easily customizable. A ringtone is essentially a sequence of individual tones. Each individual tone has a certain frequency & note duration associated with it[2] The rate at which the entire sequence is played is determined by the tempo of the tune. These features could be exploited for use in serial data communication. Keeping the note duration constant fixes the frequency of data. By making use of only two frequencies, with the higher frequency representing a logical high & the lower frequency a logical low, data in FSK format is generated. With some modifications, the ringtone could be adapted for use in data transfer.

#### 2.3 Circuit Design Details.

The circuit implementation details are listed out in Section 3. As shown in Fig.1, the functional blocks in the circuit are as follows:

# VLSI IMPLEMENTATION OF CANCELING MATERNAL ECG FROM FETAL ECG N.J.R.Muniraj M.E,\* Dr.R.S.D.Wahida Banu, Ph.D \*\* M.Ramva Sri.B.E. \*\*\*

#### ABSTRACT

Abdominal electrocardiograms make it possible to dertermme the fetal heart rate and to detect multuiple fetuses and are often used during labor and delivery. the background noise due to muscular activity and feotus motion, however, often had an amplitude equal to or grater than that of fetal heartbeat. A still more serious problem is the mother's heart beat, which has an amplitude 2 to 10 times grater than that of the fetal heartbeat. A still more serious problem is the mother's heart beat, which ECG (MECG) is the main source of interference in Fetal ECG (FECG) monitoring. The MECG is detected at all electrodes placed on the mother's skin (thoracic and abdominal). In the case of multi-fetal pregnancies the traditional adaptive filtering technique provides a "maternal cleam" signal consisting of the two fetal ECG signals. The noise was found to be too strong for the algorithm (and the naked eye) to notice any fetal heart signal[1].

This paper briefs the implementation of Adaptive noise cancellation algorithms such as LMS algorithm and RLS algorithm using MATLAB 6 (R12) suitable for real time implementation, which can be used during measurements, is being developed using VLSI. The best solution in case of multiple fetuses is the BSS filtering which has successfully been implemented in MATLAB.

**KEYWORDS: Adaptive Filters, ECG Extraction,** 

#### 1. INTRODUCTION

The monitoring of FECG has clinical importance. If the physician could obtain a reliable reading of the FECG, he

could detect problems in the fetal heart activity even before he is born[6]. The procedure for obtaining the FECG should be noninvasive. The fetal heart is a small heart so that the electrical current it generates is very low. In order to record the FECG, electrodes are placed on the maternal abdomen as close as possible to the fetal heart. The FECG may be acquired by placing a number of electrodes around the general area of the fetus and hoping that at least one of the electrodes will have the FECG with high enough SNR. Beside the problem of electrode placement, noise from electromyographic activity effects the signal due to the fetus low voltage signal. Another interfering



signal is the maternal ECG (MECG) which can be 5-10 times higher in its intensity and ability to induce surface potentials [1]. The MECG effects all the electrodes, those that are placed on the chest (thoracic electrodes), and those placed on the abdomen (abdominal electrodes) of the mother. Because the FECG is a very weak signal, an electrode placed on the thorax of the pregnant woman will hardly record any of it, if at all.

#### 2. SINGLE FETUS EXTACTION

This fact implies that an adaptive cancellation algorithm may be employed. An illustration of this conventional approach is given in figure 1.

Alternatively, four ordinary chest leads can also beused to record the mother's heartbeat and provides multiple reference inputs to the canceller. A single abdominal lead was used to record the combined maternal and fetal ECG that served as the primary input.Multichannel adaptive noise canceller is used in this case.

<sup>\*</sup> Reaserch Scholar, Sona College of Technology, Salem-636 005.

<sup>\*\*\*</sup> Asst. Professor, College of Engineering, Salem-636 011.

<sup>\*\*\*</sup> Student, Sona College of Technology, Salem-636005.

# Parallelizing a Statistical Capacitance Extractor

Nidhi Sawhney, Shabbir Batterywala, Narendra Shenoy and Richard Rudell {nidhis,battery,nshenoy}@synopsys.com, rick@rudell.net Synopsys (India) Pvt. Ltd., 6th Block, Koramangala Bangalore 560 095

Abstruct—In this paper we demonstrate an effective parallel implementation of a capacitance carcincian method. The extraction method is haved on a vultistical technique which computes self and coupling capacitances using Monte Carlo integration. The sampling process in the Monte Carlo integration is implemented as Ploating Random Walks which head is listed to easy parallelization. We estudieet our parallel implementation of random walks to harness mused computation powers of networked computers, usually available in every organization. We design our algorithms to exploit these computational resources in a non-intrusive way. Our implementation scales linearly to networks with number of computers will necessod 100. With increasing performance of computers and high-speed networks, and the decreasing cost of these resources, it makes an effective case for such a parallel capacitance extraction method. Experimental results are provided to demonstrate very accurate capacitance extraction of large number of nets in real VLSI designs.

#### I. INTRODUCTION

In deep sub-micron (DSM) technologies the problems related to parasitic capacitance extraction have become very challenging. Interconnect capacitances which could earlier be computed by simplified methods, are now more difficult to compute. This is mainly due to the increased effect of fringe capacitances. Simplified formulae or table lookup based methods no longer provide accuracy required for analysis of VLS1 nets. This is clearly unacceptable even for non-critical nets in any design. There is a pressing need to use "true" 3D methods for computing capacitances of as many nets as possible. However, it is well known that 3D methods take large extraction times and many of them to not scale to solve the extraction problems of large nets. It is desired to have a 3D extraction methodology which computes these interconnect capacitances faster and more accurately, possibly using multiple networked computers.

Accurate 3D capacitance extraction methods can broadly be classified into Boundary Element Method (BEM), Finite Element Method (FEM), Fast Multipole Method (FMM) and Statistical Monte Carlo Method. All these methods solve fundamental electrostatic equations by employing different numerical techniques. Capacitances are computed indirectly by evaluating a charge integral. FEM based methods evaluate this charge integral by discretising the entire volume of the extraction problem. This results into large linear systems to be solved, which makes the method useful only for small sized geometric patterns. BEM based methods only discretise the conductor boundaries and dielectric interfaces, thereby resulting into smaller linear systems, Efficient heuristics are suggested by Nabors and White [1] to further reduce the size of these linear systems. These are fast multipole methods. Statistical methods use Monte Carlo techniques to evaluate the charge integral. Instead of discretising volume or boundaries they evaluate charge integral by efficiently sampling of the integrand in the integral.

Among all these methods Monte Carlo methods have emerged as most successful in computing self and coupling capacitances for interconnects with sign-off accuracy. One of the main advantage being their scalability to large designs and higher accuracy due to statistical error cancelation. Foundations of these methods stem from the work of Muller [2] which describes Monte Carlo methods to solve Dirichlet problems. The first algorithmic description of the Monte Carlo methods for capacitance extraction problems is given in the pioneering work of Le Coz and Iverson [3]. Statistical samples for the integrand in the space around the conductors. Efficient extensions of this algorithm to handle multiple dielectrics are given by Schlott [4] and Le Coz and Verson [5]. I.G. The capacitance numbers computed by these methods have an associated statistical uncertainty which is a good indicator of the error in capacitance values. There is a usual tradeoff between accuracy and performance.

With the availability of high performance computers and high speed networks at decreasing cost there is a case for parallel implementation of all capacitance extraction methods. Yuan and Banerjee [7] and Aluru et. al. [8] discuss parallelization of extractors based on FMM methods. A comparative study of parallelizing capacitance extraction using FMM and BEM methods is also done by Yuan and Banerjee [9]. Monte Carlo methods though most suitable for parallel implementation are not yet discussed in the available literature. Our work focuses on this aspect. We discuss details of a parallel implementation for a Monte Carlo capacitance extractor well suited to exploit multiple networked computers.

There are many computation models for parallel programming and many hardware platforms which support these models. The key distinguishing characteristic is the ratio between the computation time for each parallel task (the "task size" and the communication overhead for each task. These computation models are supported by many different hardware platforms. One extreme is a shared-memory multiprocessor with cache-coherency and the opposite extreme is the massive distributed computation efforts, mostly over 56K baud modems, exemplified by the SETI [10] and GIMPS [11] projects. While a shared-memory multiprocessor can offer communication times in the 100 nanosecond range with byte granularity, the cost of these systems grows super-linearly with the number of processors and are generally considered impractical for more than 100 compute nodes. Likewise, there are few applications which fit well into the SETI or GIMPS model of computing for many hours, days, or weeks before communication is required.

Some applications require large amounts of communication for each "task" i.e., either the communication requirements are large or the divisible unit of the task size is small. For these applications, in the "cost is no object" market, super-massive parallel machines are being built with custom integrated circuits to interconnect the nodes (typically in a grid or mesh fashion). A lower-performance, but still costly solution, is to use high-speed networks such as Infiniband [12] to connect the machines in a cluster.

A new middle ground between these extremes is emerging, which consists of large clusters of compute nodes connected by high-speed, but generic (hence low-cost) networks. Copper Gigabit-ethemet has reached a cost of US S50/machine and US \$20/port on a switch. Even with the overhead of the operating system and the TCP/IP software stack, communication of a 1K block of data takes less than 20 microseconds. For a cluster of 1,000 machines on a single subnet, this implies that near-100% parallel efficiency can be achieved as long as the compute interval per task is a small multiple of 20 milliseconds (e.g., 90% efficiency for 200 millisecond task size). These parameters scale so that 90% efficiency is achieved for communication requirements of 10K bytes per task for 100 processors with a 200 millisecond task size, or communication of 10K bytes per task for 1000 processors with a task size 12 seconds.

We refer to this middle ground as "fine-grained network computing" and it is this middle ground that we are targetting. In particular, our goal has been to decompose the capacitance extractor into a parallel application where the target task size per task is 1 second and the com-

# Process Sensitivity Evaluation of 90nm CMOS Technology With Gate-to-Source/Drain Overlap Length as a Device Design Parameter

#### Abstract:

The impact of gate-to-source/drain overlap length on process sensitivity is evaluated by taking variations in gate length during gate patterning, and variations in halo dose during ion implantation. Two different 90nm transistor designs, namely one with 0nm gate-to-source/drain overlap length and the other with 20nm gate-to-source/drain overlap length are compared. The saturation current of the device for device robustness and stage delay of an inverter for circuit robustness are taken as performance metrics. It is demonstrated that the overlap length should be made as small as possible, in spite of increase in channel series resistance and a consequent degradation in drive current, in order to obtain consistent device performance against process variations across the wafer and from wafer to wafer.

#### I. INTRODUCTION

As the device dimensions continue to scale down with each new generation of CMOS technology, in quest of faster and more complex integrated circuits, variations in process are assuming increasingly bigger role to play. With the scaling of feature sizes progressing more rapidly than the scaling of process tolerances, the statistical variations of device characteristics is becoming increasingly significant. Hence, if a device/circuit is designed to achieve specific nominal values of performances, a distribution of actual performances is expected in a population of chips. Thus, there is an urgent need to tighten this distribution, both at the device design level and at the circuit design level, to achieve robust device/circuit performance, in the face of process variations.

Traditionally, a minimum gate-to-source/drain overlap length of about 20nm was recommended of 0.25i m technology generation node, from the source/drain series resistance consideration[1]. However, it was recently demonstrated that a gate-to-source/drain overlap length of 0nm is preferred in the sub-100nm regime from digital and analog circuit performance perspective[2]. In another study, on process parameter variation, it is shown that variation in gate length and halo implantation dominate over the other processes [3,4]. It was also indicated that the device design technique that utilizes bulk processes (ike metal gate), as opposed to localized processes (like pocket halo), are favourable to lower variations in device/circuit performance[4]. The 0nm overlap design uses lower pocket halo dose and hence one would expect it to perform better than 20 nm overlap design. In this paper, we have specifically addressed this issue. We have investigated whether Onm transistor design is also robust to process variations (gate length and halo implant) in addition to providing high performance.

Process/device simulation is considered appropriate to the study of process sensitivity as it enables the precise control of process variations that are difficult to achieve experimentally. A commercial Technology CAD software from Integrated Systems Engineering (ISE) [5] has been used for this work. Gate length variations and halo dose variations, representative of process variations, are simulated in DIOS process simulator. We perform device simulations, using DESSIS, to study the impact of gate length variations and halo dose variations for 0nm and 20nm overlap devices to characterize variations in drive current metric. Then we carried out rigorous mixed mode simulations, on two-stage inverter circuits, to characterize variations in stage delay metric. We propose that the overlap length be made as small, as is supported by the process technology, to ensure consistent device/circuit performance in spite of process variations.

The next section discusses the simulation methodology used for this study. Section III elaborates on the results obtained and Section IV concludes with summary.

#### II. SIMULATION METHODOLOGY

A nominal NMOS and PMOS devices, labeled D0, of gate length 90nm are designed and optimized with halo and SSRC doping profiles to produce on off current of 2nA/i m of  $V_{dd} = 1V$ . Then, assuming a 10% process variation in gate length,  $L_p$  fast ( $L_p$ =80nm) NMOS and PMOS

# RESPONSE SURFACE METHODOLOGY BASED DESIGN APPROACH FOR YIELD ENHANCEMENT OF ANALOG INTEGRATED CIRCUITS

Suresh Nalluri<sup>1</sup>, A. P. Shiva Prasad<sup>2</sup>

#### Abstract

The unavoidable variations in the manufacturing process can transform the most innovative integrated circuit design into a failure. To account for these variations, designing for high manufacturing vield is as important as designing for good electrical performance [3]. When the goal of an IC design is high yield, it should be based on a robust design that is insensitive to manufacturing process variations. To account for process variations we need to simulate the effect of process variations on the circuit. Statistical techniques are needed to optimize the circuit performance against these process variations. Increasing the parametric yield is a challenge that is with in the circuit designer's realm. In this paper, we describe a methodology, which merges the two fields of study, statistics and analog circuit design into unified method to simulate the effect of process variances, incorporate them into design and optimization of analog integrated circuits, and finally achieve an enhanced parametric yield. The process variations are simulated by using the SMOS [1] model incorporated in APLAC [2]. The statistical techniques that are used in this paper are Response Surface Methodology (RSM) and Design of Experiments (DOE) [10].A simple cascode current mirror with level shifter is designed to demonstrate the Response Surface Methodology based statistical design approach for parametric vield enhancement of analog integrated circuits.

Key Words: Statistical Design, Response Surface Methodology, DoE

#### **1** Introduction

Due to inherent fluctuations in any integrated circuit manufacturing process, the functional yield is always less than 100% As the complexity of VLSI chips increase, and the dimensions of VLSI devices decrease, the sensitivity of performance to process fluctuations increases, thus further reducing the functional yield as shown in Fig. 1. The parametric yield is going down with technology scaling. This reduced yield can affect the profitability of the product.

Designers often use the corner parameters in their designs to counter the effects of process variations. This approach often leads to over design. The normal design cycle involves initial circuit design; send it for fabrication; measure the

<sup>1</sup> Dept of ECE, IISc, Bangalore, nsuresh@protocol.ece.iisc.ernet.in

<sup>&</sup>lt;sup>2</sup> Professor, Dept of ECE, IISc, Bangalore, aps@ece.iisc.ernet.in

# STUDY OF SINGLE CRYSTALLINE SILICON (100) SURFACE TOPOGRAPHY ETCHED IN KOH SOLUTION

## K. Biswas\*, S. Das, K. Dey, D. K. Maurya and S. Kal

#### Abstract

High precision bulk micromachining based on wet anisotropic etching of silicon is essential step for the fabrication of MEMS devices. Silicon etch rate along with surface morphology of both ptype and n-type silicon <100 New been investigated at different KOH solution concentrations (10 wt%, 22 wt%, 33 wt% and 44 wt%) and at various temperatures (50°C, 60°C, 70°C and 80°C) was investigated in the present study. The silicon surface roughness was studied with the help of Scanning Electron Microscopy and Dektak<sup>5</sup> Surface profilometer. The silicon etch rate and surface morphology strongly depends on the dopant type of silicon, etching solution concentration and etching temperature. The KOH concentration that maximized silicon etch rate is 22 wt% at 80°C whereas the smoothest silicon surface was obtained using 33 wt% KOH at 80°C.

#### 1. Introduction.

Anisotropic etching of silicon is one of the key process step for 3-D dimensional microstructures out of silicon substrate [Kovacs et al. (1998). Elwenspoek and Jansen (1998)]. The fabrication process of many micromechanical sensors often requires bulk micromachining wherein an aqueous silicon etchant is used to accomplish the task. Anisotropic silicon etchants include inorganic aqueous solutions of KOH, NaOH, RbOH, CsOH, NH4OH, EDP, Hydrazine and organic etchants like Tetramethyl ammonium hydroxide (TMAH). The most commonly used silicon etchants are KOH [Bean (1978), Barvcka and Zubel (1998), Zubel (2000)], EDP [Finne and Klein (1967), Reisman et al. (1979)] and Tetramethyl ammonium hydroxide (TMAH) [Schnakenberg et al. (1991), Tabata (1996)], Silicon anisotropic etching in KOH solution under a variety of etching conditions has being well investigated by several research groups [Seidel et al. (1990), Sato et al. (1998), Glembocki et al. (1991)]. The etching properties of silicon in Tetramethyl ammonium hydroxide (TMAH) solutions at different ambient have been studied [Tabata et al. (1992), Sato et al. (1998)]. But potassium hydroxide (KOH) solution is widely used silicon etchant in the fabrication of microsensors and microstructures as an alternate to TMAH and EDP. KOH has excellent etch characteristics, low level toxicity and is safe to use and handle. One of the disadvantages associated with micromachining of silicon using KOH is that this etchant is not compatible with conventional integrated circuit fabrication processes due to presence of metal ions in it. Also, KOH etching produces very rough etched silicon <100> surface during etching. The surface roughness, which are higher at lower concentrations of KOH solution, reduces the silicon etch rates if not controlled.

Microelectronics Lab, Advanced Technology Centre, Department of E & ECE I.I.T Kharagpur-721302, India.

\* Corresponding author. Tel.: +91-3222-28-1479; Fax: +91-3222-255303 E-mail address: kanishka@ece.iitkgp.ernet.in

# CHARACTERSIATION OF GATE OXIDE LEAKAGE CURRENT OF NANO-MOSFET USING GREEN'S FUNCTION APPROACH

S. Dasgupta and Ritambhar Roy<sup>\*</sup> Department of Electronics Engineering, Indian School of Mines, Dhanbad-826 004, India Email: sidindia2000@vahoo.com 'Infosysis Technologies, Mysore

#### Abstract

In this paper, a novel approach to evaluate the OFF-state leakage current in a nano-MOSFET using Green's Function formalism is presented. At such low oxide layer dimensions the leakage current can be substantial and hence the OFF state power dissipation of the device and the circuit based on these devices would be appreciably high. The results obtained by us has been compared and contrasted with reported experimental result for the purpose of validation. It is seen that the model developed matches with reported experimental result very well. The paper provides model developers a general recipe in addressing issues of partition and enables one for accurate parameter extraction for MOSFETs in sub-100 nm technology. It also helps VLSI designers and device physicists to design and develop future nanoscale MOSFET technologies with very low static power dissipation.

#### 1. INTRODUCTION

For more than 30 years, the integrated circuit industry has followed a steady path of constantly shrinking device geometries and increasing chip size. Each new generation has approximately doubled logic circuit density and increased performance by about 40% while quadrupling memory capacity. The semiconductor industry itself has developed a roadmap (NTRS and ITRS) [1]-[2] based on this idea. Aggressive scaling of MOS devices in each technology generation has resulted in higher integration density and performance. Simultaneously, supply voltage scaling has reduced the switching energy per device. However, the leakage current has increased drastically with technology scaling [3] and leakage power has become major contributor to the total power. Hence the estimation of leakage current is necessary to design low-power devices and logic.

There are a number of issues associated with continued MOSFET scaling that represent challenges for the future and ultimately fundamental limits. The first issue is the gate dielectric thickness [4]. Practical MOSFET structures generally requires the gate dielectric thickness to be a few percent of the channel length. The two dimensional effects in a logic device that arise because of a difference in permittivity between the silicon channel and the gate insulator make the dielectric requirements very large in logic devices [5]. The gate electrode itself also presents some significant challenges. Polysilicon has been used for more than 25 years as the gate electrode material. However, decreasing its resistivity, implies increasing the doping levels in the polysilicon, which minimizes the resistivity of the gate electrode and helps avoid polysilicon depletion effects. But this approach is limited by dopant solubility limits and by dopant outdiffusion from the poly through the thin gate dielectric and into the silicon. This later problem is particularly acute with P<sup>+</sup> gates because diffuses rapidly through SiO<sub>2</sub>. The likely solution is again new materials – metal gate electrodes. But three are no known material solutions that are known to work in mandraturing.

#### Capacitance Sensing Techniques for MEMS Gyroscope

Rajesh Sangati, Sowjanya Syamala, Navakanta Bhat

Authors are in Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore, 560012, INDIA email: rajesh@sindhu.ece.iisc.ernet.in , sowjanya@sindhu.ece.iisc.ernet.in, navakant@ece.iisc.ernet.in

Abstract—Two novel capacitance sensing circuits are proposed which sense a capacitance change of IIF with a resolution of 0.1F. While one of the proposed schemes measures the peak change in the value of the capacitance using differential sensing, the other continuously tracks the change in capacitance. The functionality of the sensing circuits was verified with a test structure which emulates a Gyroscope by switching unit capacitances using spaced inverter thresholds.

Keywords—Capacitance sensing, MEMS, Gyroscope, differential sensing, amplitude modulation

#### I INTRODUCTION

Micromachined Gyroscopes provide high accuracy rotation measurements, leading to reliable rotational rate sensors for applications in automatic navigation and space applications. These sensors measure angular velocity by utilizing the Coriolis force, a direct consequence of a body's motion in a rotating frame of reference. The effect of the Coriolis force is to produce a displcement of the body in the direction perpendicular to the plane of the angular velocity and the motion of the body. Since the body is a rectangular plate of a parallel plate capacitor, this displacement can be sensed electrically as a change in the capacitance.

$$\Delta C = \epsilon A \Delta d/d^2$$
(1)

Unlike conventional capacitance transducers, Gyroscopes present a set of stringent conditions on the electrical voltage that can be applied across the capacitance plates for sensing, so as not to interfere with the Coriolis force which produces the displacement of capacitance plates. Fig.1 illustrates the top view of Gyroscope structure where in the two plates CS+ and CS- form the top electrodes of two sensing capacitances. These two plates have anti-parallel movement along Y-axis and hence they move antiparalle in Z-direction in response to rotational motion around X-axis. The resulting change in capacitance is illustrated in Fig.1. In this paper two techniques are proposed which dynamically sense the capacitance. In the first technique, the capacitance is sensed at the peak value of its change, while the second technique utilizes the symmetry of the Gyroscope structure to produce an amplitude modulated waveform and demodulates it, thus sensing a continuous capacitance change

#### II CHARGE/DISCHARGE METHOD

This method is primarily based on the linearity of charging time difference between two capacitaces charged by a constant current source. Fig.2 shows a detailed block diagram of the measurement scheme. Current source 1 and Current source 2 are two current sources that generate multiple currents for charging the two capacitances. The capacitance change is in phase with the sine wave which is applied to the comb drives of the Gyroscope. The measuring scheme uses this sine wave as the input to the peak detector(PD)[1]. At the peak of the sine wave the peak detector sends a pulse(Trigger), which is used to



Fig. 1. Electrical equivalent of Gyroscope structure



Fig. 2. Block diagram representation of Charge/discharge Method

# REASSESSMENT OF CHANNEL ENGINEERING IN SUB-100NM MOSFETS

R. Srinivasan and Navakanta Bhat, Senior Member IEEE, Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore 560 012, India. Email: <u>sreenivaasan@protocol.ecc.iisc.ernet.in</u> navakant@ecc.iisc.ernet.in

#### Abstract

In this paper, we have investigated the usefulness of halo implantation in 90nm MOSFET inverter circuits, using process and device simulations. We have compared the inverter delays of channel-engineered devices (halo pockets and super steep retrograde channel profile) with the devices fabricated on uniformly doped substrate. Both the devices are constrained to have identical off state leakage currents  $(I_{OFF})$  and gate to drain/source overlap  $(L_{OV})$ . The simulation results show that in deep sub-micron (DSM) regime the devices fabricated on the substrate having uniform doping concentration yield lower delays. Therefore, to get better performance in the DSM devices, uniform substrate doping profile is recommended.

#### Introduction

As the devices were scaled down, novel device structures such as halo (pocket) implantation, super steep retrograde channel (SSRC) implantation and ultra shallow highly doped source/drain extension structures were proposed to mitigate the short channel effects (SCE) [okumura et.al, 1999, Chon-Lung Wang, 1995, Cao et, al., 1995, Mii et, al., 1994, Ogura et, al, 1980, vannis, 1992]. Starting from the sub-quarter micron CMOS devices, vertically and laterally non-uniform doping profiles have been used [gwozie, 1999]. But, at the same time all these modifications in one way or other tend to degrade the drive current of the MOSFET [cao1999, hyunsang, 1996]. In this paper, we have studied the halo employed MOSFET performance with the uniformly doped channel MOSFET, in terms of inverter delay ( $\tau$ ). We have used process, device and mixed mode simulation. All the simulations are done using ISE-TCAD simulator. The transistors are defined using process simulation (DIOS) so that the transistor structure is realistic. Two technology nodes (poly gate length of 90nm and 0.25µm) have been studied. In the next section, the procedure to get the devices with required LOV and IOFF has been discussed. Section 3 deals with the simulation results and section 4 concludes the work.

#### **Procedure to Get Required Devices**

The transistors at each technology node are designed such that the  $I_{OFF}$  is the constraint. An  $L_{OV}$  of 10nm was maintained in all the transistors. Two sets of

# Effect on Surface Roughness on Physical Design Parameters

#### Ganesan.S.Iyer and Rajendra.M.Patrikar

Visvesvaraya National Institute of Technology, Nagpur-440011. India.

#### Abstract

Device and interconnect modeling is playing crucial role in the integrated circuit design verification since the performance success of design depends on interconnects for today's ICs. For accurate modeling it is more important to take into account all the physical effects with the progresses in technology. One such aspect is surface roughness, which is arising due to various processes used for fabricating them. With reduction in the physical dimensions, effects of surface roughness are becoming more pronounced because of higher surface to volume of these devices. Simulations done for 50nm devices show that the effect of surface roughness is quite pronounced compare to earlier generations, Thus it appears that surface roughness on device performance. We show that effect will become more pounced at 50nm nde.

#### 1.Introduction

With the reduction of the technology development cycle, it becomes increasingly important to be able to predict the performance of the next-generation devices. The ability to do such a prediction would be useful both for early circuit design efforts and technology development. The circuit designers would be provided with SPICE models long before any real silicon is available for model extraction, enabling them to evaluate their circuits under the realistic conditions. Technology groups could use it as a tool to evaluate the device being developed, consider the alternatives, and optimize the device performance for the number of targets and design choices. This requires experimentation and understanding of many physical effects in low dimension technology and accurate device and interconnect modeling [1-5]. One of the major parameter is surface to volume effects are becoming important in all physical design parameters. In this paper we are discussing the surface roughness as a parameter, which affects the electrical performance

In a traditional simulations and analysis all the surfaces and interconnects are assumed to be smooth. Experiments have shown that these surfaces are not smooth and errors would be introduced when roughness is not taken into consideration in parasitic extraction or in device properties. This roughness appears because of the deposition variables of metal and dielectrics and various process variables in the chemical mechanical polishing (CMP) process [6] or other teching process. In most of the deposition methods self-affine surface appears in a growth mode in which average orientation of the surface is

# PHILIPS

# Early, Fast & Accurate Software Power Estimation for Embedded Digital Signal Processors

# **Syed Saif Abrar**

Philips Semiconductors Bangalore saif.abrar@philips.com

P17-Early Fast\_Accurate-D3Pwr#1.PDF

# POWER ESTIMATION IN EMBEDDED SYSTEMS FROM A PRECHARACTERISED MODULE LIBRARY

Lakshmi Prab<sup>†</sup>haViswanaathan<sup>†</sup>, Elwin Chandra Monie<sup>‡</sup>

#### Abstract

Estimation of Power in an Embedded System needs to be addressed as Power has to be ultimately optimised . This paper presents a new approach to estimation of power for an embedded system using conventional VHDL. For a given design and set of input vectors, the switching activity in the design yields a measure of power consumption. This is carried out in two phases, namely the estimation of power for standard archtectures and then simulation of power for custom built architectures used in embedded systems. Experimental results for a number of benchmarks show that the switching activity.

#### 1.Previous work:

At the architectural level, Landman *et al* [1,2] presented a technique for the characterization of module library using signal statistics. Powell *et al.*, [3,4] suggested Power Factor Approximation (PFA) method for characterizing each module in a module library consisting of functional blocks. Chandrakasan *et al.*, [5,6] described a high-level synthesis system, HYPER-LP, which uses a variety of architectural and computational transformations to minimize power consumption in application-specific datapath-intensive CMOS circuits. Anand *et al.*, [7] present a behavioral synthesis system known as Genesis, for synthesizing low power datapath intensive CMOS circuits

The key differences between the proposed approach and others are (1) This approach is based solely on the behavioral profiling. Landman's estimation is based on behavioral profiling or RT level profiling. For large designs, with large set of inputs, the latter approach is time consuming and hence design space exploration is difficult. (2) The characterization of the module library is based on purely random inputs. Landman, on the other hand, proposed DBT (Dual Bit Type) model to take into account the input activity..

<sup>&</sup>lt;sup>†</sup> Assistant Professor, GCT, Coimbatore, ‡Professor, TPGIT, Vellore

# DYNAMIC POWER MANAGEMENT IN AN EMBEDDED SYSTEM FOR MULTIPLE SERVICE REQUESTS

#### V.Lakshmi Prabha<sup>†</sup>, Elwin Chandra Monie<sup>‡</sup>

#### Abstract

Power is increasingly becoming a design constraint for embedded systems. Dynamic Power Management algorithms enable optimal utilization of hardware at runtime. The present work attempts to arrive at an optimal policy to reduce the energy consumption at system level, by selectively placing components into low power states. A new, simple algorithm for power management systems involving multiple requests and services, proposed here, has been obtained from stochastic queuing models. The proposed policy is event driven and based on a Deterministic Markov Non-Stationary Policy model (DMNSP). The proposed policy has been tested using a Java based event driven simulator. The test results show that there is about 23% minimum power saving over the existing schemes with less impact on performance or reliability.

#### 1.Introduction :

Energy consumption has become one of the primary concerns in Electronic Design due to the recent popularity of portable devices and cost concerns related to desktops and servers. Embedded systems are often built out of multiple Processing Elements (PE's) like general purpose processor and application specific instruction processors (ASIPs). This is because, off-the-shelf components are often cheaper than ASICs and software development is generally faster than hardware development. But one drawback of programmable processors is the power efficiency compared to that of ASICs. Therefore, various techniques have been proposed here, to reduce the power consumption of these components through Dynamic Power Management and Dynamic Voltage Scaling that either switch off components or scale their voltase down.

#### 2. Related works

The fundamental premise for the applicability of power management schemes is that systems or system components experience non-uniform workloads during normal operation time. In recent past, several researchers have realized the importance of power management for large classes of applications. Chip level power management features have been implemented in commercial microprocessors [1]. Entire chip or units of the chip, can be shut down automatically by dedicated on-chip control logic [2]. Power Management schemes have been studied in [3,4,5,6,7,8]. These approaches explore several

# AN EFFICIENT MONTE CARLO DEVICE SIMULATOR TO CALCULATE VELOCITY OVERSHOOT IN MOSFETS

#### A.Madan<sup>1</sup>, S.Jindal<sup>1</sup>, B.Prasad<sup>2</sup> and P.J. George<sup>2</sup>

#### 1. ABSTRACT

A general purpose, self-consistent Monte Carlo device simulator has been developed for simulating electron transport in a silicon n-MOSFET. The simulator is aimed at providing a tool for probing high field transport processes and ultra fast phenomena that occur in a short channel MOSFET as we advance into nanometer dimensions. The Monte Carlo (MC) technique, besides its greater ability to deal with band and scattering details, has the advantage of solving the Boltzmann Transport Equation (BTE) without assuming a *priori* the distribution functions, and yet providing transparent microscopic interpretation of phenomena on the atomic level with submicron spatial and femtosecond temporal resolutions. Since, technology is developing at a faster rate, modeling activities have become more technology oriented rather than design centric. Thus, faster and accurate device simulators are required to catch up with the pace of technology development. This work is one such sincere effort in this direction.

#### 2. INTRODUCTION

Modeling of nanometer dimension semiconductor devices is a challenging task since the entry of nanotechnology. The reduction in channel length of MOSFET has been a major issue as it accounts for the speed-power factor of the circuit. Various effects are to be accounted for modeling of device as the channel length reduces to nanometer making it more complex.

For MOSFET having with a metallurgical channel length  $L \sim 0.1~\mu m$ , velocity overshoot of electrons has been observed [1]. In case of an extremely small channel length and a large changing electric field, carriers do not have enough time to reach equilibrium with the external electric field. Their average energy is thus lower than the corresponding stable value under this field resulting in drift velocity overshoot<sup>1</sup>.

The performance of MOS devices and circuits with 0.1  $\mu$ m size is affected by velocity overshoot. The overshoot effect has been studied by a number of workers [2,3,4] based on models which involves solving a series of equations consuming lot of computation time. The simulator developed by the authors has

<sup>1</sup> Student, Punjab Engineering College, Chandigarh-160012

<sup>&</sup>lt;sup>2</sup> Faculty, Electronics Science Dept, Kurukshetra University-136119

Contact author: anuj\_madan@ieee.org

# Gate Level Dynamic Power Estimation in the Presence of Varying Process Parameters

#### Siddharth Tata, Siddharth Garg, Ravishankar Arunachalam<sup>1</sup>

Abstract

Traditional power estimation tools have assumed fixed delay models in calculating dynamic power dissipation. However, with increasing uncertainty in gate delays due to manufacturing process variations, fixed delay models are inaccurate and need to be substituted with variable delay models. In our work, we represent the gate delay as a continuous random variable with known statistical parameters. We propose a novel and efficient scheme to propagate transition probability waveforms across the circuit. We show that our model accurately captures the variations in process parameters. Experimental results show that the average error in power estimation is less than 7% in our new scheme.

#### 1 Introduction

The era of low power digital circuit design has ushered in the need for accurate tools to estimate circuit power dissipation making it a crucial stage in the design process of digital circuits. The sources of power dissipation in digital CMOS circuits are (1) Dynamic Power dissipation due to switching activity at circuit nodes (2) Leakage power (3) Short Circuit current and (4) Standby Power dissipation. Among these the major source of power dissipation is dynamic power. The dynamic power dissipated in a circuit at a node 'x' is given by the equation

$$P_x = 0.5 C_L V_{dd}^2 E_{sw} f_{clk} \qquad (1)$$

where  $C_L$  is the parasitic capacitance at the node 'x',  $V_{dd}$  is the supply voltage,  $E_{sw}$  is the average switching activity at the node and  $f_{th}$  is the clock frequency. Since  $C_L$ ,  $V^2_{dd}$  and  $f_{elk}$  can be determined easily, the task of power estimation often reduces to the problem of finding the average switching activity at each node.

A number of techniques have been suggested to estimate the switching activity in digital CMOS circuits. Early methods use a zero delay model to calculate the switching activity, but these methods are inaccurate as they do not take into account the power dissipated due to glitches in the circuit. It has been shown in [Benini, Favalli and Ricco (1994)] that an average of 15-20% (up to 70% in some cases) of the total dynamic power may be dissipated as glitches in circuits. [Najm (1993), Tsui, Pedram and Despain (1993), Ding, Tsui and Pedram

<sup>&</sup>lt;sup>1</sup> Department of Electrical Engineering, Indian Institute of Technology-Madras ravia@ee.iitm.ernet.in

# nWATT: POWER PLANNING METHODOLOGY IN PHYSICAL DESIGN

#### Satya Sridhar Narayanabhatla Wipro Technologies, India

Kiran Satyamangala Jaisimha Wipro Technologies, India

Binoj Xavier Magma Design Automation, India

#### Abstract

The paper proposes a methodology for the power planning for a non-uniform power distribution across the chip. The methodology tries to ensure supply of required power to all the cells, address non-uniform power distribution, reduce pessimism while planning, addresses EM and IR drop issues while power planning. Given the Synthesized netlist and Design floorplan, this methodology attempts to determine the total power, power distribution, number of IO power Pads required and their placement, power parameters for a simple mesh structure.

# A Connection Graph based Variable Wire Width Approach to Analog Routing

#### Subhashis Mandal<sup>1</sup>, Abhishek Somani<sup>2</sup>, Shamik Sural<sup>3</sup>, Robert Drury<sup>4</sup>, Amit Patra<sup>5</sup>

#### Abstract

An elegant approach to analog routing with variable interconnect width is presented in this paper. The algorithm is an extension of the widely used Lineprobe routing algorithm on an Adaptive Connection Graph (ACG). The ACG is built on the fly depending upon the required wire width and is much sparser than the traditionally used manufacturing grid for such algorithms. A novel Localized Offset Graph is proposed which is constructed to align the wires with the mid-point of the source and target terminals. We use an unreserved layer model for better optimization of the number of vias used. The proposed algorithm allows arbitrary user-specified directions of routing from the source/target and can thus easily be coupled with a routing optimizer for better design space exploration in the detailed routing phase of the physical design of a circuit. The implemented algorithm is described and results are reported validating the effectiveness of this approach for routing in analog circuits as the routing regions of analog circuits are not as dense as their digital counterpart.

Key Words - Analog routing, Arbitrary Wire Width, Unreserved Layer routing

#### 1. INTRODUCTION

ROUTING has become one of the most prominent tasks facing VLSI circuit designers as fabrication technology moves into deep sub-micron territory [Semiconductor Industry Association, (1997)]. With the automation of the physical design of analog circuits being still in its infancy, one of the most challenging problems is to have efficient routing algorithms, which take analog constraints into consideration. Typically, the performances of analog circuits are adversely affected by the parasitic associated with routing. Interconnect resistance minimization and adjusting its width for the current requirement of the net is a very important issue [A.Vittat]. (1999), [C.Albrecht, (2001)].

This work is partially supported by National Semiconductor, Santa Clara, USA. <sup>1</sup> School of Information Technology, IIT Kharagpur, India (email:

subha@vlsi.iitkgp.ernet.in)

<sup>&</sup>lt;sup>2</sup> Department of Computer Science, IIT Kharagpur, India

<sup>3</sup> School of Information Technology, IIT Kharagpur, India

<sup>4</sup> National Semiconductor, Santa Clara, USA

<sup>5</sup> Department of Electrical Engineering, IIT Kharagpur, India

# Novel Approach to Solve IP Integration Problems in an Era of SOC

Sreekanth K M (sree@ti.com)<sup>1</sup> Lionel Dahyot (l-dahyot1@ti.com)<sup>2</sup> Vinod Kumar (vinodk@ti.com)<sup>1</sup>

Abstract –This paper discusses the flow that we developed and used for integration of hard IP into chip design flow. Paper outlines difficulties in integration of IP and how integration flow addresses these issues. Concepts of level-0, level-1, level-2 checks on which integration flow is built is also discussed. Paper also lists down benefit seen from this flow

Keywords - IP integration, flow

#### **1.0 Introduction**

Intense competition to put new products at lower cost into customer hands ahead of competitors is forcing system manufacturers to reduce cycle times. Complex System-on-Chip (SOC) designs are fast becoming common place in today's applications. The increasing demand for more complex features in consumer electronics resulted in exponential growth in the amount of analogue content on mixed-signal SOCs. All these factors put pressure on budgeted time for design of chips used in the system. Result is to reuse hardware modules called IP (Intellectual Property) blocks in designs. Designers must overcome many complex and challenging issues regarding cost, time-to-market (TTM), performance, power, capacity, quality and IP integration.

Smooth integration of IP is one of the major care-about in SOC designs. Successful integration of IP in the design flow contributes significantly in reducing design cycle time and effort. This also improves the predictability of the design cycle time. Another benefit of this is to enable designers to concentrate on improving the performance, area of the design instead of wasting their energy on integration issues.

This paper discusses IP integration methodology, which is developed for integration of hard IP (IP in which size, layout, routing is complete and frozen) with Chip Create Flow. Paper also talks about the flow that is developed to automate this methodology and its benefits. In this paper unless explicitly mentioned IP refers to "hard IP".

#### 2.0 Problem Space

Smooth IP integration requires

- Generation of set of views that provides abstracted views of the IP to top level design flow
- Ensuring that IP delivered contains
  - All required models
  - Models adhere to the requirements put by the top level flow
  - Models are consistent with each other

<sup>1</sup> Texas Instruments, Bangalore 560017, India.

<sup>&</sup>lt;sup>2</sup> Texas Instruments, Nice, France.

# Maximization of Aggressor Influence in Crosstalk–Delay Testing

Ravishankar Arunachalam Indian Institute of Technology, Madras.

#### Aniket Indian Institute of Technology, Madras.

#### Abstract

Crosstalk between adjacent lines can significantly affect the propagation delay of signals in Deep-Submicron (DSM) circuits. When such a circuit is subjected to conventional delay testing techniques, the critical paths obtained from static timing analysis are often incorrect due to the effect of crosstalk. Any path can have many wires (victims) which are affected by crosstalk from many other lines (aggressors). It may so happen that all the wires lying along a path are affected by crosstalk and the cumulative effect of crosstalk delays of all these victim nodes causes a timing violation. In such a case, all the aggressors associated with the victims lying along the path need to be activated appropriately in order to maximize the crosstalk delay, so that a delay fault, if it exists, is detected. In this paper we present a new Automatic Test Pattern Generation (ATPG) algorithm which maximizes the influence of crosstalk by appropriately activating the aggressors coupled to the victim nodes lying along any critical path.

#### 1. Introduction

Contemporary VLSI circuits use Deep-Submicron geometries, very high switching speeds, devices of different strengths. The above results in crosstalk being induced between circuit elements. Crosstalk effects can be categorized into two types: crosstalk induced pulses and crosstalk induced delay. The former causes functional failures whereas the latter affects the timing (delay faults). Crosstalk delay is induced when two lines (aggressor and victim) have simultaneous or near simultaneous transitions. If the aggressor and victim lines transition in opposite (same) directions, the transition times are increased (reduced) and effective delay is increased (reduced). Moreover, a victim line can be affected by multiple aggressors. Such unexpected changes in the signal propagation delay can adversely affect the performance of VLSI circuits where timing margins are small. Test generation techniques need to be developed in order to detect signals violating the timing constraints.

The first step in the test generation process is to list a set of critical paths which can potentially violate the timing criterion for the circuit. In the presence of crosstalk, crosstalk induced delays must be considered in addition to gate input to output delays when performing static timing analysis. Once the list of the possible critical paths is obtained, the paths need to be checked for sensitizability. For a sensitizable path, a transition placed at the Primary Input to the path may propagate all the way to the Primary Output. This is followed by the activation of the aggressors. Literature deals with the sensitization of a single victim (and all its corresponding aggressors) and then propagating the crosstalk skew along the maximum delay path passing through the victim. [Chen, Gupta, Breuer (1999)]

# CROSSTALK NOISE ANALYSIS AT MULTIPLE FREQUENCIES

Authors: Sachin Shrivastava1 and Sreeram Chandrasekar2

#### Abstract

In SOC designs, it is common to have multiple clocks with some of them capable of operating at different frequencies. Crosstalk noise analysis uses switching information of nets in the form of timing windows to identify simultaneously switching aggressors. When clock frequencies change, the switching overlap relationship among nets also changes. Hence it is important to perform crosstalk analysis at all frequencies at which the chip may operate. In this paper we propose an approach to perform crosstalk analysis across multiple frequencies concurrently. We first present a method to obtain the worst crosstalk noise across all frequencies for each stage. Then we consider glitch propagation effects and build an efficient approach to analyze for crosstalk failures at all frequencies. We also discuss a method to guarantee functionality for a range of frequencies.

#### 1. Introduction

Pessimism in static crosstalk noise analysis can be reduced by using liming windows for finding simultaneously switching aggressors [1][2]. Timing windows for a net are defined with respect to the clock dege that triggers the switching of a net. Changing the clock frequency causes the switching time of nets also to change. This means that the switching overlap relationship among nets also changes when the clock frequency changes [3].

In system-on-chip (SOC) designs, it is common to have multiple clocks with some of them capable of operating at different frequencies. Chips may also be overclocked, or run at a slower speed to reduce power dissipation. If a design can operate at multiple frequencies, it is important to ensure glitch-free operation at all the frequencies. If there are multiple clocks that can operate at different frequencies, we will have multiple combinations of clock frequencies possible, and crosstalk analysis needs to be performed for all such combinations. Since chip-level crosstalk noise analysis with glitch propagation is very run time intensive, performing separate noise analysis for all combinations is not desirable. Work has been done to find the discrete set of frequencies for which noise analysis needs to be done to guarantee functionality from dc to  $f_{max}$  [3]. But this requires noise analysis at different sets of discrete frequencies. In [4] crosstalk noise analysis is first done without using timing windows, and then for each combination of clock frequencies, selective re-analysis is required. Again, multiple frequencies are handled separately thus required mathemations

<sup>1</sup> Texas Instruments India Pvt. Ltd. (sachins@ti.com)

<sup>&</sup>lt;sup>2</sup> Texas Instruments India Pvt. Ltd. (sreeram@ti.com)

# A Mathematical Analysis of Analog and Digital summation techniques in compensation Block for I/O Buffers

Sushrant Monga, Paras Garg, Frederic Hasbani

STMicroelectronics Noida

#### Abstract

Compensation circuits have been employed in the a/p buffers to compensate for the effects of the mobility, threshold voltage and other parameters variations with the change in the PFT conditions. The aim is to optimize the performance of the a/p buffer in terms of the parameters namely the frequency of operation, current slew and the a/p-drive. For the change in PFT (Process, Voltage, Temperature) conditions the current in the p-mos and nmos transistors to optimize the performance. In this document a comparison of two techniques namely Analog and Digital summation has been presented which are employed to produce a seven bit code for controlling the operation of the circuit by setting the aspect ratios of the pre-driver and the a/p buffer. A statistical approach has been taken to augment the empirical relations to reach at a best optimized solution for generating the codes.

Terms used: Iref(the reference current generated in compensation block to be compared with last to generate the seven bit  $code_1$ ). $t_0$  the value of Iref at typical PIT conditions. Iqn/p(the quantized value of the saturation currant in the nmox/p-mos transistor). kn/p denotes their respective process parameters and Weqi, Weqo denoting the equivalent Widths of the pre-driver and the o/p buffer section as a function of the code. Iref can vary in the positive and the negative directions with the y% and 2% variation respectively i.e. it can vary from  $L_n(1-z/100)$  to  $L_n(1+y/100)$ with a uniform probability.

#### 1.INTRODUCTION

In an I/O cell various quantities have to be optimized to interface the core to the o/p world and guarding against noise and excess charge etc to damage the core or the signals through which it is interfaced to the o/p world. To operate the cell under various PVT conditions we have to make sure that certain parameter constraints are satisfied for the proper operation. For this we employ the compensation circuitry to compensate the effects of mobility and threshold variation with temperature, process and voltage variations. To this effect a compensation code is supplied to a pre-driver as well as the o/p buffer so that our o/p buffer can perform at its best. The quantities that are to be optimized are namely

1.Current Slew (di/dt)max

2.O/P drive i.e. the o/p impedance

3.Frequency of operation

# DEVELOPMENT OF SILICON PIEZORESISTIVE ACCELEROMETER FOR AVIONICS APPLICATIONS<sup>+</sup>

#### Anil Nandi\*, Saumen Das\*\*, S.K.Lahiri\*\*\*

#### Abstract

A successful attempt is made to design and fabricate a dual beam silicon based piezoresistive accelerometer. The development is aimed to achieve maximum sensitivity in one direction (Z direction) and minimum off axis sensitivity. The accelerometer is designed for 13 g and 20 g acceleration values. The design is simulated using Coventorware tool on HP-Solaris platform. The aim of this work intends to achieve maximum sensitivity in Z direction and required material properties are chosen for this purpose. The material properties and processes are chosen so as to match the actual fabrication process involved. A bridge circuit is configured from the diffused piezoresistors and the value of voltage from the output of the circuit is calculated theoretically. The output voltage thus calculated demonstrates the sensitivity of the sensor in terms of voltage values. The sensitivity is also calculated in terms of piezoresistor values. Various factors governing the design of accelerometers are studied.

#### 1. Introduction

Accelerometers are devices used to measure acceleration. MEMS version of Accelerometer is lighter, smaller and cheaper than traditional alternatives. They are used in automobile, consumer, industrial, aerospace, navigational and various other applications. The accelerometer is constructed from Silicon, using MEMS technology [1,2]. Micro-Electro-Mechanical System (MEMS) is a highly miniature device or an array of devices combining both the mechanical and electrical components. It is fabricated using standard IC batch manufacturing technique [3].

#### 1.1 Accelerometer theory

Accelerometer can be modeled as a spring and Dashspot system, as in Fig 1 [4]. An accelerometer generally consists of a proof mass suspended by beams anchored to a fixed frame. The proof mass has a mass of m, the suspension beams have an effective spring constant of k, and there is a damping factor ( $\lambda$ ) affecting the dynamic movement of the mass. The accelerometer can be modeled by a second order mass-damper-spring system, as shown below. External acceleration displaces the support frame relative to the proof mass, which in turn changes the internal stress in the suspension spring. Both this relative displacement and the suspension-beam stress can be used as a measure

<sup>\*</sup> Senior Lecturer, ECE dept, B.V.B.College of Engg, Hubli, Karnataka.

<sup>\*\*</sup>Senior Scientific Officer, ATC, IIT KGP 721302, WB

<sup>\*\*\*</sup>Professor, E&ECE dept, IIT KGP 721302, WB.

<sup>+</sup> This work is carried out in Microelectronics laboratory at IIT, Kharagpur

# 1.5V, 10-bit, ±1200 mV Input range, CMOS, Pipelined Analog-to-Digital Converter

Sunil Kumar Vashishtha and Dr. Basabi Bhaumik

Department of Electrical Engineering Indian Institute of Technology Delhi New Delhi (India)

# Abstract

A 1.5-V, 10-bit,  $\pm 1200$  mV Input range pipelined ADC is designed and simulated in 0.35  $\mu$ m CMOS technology. The emphasis is placed on large input range. This is achieved by using operational amplifier with complimentary (N-stage and P-stage in parallel) input stage and low voltage switch. The main contributions of this work are (a) a new  $g_m$  Control scheme for rail-to-rail input range operational amplifier. (b) New low voltage switch using bootstrapped technique. (c) A new Digital-to-Analog Converter using tri-state inverter. The converter has maximum DNL of 0.84 LSB, maximum INL of 1.32 LSB, and conversion rate of 5MJSce cand power dissipation of 13.5 mW.

## Introduction

Analog-to-Digital converters (ADC) provide the link between analog world of transducers and digital world of signal processing. The increasing digitalization in all spheres of electronics application, from telecommunications systems to consumer electronics appliances requires ADCs with a higher sampling rate, higher resolution, and lower power consumption. In mixed-mode circuit, the voltage limitation of the technology dictates the integrated ADCs to operate at the same low voltage as digital circuits. Conventional ADCs operates at large supply voltages (typically SV or above) and use high speed, high gain operational amplifiers. At large supply voltages the input-output swing is not an issue as threshold voltages of MOS transistor are much less as compared to supply voltage. The low-voltage has an enormous impact on the able to deal with signal voltages that extend from rail-to-rail. This requires classical circuit solutions to be replaced by new configuration. Also at lower supply voltage with increase in resistance of conventional pass transistor it can no longer be used for switched capacitor circuits.

# Pipelined ADC Architecture

Pipeline ADC is the most suitable architecture for resolution in the range of 8-12 bit and high sampling speed. [1-2] examined the effect of stage resolution on some important characteristics of ADC such as linearity, speed, area and power dissipation and found that minimizing the stage resolution maximizes the conversion rate and minimizes both area and power dissipation of a multistage pipelined ADC, however large resolution is desirable from linearity standpoint but effect of stage resolution on linearity can be minimized using redundant bit and digital error correction scheme. One such pipelined ADC using 1.5bit/stage is proposed by Abo et al [3] and used in this design. The A/D converter presented in this has dynamic range of  $\pm 1200mV$  as compared to  $\pm 600mV$  to  $\pm 1000mV$  as proposed in [3-5].

The block diagram of pipelined ADC architecture with 1.5 bits/stage is used in this design is shown in figure 1. A pipelined ADC consists of nine stages, each stage resolves two bits with a sub-ADC, subtracts this value from its input, and amplifies the resulting residue by a gain of two. The stages are buffered by switched capacitor (SC) gain blocks that provide a sample-and-hold (S/H) between each stage, allowing concurrent processing. The resulting 18 bits are combined with digital error correction to yield 10 bits at

# A MEMS Oscillator based on displacement sensing principle

# C.Venkatesh<sup>†</sup>, Navakanta Bhat<sup>†</sup>

#### Abstract

Oscillator based on displacement sensing has been designed. The proposed oscillator uses a comb drive actuator as the vibrating element. Displacement is sensed through a voltage divider network and feedback to achieve oscillations. The configuration is simulated and verified with T-SPICE package.

#### 1.1 Introduction

Oscillators are an important component in present day communication systems. In most systems oscillators are currently off-chip. Many applications demand extremely stable frequencies. In current integrated circuit (C) technology it is not possible to achieve very high quality factor (Q) required for oscillators. Hence crystal oscillators, which generate highly stable frequencies owing to their high-Q are used for generating stable frequencies. Micro Electro Mechanical Systems (MEMS) have the potential to integrate various offchip components on a single chipl 1.2].

Oscillators based on comb drive actuators are widely discussed in current literature [1,3,4,6]. A monolithic oscillator has been realized using combined CMOS and surface micromachining technologies [1]. Methods of achieving frequency tuning have also been proposed [3,6].

In this work novel oscillator using voltage divider network for sensing displacement has been designed.

#### 2.1 Principle

The basic principle involved in the operation of the oscillator can be explained with a series combination of MEMS varactor and a fixed capacitor. Fig(1) shows the series combination of the capacitors forming a voltage divider network. The variable capacitors are mechanically coupled such that if the capacitance increases in one it decreases in the other. Due to some force if the plates of one varactor move closer/hence the plates of other move farther), the capacitance of that varactor increases and hence the voltage drop in corresponding fixed capacitor increases. This increase in voltage, if fedback properly to form a positive feedback, would generate oscillations.

In this work two oscillator configurations using this principle are presented



Fig1 . Voltage divider network

<sup>†</sup> Electrical Communication Engineering, I.I.Sc,Bangalore. **3.1 Oscillator configuration -1** 

1

# AN ATPG APPROACH FOR 2-D ARRAY RECONFIGURABLE LOGIC STRUCTURES

#### Sarath Kumar<sup>1</sup>, Ravi Kumar Dasari<sup>1</sup> and Venkata Rangam<sup>2</sup>

#### Abstract

This paper proposes an approach to reduce the Automatic Test Pattern Generation (ATPG) effort based on Hierarchical ATPG flow for 2-D array configurable logic structures. The proposed flow uses macro level test patterns as basic test cubes and translates these test patterns to top-level test vectors. This reduces the test data volume and test time by minimizing the test configurations. In this approach, Actel VariCore Logic Unit (LU) is considered as the basic module for the 2-D array structure. The proposed approach has been tested for 4X4 array of LUs with registered I/O cells.

#### 1 Introduction

Field programmable gate arrays (FPGAs) are widely used logic devices which can be programmed in the field to implement logic circuits. There are many kinds of architectures with different programming technologies. FPGAs with SRAM-based architecture, also called Look-Up-Table (LUT) based FPGAs, are most popular. FPGAs consist of 2-D array of programmable logic blocks, programmable interconnects and programmable I/Os. FPGA cores can be embedded in a SoC design as Embedded Reprogrammable Cores.

Testing these Embedded Reprogrammable Cores for manufacturing faults can be done using different test approaches [4-21]. Testing them with configuration test addressed in [7],[8],[10],[11],[14],[17],[19],[20],[21]. Testing them with Built In Self Test (BIST) are addressed in [2],[8][12],[13],[15], [16],[18]. This paper focuses on conversion of macro level configuration tests to top-level scan test patterns. In this paper we have considered Actel VariCore Logic Unit as programmable logic element in a 2-D array of Reconfigurable Logic Structure.

Application of standard ATPG methods on re-configurable architectures leads to large number of test configurations, resulting in large test volume and test time. In this paper we have considered a hierarchical approach using Macrotest, by reading the pre-calculated test configurations for the logic unit.

We describe the VariCore architecture in section 2, and Macrotest usage to translate to scan-test patterns in section 3. The proposed flow is explained in the section 4, and in section 5 generation of configuration test is given for VariCore's logic unit. Finally, section 6 gives some concluding remarks.

#### 2. VariCore Architecture

The main building block of the VariCore SRAM-based Embedded Programmable Gate Array (EPGA) architecture is the Primary Embedded Gate

1 Mentor Graphics (India), {sarath\_kumar, ravi\_dasari}@mentor.com 2 Texas Instruments (India), venkata\_rangam@ti.com

## Redundancy and Undetectability of Faults in Logic Circuits: A Tutorial

Debesh K. Das Computer Sci. & Eng. Dept. Jadavpur University Kolkata - 700 032, India debeshd@hotmail.com Bhargab B. Bhattacharya ACM Unit Indian Statistical Institute Kolkata - 700 108, India bhargab@isical.ac.in

Abstract: A fault f in a circuit-under-test (CUT) is said to be redundant, if in the presence of f the functional behavior of the circuit remains unchanged under all conditions. A fault f is said to be undetectable, if the presence of f cannot be ascertained by any input-output experiment, i.e. by applying a test to the inputs and by observing its response at some CUT output. In a combinational or scan-based synchronous sequential circuit, redundant and undetectable faults are synonymous. However, in a non-scan sequential circuit, they may represent different classes. Redundancy may be needed for achieving fault-tolerance, eliminating hazards, and to improve physical design. On the other hand, to a logic designer or test engineer redundancy may be undesirable for several reasons. It may increase chip area, critical delay, cost and complexity of test generation and fault simulation; it may invalidate a test for a target fault, which therefore, may escape while testing. Redundancy of stuck-at faults also affects path-delay testing. In a sequential circuit, the effect of redundancy manifests in more complex fashion. It strongly influences test generation and retiming strategy. This tutorial demonstrates, with examples, the various facets of redundancy and undetectability of stuck-at faults in both combinational and sequential circuits, their elimination techniques, and describes new classification schemes.

#### 1. Introduction

The concept of redundancy is very intriguing and lies at the foundation of system design. In general, it refers to the additional information, resources, or time beyond what is needed for normal operation [13]. Hardware redundancy is viewed as the presence of some gates and lines in the circuit, deletion of which does not change the circuit behavior [12, 32]. Classification of redundancies under stuck-at faults in combinational circuits based on circuit structure was first studied by Hayes [1]. Removal of hardware redundancy in logic circuits usually leads to further circuit optimization.

Test generation for a non-scan sequential circuit is a very complex task and the presence of redundancy/undetectability makes it worse. In the presence of hardware reset facility, undetectabilities in non-scan sequential circuits are classified in [2]. A fault in a sequential circuit is said to be *combinationally redundant* if starting from any state with any sequence of input vectors, its effect cannot be observed at the primary outputs, or by probing the next-state lines, i.e., it is undetectable under full scan. If a fault is not combinationally redundant, but changes the state diagram such that no input sequence from the reset state

# Single Full Chip Vector for Functional Testing

Vishal Dalal\*

#### Abstract

There have been continuous efforts to bring down the overall cost of a SOC. Die cost and cost associated with testing it are major components of overall cost. Testing an SOC is expensive and directly translates into amount of time spent by tester per chip. There are lots of efforts to reduce the tester time per chip and simultaneously increase the fault coverage. Tester needs to test for manufacturing defects as well as functional correctness of the chip. This is done through "test vectors".

This paper discusses one of the ways to reduce the testing time for functional test vectors. It explains the functional testing, general procedure of generating functional vectors and its limitations. Single test vector collapsing multiple functional test vectors to one reduces the test time. It has been explained along with its architecture, advantage, challenges and overheads. The process of characterizing a chip is also explained. Finally the paper touches upon testing the analog component of an SOC.

#### 1. Introduction

The main forces that drive today's semiconductor product development are low cost, low power and less latency (or high performance) of the product. The products with the most optimum utilization of these will eventually capture most of the market share. Cost is again highly important aspect of any product, which may eventually rule out even better designs. The cost associated with may SOC can be divided in two major components, the cost associated with making the silicon right from specifications to a packaged chip and cost associated with testing the rolled out chip. While former is dependent on the type of product, technology et the later cost component comes out of the need to provide high quality working products free from defects.

<sup>\*</sup> SASKEN Communication Technologies Ltd, vishald@sasken.com

1

# Hierarchical ATPG Static Pattern Compression

#### Sarveswara Tammali, Jais Abraham e-mail : {sarvesh, j-abraham1}@ti.com

#### Abstract

Static compression is a technique employed by the ATPG tools to get the smallest subset of patterns that cover the entire faults in the design. However, there are limitations in the way this technique is employed by the commercial ATPG tools. This paper talks about a technique where the knowledge of the hierarchical structure of the design is exploited to enhance the static compression achievable through these tools. The results of our experiments show significant reduction in pattern count when the proposed technique is applied. This technique becomes more attractive in the light of the fact that it does not involve any hardware change to the design.

#### 1 Introduction

The current ATPG tools [1] apply the following compression techniques to generate the smallest set of patterns to detect the given fault set.

1.1Dynamic compression: This is performed during pattern generation process. The algorithm attempts to maximize the faults detected by a given pattern. This is accomplished by setting unspecified bits in the pattern to a '1' or '0' so that previously undetected faults can be detected by the same vector. One method of doing this uses the Redundant Vector Elimination (RVE) and Essential Fault Reduction (EFR) algorithms [2].

1.2 Static compression: This is one of the commonly used approaches to reduce the test pattern sizes. This is performed post the initial pattern generation. This process attempts to find out the smallest subset of patterns, which can detect all faults in the fault set. One of the common way to reduce the pattern set is reverse fault simulated [3].[4], in which already generated pattern set is fault simulated in the reverse order so that if there is any test vector that doesn't detect a fault undetected by vectors simulated earlier can be dropped from the test set. In addition to reverse fault simulation, current ATPG tools fault simulate the existing patterns by selecting in random order and eliminate any redundant patterns.

If the design can be partitioned into regions of minimally interacting logic, then the static compression results can be improved. Although it's a complex task for an ATPG tool to perform such a partition, the designer has this information most of the time. This paper talks about exploiting this information to enhance the static compression.

DSP Design, Texas Instruments India (P) Ltd., Bangalore, India.

# A NOVEL CMOS BIST SCHEME FOR

# ON-CHIP ADC AND DAC TESTING

#### J.Ramesh D.Dinesh Kumar M.Veera Raghavulu Dr.K.Gunavathi

#### Abstract

Testing the analog and mixed signal circuitry of a mixed-signal IC has become a difficult task due to the fact that most analog and mixed signal circuits are tested by functionality, which is both time consuming and expensive. Recently several BIST schemes were proposed [1] which require on-chip ADC and DAC that require intensive computation which is not always possible. Others [2] rely on the analog circuitry and reference voltages for measurements which make it vulnerable to analog imperfections. Some [3-4] are sensitive to process variations while others cannot be used to test all kinds of ADC and DAC. In [5] delta sigma modulation based approach is proposed for on-chip test stimulus generation. This requires a low pass filter to remove the out of band high frequency modulation noise, which increases area overhead and decreases the speed. This also reduces the speed. In this paper, we have developed a Built In Self Test (BIST) scheme, for testing on-chip Analog to Digital and Digital to Analog converters. Here, we discuss on-chip generation of linear ramp as test stimuli [6] using adaptive ramp generator circuit, and describe the techniques for measuring the Differential Non Linearity (DNL) and Integral Non Linearity (INL) of the converters. We have validated the scheme with software simulation (Tanner-SPICE) using 1.25µm technology and a supply voltage of 2.5 V.

#### 1. Block Diagram



Fig. 1. Overall Block Diagram

The authors are with the Department of Electronics and Communication Engineering, P. S. G. College of Technology, Coimbatore, Tamil Nadu, India E-mail: jramesh60@yahoo.com

# A NOVEL TEST METHOD FOR ANALOG CIRCUITS USING WAVELET ANALYSIS

# P.KALPANA<sup>1</sup>, L.GAUTHAM<sup>2</sup>, S.MAHESH<sup>3</sup>, DR.K.GUNAVATHI<sup>4</sup>

#### Abstract

In this paper, a novel test method is proposed for detecting catastrophic and parametric faults in analog circuits. A transient stimulus is used as a test signal and a wavelet transform is applied on the response signal of the Device Under Test (DUT). The specifications of the circuit are verified implicitly by linking the test thresholds with the specifications.

#### 1. Introduction

Traditionally in specification testing, the specifications of the Device Under Test are measured and if any of the specifications is violated, the circuit is treated as faulty. But his specification based testing is time consuming and expensive. So research is motivated towards fault based or structural based testing. Faults that occur in analog circuits are commonly classified into catastrophic faults (hard faults) and parametric faults (soft faults). These parametric faults do not alter the function of the circuit, but violates one of the specifications.

In the literature several methods have been suggested for verifying the specifications of the circuit implicitly. Implicit functional testing using pseudo random technique has been suggested in [1], [2]. They have used samples of impulse response for fault detection and classification. For good classification it requires long pseudorandom sequences. In [3] optimal transient test stimulus is derived and an algorithm is proposed for determining the time points at which the output need to be sampled. However, in practice the user may not have enough information of the circuit to pre-specify the best sampling time instances.

Different types of waveforms have been proposed as test waveforms. In [4], [5] DC test generation methods have been proposed. AC testing have been proposed in [6], [7].DC testing is not adequate and AC testing requires different frequency components and also takes long testing time. In transient testing different frequencies can be accommodated and faults can be made visible on the measurements. So in the proposed test methodology, a PWL stimulus is used as an input stimulus

<sup>&</sup>lt;sup>1</sup>, <sup>4</sup> Assistant Professor, ECE Department, PSG College of Technology

Email: kalpana\_shekar @yahoo.co.in , kgunavathi2000@yahoo.com

<sup>2, 3</sup> ECE Department, PSG College of Technology

# A New Approach to Analog Scan using Time Delays

# $\begin{array}{c} P.Nandi^{[1]}, T.Pattnayak^{[2]}, S.Biswas^{[3]}, S.Mukhopadhyay^{[4]}, \\ A.Patra^{[5]}, \end{array}$

#### Abstract

Scan Based DFT techniques are quite popular in Digital Circuits. The current work is focused towards the development of Scan Architectures for Analog circuits. A single input line is required to program the scan configuration logic and a single output line is used to probe several analog signals in timemultiplexed fashion.

#### 1. Introduction

Digital DFT (Design for Testability) using shift registers in scan chains is widely popular. Unlike digital circuits the concept of shifting binary logic values through the scan registers cannot be applied to analog circuits because they involve a wide range of time varying voltage levels. Some work has already been done in the area of analog scan based DFT like IEEE 1149.4[S. Sunter (1995)]. The major existing approaches for analog scan based DFT are discussed below.

#### Multiplexer Based Approach

Analog inputs and outputs from multiple on-chip modules may be accessed by a single I/O analog bus by simple multiplexing based methodologies. The input address selects the module to be tested.

#### Voltage Scan Approach.

For testing of analog circuits involving high-frequency signals the method discussed above is not suitable. In [C.L. Wey, (1990)] Wey et al. have proposed a technique in which voltages are stored in the scan cells, which are composed of sample and hold circuits, each built with a switch for sampling, a capacitor for storage, and a voltage follower for impedance buffering between capacitors. When a test is performed, data at various test points are simultaneously loaded to the holding capacitors. To scan the data out, a two-phase clock is used to scan out the voltages from the capacitors. The details are provided in [C.L. Wey, (1990)].

<sup>[1]</sup> Intel Corporation, India, Email: projit.nandi.@intel.com

<sup>[2]</sup> Philips India Pvt. Ltd., India, Email: Tapan. Pattnayak@philips.com

<sup>&</sup>lt;sup>[3]</sup>Dept. of Electrical Engineering, Indian Institute of Technology, Kharagpur, Email:sbiswas@vlsi.iitkgp.ernet.in

<sup>&</sup>lt;sup>[4]</sup>Dept. of Electrical Engineering, Indian Institute of Technology, Kharagpur, Email:smukh@ee.iitkgp.ernet.in

<sup>&</sup>lt;sup>[5]</sup>Dept. of Electrical Engineering, Indian Institute of Technology, Kharagpur, Email:amit@ee.iitkgp.ernet.in

# A SIMULATION COVERAGE METRIC FOR ANALYZING THE BEHAVIORAL COVERAGE OF AN ASSERTION BASED VERIFICATION IP

# B. Pal<sup>1</sup>; A. Banerjee<sup>1</sup>; K. Chaitanya<sup>2</sup>; P. Dasgupta<sup>1</sup>; P.P. Chakrabarti<sup>1</sup>

#### Abstract

In recent times, Assertion-Based Verification (ABV) has become an essential component of the pre-silicon design validation flow. However, the use of ABV to validate descriptions of systems during simulation lacks a proper coverage metric. The set of assertions may be incomplete, thus not guaranteeing the behavioral checking completeness of the design under test with respect to the specification. In this paper, we consider the task of determining the completeness of an Assertion Based Verification IP against a high-level stuck-at fault model in a simulation-based validation framework. Such a coverage analysis can discover behavioral gaps in the set of assertions and aid the verification engineer to add more assertions to close the behavioral gaps. We present results of the proposed methodology on the ARM AMBA AHB bus protocol.

#### 1. Introduction

In recent times, assertion based verification (ABV) has found increased acceptance within the pre-silicon design validation flow. Several companies and consortiums have independently come up with languages for specifying assertions (OVA of Synopsys [7], System Verilog Assertion (SVA) of Accellera [9]). Property suites for standard bus interfaces, such as PCI [8] and AMBA [1], are being marketed as verification IPs for use as simulation monitors.

In spite of the growing popularity of ABV and the emergence of an arsenal of specification languages and simulation based ABV tools, the task of reasoning about the coverage of a set of assertions with respect to a given design-undertest remains a non-trivial task. Typically, property suites are derived manually

<sup>&</sup>lt;sup>1</sup> Department of Computer Science and Engineering, IIT Kharagpur. {bhaskar,ansuman,pallab,ppchak}@cse.iitkgp.ernet.in

<sup>&</sup>lt;sup>2</sup> Mentor Graphics, Hyderabad, India. chaitanya\_kamarapu@mentor.com
## A Novel Approach to Reduce Test Power Consumption

Santanu Chattopadhyay<sup>1</sup> Shantanu Gupta<sup>2</sup> Tarang Vaish<sup>3</sup>

#### Abstract

It is a well known fact that excessive switching activity during scan testing can cause average power and peak power dissipation during test to be much higher than the normal mode operation. This can be attributed to the fact that in CMOS technology almost all of the power consumption occurs during state transitions. Thus, by reducing the power consumption during scan testing we can avoid any sort of damage to Circuit Under Test (CUT) and extend the battery. life of the device. In this paper we propose a novel method to reduce test power consumption by a three step algorithm. First step involves modification of flip flop chain in scan architecture, followed by second step where we adapt the vectors to the new architecture; finally we do optimal reordering of test vectors in order to minimize the number of transitions. This approach at any point does not involve reordering of scan cells which though an attractive option is impractical as it causes timing problems during actual testing. The algorithm is verified for ISCAS'89 benchmark circuits, where it shows as much as 24.4% of reduction in transitions.

Keywords: test, low power, test vector reordering, switching activity, scan testing architecture, flip flop chain modification.

#### 1. Introduction

With the advent of modern technology and advancements in VLSI design methodology testing of circuits has imposed great challenges. Power dissipation is a crucial issue because of its wide ranged effects on circuit performance and life of circuit under test. Exceeding peak power limitations of any circuit can cause irreversible damages and it is a well known fact that average power consumption during testing is much higher than the normal operation.

 $P = \frac{1}{2} C_{id} V_{ad}^2 FS$   $C_{id} : \text{Load Capacitance}$   $V_{ad} : \text{Supply Voltage}$  S : Switching activity F : Clock Frequency

<sup>&</sup>lt;sup>1</sup> Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, <u>santanu@iitg.ernet.in</u>

<sup>&</sup>lt;sup>2</sup>,<sup>3</sup> Department of Computer Science and Engineering., Indian Institute of Technology Guwahati, <u>shantanu@iitg.ernet.in</u>, <u>tarang@iitg.ernet.in</u>

# FAULT DIAGNOSIS BY SPECTRAL METHOD

## Pradyut Sarkar<sup>1</sup> Arindam Karmakar<sup>2</sup> Susanta Chakrabarti<sup>2</sup>

#### Abstract

A new spectral coefficient is proposed for fault diagnosis of combinational circuits. The pseudo-random test pattern multiplied by proposed spectral co-efficient results in a noise free spectral pattern. The proposed spectral pattern is applied to the CUT and the first erroneous test response in the presence of the considered fault is detected. Some of the spectral patterns have to be repeatedly applied less than n number of times (in is the number of outputs in the CUT) which is significantly a better result as compared to that of the best known method where pseudo-random test pattern is applied n times for 100% fault coverage.

Keywords- Spectral Pattern, MILFSR, Hadamard Coefficient, BIST.

#### 1. Introduction

The main components of a BIST system are a test pattern generator that applies a sequence of patterns to the circuit under test (CUT), a response compactor that compacts the response into a signature to a fault—free reference value. Due to their low hardware costs, BIST based on the random patterns is very attractive. Linear feedback shift registers (LFSRs) are commonly used as pseudo-random test pattern generators in BIST schemes. Weighted pseudo random patterns have better fault coverage in circuits [1-4]. However, this technique generally results in large test sets. Another disadvantage of this method is the additional area and delay overheads. One way to overcome these drawbacks is to design the LFSR by selecting good seed and feed back polynomial [5, 6]. However, test sequences with acceptable test length and fault coverage are obtained at the expense of area overhead required to store seeds. Also the complexity of computation of the seeds rapidly increases with the number of primary inputs.

For built-in self-test (*BIST*) and also for an external test the corresponding test responses are very often accumulated into a signature by use of a multi-input signature analyzer (*MILFSR*). When the test is completed the signature which is accumulated in the MISA is compared with the expected signature of the fault free *CUT*. If a miss-match occurs the tested *CUT* is considered as faulty [7-11]. Walsh and Rademacher-Walsh function [15, 16] is implemented only for response compaction. [10]. First erroneous response position of a combinational circuit in presence of the considered fault is detected. Then last n test inputs is reapplied n times (where n is the number of the signature) is the signature of the sisonal transities (the signature of the sisonal transities of the

<sup>1</sup> MCKV Institute of Engg. <sup>2</sup> Kalyani University e-mail: p\_sarkar77@yahoo.co.in e-mail: susanta\_chak@yahoo.co.in

This work was supported in part by AICTE, Delhi R&D scheme, No-8020/RID/RGD-147/2002. PI of the project: Susanta Chakrabarti

# AUTOMATED SILICON DEBUGGING METHODOLOGY FOR VALIDATING STANDARD CELLS

## N.VIJAYARAGHAVAN:1 DIMPLE LALWANI:2

## Abstract

ASIC libraries provide designers with reusable building blocks in their design flow. It is thus imperative that these libraries are silicon qualified so as to obtain high yields for designs made using them. This paper discusses an automated Silicon debugging methodology for the functional validation of standard-cell libraries. This methodology has been implemented in the form of a tool – INQUEST and successfully used for debugging standard-cells made in the 130nm & 90nm technology. The design of a representative system for exhaustive checking of standard-cells is also discussed. Using this methodology, physical faults in standard cells can be quickly debugged and analyzed and also facilitate in the early maturity of the associated manufacturing process.

### 1. Introduction

The key success factor for the rapid growth of the integrated systems is the use of standard-cell library for various system functions. standard-cell designs are becoming more complex and libraries are increasing in size and richness, growing from 200 to 300 cells a few years ago to 700 cells or more today with separate libraries for high performance, minimum area and low power. Thus for making designs that have high yield, it is important that these libraries be exhaustively tested on Silicon, thus leading to the maturity of the associated manufacturing process.

Debugging and Diagnosis are important components of testing, as accurate defect location is essential in improving the manufacturing process and rapid identification of defective cells is critical for the quick maturity of a Library. Locating functionality errors in the Library cells as well as the associated process problems and analyzing them are the key objectives of our debugging methodology. The discussed methodology is based on a scan chain based design, in which each Cell to be tested (CUT) is connected to the input of a scan flip-flop.

Vijay.RAGHAVAN@st.com

<sup>2</sup> Contact Information: Dimple LALWANI CRnD STMicroelectronics Sec 16 A Noida, India.

Dimple.LALWANI@st.com

<sup>&</sup>lt;sup>1</sup> Contact Information: N.Vijayaraghavan CRnD STMicroelectronics Sec 16 A Noida, India.

# A BIST Approach to On-Line Monitoring of Digital VLSI Circuits: Theory, Design and Implementation

# S.Biswas<sup>[1]</sup>, S.Mukhopadhyay<sup>[2]</sup>, A.Patra<sup>[3]</sup>, S.Mandal<sup>[4]</sup>

## Abstract

This work is concerned with the development of algorithms and CAD tools for the design of digital circuits with on line monitoring capability. An existing Theory of Fault Detection and Diagnosis available in the literature on Discrete Event Systems has been adopted for on-line detection of stuck-at faults in Digital Circuits. Efficient computational techniques to deal with very large state spaces based on Ordered Binary Decision Diagrams and Abstraction have been proposed. Based on these a CAD tool has been developed that can provide a fully automated flow for design of circuits with on-line test capability without the requirement of any modification to the core and can handle generic digital circuits with cell count as high as 15,000 and having the order of 2<sup>500</sup> states. This is believed to be an improvement of an order of magnitude over results presented in the literature. This methodology enables the designer to tradeoff fault coverage and detection latency against area and power overhead. Chips, designed using this methodology have been fabricated in (0.18micron technology) and are tested to be working.

#### 1. Introduction

The current work is aimed at the development of methodologies and CAD tools for On-Line Testing (OLT) of Digital Circuits. OLT can be defined as the procedure to enable integrated circuits to verify the correctness of their functionality during normal operation by checking whether the response of the circuit conforms to its normal dynamic model. Issues related to OLT are increasingly becoming important in modern electronic systems [Nicolaidis M. (1998)]. These needs have increased dramatically in recent times because with the widespread usage of deep submicron technology, there is a rise in the probability of development of faults during operation.

OLT techniques for VLSI, reported in the literature [Nicolaidis M. (1998)] can be classified into the following main categories, namely a) Self-Checking design b) Signature Monitoring in FSMs c) On-line BIST d) Analog Methodologies. Over the years these methodologies have proved their efficiency in terms of cost

<sup>[1]</sup> Dept. of Electrical Engineering, Indian Institute of Technology, Kharagpur, Email:sbiswas@vlsi.iitkgp.ernet.in

<sup>[2]</sup> Dept. of Electrical Engineering, Indian Institute of Technology, Kharagpur, Email:smukh@ee.iitkgp.ernet.in

<sup>[3]</sup> Dept. of Electrical Engineering, Indian Institute of Technology, Kharagpur, Email:amit@ee.iitkgp.ernet.in

<sup>[4]</sup> School of Information Technology, Indian Institute of Technology, Kharagpur, Email:subha@ee.iitkgp.ernet.in

# Fault Observability Analysis of CMOS Opamp in Frequency Domain

#### S. C. Bose\*, Vishal Gupta\*\* and Dinesh Jain\*\*

Abstract In this paper, fault observability of MOS block has been analyzed in frequency domain. The concept of fault observability has been dealt at transistor level of the circuit. The proposed algorithm which is mainly simulation based determines the set of optimal parameters and adequate test frequencies that would lead to increased fault observability. However, aim here is the application of frequency domain analysis in finding faults present in the transistors (i.e. active components) of the given CUT. The proposed algorithm determines the set of optimal parameters and adequate test frequencies that would lead to increased fault observability. Concepts such as fault masking, fault dominance, fault equivalence and non-observable fault are defined at the transistor level.

#### 1. Introduction

Analog integrated circuits, in addition to digital electronics, RF circuits etc, have become integral part of SoCs. Testing and testability of digital circuits is mature and established. However, there remain considerable difficulties in testing analog integrated circuits [1,8]. These difficulties pertain to the nature of the signal and diversity of specifications to be checked in order to assess the cor<sup>1</sup>rectness of the circuit. Non-linearity, noise, parametric variation, limited access to the internal nodes of the chip etc. add to these difficulties. There are several models [2,3,4,5,6,7] proposed in literature. Some of them are statistical models. Algorithm proposed by Tsai [2] is a quadratic programming problem and is applicable only in case of system having a simple transfer function. Other techniques proposed are fault equivalence study [14], dictionary based approach, dc testing for catastrophic fault [15], path sensitization [13], use of IDDQ like method [12] etc. However, detection of a faulty circuit does not throw any light on the cause of the fault. There have been attempts to diagnose faults in analog system [5, 9] but these efforts concentrated on the passive components (resistance and capacitors) of analog block with the assumption that active component like op-amp is fault free. In earlier work [12], authors have used IDDQ like method to identify the cause of the fault in a CMOS op-amp and depending upon the value of the current drawn from supply they are able to club possible faults into certain groups but have failed to identify them within group.

In this paper, fault observability of MOS analog block has been analyzed in frequency domain using spice simulation. We introduce the concept of fault

<sup>1 \*</sup> Scientist, CEERI Pilani \*\* Students EEE Group, BITS

# <u>Advanced Processor Architectures – The Verification</u> <u>Challenge</u>

## Advanced Processor Architectures - The Verification Challenge by Sunil Kakkar

## Advanced Processor Architectures: Verification Challenge I dentified

We will briefly discuss some of the complex high performance microprocessor architectures. Terms like SMP, SMT, Superdeep Pipeline, VLIW, Vector Processor, Superscalar Processor will be mentioned and it will be discussed as to why they make processor architectures complex. We will talk about some of the high performance features of these microprocessors like dynamic scheduling, register renaming, speculative execution, branch prediction, imprecise exceptions etc:

We will also mention the complications that these architectures have to tackle like cache coherency, hazards, hardware interlocks, exception contentions, queue overflows etc:

As design complexity grows linearly, verification complexity grows exponentially and we will talk about the impact from shrinking chip geometry.

We will also see how cycle accurate simulators come to our rescue when the typical event based simulator fails to cope up.

Then, we will discuss if simulation is enough and if it alone can guarantee coverage. Also, we will talk about time limitations with simulation and the minimum state of the design required for it to be simulated effectively.

We will than talk briefly about complementing simulation with static and mathematical techniques like Formal Verification where properties or rules of the design can be checked for conformance at all times under all conditions. Formal Verification can also start very early on small self contained blocks of the design. It is ideal for finding control logic bugs.

We will introduce coverage and random tests. We will also talk about directed random tests and how random tests can be focused to maximize coverage of the functionality they attempt to verify.

## Why Formal Verification

We will define Formal Verification and will discuss the basic concepts of Formal Verification like representing design as a FSM and BDDs.

We will discuss why Formal Verification does not need test development and how Formal Verification is exhaustive & guarantees coverage for the properties verified.

We will talk about limitations of Formal Verification like State Space Explosion and size limitations.

## Assertion Based Verification

Assertions are rules or properties that the design must conform to. They may also be the assumptions that the designer expects the design to obey. Assertions also specify the functional relationship of a block to its surrounding blocks, which may include protocols and

# An Assertion-based Language for Generating Test Sequences for Complex Temporal Behavior

## Pritam Roy<sup>1</sup>, Pallab Dasgupta<sup>1</sup>, P P Chakrabarti<sup>1</sup>

## Abstract

This paper addresses the task of stimulus generation for complex temporal behavior of designs. Such stimuli can be used in a variety of cases including simulation at lower level of design hierarchy, synthesis of controllers and postsilicon testing. In this paper we present a language for expressing temporal behavior in the form of formal properties and for annotating various kinds of input constraints to appropriately direct test generation. We present a tool that accepts our language and a circuit as a net-list, and produces sequential test patterns.

## **1** Introduction

In recent times, there has been a strong focus on assertion-based validation (ABV) within the validation flows of leading chip design companies. There are broadly two methodologies for verifying assertions, namely model checking (formal) and simulation based ABV (semi-formal). Current model checking tools do not scale to circuits of large size, thereby limiting the applicability of formal assertion verification to individual component modules of a design. Also, model-checking techniques cannot handle the circuit complexity at the lower levels of design (such as transistor / SPICE level).

While validating designs of large size, the designer often aims to check certain correctness requirements (framed as assertions). If these cannot be verified formally (due to capacity bottlenecks), the designer aims to develop the appropriate stimuli to exercise the scenarios of concern (as a test bench) and then run the simulation with these stimuli for checking the assertions. In recent times several ATPG based techniques [1,2,3,6,7] have been developed for this purpose.

<sup>&</sup>lt;sup>1</sup> Dept of CSE, Indian Institute of Technology Kharagpur, Email: {pritam,pallab, ppchak}@cse.iitkgp.ernet.in

# $Parik s\bar{a}$ - Functional Verification tool for x86 Architecture

K. Uday Bhaskar G. Chandramouli V. Kamakoti

Department of Computer Science and Engineering Indian Institute of Technology, Madras, Chennai, India Email: kama@iitm.ernet.in

## Abstract

Functional testing of CISC architectures like the x86 is a Herculean task. This paper proposes a functional verification tool -  $Par\bar{k}s\bar{s}$  that generates random assembly programs to test x86-based architectures. The test generates random assembly programs to test x86-based architectures. The test generator is envisaged to be used throughout the design cycle of the x86 processor. To this effect, several features and interfaces are provided to the user to control the randomness and facilitate easy debugging. The test generator is supported by a knowledge base that aids in the selection of instructions so as to ensure larger fault coverage and termination of the tests when executed on a fault-free Design/Unit under test. The test generator can be used both for pre- and post-silicon validation. A salient feature of this tool is its ability to *locate* a failing instruction on a x86-compatible processor at field.

**Keywords:** Test generation, CISC architecture, Functional Verification, Fault Coverage, Fault Grading, x86 Architecture, fault detection, fault location.

## 1 Introduction

Functional verification of a processor is to ensure that a given design of a processor (logical design or the silicon chip) under test matches with the specification of its instruction-set architecture. It is the most difficult and time-consuming step in large processor designs. To address this problem, design experts have made large investments in terms of both time and money, building customized solutions, and a few have even introduced new methods (e.g. formal verification) or new verification tools. However, as designs grow more complex, the existing solutions consume an ever-increasing percentage of time. The result is a growing verification crisis. A cost-effective solution for functional verification, that has a short learning curve and requires acceptable changes to the existing design methodologies is the need of the day. Precisely, the objective is to build a tool that can output tests comprising of *intelligent* sequence of instructions that covers the Instruction-set Architecture, and exercises the various corner cases of the design. Due to increased number of instructions and the contexts in which these execute, exhaustive testing is impossible. The "directed-random testing" approach is more likely to find bugs (and therefore achieve high functional coverage) than conventional methods. This is achieved by generating random test stimulus along three distinct axes - which instructions to execute; what arguments to pass to each instruction; and, when to execute each instruction in relation to other activity in the test

## THE ROLE OF INSTITUTIONAL DEVELOPMENT AND ADVANCEMENT OFFICE IN PROMOTING UNDERGRADUATE VLSI EDUCATION A ROLE MODEL CONCEPT

## Shekhar Pradhan, Ph.D., Professor and Director, Sponsored Programs, and Felica Wooten Blanks, Ph.D., Executive Director

Abstract: Ideal VLSI system Technology education is centered on project oriented courses. It involves undergraduates in the complete development cycle: design, fabrication and testing of fabricated chips. The course proposal outlined in this paper is suitable for advanced level undergraduate students who have already taken circuit analysis, analog and digital circuit courses. This paper reports the successful implementation of the VLSI Design Course at undergraduate level at our college. Several projects completed by the students involve experimental microprocessors and Finite State Machines. The completed projects include fabrication of designed chips by The Metal Oxide Semiconductor Implementation Service (MOSIS), University of California (USC), Marina Del Ray, California, USA. The project was funded by National Science Foundation (NSF). The Institutional Development and Advancement office plays a key role in coordinating between the NSF, USC and our college.

The design of VLSI chips involved current trends in the design and fabrication of custom VLSI chips that are based on approach where libraries of cells are used to complete the total function of the system. The cell library approach has a considerable advantage in that the designer's previous work can be employed, saving a large amount of effort and time in design of cells and systems. Cells were designed with a conservative approach to assure their operation in wide spectrum of applications. This conservative approach reduces their specific performance compared with other cells. For analog cells where performance is usually more sensitive to circuit design features than digital circuits, standard cells were designed to guarantee its successful operation. Different chips were fabricated at different times which put greater responsibility to test on validity of cell design for correct functionality. The disadvantage of this custom approach was that less than optimal hardware design was realized. This was directly reflected in the electrical characteristics of the chips, where a carefully handcrafted design had higher performance over the library cell-design based approach.

The approach described in this paper is the successful completion of chips in a small college in collaboration with a larger university. This role model approach can be adopted by the smaller institutions to integrate VLSI based courses in their curriculum.

-----

Office of the Institutional Development and Advancement Bluefield State College, Bluefield, WV 24701, USA

# VLSI CURRICULUM IN INDIAN UNIVERSITIES: AN ANALYSIS & PRESCRIPTION

## V. Sahula<sup>1</sup>

#### Abstract

In this paper, we review the general motivation and requirements of establishing an effective VLSI oriented curriculum at an undergraduate and a postgraduate level. We first discuss about such curriculums being followed at our institute. Nest, we touch upon the environment and human related issues, keeping away from the debate on issues related to imparting & balancing fundamental knowledge & application oriented knowledge in a VLSI curriculum. Thus, manuscript focuses on issues related to effective VLSI teaching, laboratory development & faculty research; and strategies for possible solutions.

#### 1 Introduction & requirements analysis

The element of evolution is driven by the needs of society & environment and by capability & desire of change agents. There is a definite shift of impetus towards digital media and computation for communication. The students' knowledge and desire for latest electronic and computer technology is ever higher, paving the way for evolution of systems and system design techniques for multimedia, computation and communication. Lee and Messerschmitt (1998) and (1999) provide a peek into futuristic education environment. According to a report from ministry of communication and IT, Govt. of India as well as according to an industry prediction, the demand of engineers for IT sector especially IT hardware is nowhere matched by the supply, from education domain. The short term strategies employed by industry is to impart VLSI design education-cum-training either in-house or at a hired academy. Alternately, fresh or practicing engineers pursue postgraduate studies in VLSI design specialization. Long-term strategies imply including VLSI design related course at undergraduate level. The latency of undergraduate course is much more than PG and hence we need to act urgently for improving and orienting the UG curriculum.

<sup>&</sup>lt;sup>1</sup> Faculty member, Department of ECE, Malaviya National Institute of Technology, Jaipur. E-mail: <u>sahula@ieee.org</u>