scispace - formally typeset
Search or ask a question

Showing papers by "Timo Hämäläinen published in 2002"


Journal ArticleDOI
TL;DR: This paper gives an overview of the background, current status, and ongoing trends in wireless personal data communications, focusing on the most prospective standardisation and specification efforts.

64 citations


Proceedings ArticleDOI
07 Aug 2002
TL;DR: The basic properties, such as structure, transfer properties and arbitration of bus-based interconnections for System-on-Chip (SoC) designs are introduced.
Abstract: This paper introduces the basic properties, such as structure, transfer properties and arbitration of bus-based interconnections for System-on-Chip (SoC) designs. The overview shows that contemporary SoC buses differ only in minor details. As a result, practically every studied interconnection method could rather easily conform to a common interface. Such an interface would enhance design re-use and make system design easier. However, due to their similarity, the choice between buses is not a straightforward task.

63 citations


Journal ArticleDOI
TL;DR: This paper summarizes the results of over 25 research groups or individual researchers that have presented video coding implementations on general-purpose processors with the new single instruction multiple data media instruction set architecture extensions and offers an overview of future trends for new instructions and architectural speed-up techniques.
Abstract: This paper summarizes the results of over 25 research groups or individual researchers that have presented video coding implementations on general-purpose processors with the new single instruction multiple data media instruction set architecture extensions. The extensions are introduced and the fundamentals for extensions, as well as some inherent problems, are explained. The reported attempts to utilize the extensions are divided into kernel- and application-level, as well as platform dependent and independent optimizations. Optimized applications include, in addition to some proprietary methods, all of the major video coding standards such as H.261, H.263, MPEG-4, MPEG-1, and MPEG-2. These optimized implementations include a complete video codec, several decoders, and several encoders. Additionally, a performance comparison is given for four representative encoder implementations based on the reported results. Also included is an overview of future trends for new instructions and architectural speed-up techniques.

40 citations


Proceedings ArticleDOI
10 Dec 2002
TL;DR: These models for the 3G/4G service pricing including QoS are introduced and the importance of pricing versus the acceptance of services will be a very delicate and important matter that must be dealt very gently.
Abstract: Pricing of the future multimedia services in the 3G/4G networks will play a key role from operator's point of view to achieve the maximum revenue and maximizing ROY. On the other hand pricing of the various new services is a very important issue to subscribers and especially the pricing versus the acceptance of services will be a very delicate and important matter that must be dealt very gently. This paper introduces models for the 3G/4G service pricing including QoS.

27 citations


Journal ArticleDOI
TL;DR: This paper proposes a new methodology based on the economic models for competing traffic classes (classes of sessions) in packet networks, which aims to exploit the maximal capacity of the data network link by using the dynamic allocation strategy.
Abstract: In this paper, the maximal capacity of the data network link has attempted to be exploited by using the dynamic allocation strategy. We propose a new methodology based on the economic models for competing traffic classes (classes of sessions) in packet networks. As the demand for network services accelerates, users' satisfaction to the service level might decrease due to the congestion at the network nodes. To prevent this, efficient allocation of a networks resources, such as available bandwidth and switch capacity, is needed. By using the so-called user profile as well as the utility (e.g., data rate) functions, it is possible to allocate data rates and other utilities using the arbitrary number of QoS classes, say $0.01,…, $10.

22 citations


Book ChapterDOI
01 Jan 2002
TL;DR: This chapter focuses on parallel implementations of the Self-Organizing Map (SOM) featuring different levels of parallelism, considered in great detail as it is the most commonly used approach.
Abstract: This chapter focuses on parallel implementations of the Self-Organizing Map (SOM) featuring different levels of parallelism. The basic arithmetic-logical operations of SOM are first reviewed for a consideration of implementation issues such as number precision, memory consumption and time complexity. Mapping involves network, training set, neuron and weight parallelism. Examples of the weight and neuron parallel mappings are given for abstract platforms to conduct general principles. Neuron parallel mapping is considered in great detail as it is the most commonly used approach. A review of implementations is given from supercomputers to VLSI (Very Large Scale Integration) chips with criteria for performance comparison.

19 citations


Proceedings ArticleDOI
11 Oct 2002
TL;DR: A new service differentiation mechanism (arrival-related dynamic partitioning) for cluster-based network servers is presented and it is demonstrated by simulation that it can guarantee that customers with higher priority classes receive better service than ones with lower priority classes and achieve the basic goals with service differentiation.
Abstract: As the Web is becoming a medium widely used as a preferential channel for critical information exchange, business, and e-commerce, it is necessary to enable differentiated service (DiffServ) mechanisms not only at the network but also at the Web server level. We present a new service differentiation mechanism (arrival-related dynamic partitioning) for cluster-based network servers. We demonstrate by simulation that it can guarantee that customers with higher priority classes receive better service than ones with lower priority classes and achieve the basic goals with service differentiation, especially when the system is heavily loaded with not enough server resources to allocate. An admission control mechanism is also provided to prevent a Web site from being overwhelmed by excessive user requests.

15 citations


Journal ArticleDOI
TL;DR: HIBI offers a scaleable and easy to use architecture for system-on-a-chip designs that enables data transmissions with very low latencies and also minimizes the amount of needed signal lines.

14 citations


Journal ArticleDOI
TL;DR: The requirements of the H.263/MPEG4 video encoder are clearly exceeded with two TMS320C6201 processors while obtaining over 90% parallelization efficiency.

12 citations


Proceedings ArticleDOI
09 Dec 2002
TL;DR: The proposed method substantially improves image quality of video conferencing sequences in presence of transmission errors and is compared to average intersample difference across the block boundaries (AIDB) algorithm whose performance is shown to be more sensitive to selection of correct threshold values than the proposed method.
Abstract: Corrupted low frequency data of intra coded macroblocks can significantly degrade quality of video in error prone wireless networks Therefore, a new method for detecting the corrupted blocks is presented The method exploits temporal smoothness of video by computing the absolute difference between subsequent video frames A threshold function is used to highlight the block differences, and a heuristic is developed to detect the corrupted blocks The proposed method is evaluated with our wireless video simulator, which shows that the method substantially improves image quality of video conferencing sequences in presence of transmission errors In addition, the method is compared to average intersample difference across the block boundaries (AIDB) algorithm whose performance is shown to be more sensitive to selection of correct threshold values than the proposed method

10 citations


Proceedings ArticleDOI
17 Nov 2002
TL;DR: This paper introduces a model that can be used to share link capacity among customers under different kind of traffic conditions to support connections of given duration that requires a certain quality of service.
Abstract: This paper introduces a model that can be used to share link capacity among customers under different kind of traffic conditions. This model is suitable for different kind of networks like the 4G networks (fast wireless access to wired network) to support connections of given duration that requires a certain quality of service. We study different types of network traffic mixed in a same communication link. A single link is considered as a bottleneck and the goal is to find customer traffic profiles that maximizes the revenue of the link. Presented allocation system accepts every calls and there is not absolute blocking, but the offered data rate/user depends on the network load. Data arrival rate depends on the current link utilization, user's payment (selected CoS class) and delay. The arrival rate is (i) increasing with respect to the offered data rate, (ii) decreasing with respect to the price, (iii) decreasing with respect to the network load, and (iv) decreasing with respect to the delay. As an example, explicit formula obeying these conditions is given and analyzed.

Journal ArticleDOI
TL;DR: Performance of the configurable parallel memory architecture (CPMA) is analyzed in the case of a selection of algorithms from a video encoder that benefit from the multiple memory access functions, which is apparent from the comparisons to the traditional sequential memory accesses.

Proceedings ArticleDOI
05 Nov 2002
TL;DR: An implementation of an access point (AP) providing an access from WLANs to backbone wired networks is presented and the AP adapts the QoS signalling of the connected networks.
Abstract: The maintaining of the quality of service (QoS) of the data transfers in wireless local area networks (WLAN) is an extensive task. An implementation of an access point (AP) providing an access from WLANs to backbone wired networks is presented in this paper. The AP adapts the QoS signalling of the connected networks. The AP has been implemented in Windows NT workstation as a protocol driver and its operation has been verified. The evaluation of the driver performance encourages the usage of the AP in demanding environments.

Proceedings ArticleDOI
TL;DR: An adaptive Weighted Fair Queue based algorithm for traffic allocation is presented and studied and the weights in gradient type WFQ algorithm are adapted using revenue as a target function.
Abstract: In the future Internet, di erent applications such as Voice over IP (VoIP) and Video-on-Demand (VoD) arise with di erent Quality of Service (QoS) parameters including e.g. guaranteed bandwidth, delay jitter, and latency. Different kinds of service classes (e.g. gold, silver, bronze) arise. The customers of di erent classes pay di erent prices to the service provider, who must share resources in a plausible way. In a router, packets are queued using a multi-queue system, where each queue corresponds to one service class. In this paper, an adaptive Weighted Fair Queue based algorithm for traAEc allocation is presented and studied. The weights in gradient type WFQ algorithm are adapted using revenue as a target function.

Proceedings ArticleDOI
05 Nov 2002
TL;DR: The results show that the power consumption of a PC/104 diagnostics module is high for battery operating systems, but it can be significantly reduced by component selection and transfer delays can be significant with high network traffic load.
Abstract: A diagnostics systems module based on PC/104 computer platform standard has been developed A wireless diagnostics module is battery operated and connected to a diagnostics access point using IEEE 80211b wireless local area network (WLAN) link The diagnostics module performance is evaluated in the means of power consumption and wireless link capacity The results show that the power consumption of a PC/104 diagnostics module is high for battery operating systems However, it can be significantly reduced by component selection The IEEE 80211b wireless link performance is adequate for enabling diagnostics applications, but transfer delays can be significant with high network traffic load The PC/104 architecture is found to be suitable for industrial use The architecture can be easily implemented and changed for other types of applications

Proceedings ArticleDOI
08 Apr 2002
TL;DR: The functionality and implementation of the wireless Video Control Protocol (VCP) is presented, which has been implemented for developing the functionality for real-time video stream transmission over heterogeneous wireless network technologies.
Abstract: Real-time streaming video is expected to emerge as a key service in different telecommunications systems, including wireless networks. This paper presents the functionality and implementation of the wireless Video Control Protocol (VCP). The protocol has been implemented for developing the functionality for real-time video stream transmission over heterogeneous wireless network technologies. VCP is embedded into a wireless video demonstrator. The demonstrator consists of Windows NT hosts containing a real-time H.263 encoder, video stream parsing functionality, and several network connections, such as wireless LAN, Bluetooth and GSM data. The protocol contains functionality for protecting the video stream transfer and adapting different network technologies together.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: An analysis of TDMA-based communication scheduling in a system-on-chip (SoC) video encoder and analysis of communication and scheduling optimization methods in HIBI show that HIBI is capable of exploiting the predictable nature of continuous-media processing when the TDMA method is employed.
Abstract: An analysis of TDMA-based communication scheduling in a system-on-chip (SoC) video encoder is presented. Heterogeneous IP (intellectual property) block interconnection (HIBI) forms the platform for the system, enabling the possibility of exploring timeslot-based arbitration in addition to more traditional priority-based arbitration. To illustrate the effects of heavy communication with real-time periodic data transfers, a video encoder is used as a case study. A simulation environment and a tool for monitoring the communication are described. Analysis of communication and scheduling optimization methods in HIBI are given. Analyses show that HIBI is capable of exploiting the predictable nature of continuous-media processing when the TDMA method is employed. However, dynamic refining of timeframe allocation is necessary to utilize maximally the advantages of TDMA scheduling.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: The architecture of a wireless video transfer demonstrator, which contains modules for video capture, encoding, stream protection and transfer for wireless link or network, as well as for decoding and displaying at the receiver, is presented.
Abstract: This paper presents the architecture of a wireless video transfer demonstrator. The demonstrator has been implemented for developing control protocols and QoS support for real-time video streaming services. The demonstrator contains modules for video capture, encoding, stream protection and transfer for wireless link or network, as well as for decoding and displaying at the receiver. H.263 encoding is performed in real-time using dedicated hardware. A video control protocol has been designed and implemented for managing the stream transfer and for collecting measurement information. The current implementation operates over wireless LAN, GSM data, Bluetooth and a proprietary wireless LAN called TUTWLAN. In addition, a special module has been implemented for simulating different wireless links or networks locally.

Proceedings ArticleDOI
26 May 2002
TL;DR: The results of this paper suggest that design space exploration leads to substantial improvements when constructing complex SoCs and ideas on how to support this automatically with FSM optimization are shown.
Abstract: In this paper, finite state machine (FSM) optimization for a system-on-chip (SoC) interconnection is presented. In the used interconnection architecture, the same interface block is used repeatedly and, therefore, optimization of the interface for synthesis is a very critical implementation issue. However, low-level hand-optimization is not desirable and, therefore, optimization should be performed in the high-level description or automatically in the synthesis process. The results of this paper suggest that design space exploration leads to substantial improvements when constructing complex SoCs. Ideas on how to support this automatically with FSM optimization are shown.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: The paper evaluates the performances of the Secure Remote Password (SRP) authentication protocol computations written in C with the MIRACL and OpenSSL libraries utilized and evaluated on Pentium III and ARM9TDMI microprocessors.
Abstract: The paper evaluates the performances of the Secure Remote Password (SRP) authentication protocol computations. The software implementations are written in C with the MIRACL and OpenSSL libraries utilized and evaluated on Pentium III and ARM9TDMI microprocessors. Accelerating the performance of the critical computational parts with dedicated hardware is discussed. The measurements show that with a prime modulus of length 2048 bits the SRP computations take 80.5 ms on 700 MHz Pentium III and 503 ms on 200 MHz ARM9TDMI with the highest software optimizations within the libraries. With maximal precomputation the execution times can be decreased down to 30.6 ms and 194 ms respectively. By adding appropriate hardware support for the SHA-1 hash computation and exponentiation their cycle counts can be reduced at least by factors of 10 and 20.

Journal Article
TL;DR: Analgorithm that actually provides Class of Servicebased differentiated differentiatedaccessto serverclusters, and offers better playground for QoSmechanisms in client-serverenvironments is described.
Abstract: Theswift growth of Internethasboostedtheuseof Webbasedservicesandin somepracticalcaseshas led to overwhelmingrequestburststo servers.Relationaldatabasequeries,imagestorage/retrie val andothernew typesof applicationtransactionshave becomeincreasinglypopular. Their coexistencein commercialparalleland distributedsystemshave generatedsomeuniquelynew loadingproblems.For example,the constantincreaseof requestratefinally leadsto processingpowerrequirement exceedingthatof theaccessedserver. As aconsequence, the responsetimesincreaseandsomeportionof the requestsarelost. Clusteringof serversto meetthe growing demandfor serverprocessingcapacity, especiallyin web-basedservicesupply, havecreatedtheneedfor intelligent switchingat front-enddevices.As aconsequence of clustering,multilayerswitchingschemeshavebeendeveloped to enableoptimumloadingof the individual serversin a cluster. In this paper , we formulatethe load balancing problemtaking the QoSinto considerationandintroducea QoSawareloadbalancingalgorithm(QoS-LB).The performanceof thealgorithmis simulatedandresultsindicatingtheloadbalancingcapabilityof thealgorithmare presented.Theoverall ideaof this paperis to describeanalgorithmthatactuallyprovidesClassof Servicebased differentiatedaccessto serverclusters,andoffersbetterplaygroundfor QoSmechanismsin client-serverenvironments.Theengineeringtaskto offerQoSguarantees with suchadifferentationtool is outof thescopeof thispaper .

Proceedings ArticleDOI
08 Apr 2002
TL;DR: Results show that a nine-fold improvement can be obtained in H.26L decoding speed in terms of frames per second with video quality equivalent to a non-optimized implementation.
Abstract: A unified method for optimization of video coding algorithms on general-purpose processors is presented. The method consists of algorithmic, code, compiler, and SIMD (Single Instruction Multiple Data) media Instruction Set Architecture (ISA) optimizations. H.263, H.263+ and emerging H.26L are used as example cases. For the realization of the unified method, the coding elements in all the codecs are analyzed and optimization techniques suitable for one or several of all the coding elements are presented. Results show that a nine-fold improvement can be obtained in H.26L decoding speed in terms of frames per second with video quality equivalent to a non-optimized implementation.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: The optimization tool uses an iterative algorithm to optimize the interconnection parameters, such as data width, priorities, and the time an agent can reserve the inter connection, to fulfill the given constraints.
Abstract: In this paper, we present a tool to be used in the optimization of interconnection parameters in order to achieve optimal performance and implementation with minimal costs. The optimization tool uses an iterative algorithm to optimize the interconnection parameters, such as data width, priorities, and the time an agent can reserve the interconnection, to fulfill the given constraints. In the used test case, the required area decreased 50% while 85% of the original bandwidth was obtained. This was due to an improved arbitration process.

Proceedings ArticleDOI
04 Sep 2002
TL;DR: A novel architectural extension called CPMA access instruction correlation recognition is introduced, intended for accelerating the execution rate of consecutive, temporally conflict-free, CPMA memory accesses and confirms that CPMA can have an acceptable silicon area.
Abstract: Contemporary multimedia processors and applications are increasingly limited by their data accessing capabilities. However, the designed Configurable Parallel Memory Architecture (CPMA) alleviates these multimedia data accessing requirements; achieving significant performance improvements over traditional memory architectures. CPMA decreases considerably the processor-memory bottleneck by widening the memory bandwidth, decreasing the number of memory accesses, and diminishing the significance of memory latency. To further enhance the performance of CPMA, this paper introduces a novel architectural extension called CPMA access instruction correlation recognition. The presented method is intended for accelerating the execution rate of consecutive, temporally conflict-free, CPMA memory accesses. As demonstrated in this paper, the superior CPMA performance can also be maintained in the case of limited access widths. In addition, the presented results confirm that CPMA can have an acceptable silicon area.

Journal ArticleDOI
TL;DR: This paper investigates the tuning of the matrix gains of the controller as ε → 0.1 and closed forms for asymptotically globally optimal solutions are given.
Abstract: The authors previously (2000) showed that a low-gain controller of the form C/sub /spl epsiv//(s)=/spl Sigma//sub k=-n//sup n/ /spl epsiv/K/sub k//(s-i/spl omega//sub k/) is able to track and reject constant and sinusoidal reference and disturbance signal for a stable plant in the Callier-Desoer (CD) algebra. In this note, we investigate the optimal tuning of the matrix gains K/sub k/ of the controller C/sub /spl epsiv//(s) as the scalar gain /spl epsiv//spl darr/0. The cost function is the maximum error between the reference signal and the measured output signal over all frequencies and bounded reference and disturbance signal amplitudes. Closed forms for asymptotically globally optimal solutions are given. The optimal matrix gains K/sub k/ are expressed in terms of the values of the plant transfer matrix at the reference and disturbance signal frequencies. Thus the matrices K/sub k/ can be tuned with input-output measurements made from the open loop plant without knowledge of the plant model. Although the analysis is in the CD-algebra, to the authors' knowledge the main results are new even for finite-dimensional systems.

Patent
07 Jun 2002
TL;DR: In this paper, the authors propose a method for adapting a bus to data traffic in a system comprising several functional units ( 311, 312,..., 31 n ) and a bus structure, where functional units are divided into at least two sets so that units which mainly transfer data with each other belong to the same set and are interfaced with the same separate sub-bus ( 321, 322 ).
Abstract: A method for adapting a bus to data traffic in a system comprising several functional units ( 311, 312, . . . , 31 n ) and a bus structure. The functional units are divided into at least two sets so that units, which mainly transfer data with each other belong to a same set and are interfaced with the same separate sub-bus ( 321; 322 ). The sub-buses can be united by switches (SW) into a more extensive bus, which is only used when data must be transferred between different sets. Supply voltage of each sub-bus is adjustable and is set the lower the less traffic there is on the bus. The parallel transfer operation makes it possible to increase the transfer capacity of the bus structure without increasing it's clock frequency. Furthermore energy consumption can be reduced by dropping the supply voltage of the bus circuits so that the bus retains the transfer capacity needed.

Book ChapterDOI
19 May 2002
TL;DR: This paper introduces and simulate a QoS aware caching scheme that offers lower response delay for higher quality services and additionally optimizes utilization of the available processing power.
Abstract: Research of web-servers has recently addressed the problem of content distribution coupled with quality of service (QoS). Due to the explosive growth of services offered over the Internet, novel mechanisms are needed for IP based service delivery to scale in a client-transparent way. This paper addresses the above problem considering also utilization of available processing power of servers. Many developed caching systems dedicate a fixed portion of the processing power for higher QoS services leading to lowered overall throughput of the server system. Here we introduce and simulate a QoS aware caching scheme that offers lower response delay for higher quality services and additionally optimizes utilization of the available processing power.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: This paper presents the usage of the specification and description language (SDL) for system simulations of a large-scale telecommunication system for delivering information services to public transport passengers (TUTPIS), and an embedded medium access control protocol for wireless LAN.
Abstract: This paper presents the usage of the specification and description language (SDL) for system simulations. The two studied systems are a large-scale telecommunication system for delivering information services to public transport passengers (TUTPIS), and an embedded medium access control (TUTMAC) protocol for wireless LAN. The TUTPIS system combines the implementation of some systems components by the building of a simulation model for the general service architecture. On the other hand, TUTMAC has been fully implemented in SDL. The formal system designs are simulated for verifying the required functionality. By simulations, the architectural operability of the transport service system design has been tested, e.g. with a very high number of users and with different telecommunication network environments. For the embedded TUTMAC protocol, real-time simulations are performed for evaluating the performance and capacity requirements of the application for the final platform.