3D logic-on-logic technology is a promising approach for extending the validity of Moore’s law when technology scaling stops. 3D technology can also lead to a paradigm shift in on-chip communication design by providing orders of magnitude higher bandwidth and lower latency for inter-layer communication. To turn the 3D technology bandwidth and latency benefits into network latency reductions …
The performance of user-facing applications is critical to client platforms. Many of these applications are event-driven and exhibit ”bursty” behavior: the application is generally idle but generates bursts of activity in response to human interaction. We study one example of a bursty application, web-browsers, and produce two important insights: (1) Activity bursts contain false parallelis…
Interconnection networks are a critical component in most modern systems nowadays. Both off-chip networks, in HPC systems, data centers, and cloud servers, and on-chip networks, in chip multiprocessors (CMPs) and multiprocessors system-on-chip (MPSoCs), play an increasing role as their performance is vital for the performance of the whole system. One of the key components of any interconnect i…
Next generation byte addressable nonvolatile memory (NVM) technologies like PCM are attractive for end-user devices as they offer memory scalability as well as fast persistent storage. In such environments, NVM’s limitations of slow writes and high write energy are magnified for applications that need atomic, consistent, isolated and durable (ACID) updates. This is because, for satisfying cor…
Over the lifetime of a microprocessor, the Hot Carrier Injection (HCI) phenomenon degrades the threshold voltage, which causes slower transistor switching and eventually results in timing violations and faulty operation. This effect appears when the memory cell contents flip from logic ‘0’ to ‘1’ and vice versa. In caches, the majority of cell flips are concentrated into only a few of t…
The performance of data-intensive applications, when running on modern multi- and many-core processors, is largely determined by their memory access behavior. Its most important contributors are the frequency and latency of off-chip accesses and the extent to which long-latency memory accesses can be overlapped with useful computation or with each other. In this paper we present two methods to…
An optimal operation plan has been developed for the reservoirs in Kuttiadi river basin in Kerala using systems approach. A system consists of three reservoirs namely; Banasurasagar in Wayanad district ,Kakkayam and Peruvannamuzhi in Kozhikode district has been taken for the study. Banasurasagar reservoir is in Kabbini river basin, which is an east flowing river and the remaining two are in …
Hardware prefetching improves system performance by hiding and tolerating the latencies of lower levels of cache and off-chip DRAM. An accurate prefetcher improves system performance whereas an inaccurate prefetcher can cause cache pollution and consume additional bandwidth. Prefetch address filtering techniques improve prefetch accuracy by predicting the usefulness of a prefetch address and ba…
Vrishabhavathi Watershed is a constituent of the Arkavathi River Basin, Bangalore Urban and Ramanagara District and covers an area of 360.620Km2, representing seasonally dry tropical climate. In Vrishabhavathi watershed Vrishabhavathi River is the main surface water source which is tributary of river Arkavathy, which joins the Cauvery River at a later stage. Earlier this surface water was mainl…
For the last few years, the major driving force behind the rapid performance improvement of SSDs has been the increment of parallel bus channels between a flash controller and flash memory packages inside the solid-state drives (SSDs). However, are other internal parallelisms inside SSDs yet to be explored. In order to improve performance further by utilizing the parallelism, this paper sugges…
Rewriting sequential programs to make use of multiple cores requires considerable effort. For many years, Amdahl’s law has served as a guideline to assess the performance benefits of parallel programs over sequential ones, but recent advances in multicore design introduced variability in the performance of the cores and motivated the reexamination of the underlying model. This paper extends A…
Unstructured oceanic environment and uncertain operating conditions of autonomous underwater vehicles (AUV) with limited on-board energy resources calls for the design of appropriate controllers to achieve optimal energy consumption while navigatingunderunstructured oceanic environment. With this aspect in mind, a suboptimal robust control methodology has been presented here based on the approp…
The shoreline change extraction and change detection analysis is an important task that has application in different fields such as development of setback planning, hazard zoning, erosion-accretion studies, regional sediment budgets and conceptual or predictive modeling of coastal morphodynamics. Shoreline delineation is difficult, time consuming, and sometimes impossible for entire coastal sys…
We explore the problem of how to easily estimate the per-core power distribution of GPGPUs from the total power of all cores. We show that the dynamic energy consumption of a core for a given kernel, represented by its work footprint, is approximately proportional to the total time taken by all work units executing on that core, and the static power, represented by its core footprint, is propo…
Sea Surface Temperature (SST), being one of the most important geo-physical parameters in the ocean, plays an important role in global climate change. The spatial attribute of oceanographic data makes them highly suitable for GIS analysis as GIS provides a natural framework for the acquisition, storage, and analysis of georeferenced data. In the present study we estimate the recent SST trends u…
The Internet of Things will result in users generating vast quantities of data, some of it sensitive. Results from the statistical analysis of sensitive data across wide ranges of Demographics will become ever more useful to data analysts and their clients. The competing needs of the two groups—data generators with their desire for privacy and analysts with their desire for inferred statisti…
The south west coast of India consists of beaches and cliffs which support a highly dense costal community. Coastal erosion is confined to southwest monsoon season when the waves are rough. Seawalls and groins are the major management strategies adopted for coastal protection along Kerala coast. Erosion hotspots are identified from extensive field work carried out during the southwest monsoon s…
The increasing computational and communication demands of the scientific and industrial communities require a clear understanding of the performance trade-offs involved in multi-core computing platforms. Such analysis can help application and toolkit developers in designing better, topology aware, communication primitives intended to suit the needs of various high end computing applications. In…
In this current study, a hybrid model of wavelet and Artificial Neural Network (WLNN) has been developed to forecast time series significant wave height for lead times up to 48 h. The data used in the hybrid model are significant wave heights (Hs) belongs to two stations, one near to New Mangalore port, Indian ocean and another near to west of Eureka, Canada in North Pacific ocean. The three ho…
Utilizing small (e.g., 4KB) pages incurs frequent TLB misses on modern big memory applications, substantially degrading the performance of the system. Large (e.g., 1GB) pages or direct segments can alleviate this penalty due to page table walks, but at the same time such a strategy exposes the organizational and operational details of modern DRAM-based memory systems to applications. Row-buffer…
By providing instruction-grained access to vast amounts of persistent data with ordinary loads and stores, byte-addressable storage class memory (SCM) has the potential to revolutionize system architecture. We describe a non-intrusive SCM controller for achieving light-weight failure atomicity through back-end operations. Our solution avoids costly software intervention by decoupling isolation …
This letter describes the architecture of an inter-domain message passing hardware sub-system targeting the embedded virtualization field. Embedded virtualization is characterized by application-specific solutions, where functionality is partitioned into a small, fixed number of Virtual Machines, typically under real-time constraints, which must communicate for synchronization and status signal…
Hardware prefetching on IBM’s latest POWER8 processor is able to improve performance of many applications significantly, but it can also cause performance loss for others. The IBM POWER8 processor provides one of the most sophisticated hardware prefetching designs which supports 225 different configurations. Obviously, it is a big challenge to find the optimal or near-optimal hardware prefetc…
Web applications are getting closer to the performance of native applications taking advantage of new standard–based technologies.The recent HTML5 standard includes, among others, the Web Workers API that allows executing JavaScript applications on multiple threads, or workers. However, the internals of the browser’s JavaScript virtual machine does not expose direct relation between worker…
Applications of neural networks in various fields of research and technology have expanded widely in recent years. In particular, applications with inherent tolerance to accuracy loss, such as signal processing and multimedia applications, are highly suited to the approximation property of neural networks. This approximation property has been exploited in many existing neural network accelerato…
Recently, both industry and academia have proposed many different roadmaps for the future of DRAM. Consequently, there is a growing need for an extensible DRAM simulator, which can be easily modified to judge the merits of today’s DRAM standards as well as those of tomorrow. In this paper, we present Ramulator, a fast and cycle-accurate DRAM simulator that is built from the ground up for exte…
Programmer-managed GPU memory is a major challenge in writing GPU applications. Programmers must rewrite and optimize an existing code for a different GPU memory size for both portability and performance. Alternatively, they can achieve only portability by disabling GPU memory at the cost of significant performance degradation. In this paper, we propose ScaleGPU, a novel GPU architecture to ena…
Third-party accelerators offer system designers high performance and low energy without the market delay of in-house development. However, complex third-party accelerators may include vulnerabilities due to design flaws or malicious intent that are hard to expose during verification. Rather than react to each new vulnerability, it is better to proactively build defenses for classes of attacks. …
Cloud providers host an increasing number of popular applications, on the premise of resource flexibility and cost efficiency. Most of these systems expose virtualized resources of different types and sizes. As instances share the same physical host to increase utilization, they contend on hardware resources, e.g., last-level cache, making them vulnerable to side-channel attacks from co-schedul…
1. Plants use light as a source of both energy and information. Plant physiological responses to light, and interactions between plants and animals (such as herbivory and pollination), have evolved under a more or less stable regime of 24-h cycles of light and darkness, and, outside of the tropics, seasonal variation in day length. 2. The rapid spread of outdoor electric lighting across t…
1. Understanding how soil microbial communities influence plant interactions with other organisms, and how this varies with characteristics of the interacting organisms, is important for multiple systems. Solanum spp. are a suitable model for trophic interactions in studies of agricultural and natural systems and can also provide useful corollaries in invaded systems. This study examined the…
1. Plant–fungal interactions can have strong effects on plant abundances, both through direct effects on plant performance and indirect effects on competition and facilitation. Most evidence linking fungi to plant abundances derives from direct fungal effects on initial growth, with little evidence linking fungal effects on plant–plant interactions in intact communities to plant abundanc…
1. Trait differences among plants are expected to influence the outcome of competition; competition should be strongest between similar species (or individuals) under limiting similarity, and between dissimilar species within competitive hierarchies. These hypotheses are often used to infer competitive dynamics from trait patterns within communities. However, plant traits are frequently plas…
1. Invasive herbivores can strongly affect ecosystems by reducing or removing native plant species, and early in primary successions they could have enduring consequences for plant community assembly and ecosystem functioning, although this has seldom been explored. Invasive brushtail possums (Trichosurus vulpecula) browse from ground levels to forest canopies in New Zealand, including on p…
1. In forest communities, the Janzen–Connell (J-C) hypothesis proposes that species diversity is maintained by non-competitive distance- and/or density-dependent seedling mortality caused by host-specific natural enemies. However, the effects of pathogen associations from nearby conspecifics versus heterospecifics remain unknown in spatially heterogeneous light environments. 2. Seeds o…
1. Metapopulation dynamics have been used to explain bryophyte dispersal patterns and they predict that population abundances vary with the spatial distribution of habitat and with species traits. However, results from stand and landscape studies are contradictory as both distance-dependent and distance- independent patterns have been found. These studies have typically included only a few …
1. Understanding the mechanisms by which invasive species affect native plants is a central challenge. Invasive plants have been shown to reduce pollinator visitation to natives and increase pollen quantity limitation. However, visitation and conspecific pollen delivery are the only two components of the pollination process; post-pollination interactions on the stigma (heterospecific pollen …
1. An increase in tree mortality rates has been recently detected in forests world-wide. However, few works have focused on the potential consequences of forest dieback for ecosystem functioning. 2. Here we assessed the effect of Quercus suber dieback on carbon, nitrogen and phosphorus cycles in two types of Mediterranean forests (woodlands and closed forests) affected by the aggressive p…
1. Land-use change can modify the functional composition of tree communities, which is an essential determinant of the ecosystem functions. The lack of consensus about the functional responses of tree communities to land-use change is a major uncertainty in the assessments of human impacts on terrestrial ecosystem functions. 2. In this study, we applied a machine-learning method to a larg…
The run-time virtual address (VA) stack has some unique properties,which have garnered the attention of researchers. The stack ne-dimensionally grows and shrinks at its top, and contains data that is seemingly local/private to one thread, or process. Most prior related research has focused on these properties. However, this article aims to demonstrate how conventional wisdom pertaining to the r…
Plasma physics has matured rapidly as a scientific and technological discipline with a vast span of relevant application in many different fields. As a consequence, no single textbook is able to address all aspects of plasma physics relevant to such a burgeoning community. With this reference text I have attempted to bridge the gap between the excellent variety of traditional, broadly-ba…
The last level cache (LLC) in private configurations offer lower latency and isolation but extinguishes the possibility of sharing underutilized cache resources. Cooperative Caching (CC) provides capacity sharing by spilling a line evicted from one cache to another. Current studies focus on efficient capacity sharing, while the adaptability of CC to manycore environment deserves more attentions…
A major problem in managing large-scale datacenters is diagnosing and fixing machine failures. Most large datacenter deployments have a management infrastructure that can help diagnose failure causes, and manage assets that were fixed as part of the repair process. Previous studies identify only actual hardware replacements to calculate Annualized Failure Rate (AFR) and component Reliability. I…
This letter quantitatively studies the benefits of inter-warp divergence aware execution on GPUs. To that end, the letter first proposes a novel approach to quantify the inter-warp divergence by measuring the temporal similarity in execution progress of concurrent warps, which we call Warp Progression Similarity (WPS). Based on the WPS metric, this letter proposes a WPS-aware Scheduler (WPSaS) …
Solid-state drives (SSD) offer a significant performance improvement over the hard disk drives (HDD), however, it can exhibit a significant variance in latency and throughput due to internal garbage collection (GC) process on the SSD. When the SSDs are configured in a RAID,the Performance variance of individual SSDs could significantly degrade the overall performance of the RAID of SSDs. The in…
The internal architecture of a SSD provides channel-, chip-, die- and plane-level parallelism Levels, to concurrently perform multiple data accesses and compensate for the performance gap between a single flash chip and host interface. Although a good strategy can effectively exploit the first 3 levels, parallel I/O accesses at plane-level can be performed only for operations of the same types …
We present a system architecture that uses high-efficiency processors as opposed to high-performance processors, NAND flash as byte-addressable main memory, and high-speed DRAM as a cache front-end for the flash. The main memory system is interconnected and presents a unified global address space to the client microprocessors. A single cabinet contains 2550 nodes,networked in a highly redundant…
Current ultra-high-performance computers execute instructions at the rate of roughly 10 PFLOPS (10 quadrillion floating-point operations per second) and dissipate power in the range of 10 MW. The next generation will need to execute instructions at EFLOPS rates—100 as fast as today’s—but without dissipating any more power. To achieve this challenging goal, the emphasis is on power-effici…
A compelling confluence of technology and application trends in which the cost, execution time, and energy of applications are being dominated by the memory system is driving the industry to 3D packages for future microarchitectures. However, these packages result in high heat fluxes and increased thermal coupling challenging current thermal solutions. Conventional design approaches utilize des…
As thread level parallelism in applications has continued to expand, so has research in chip multi-core processors. As more and more applications become multi-threaded we expect to find a growing number of threads executing on a machine. As a consequence, the operating system will require increasingly larger amounts of CPU time to schedule these threads efficiently. Instead of perpetuating the…