A new data compression technique for faster computer programs



[ad_1]

A new technique developed by MIT researchers is redesigning hardware data compression to free up more memory for computers and mobile devices, enabling them to run faster and perform more tasks simultaneously.

Data compression leverages redundant data to free up storage capacity, increase compute speed, and provide other benefits. In today's computer systems, accessing main memory is very expensive compared to real computing. Because of this, the use of data compression in memory helps improve performance because it reduces the frequency and amount of data that programs must retrieve from main memory.

The memory of modern computers manages and transfers data into fixed-size chunks, on which traditional compression techniques must operate. However, software does not naturally store their data in fixed-size chunks. Instead, it uses "objects", data structures containing different types of data and having variable sizes. As a result, traditional hardware compression techniques do not handle objects well.

In a paper presented this week at the ACM International Conference on Architectural Support for Programming Languages ​​and Operating Systems, MIT researchers describe the first approach to compressing objects in the memory hierarchy. This reduces memory usage while improving performance and efficiency.

Programmers could benefit from this technique when programming in any modern programming language – such as Java, Python, and Go – that stores and manages data in objects without changing their code. For their part, consumers would see computers that can run much faster or many more applications at the same speed. As each application consumes less memory, it runs faster, so that a device can support more applications in the allocated memory.

In experiments using a modified Java virtual machine, the technique compressed twice as much data and halved the use of memory compared to traditional cache-based methods.

"The motivation was to create a new memory hierarchy that could do object compression, instead of cache line compression, because most modern programming languages ​​manage the data," he says. the first author, Po-An Tsai, graduate student in the Laboratory of Computer Science and Artificial Intelligence (CSAIL).

"All IT systems would benefit," adds Daniel Sanchez, co-author, professor of computer science and electrical engineering and researcher at CSAIL. "Programs become faster because they are not bothered by memory bandwidth anymore."

Researchers drew on their earlier work to restructure the architecture of memory to manipulate objects directly. Traditional architectures store data in blocks in a hierarchy of progressively larger and slower memories, called "caches". Newly used blocks access smaller and faster caches, while older blocks are moved to slower and larger caches, eventually returning to main memory. Although this organization is flexible, it is expensive: to access the memory, each cache must look for the address among its contents.

"Because objects are the natural unit of data management in modern programming languages, why not just create a hierarchy of memory that processes objects?" Says Sanchez.

In an article published last October, the researchers detailed a system called Hotpads, which stores whole objects, grouped in hierarchical levels, or "pads". These levels reside entirely in efficient memories, directly on chip, directly addressed – without sophisticated research. Required fields.

Programs then directly refer to the location of all objects in the pad hierarchy. Recently allocated and recently referenced objects, as well as the objects they point to, remain at the fastest level. When the fastest level is filled, it runs an "eviction" process that retains the recently referenced objects, but lowers older objects to slower levels and recycles objects that have become useless to free up. ;space. The pointers are then updated in each object to point to the new locations of all moved objects. In this way, programs can access objects much cheaper than a search in cache levels.

For their new work, the researchers designed a technique called "Zippads", which exploits the Hotpads architecture to compress objects. When objects begin for the first time at the fastest level, they are uncompressed. But when they are expelled at a slower level, they are all compressed. The pointers of all objects, whatever their level, then point to these compressed objects, which facilitates their recall to the fastest levels and their storage more compact than the previous techniques.

A compression algorithm then efficiently exploits redundancy between objects. This technique reveals more compression opportunities than previous techniques, limited to finding redundancy within each fixed-size block. The algorithm first selects some representative objects as "basic" objects. Then, in the new objects, it stores only the different data between these objects and the representative basic objects.

Brandon Lucia, an assistant professor of electrical and computer engineering at Carnegie Mellon University, praises the work of leveraging the features of object-oriented programming languages ​​to better compress memory. "Abstractions such as object-oriented programming are added to a system to simplify programming, but often result in a cost in terms of system performance or efficiency," he said. "What's interesting about this work is that it uses existing object abstraction as a way to make memory compression more efficient, making the system faster and more efficient with new features. computer architecture.

[ad_2]

Source link