Technische Universität Braunschweig
  • Study & Teaching
    • Beginning your Studies
      • Prospective Students
      • Degree Programmes
      • Application
      • Fit4TU
      • Why Braunschweig?
    • During your Studies
      • Fresher's Hub
      • Term Dates
      • Courses
      • Practical Information
      • Beratungsnavi
      • Additional Qualifications
      • Financing and Costs
      • Special Circumstances
      • Health and Well-being
      • Campus life
    • At the End of your Studies
      • Discontinuation and Credentials Certification
      • After graduation
      • Alumni
    • For Teaching Staff
      • Strategy, Offers and Information
      • Learning Management System Stud.IP
    • Contact
      • Study Service Centre
      • Academic Advice Service
      • Student Office
      • Career Service
  • Research
    • Research Profile
      • Core Research Areas
      • Clusters of Excellence at TU Braunschweig
      • Research Projects
      • Research Centres
      • Professors‘ Research Profiles
    • Early Career Researchers
      • Support in the early stages of an academic career
      • PhD-Students
      • Postdocs
      • Junior research group leaders
      • Junior Professorship and Tenure-Track
      • Habilitation
      • Service Offers for Scientists
    • Research Data & Transparency
      • Transparency in Research
      • Research Data
      • Open Access Strategy
      • Digital Research Announcement
    • Research Funding
      • Research Funding Network
      • Research funding
    • Contact
      • Research Services
      • Academy for Graduates
  • International
    • International Students
      • Why Braunschweig?
      • Degree seeking students
      • Exchange Studies
      • TU Braunschweig Summer School
      • Refugees
      • International Student Support
      • International Career Service
    • Going Abroad
      • Studying abroad
      • Internships abroad
      • Teaching and research abroad
      • Working abroad
    • International Researchers
      • Welcome Support for International Researchers
      • Service for Host Institutes
    • Language and intercultural competence training
      • Learning German
      • Learning Foreign Languages
      • Intercultural Communication
    • International Profile
      • Internationalisation
      • International Cooperations
      • Strategic partnerships
      • International networks
    • International House
      • About us
      • Contact & Office Hours
      • News and Events
      • International Days
      • 5th Student Conference: Internationalisation of Higher Education
      • Newsletter, Podcast & Videos
      • Job Advertisements
  • TU Braunschweig
    • Our Profile
      • Aims & Values
      • Regulations and Guidelines
      • Alliances & Partners
      • The University Development Initiative 2030
      • Facts & Figures
      • Our History
    • Career
      • Working at TU Braunschweig
      • Vacancies
    • Economy & Business
      • Entrepreneurship
      • Friends & Supporters
    • General Public
      • Check-in for Students
      • CampusXperience
      • The Student House
      • Access to the University Library
    • Media Services
      • Communications and Press Service
      • Services for media
      • Film and photo permits
      • Advices for scientists
      • Topics and stories
    • Contact
      • General Contact
      • Getting here
  • Organisation
    • Presidency & Administration
      • Executive Board
      • Designated Offices
      • Administration
      • Committees
    • Faculties
      • Carl-Friedrich-Gauß-Fakultät
      • Faculty of Life Sciences
      • Faculty of Architecture, Civil Engineering and Environmental Sciences
      • Faculty of Mechanical Engineering
      • Faculty of Electrical Engineering, Information Technology, Physics
      • Faculty of Humanities and Education
    • Institutes
      • Institutes from A to Z
    • Facilities
      • University Library
      • Gauß-IT-Zentrum
      • Professional and Personnel Development
      • International House
      • The Project House of the TU Braunschweig
      • Transfer Service
      • University Sports Center
      • Facilities from A to Z
    • Equal Opportunity Office
      • Equal Opportunity Office
      • Family
      • Diversity for Students
  • Search
  • Quicklinks
    • People Search
    • Webmail
    • cloud.TU Braunschweig
    • Messenger
    • Cafeteria
    • Courses
    • Stud.IP
    • Library Catalogue
    • IT Services
    • Information Portal (employees)
    • Link Collection
    • DE
    • EN
    • Instagram
    • YouTube
    • LinkedIn
    • Mastodon
    • Bluesky
Menu
  • Organisation
  • Faculties
  • Faculty of Electrical Engineering, Information Technology, Physics
  • Institutes
  • Institute for Communications Technology
  • Research
Logo Institut für Nachrichtentechnik der TU Braunschweig
Learned Media Compression
  • Learned Media Compression
    • ⤶ Research
    • ⌂ IfN

Learned Media Compression

Media compression aims to find a representation that can be represented with the fewest possible information (bitrate) and from which the original data can be reconstructed. A distinction is made between lossless and lossy compression, whereby in lossy compression the reconstruction in not necessarily perfect, thereby even lower bitrates can be achieved. This leads to a trade-off between low bitrate and high reconstruction quality, which is called rate-distortion trade-off. The processing steps ofa compression system include a lossless or lossy data transformation followed by a quantization step and potentially an entropy coding step, all of which are implemented in classical systems with hand-engineered functions. The resulting bitstream is decompressed at the receiver side by the respective inverse operations. Using deep neural networks, some or all of the above process steps can be learned, minimizing the reconstruction distortion and bitrate together as a cost function, and thereby outperforming classical systems. As a basic structure, an encoder and a decoder (autoencoder) are often learned inversely to each other as elementary blocks.

 

Media Compression

Speech Compression

In the field of speech compression, e.g., a WaveNet-based decoder has been proposed, which is able to synthetize wideband speech samples based on a bitstream, that has been generated from a conventional codec. The approach is especially well-suited for lower bitrates. Furthermore, it can be shown that a trainable quantization scheme, used to learn a discrete latent representation, is able to produce results comparable to current coding standards, again particularly effective for low bitrates. Using an end-to-end-learned autoencoder trained on raw speech samples, the performance of learned speech compression can even compete with AMR-WB at various bitrates, while in this case particularly higher bitrates can take profit, such that significant improvements can be witnessed. Last but not least, deep neural networks can serve as speech enhancers for decoded speech after lossy compression.

Image Compression

While a fixed bitrate is often required for speech coding, an entropy model is usually used in image compression, so that the exact bitrate depends on the respective input (variable bitrate). Therefore, bitrate specifications usually refer to average bitrates. The rate-distortion trade-off can then be weighed by the ratio between distortion and bitrate, both of which are included in the loss function. The quality of the lossy transformation and the quantization determine the distortion and the entropy model determines the bitrate. Many improvements are achieved in the architectures used for image compression. For example, skip connections are adopted from normal autoencoders to increase the reconstruction quality at the expense of a second bottleneck. If this additional side information is included in the entropy modeling, it is called a hyperprior architecture. For evaluation, various metrics are used that model human perception to varying degrees. So-called generative adversarial networks (GANs) have been used to create images that look particularly real. However, compression is not always intended to produce reconstructions that look good to the human eye. If the images are input data for a subsequent processing step, the performance of this stage of the process can also be an important metric. In this case, learned image compression has been shown to offer an advantage over classical codecs.

Vier Fotos einer Szene im Straßenverkehr

Samples from (a) JPEG, (b) JPEG2000, (c) WebP, and (d) GAN compression including quantization in the training process. The effects can be viewed best in color and on a computer screen. 

 

Video Compression

The difference between image and video compression is the inclusion of temporal context, which can significantly reduce the bitrate of video compression compared to single-frame image compression. To model the temporal context in an image, optical flow is often used, which assigns a motion vector to pixels in single frames. If the optical flow is available, a prediction for the next frame can be generated by motion estimation before the actual compression. Only the deviation between this prediction and the actual next frame is then compressed. On the receiver side, the inverse process is performed in the form of motion compensation. The actual architectures used to compress the deviation are often very similar to the structures used in image compression. It is further possible to use recurrent networks such as convolutional LSTMs instead of normal autoencoders, so that the networks learn the temporal context implicitly and store it as an internal state across multiple frames.

 

Example Video Compression

Publications

[1] Z. Zhao, H. Liu, and T. Fingscheidt, “Convolutional Neural Networks to Enhance Coded Speech," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 4, pp. 663 –78, Apr. 2019.

[2] J. Löhdefink, A. Bär, N. M. Schmidt, F. Hüger, P. Schlicht, and T. Fingscheidt, “On Low-Bitrate Image Compression for Distributed Automotive Perception: Higher Peak SNR Does Not Mean Better Semantic Segmentation," in Proc. of IV, Paris, France, Jun. 2019, pp. 424 – 431.

[3] J. Löhdefink, F. Hüger, A. Bär, P. Schlicht, and T. Fingscheidt, “Scalar and Vector Quantization for Learned Image Compression: A Study on the Effects of MSE and GAN Loss in Various Spaces," in Proc. of ITSC, Rhodes, Greece, Sep. 2020, pp. 1 – 8.

[4] J. Löhdefink, A. Bär, N. M. Schmidt, F. Hüger, P. Schlicht, and T. Fingscheidt, „Focussing Learned Image Compression to Semantic Classes for V2X Applications," in Proc. of IV, Las Vegas, NV, USA, Oct. 2020, pp. 1 – 8.

Photo credits on this page

For All Visitors

Vacancies of TU Braunschweig
Career Service' Job Exchange 
Merchandising

For Students

Term Dates
Courses
Degree Programmes
Information for Freshman
TUCard

Internal Tools

Glossary (GER-EN)
Change your Personal Data

Contact

Technische Universität Braunschweig
Universitätsplatz 2
38106 Braunschweig

P. O. Box: 38092 Braunschweig
GERMANY

Phone: +49 (0) 531 391-0

Getting here

© Technische Universität Braunschweig
Legal Notice Privacy Accessibility

TU Braunschweig uses the software Matomo for anonymised web analysis. The data serve to optimise the web offer.
You can find more information in our data protection declaration.