EE5350 Error Correcting Codes

Table of Contents

Welcome to EE5350 (Error Correcting Codes). This is an elective course for final-year BTech and MTech/PhD students. A background on linear algebra and probability is essential for taking this course. While not mandatory, it is helpful to have some familiarity with information theory and digital communication. The course will involve a good amount of math and programming in C/C++/python.

Error control codes are the most important building blocks of modern digital communication systems, and were a major contributing factor to the cellular revolution. They also have applications in storage (from early CDs to DNA storage, QR codes and RAID systems), cryptography (secret sharing, post-quantum cryptosystems based on ECCs), quantum computation (Quantum ECCs for fault-tolerant quantum computation), just to name a few. In this course, we will discuss the basic principles of error correction from a combinatorial and algebraic perspective, and then talk about ``modern'' codes based on so-called graphical approaches. We will aim to discuss at least four major families of codes widely used in practice: BCH codes, Reed-Solomon codes, Reed-Muller codes, and LDPC codes. All these have a variety of applications, and LDPC codes are a part of major wireless communication standards.

Classes will be offline, and much of our online interaction (homework submissions, announcements, etc) will be on Google classroom. Invites will be sent to registered students by the first week of August. If you have not received an invite by the second class, please send me an email.

Prerequisites

1 Assessment (tentative):

Each student will be expected to

  • attend classes and solve short in-class quizzes
  • participate in class
  • solve a mid-term exam
  • solve homework assignments, roughly 3 (collaboration is fine as long as it is explicitly declared, but you have to write the solutions in your own words)
  • present a paper/do a project/solve a final exam
Homeworks 30%
Attendance and class participation 5%
In-class quizzes 10%
Mid-term exam 25%
Final project/paper presentation/final exam 30%

2 Instructor:

Name Dr. Shashank Vatedka
Email shashankvatedka@ee.iith.ac.in
Office 446, Academic block C

3 Class timings:

  • Slot P: Mondays 14:30-16:00, Thursdays 16:00-17:50
  • Class venue: A-112

4 Primary references:

There is no single textbook for this course. Primarily, the material will be drawn from

  • Lecture notes and slides, uploaded here.
  • Essential Coding Theory, Venkatesan Guruswami, Atri Rudra and Madhu Sudan (GRS), available online
  • Introduction to Coding Theory, Ron Roth (RR), Cambridge University Press, 2006
  • "The generalized distributive law", SM Aji and RJ McEliece, IEEE Transactions on Information Theory, 2000
  • "Factor graphs and the sum-product algorithm", FR Kshischang, BJ Frey and HA Loeliger, IEEE Transactions on Information Theory, 2001.
  • Modern Coding Theory, Tom Richardson and Ruediger Urbanke, Cambridge University Press, 2008. The bible for LDPC codes.

Other references (in no particular order):

  • "A mathematical theory of communication", Claude Shannon, Bell systems Tech Journal, 1948. The classic paper that started this field.
  • "Channel coding: The road to channel capacity", DJ Costello and GD Forney Jr, Proceedings of the IEEE, 2007. Gives a great historical perspective.
  • NPTEL lectures by Prof Vijay Kumar (IISc)
  • NPTEL lectures by Prof Andrew Thangaraj (IITM)
  • Fundamentals of error correcting codes, WC Huffman and V Pless, Cambridge University Press, 2003
  • The theory of error correcting codes, FJ Macwilliams and NJA Sloane, Elsevier/North Holland, 1977
  • Error control coding, S Lin and D Costello (2nd ed), Prentice-Hall, 2004
  • Fundamentals of convolutional coding, R Johanneson and K Zigangirov, IEEE Press, 1999.
  • Information theory, inference, and learning algorithms, David MacKay (soft copy available for free on the author's website). A very nice book, but gives a very different flavour that what will be covered in the course. Suggested for additional reading.
  • A student's guide to coding and information theory, Stefan Moser (amazon link).
  • Finite fields or Introduction to finite fields and their applications, R Lidl and H Neiderreiter (the first is considered by many as the book on finite fields)

5 Tentative syllabus:

Channel models and coding theory, need for error correction; Basics of fields and vector spaces; Codes, rate, decoding; Bounds on the size of codes: GV, singleton, Plotkin, EB and LP bounds; Basics of finite fields: construction and arithmetic; Algebraic codes: BCH, Reed-Solomon and Reed-Muller codes; Codes on graphs: convolutional codes and Viterbi decoding; Generalized distributive law and factor graphs; LDPC codes and their decoding; Message passing algorithms, belief propagation, density evolution and EXIT charts;

6 Tentative schedule

Some of the recorded classes will be made available in this playlist.

Topic Date Notes/links/material Homeworks
Introduction to coding theory   Review basic material on linear algebra/probability  
- Channels, noise, and need for error correction   Linear Algebra by Gilbert Strang  
- Various channel models and error correcting codes   NPTEL course on linear algebra  
- Rate, encoding, decoding, complexity   Chapter 1 of RR  
- The channel coding theorem for DMCs   Chapter 1 of GRS  
- Adversarial models and minimum distance   Class notes 1  
- Recap of linear algebra: fields and vector spaces   Class notes 2  
- Linear codes   Chapter 2 of RR  
Bounds on the size of codes   Chapter 4 of RR  
- Singleton bound, Hamming bound, sphere packing bound, GV bound   Class notes 3  
- Plotkin bound, EB bound   Chapter 2 of Huffman and Vera Pless, "Fundamentals of ECCs"  
- LP bound      
- The Hamming code      
Finite fields and their arithmetic   Class notes 4  
- Fields and polynomials, Euclidean algorithm   Chapter 3 of RR  
- Constructing finite fields      
- Subfields, factorization      
Algebraic codes   Class notes 5  
- Reed-Solomon codes   Chapter 5 of RR  
- BCH codes   Chapter 5 and Chapter 9 of GRS  
- Reed-Muller codes      
- Cyclic codes      
Convolutional codes   Class notes 6  
- Construction and encoding   Lin-Costello  
- Viterbi algorithm      
- BCJR algorithm and Turbo codes (if time permits)      
Generalized distributive law, factor graphs   Aji-McEliece paper  
- Marginalization problem of a sum-of-products      
- Generalized distributive law      
- Factor graphs and message passing algorithms      
- Physics interpretation: Free energies and Bethe approximations (if time permits)      
Low-Density Parity-Check (LDPC) Codes   Class notes 7  
- Construction and encoding   Richardson-Urbanke (chapters 1-3 essentially cover what we need)  
- Constructing the Tanner graph      
- Belief propagation decoding of LDPC codes      
- Density evolution      
- EXIT charts      
Other topics      
- Based on time and interest      

7 Further reading

Coding theory is a vast topic, and hence there are several topics that we will not be able to cover in class.

Good sources for the latest work on channel coding can be found in the IEEE Transactions on Information Theory, IEEE Transactions on Communications, various IEEE conferences including the annual IEEE International Symposium on Information Theory (ISIT), Information Theory Workshop (ITW). Several good preprints are made available online on the Arxiv preprint server.

The following are representative papers. You can go through papers citing/cited by the following as well.

8 General comments regarding presentations/reports

In addition to learning cool stuff, you should also learn to present your work. The final presentation is an opportunity to develop your presenting skills. You must be able to convey your ideas and thoughts clearly to another person. Here are some general comments for making good presentations:

  • Make sure that you acknowledge all the references/material that you used (papers/code/etc.) While it does not matter much for the purposes of this class, this is very important to keep in mind when you are delivering presentations at bigger venues
  • In a presentation, always introduce yourself and your team members/collaborators in the very beginning.
  • If much of the presentation is based on on one/two papers/references, explicitly state which references you are using in the first/second slide.
  • Avoid saying "We did this/we showed that…" when you are in fact talking about some other authors' work. Unless you have derived something/programmed something yourself, always say "they/the authors did/showed/observed that…"
  • A presentation/report is like telling a story: even before preparing the slides/document, imagine that you want to explain the work to a friend in 15-20 mins, identify exactly what you want to convey, and only then proceed.
  • Make sure that you explain the problem clearly.
  • When reading a paper, you should first understand what the problem is, what makes it challenging, and why it is worthwhile solving the problem. Then, you should identify what is new about the paper, and what the contributions are. To understand the rest of the paper, you should then try to understand the constructions/proof techniques/ideas. Your presentation should also bring out these aspects. See this article and this one on how to read a paper.
  • There is no single best rule for presentations, but generally in any presentation, the viewer/audience's attention is maximum in the first 5-10 minutes and then gradually decreases. So it is helpful to get to the punchline in the first few minutes.
  • Citations should be treated differently in papers and slides. When you are making a reference to a particular paper/book/website (for e.g., when you say that result A can be found in paper B) in your slides, mention the paper/book/website as a footnote rather than saying [1], and putting the reference in the last slide. 
  • Do not put too much text on your slides. This is bad practice. Each slide should contain at most 3-5 points. You should absolutely avoid long sentences. 
  • Do not copy text directly from the paper.
  • Do not read out your slides word-to-word.
  • Similarly, do not put too many lemmas in a slide. Each slide should convey a single idea. Unless very important, do not put too much math/long derivations in a short presentation. It is more important to convey ideas/intuition. But at the same time, there should be a significant amount of technical content. It is therefore very useful to instead explain using examples.
  • Make sure that the fonts/diagrams/labels/plots are clear. Label all the axes on your plots.
  • Speak clearly, and make sure that the audio quality is fine. Check your mic volume. Do a couple of dry runs.
  • If you are using Google meet to record, then make sure that you "pin" the slides since meet might automatically switch displaying you/your icon instead of the presentation. 
  • Make sure that you check, and double check the slides and report for typos, technical and grammatical errors.
  • If your group has more than one person, then the presentation must be shared equally.

Author: Shashank Vatedka

Created: 2022-11-04 Fri 11:35