Row-Major and Column-Major Storage
If I had to sum up row major and column major, the easiest way to remember it is this.
1. row major is a Z pattern. Think typewriter or reading. You read left to right and then start over.
This is row major. Column major is the russian e sound which looks like a backwards N.
The Z and N pattern respectively refer to the fact that the lines of text are straight lines
and then your eyes must follow a return path to start the next line.
C M R L T
O A E I H
L J A K I
U O D E S
M R S
N
When you iterate over a row of a matrix, which index to you iterate over? You iterate over the column index of the matrix. Take an array and print it out. In a column major configuration, the row index is multiplied by the number of columns. I like to call the jump size. If you know you are on the third row of a 5 x 5 matrix, you can you quickly jump to the second element of the 3rd row. Well, you take 2 + 3*5 = 17. This would be the 17th element in contiguous memory.
Matrix storage order determines how multidimensional arrays are laid out in memory. Understanding row-major and column-major storage is crucial for writing efficient numerical code, as it directly impacts cache performance and memory access patterns.
What is Storage Order?
When a matrix is stored in computer memory, it must be flattened into a one-dimensional array. The storage order determines how the two-dimensional structure maps to this linear memory layout.
Consider a matrix:
Row-Major Order (C/C++ Style)
In row-major order, elements are stored row by row. The memory layout for the matrix above would be:
a00 a01 a02 a03 a10 a11 a12 a13 a20 a21 a22 a23 To access element in a row-major matrix with columns, the index is:
Used by: C, C++, Python (NumPy default), most modern languages
Column-Major Order (Fortran/Julia Style)
In column-major order, elements are stored column by column. The memory layout for the same matrix would be:
a00 a10 a20 a01 a11 a21 a02 a12 a22 a03 a13 a23 To access element in a column-major matrix with rows, the index is:
Used by: Fortran, MATLAB, Julia, R, BLAS/LAPACK
Performance Implications
The storage order has significant performance implications due to CPU cache behavior:
- Cache locality: Accessing elements in the order they're stored in memory (sequential access) is much faster than random access
- Row-major: Iterating over rows is fast; iterating over columns is slow
- Column-major: Iterating over columns is fast; iterating over rows is slow
Example: Matrix-Vector Multiplication
For , where is :
- Row-major storage: Compute by iterating over (rows) - this is efficient
- Column-major storage: Compute by iterating over (columns) - this is efficient
BLAS and LAPACK
The BLAS (Basic Linear Algebra Subprograms) and LAPACK libraries use column-major storage, following Fortran conventions. This is why:
- Many high-performance linear algebra libraries expect column-major matrices
- When calling BLAS/LAPACK from C/C++, you may need to transpose or use different indexing
- Some libraries (like NumPy) provide automatic conversion, but this can introduce overhead
Practical Considerations
In C/C++
C and C++ use row-major order. When interfacing with BLAS/LAPACK, you may need to:
- Transpose matrices before calling BLAS routines
- Use the transpose flags in BLAS calls (e.g.,
cblas_dgemmwithCblasTrans) - Consider the performance cost of transposition
In Python (NumPy)
NumPy arrays are row-major by default, but you can specify column-major using:
import numpy as np
# Row-major (default, C-style)
A = np.array([[1, 2, 3], [4, 5, 6]], order='C')
# Column-major (Fortran-style)
B = np.array([[1, 2, 3], [4, 5, 6]], order='F')
# Check storage order
print(A.flags['C_CONTIGUOUS']) # True for row-major
print(B.flags['F_CONTIGUOUS']) # True for column-major