Row-Major and Column-Major Storage

If I had to sum up row major and column major, the easiest way to remember it is this. 1. row major is a Z pattern. Think typewriter or reading. You read left to right and then start over. This is row major. Column major is the russian e sound which looks like a backwards N. The Z and N pattern respectively refer to the fact that the lines of text are straight lines and then your eyes must follow a return path to start the next line.
C M R L T
O A E I H
L J A K I
U O D E S
M R S
N

When you iterate over a row of a matrix, which index to you iterate over? You iterate over the column index of the matrix. Take an array and print it out. In a column major configuration, the row index is multiplied by the number of columns. I like to call the jump size. If you know you are on the third row of a 5 x 5 matrix, you can you quickly jump to the second element of the 3rd row. Well, you take 2 + 3*5 = 17. This would be the 17th element in contiguous memory.

Matrix storage order determines how multidimensional arrays are laid out in memory. Understanding row-major and column-major storage is crucial for writing efficient numerical code, as it directly impacts cache performance and memory access patterns.

What is Storage Order?

When a matrix is stored in computer memory, it must be flattened into a one-dimensional array. The storage order determines how the two-dimensional structure maps to this linear memory layout.

Consider a $3 \times 4$ matrix:

A = a_{00} a_{10} a_{20} a_{01} a_{11} a_{21} a_{02} a_{12} a_{22} a_{03} a_{13} a_{23}

Row-Major Order (C/C++ Style)

In row-major order, elements are stored row by row. The memory layout for the matrix above would be:

 
a00 a01 a02 a03 a10 a11 a12 a13 a20 a21 a22 a23  

To access element $a_{ij}$ in a row-major matrix with $n$ columns, the index is:

index = i \times n + j

Used by: C, C++, Python (NumPy default), most modern languages

Column-Major Order (Fortran/Julia Style)

In column-major order, elements are stored column by column. The memory layout for the same matrix would be:

 
a00 a10 a20 a01 a11 a21 a02 a12 a22 a03 a13 a23  

To access element $a_{ij}$ in a column-major matrix with $m$ rows, the index is:

index = j \times m + i

Used by: Fortran, MATLAB, Julia, R, BLAS/LAPACK

Performance Implications

The storage order has significant performance implications due to CPU cache behavior:

Cache locality: Accessing elements in the order they're stored in memory (sequential access) is much faster than random access
Row-major: Iterating over rows is fast; iterating over columns is slow
Column-major: Iterating over columns is fast; iterating over rows is slow

Example: Matrix-Vector Multiplication

For $y = Ax$ , where $A$ is $m \times n$ :

Row-major storage: Compute $y_{i} = \sum_{j} a_{ij} x_{j}$ by iterating over $i$ (rows) - this is efficient
Column-major storage: Compute $y_{i} = \sum_{j} a_{ij} x_{j}$ by iterating over $j$ (columns) - this is efficient

BLAS and LAPACK

The BLAS (Basic Linear Algebra Subprograms) and LAPACK libraries use column-major storage, following Fortran conventions. This is why:

Many high-performance linear algebra libraries expect column-major matrices
When calling BLAS/LAPACK from C/C++, you may need to transpose or use different indexing
Some libraries (like NumPy) provide automatic conversion, but this can introduce overhead

Practical Considerations

In C/C++

C and C++ use row-major order. When interfacing with BLAS/LAPACK, you may need to:

Transpose matrices before calling BLAS routines
Use the transpose flags in BLAS calls (e.g., cblas_dgemm with CblasTrans)
Consider the performance cost of transposition

In Python (NumPy)

NumPy arrays are row-major by default, but you can specify column-major using:

import numpy as np

# Row-major (default, C-style)
A = np.array([[1, 2, 3], [4, 5, 6]], order='C')

# Column-major (Fortran-style)
B = np.array([[1, 2, 3], [4, 5, 6]], order='F')

# Check storage order
print(A.flags['C_CONTIGUOUS'])  # True for row-major
print(B.flags['F_CONTIGUOUS'])  # True for column-major

Summary

Aspect	Row-Major	Column-Major
Storage	Row by row	Column by column
Fast Iteration	Over rows	Over columns
Languages	C, C++, Python (NumPy)	Fortran, MATLAB, Julia, R
BLAS/LAPACK	May need transpose	Native format