The right way to loop in Python (codes included)

Utpal Kumar   4 minute read      

What is the fastest and most efficient way to loop in Python. We found that the numpy is fastest and python builtins are the most memory efficient.

Introduction

Since, Python by itself is slow, it becomes import to know the nitty-gritty of different components of our code to efficienty code. In this post, we will look into most common ways we loop in Python using a simple summing example. We will also compute the memory profile to inspect which way is the most memory efficient for analyzing huge datasets.

The while loop

import timeit
import numpy as np

nval = 1000000

# usual while loop


def while_loop(n=nval):
    i, sumval = 0, 0
    while i < n:
        sumval += 1
        i += 1

    return sumval

if __name__ == "__main__":

    print(
        f"while_loop: {timeit.timeit(while_loop, number = 10):.6f}s")
   

This returns while_loop: 0.727578s. We can also do the memory profiling of this function.

import timeit
import numpy as np
from memory_profiler import profile

nval = 1000000

# usual while loop
@profile(precision=4)
def while_loop(n=nval):
    i, sumval = 0, 0
    while i < n:
        sumval += 1
        i += 1

    return sumval


if __name__ == "__main__":
    while_loop()
  

This returns:

Line #    Mem usage    Increment  Occurences   Line Contents
============================================================
    10  25.8984 MiB  25.8984 MiB           1   @profile(precision=4)
    11                                         def while_loop(n=nval):
    12  25.8984 MiB   0.0000 MiB           1       i, sumval = 0, 0
    13  25.9727 MiB   0.0000 MiB     1000001       while i < n:
    14  25.9727 MiB   0.0625 MiB     1000000           sumval += 1
    15  25.9727 MiB   0.0117 MiB     1000000           i += 1
    16                                         
    17  25.9727 MiB   0.0000 MiB           1       return sumval

In total, the while loop took 0.0743Mb of the memory usage for the above task.

The for loop

import timeit
import numpy as np

nval = 1000000


# usual for loop


def for_loop(n=nval):
    sumval = 0
    for i in range(n):
        sumval += i
    return sumval

if __name__ == "__main__":

    print(
        f"for_loop: {timeit.timeit(for_loop, number = 10):.6f}s")

This returns for_loop: 0.490051s. Now, we do the memory profiling of this function.

Line #    Mem usage    Increment  Occurences   Line Contents
============================================================
    22  25.9922 MiB  25.9922 MiB           1   @profile(precision=4)
    23                                         def for_loop(n=nval):
    24  25.9922 MiB   0.0000 MiB           1       sumval = 0
    25  26.0273 MiB   0.0117 MiB     1000001       for i in range(n):
    26  26.0273 MiB   0.0234 MiB     1000000           sumval += i
    27  26.0273 MiB   0.0000 MiB           1       return sumval

In total, the for loop took 0.0351Mb of the memory usage for the above task.

The builtin python function

import timeit
import numpy as np

nval = 1000000


# using built in sum
def builtinsum(n=nval):
    return sum(range(n))

if __name__ == "__main__":

    print(
        f"builtinsum: {timeit.timeit(builtinsum, number = 10):.6f}s")

This returns builtinsum: 0.175238s.

Line #    Mem usage    Increment  Occurences   Line Contents
============================================================
    46  25.8867 MiB  25.8867 MiB           1   @profile(precision=4)
    47                                         def builtinsum(n=nval):
    48  25.8906 MiB   0.0039 MiB           1       return sum(range(n))

In total, the “builtin function” based function took 0.0039Mb of the memory usage for the above task.

The numpy function

import timeit
import numpy as np

nval = 1000000


# using numpy sum
def numpysum(n=nval):
    return np.sum(np.arange(n))

if __name__ == "__main__":

    print(
        f"numpysum: {timeit.timeit(numpysum, number = 10):.6f}s")

This returns numpysum: 0.017640s.

Line #    Mem usage    Increment  Occurences   Line Contents
============================================================
    53  25.9766 MiB  25.9766 MiB           1   @profile(precision=4)
    54                                         def numpysum(n=nval):
    55  33.6172 MiB   7.6406 MiB           1       return np.sum(np.arange(n))

In total, the numpy based function took 7.6407Mb of the memory usage for the above task.

Conclusions

Please note that these values of run time and memory usage may differ from system to system but the ratio of these values between different methods will stay very similar.

We found that the numpy is fastest (0.017640s) and while loop sum is the slowest (0.727578s). The reason for the while loop to be slow is that each step of the task is completed in the native Python. Since numpy is written in C, it runs quite fast.

In terms of the memory usage, the numpy is the worst. It took ~7Mb of the memory usage. In contrast, the “builtin python function” based function is the most memory efficient as it does not store all the data into memory but does it in steps.

If we compare the while and for loop, then for loop is fast and also more memory efficient. Hence, for loop should always be our first choice (and usually is) unless we don’t know the total number of runs.

References

  1. The fastest way to loop in Python - An Unfortunate truth

Disclaimer of liability

The information provided by the Earth Inversion is made available for educational purposes only.

Whilst we endeavor to keep the information up-to-date and correct. Earth Inversion makes no representations or warranties of any kind, express or implied about the completeness, accuracy, reliability, suitability or availability with respect to the website or the information, products, services or related graphics content on the website for any purpose.

UNDER NO CIRCUMSTANCE SHALL WE HAVE ANY LIABILITY TO YOU FOR ANY LOSS OR DAMAGE OF ANY KIND INCURRED AS A RESULT OF THE USE OF THE SITE OR RELIANCE ON ANY INFORMATION PROVIDED ON THE SITE. ANY RELIANCE YOU PLACED ON SUCH MATERIAL IS THEREFORE STRICTLY AT YOUR OWN RISK.


Leave a comment