Recommended reading to get started with Python for science and data analysis
I get asked a lot in the Fatiando a Terra
how to do some basic Python and numpy tasks which are not necessarily related
The most common question is some variant of "I have some data in a csv/txt/xyz
file and I want to load it into Python".
I think this happens because a lot of people find the project while searching
for a replacement for GUI based commercial projects,
like Geosoft's Oasis Montaj,
but they don't necessarily know Python.
So instead of writing yet another email,
I decided to "Reply to public"
Before you start, try to find a task or problem that you would like to
solve using programming.
Then learn what you need to solve this problem.
This will let you apply the knowledge you gain straight away.
Plus, you get something useful by the end of your studies.
If you don't have anything in mind, here a few ideas:
- Loading data from several text files and making a plot for each one.
- Extracting a section of data from several files and compiling them into a
- Download data from an online source and fit a trend line to it (for example,
the temperature data from
my Python workshop at UH).
- Implement a common (and simple) method in your field that is only available
in commercial software.
Finally, here are my recommendations (in order):
- Don't start with a general Python programming book/site/tutorial/video.
They are usually meant for programmers, not scientists. You'll have to wade
through mountains of string formatting and Fibonacci numbers before you find
an answer to "How do I load these data?" (and it won't be
- Ignore any reference that uses Python 2. All support for Python 2.7
will end in 2020 and there is no reason to still be using it.
- Start with the Software Carpentry
lessons. Read all of them if you
can. Everything there will make your life easier. If you don't have the
time, focus on "Programming with Python" and "Version Control with git".
The lessons are not detailed but instead show you what is out
there and what you should type into Google.
- After you wet your appetite, dive deeper into more detailed material. I
highly recommend the Scipy Lecture Notes.
If you like the feeling of paper in your hands and have some money to spare,
try the books
Effective Computation in Physics
by Anthony Scopatz and
and Python Data Science Handbook
by Jake VanderPlas.
There is also Think Python by
the excellent Allen Downey, which is available
for free if you don't have the option to purchace.
Beware that I have not read these books so I cannot vouch for them
(but I have read good reviews online).
- Now that you have the basics down, use the project documentation pages
to find specific instructions for what you want, for example the
numpy documentation and the
- If you like games and want to learn useful general Python skills, try the
From now on, learning new things will be a continuous process. I've been
programming Python for 10 years and every once in a while I'll still learn
something new, usually that reduces the amount of code I have to write (less
code = less bugs).
The key is to stay informed and you can do that by subscribing to some (or all)
of the following:
Now go out there and learn a skill that just might save you in these times of
How did you get started with Python? Do you have anything to add to this list?
Let me know!
Update (2017-11-13): Added "Before you start" and the Python Challenge.
Update (2018-12-17): Added "Think Python" and a warning about Python 2.
The open science logo is by G.emmerich on Wikimedia
and the picture of the Python book is by
Because both are licensed CC-BY-SA, then so is the thumbnail image for this
blog post (a composite of the two images).
Found a typo/mistake?
Send a fix through Github
All you need is an account and 5 minutes!
More from the blog