Design ideas and goals for the GMT Python interface
As you may already know, I’m away on a postdoct writing a Python interface for the Generic Mapping Tools. Recently, I started laying out our goals for the project and some of my design ideas. This all lives on the GenericMappingTools/gmt-python GitHub repository, which is where the code will eventually be as well. I thought it would be good to post it here as well to have a snapshot of this phase of the project for future reference.
I have also submitted a talk proposal for Scipy 2017 about the project.
- Provide access to GMT modules from Python using the GMT C API (no system calls).
- Input and output using Python native containers: numpy
DataFramefor data tables and xarray
Datasetfor netCDF grids.
- Integration with the Jupyter notebook to display plots and maps inline.
- API design familiar for veteran GMT users (arguments
J, etc) with more newbie-friendly alternatives/aliases (
region=[10, 20, -30, -10],
To my knowledge, there have been 3 attempts at a GMT Python interface:
gmtpy has received commits since 2014 and is the more mature
alternative. However, the project doesn’t seem to be very
PyGMT use system class (through
pass input and output through
pygmt seems to call
the GMT C API directly through a hand-coded Python C extension. This
might compromise the portability of the package across operating systems
and makes distribution very painful.
gmt-python is made for the future. We will support only Python 3.5
or later and require the new “modern” mode of
(currently only in the
trunk of the SVN repository). The
removes the need for
-O -K and explicitly redirecting to a
.ps file. This
all happens in the background. A final call to
gmt psconvert brings the plot
out of hiding and finalizes the Postscript. This mode is perfect for the Python
interface, which would have to handle generation of the Postscript file in the
We will wrap the GMT C API using the
ctypes module of the
Python standard library.
ctypes grants access to C data types and
foreign functions in DDLs and shared libraries, making it possible to
wrap these libraries with pure Python code. Not having compiled modules
makes packaging and distribution of Python software a lot easier.
Wrappers for GMT data types and C functions will be implemented in a
lower level wrapper library. These will be direct
ctypes wrappers of
the GMT module functions and any other function that is needed on the
Python side. The low-level functions will not handle any data type
conversion or setting up of argument list.
We’ll also provide higher level functions that mirror all GMT modules. These functions will be built on top of the low-level library and will handle all data conversions and parsing of arguments. This is the part of the library with which the user will interact (the GMT Python API).
The GMT Python API
Each GMT module has a function in the
gmt package. Command-line
arguments are passes as function keyword arguments. Data can be passed
as file names or in-memory data.
The simplest usage would be with data in a file and generating a PDF output figure, just as a normal GMT script:
import gmt cpt = gmt.makecpt(C='cubhelix', T=[-4500, 4500]) gmt.grdimage(input='grid.nc', J='M6i', B='af', P=True, C=cpt) gmt.psscale(C=cpt, D='jTC+w6i/0.2i+h+e+o0/1i', B='af') gmt.psconvert(T='f', F='my-figure')
Arguments can also be passed as in the GMT command-line by using a single string:
import gmt gmt.makecpt('-Ccubhelix -T-4500/4500', output='my.cpt') gmt.grdimage('grid.nc -JM6i -Baf -P -Cmy.cpt') gmt.psscale('-Cmy.cpt -DjTC+w6i/0.2i+h+e+o0/1i -Baf') gmt.psconvert('-Tf -Fmy-figure')
Notice that output that would be redirected to a file is specified using
output keyword argument.
You can also pass in data from Python. Grids in netCDF format are passed
Datasets that can come from a netCDF file or generated in
import gmt import xarray as xr data = xr.open_dataset('grid.nc') cpt = gmt.makecpt(C='cubhelix', T='-4500/4500') gmt.grdimage(input=data, J='M6i', B='af', P=True, C=cpt) gmt.psconvert(T='f', F='my-figure')
Tabular data can be passed as numpy arrays:
import numpy as np import gmt data = np.loadtxt('data_file.csv') cpt = gmt.makecpt(C="red,green,blue", T="0,70,300,10000") gmt.pscoast(R='g', J='N180/10i', G='bisque', S='azure1', B='af', X='c') gmt.psxy(input=data, S='ci', C=cpt, h='i1', i='2,1,3,4+s0.02') gmt.psconvert(T='f', F='my-figure')
In the Jupyter notebook, we can preview the plot by calling
gmt.show(), which embeds the image in the notebook:
import numpy as np import gmt data = np.loadtxt('data_file.csv') cpt = gmt.makecpt(C="red,green,blue", T="0,70,300,10000") gmt.pscoast(R='g', J='N180/10i', G='bisque', S='azure1', B='af', X='c') gmt.psxy(input=data, S='ci', C=cpt, h='i1', i='2,1,3,4+s0.02') gmt.show()
gmt.show will call
psconvert in the background to get a PNG image
back and use
IPython.display.Image to insert it into the notebook.
TODO: We’re still thinking of the best way to call
first to generate a high-quality PDF and right after call
for an inline preview. The issue is that
psconvert deletes the
temporary Postscript file that was being constructed on the background,
this calling it a second time through
gmt.show() would not work. Any
suggestions are welcome!
The general layout of the Python package will probably look something like this:
gmt/ c_api/ ## Package with low-level wrappers for the C API ... modules/ ## Defines the functions corresponding to GMT modules ...
The module functions
The functions corresponding to GMT modules (
are how the user interacts with the Python API. They will be organized
in different files in the
gmt.modules package but will all be
accessible from the
gmt package namespace. For example,
gmt/modules/ps_generating.py but can be called as
Here is what a module function will look like:
def module_function(**kwargs): """ Docstring explaining what each option is and all the aliases. Likely derived from the GMT documentation. """ ## Convert any inputs into things the C API can digest ... ## Parse the keyword arguments and make an "args" list ... ## Call the module function from the C API with the inputs ... ## Process any outputs from the C API into Python data types ... return output
We will automate this process as much as possible:
- Common options in the docstrings can be reused from an
- Parsing of common arguments (R, J, etc) can be done by a function.
- Creating the GMT session and calling the module can be automated.
- Conversion of inputs and outputs will most likely be: tables to numpy arrays,
grids to xarray
Datasets, text to Python text.
Most of the work in this part will be wrapping all of the many GMT
modules, parsing non-standard options, and making sure the docstrings
are accurate. It might even be possible to automatically generate the
docstrings or parts of them from the command-line help messages by
passing a Python callback as the
print_func when creating a GMT
The low-level wrappers
The low-level wrapper functions will be bare-bones
functions from the
libgmt.so shared library. The functions can be
accessed from Python like so:
import ctypes as ct libgmt = ct.cdll.LoadLibrary("libgmt.so") ## Functions are accessed as members of the 'libgmt' object GMT_Call_Module = libgmt.GMT_Call_Module ## Call them like normal Python functions GMT_Call_Module(... inputs ...)
The tricky part is making sure the functions get the input types they
ctypes provides access to C data types and a way to specify the
data type conversions that the function requires:
GMT_Call_Module.argstypes = [ct.c_void_p, ct.c_char_p, ct.c_int, ct.c_void_p]
This is fine for standard data types like
char, etc, but will
need extra work for custom GMT
struct. These data types will need to
be wrapped by Python classes that inherit from
gmt.c_api module will expose these foreign functions (with output
and input types specified) and GMT data types for the modules to use.
The main entry point into GMT will be through the
function. This is what the
gmt command-line application uses to run a
given module, like
GMT_pscoast for example. We will use it to run the
modules from the Python side as well. It has the following signature:
int GMT_Call_Module (void *V_API, const char *module, int mode, void *args)
args (the command-line argument
list) are plain C types and can be generated easily using
Python module code will need to generate the
args array from the given
function arguments. The
V_API argument is a “GMT Session” and is
created through the
GMT_Create_Session function, which will have to be
wrapped as well.
The input and output of Python data will be handled through the GMT
virtual file machinery. This allows us to write data to a memory
location instead of a file without GMT knowing the difference. For
input, we can use
GMT_Open_VirtualFile and point it to the location in
memory of the Python data, for example using
We can also translate the Python data into
ctypes compatible types.
The virtual file pointer can also be passed as the output option for the
module, for example as
-G or through redirection (
->). We can read
the contents of the virtual file using
There are gonna be some rough edges on the C API that will have to get sorted before all of this is usable. The API is new (from 2013) and hasn’t been much used by third-party libraries. Some of the details aren’t documented and require diving into the GMT source code or having access to a GMT guru, like I have. Hopefully this work will make it more robust and new GMT wrappers can be made for other languages without so much effort.
All of this work in its very early stages and I’d love to get some feedback and ideas! You can leave a comment below or create an issue on the GitHub repository.