Protostar Formation – Infographic

Infographic of the formation of protostars. Based on An Introduction to Modern Astrophysics by Carrol and Ostlie.

Infographic of the formation of protostars. Based on An Introduction to Modern Astrophysics by Carrol and Ostlie.

Took me a while to do that. I hope that you like what I did.

In some respects, I am not happy. I wanted to fit much, much more info. But I saw very soon that that was not going to be possible, at least in the way in which I originally conceived the infographic.

Adding more information without cluttering up too much would surely ask for a change in format. I tried to keep it as simple as possible.

In addition, my original idea was quite different from this. I had in mind something more hand made, more organic. However I also wanted to try Illustrator out, which gives a different feeling. At least in my hands.

Only one explanation is pertinent:

I used the greek letter Tau to represent optical depth, which is a measure of opacity (see also Stellar Atmospheres).

Other symbols are more obvious. L stands for luminosity, T is temperature, D is deuterium, H is hydrogen.

If you are looking at this, I’d truly appreciate your opinion about the infographic.


Posted in Uncategorized | Tagged , , , , , , , , , , , | 1 Comment

Protostar formation – mindmap

Study mindmap for protostar formation - based on "An Instroduction to Modern Astrophysics" by Carrol and Ostlie

Study mindmap for protostar formation – based on “An Introduction to Modern Astrophysics” by Carrol and Ostlie

Edit: see the infographic HERE

Why this, actually?

Once upon a time, I had an astrophysics exam. I took this book and put myself to work. I read the relevant chapters carefully, and crafted summaries and mindmaps for each block of content. In the end, I overstudied (!), since the exam ended up being far, far easier than I expected.

The book is truly excellent in content quality and clarity. Still, it was a long and daunting journey, and perhaps I’d have stopped before the end if I didn’t have that exam. In the end, I can say it was worth it, because astrophysics is a truly exciting topic by itself.

Often one can get a fair glimpse of this grand topic, which you can typically get by watching a well done documentary (some episodes of the new series Cosmos were very remarkable). However, the details become invisible by this approach. With my posts, I’d like to help you to get closer to the details, without asking you to spend long hours reading the book by Carrol and Ostlie. Still, if you find the time and have the passion, I HIGHLY encourage you to read that book, which could be almost considered as the bible of astrophysics.

One way of conveying the information is via mindmaps like the one above (also take a look at Toni’s Mindmap book).

Another (obvious) way is the good old classic blogpost, like any one from IFLS.

A third way that  I find particularly attractive: the infographic. As a matter of fact, I am preparing an infographic about protostars formation, with content similar to the one above but much, much more detailed. You can find a bunch of science infrographics in pinterest (although I have more in mind the illustrations of awesome educational books that I had as a child).

Now a few things to say about the mindmap. In fact, you could only understand it by having previously permeated yourself with the topic. The mindmap’s goal is not to understand and learn, its goal is to connect and memorize information effectively.

The mindmap above is designed following a few key ideas:

  • Information is tiered in 3 levels, from general to specific
  • The upper tier is marked in blue and corresponds to the central topic
  • The mid tier (subtopics within the main central topic) is indicated by red.
  • The bottom tier (aspects of each subtopic) is indicated in green.
  • Arrows indicate chronological order.
  • Grey boxes show content which complements and expands the concept to which they are linked
  • Black lines around boxes are meant so that a greyscale version of the mindmap is still usable. Notice that blue has all sides in black, red the upper and bottom sides, and green only the bottom one, respectively blackened.

Let me repeat: the mindmap is not meant to stand alone since it is not understood without a previous knowledge in the area. Additional content will come and I’ll link to it here and in the “Astrophysics” section, in the index.

See you!

Posted in Uncategorized | Tagged , , , , , , , , | Leave a comment

Animated astronomy video

I just found this video in YouTube, created by the same people who makes PhD comics.

After watching it I had to share it in here. This is good stuff. Albeit not thorough, frenetically fast, deceptively messy, and beautifully executed. Really cute stuff.

Watch it now.

Posted in Uncategorized | Tagged , , , | Leave a comment

Carolyn Porco: This is Saturn

I wanted to share this with you. A TED talk about the exploration of Saturn by Carolyn Porco.

I hope that you enjoy it!

Posted in Uncategorized | Leave a comment

The Solar Neutrino Problem

The solar core emits neutrinos as a result of the nuclear reactions in the pp-chain.

The thing with neutrinos is, they have a very small cross-section, and therefore are very difficult to detect (they can go all the way through Earth without interacting with anything).

The Homestake experiment, Lead, South Dakota, had the collection and counting of neutrinos coming from the Sun as a purpose. It consisted on a tank with 100,000 gallons (~380,000 L) of C2Cl4.

Cl-37 can interact with neutrinos of sufficient energy to produce radioactive Ar-37, with a half-life of 35 days.


The counting rate of neutrinos was measured in solar neutrino units (SNU, 1 SNU = 1e-36 reactions per target atom per second). The standard solar model predicts a rate of 7.9 SNU, while the outcome of the experiment was 2.23 ± 0.26 SNU. This discrepancy is the solar neutrino problem (SNP).

The Super-Kamiokande: Consisted on a tank of 3,000 metric tons of water. Neutrinos are able to scatter electrons, and the Kamiokande detector is able to detect the Cherenkov radiation produced when this happens. Cherenkov radiation is produced when an electron travelling in a medium (e.g. in water) travels faster than the speed of light in that medium (which is not physically impossible, since the speed of light in any medium is always slower than in vacuum). This is somewhat analogous to breaking the sound barrier.

The Kamiokande experiment was the first able to experimentally verify that the neutrinos have a non-zero mass.

The explanation for the SNP is that, since neutrinos have non-zero mass, they can transform between different types or flavors. Thus, there are 3 types of neutrinos:

  • Electron neutrinosνe
  • Muon neutrinos, νµ
  • Tau neutrinosντ

The transformation between one flavor and the others (called neutrino oscillation) takes place when neutrinos interact with electrons. The experiments only detected the electron neutrinos, explaining why a smaller number of neutrinos than that predicted by the models was detected.

EDIT: The transformation between flavors seems to be of a more complex nature than simply “interact with electrons”, which I don’t fully understand. To get some clue, read the paragraph under the “Theory” heading in this wikipedia article.

Neutrino oscillation was evidenced by the Sudbury Neutrino Observatory.

See the text under the heading “Resolution” in the wikipedia article. The explanation is quite neat.

Posted in Uncategorized | Tagged , , , , , , , , , , , | 2 Comments

Solar Model

The Sun, a typical main-sequence star of spectral class G2, with a surface of composition X = 0.73 (mass fraction of hydrogen) and Z = 0.02 (mass fraction of metals). Current estimated age: 4.52e9 years.

The standard solar model applies physical stellar modelling to describe the Sun (see Stellar Modelling).

  • The center of the Sun is rich in He-4 (product of pp-chain), and poor in hydrogen (low X, high Y).
  • The surface of the Sun is rich in hydrogen and poor in He-4. This is due to diffusive settling of heavier elements toward the center.
  • Since the birth of the Sun, its radius has increased by 10%, and its luminosity has increased by 40%.
  • The Sun’s primary energy source is the pp-chain.
  • 90% of the Sun’s mass is located within one half of its radius.
  • The luminosity suddenly increases at around 10% of the solar radius, which means that at this point the energy production is maximum. Two factors to take into account for this:
    (1) The energy production is higher as the mass within one shell increases, this is naturally happening as we get further away from the center.
    (2) The concentration of fuel available to generate energy (hydrogen) decreases as we move further away from the center.
    These two factors cause a maximum in the derivative of the Sun’s interior luminosity (dL/dr) at 10% of the solar radius (i.e. maximum of energy production).
  • At 71% of the solar radius, the energy transport regime changes from radiative transport (r < 0.71R) to convective transport (r > 0.71R).
Dependency of composition and luminosity on solar radius.

Dependency of composition and luminosity on solar radius.

Helioseismology = study of the oscillations of the Sun.

The Sun oscillates with roughly ten million vibration modes, typically with a very low amplitude (surface velocity of < 10cm/s, and luminosity variation  dL/L ~ 1e-6). Two kinds of modes identified.

  • p-modes, or five-minute oscillations, with periods between 3 and 8 minutes.
  • g-modes, with longer periods of about 160 minutes.

More about helioseismology.

Posted in Uncategorized | Tagged , , , , , , , , , | Leave a comment

Stellar Modelling

Stellar Models require:

  • The fundamental stellar structure equations (FSSE)
  • The constitutive relations (CR): “equations of state” describing the physical properties of the matter of the star.


If the structure of the star is changing, the influence of gravitational energy must be included. This introduces a time dependence not present in the static (time-independent) case.



Also, if acceleration is non zero, the acceleration term must be added to the pressure differential (first equation in FSSE):


The CR represent P, κ, and ϵ as a function of ρ, T, and composition.


Boundary Conditions: specify physical constraints to our equations.

  • Interior mass Mr and interior luminosity Lr vanish in the center of the star (r=0).
  • Another set of boundary conditions are required at a r=R* somewhere at the surface of the star. The simplest conditions are assuming that the temperature T, the pressure P and the density ρ vanish at r=R*. Strictly speaking, these conditions are never obtained in reality, so more sophisticated conditions may be needed.

How is it all used? The volume of the star is imagined to be constructed of spherically symmetric shells of width Δr. These shells separate the volume of the star into discrete Δr increments (…, ri-2, ri-1, ri, ri+1, ri+2, …, thus i is a “label” for one given shell), and the different properties are calculated via numerical integration for each r, using the Stellar Structure Equations, and the Constitutive Relations.

This numerical integration can be done starting from one boundary (either r=0 or r=R*) or somewhere in the middle point of one star radius, and integrate in both directions (i.e. toward the center and toward the surface). By doing this, for each value of i we can find the respective Pi, Mi, Li, and Ti.

In the end, the values obtained should match with the defined boundary conditions in both ends with a desired accuracy. Usually, it requires several iterations, which means that if the properties at the boundaries do not match the desired values, we must change the initial conditions and repeat the numerical integration.

In sum:

  1. Decide a starting point, and a set initial conditions (guess).
  2. Perform numerical integration toward both boundaries (center and surface).
  3. Check accuracy of solution. If the solution is not accurate enough, return to step 1. Otherwise, you’re done.


Vogt-Russell Theorem: take as a general rule, more than a rigorous law.

The mass and composition of a star uniquely determine its radius, luminosity, and internal structure, as well as its subsequent evolution.

Changes of properties in Main sequence stars…

  • Larger M mean larger central P and T.
  • In low-M stars, the pp-chain dominates.
  • In high-M stars, the CNO cycle dominates.
  • Star lifetimes decrease with decreasing L.
  • Stars of the main sequence have masses which range between:
    – M < 0.08 M (no nuclear reactions taking place).
    – M > 90 M (energy pulsation mechanism and unstable stars).
  • A M change of 3 orders of magnitude corresponds to:
    – A L change of 9 orders of magnitude (i.e. a damn huge change)
    – Only moderate T change (factor of 20, 2,700K — 53,000K).
  • A lower T involves a higher opacity (κ), i.e. low T favors convection domination.

Source (modified)

Posted in Uncategorized | Tagged , , , , , , , , , , , , , , , , | Leave a comment

Energy Transport

Energy transport inside a star takes place in three ways:

  • Radiation: dependent on opacityflux, density, and temperature. Larger opacity and flux involve steeper decrease in T with increasing r.
  • Convection: cool masses move “down” (in the direction of gravitational pull) and hot masses move “up” (against gravitational pull). If the T gradient is too steep, convection can play an important role.
  • Conduction: generally negligible.

We look more closely at convection.

  • Its treatment is very complex, and the theory is not yet fully developed.
  • It relies on Navier-Strokes equations in 3-D.
  • For the study of stellar structure, usually we use 1-D approximations (only spatial variable is r, i.e. we impose spherical symmetry).
  • The pressure scale height (a characteristic length scale for convection) is in the same order as the size of the star, thus convection is coupled to star’s behavior.

The model that we use is that of a hot bubble which (1) rises and expands adiabatically, and then after travelling some distance, (2) it thermalizes with the medium.

This is a thermodynamic model based on state functions (changes in thermodynamic state functions depend only on the starting and final states of the system, and not in the way followed).

The following equation describes how the temperature of the gas inside the bubble changes as the bubble rises and expands adiabatically:



CP and CV are the specific heat capacities at constant pressure and constant volume respectively.

Superadiabaticity: Case that the star’s actual temperature gradient is steeper than the adiabatic temperature gradient (i.e. inside the bubble). It represents a condition for convection domination:



If “act” is larger than “ad”, all luminosity is carried by convection.

It can be alternatively written as (assuming that the µ -mean molecular weight- does not vary):



So, either radiation or convection dominate in different circumstances. When is convection favored?

  1. Conditions of large stellar opacity
  2. Ionization taking place, which means large specific heat, and low adiabatic T gradient.
  3. Low local gravitational acceleration
  4. Large T dependence of nuclear reactions (e.g. CNO cycle and triple alpha)
Posted in Uncategorized | Tagged , , , , , , , , , | Leave a comment

Nuclear Reactions

What are the nuclear reactions taking place in the interior of the stars, and what are their elementary steps?

There are many. In these reactions, light elements fuse into heavier elements. The more massive is a star and the higher the temperature of its interior, the higher the number of possible reactions available, which transform atoms into heavier and heavier elements in a process known as nucleosynthesis.

The elementary steps of nuclear reactions need to fulfill some conditions:

  • The electric charge needs to be conserved.
  • The number of leptons (electrons, positrons, neutrinos, and antineutrinos) needs to be conserved.
    Conditions: matter leptons – antimatter leptons = constant
    Note: electron + positron = 2 photons (γ) (matter and antimatter annihilate each other producing photons)


Neutrinos are somewhat special in that they have a very small mass, and a very small cross section, therefore they have very long mean free paths (of the order of 10e9 solar radii!).

In the following, I represent nuclei by the symbol:


One possible reaction is the conversion of hydrogen into helium. This happens via the proton-proton chain or via the CNO cycle, where carbon, nitrogen, and oxygen act as catalysts.


The proton-proton chain consists on 3 branches: PP-I, PP-II, PP-III.

PP-I. Source.

PP-I. Source.

The slowest step in the PP-I chain is the first one, which involves the weak nuclear force.

PP-II. Source.

PP-II. Source.

PP-III. Source.

PP-III. Source.

In an environment similar to the center of the Sun, 69% of the time the PP-I chain takes place, while 31% of the time the PP-II chain occurs. In such environment, the occurrence of the PP-III chain is more rare, with a probability of 0.3%. These probabilities change depending on the temperature on the star.

In summary, the three branches of the PP-chain:

The next pic represents the CNO cycle.

CNO cycle. Source.

CNO cycle. Source.

The cycle above culminates with the production of carbon-12 and helium-4. Another possibility, less probable (0.04%) is the production of oxygen-16 and a photon instead.

The effect of fusing hydrogen into helium is the increase of the mean molecular weight (μ). The star begins to collapse, and this is compensated by a gradual increase of T and ρ.

Therefore, the PP-chain is dominant in low mass stars with low T, while the CNO cycle dominates at higher T and more massive stars.

In the triple alpha process, helium is converted into carbon. The “alpha” comes from the historical mysterious alpha particles, which turned out to be helium-4 nuclei. This process may be thought of as a 3-body interaction, and it has a very dramatic T dependence.

Triple alpha process. Source.

Triple alpha process. Source.

The power laws for the reactions above are follow. Simply note the temperature dependence of each reaction. Each needs larger T than the previous, but the T-dependence is much stronger for the latter (indicated by larger powers). Thus a given increase in T causes a larger and larger increase in the reaction rates:




At sufficiently high temperatures, many more reactions become available, consisting on the formation of heavier and heavier nuclei. Examples are carbon burning reactions, yielding oxygen, neon, sodium, and magnesium, and oxygen burning reactions, yielding magnesium, silicium, phosphorous, and sulphur.

If we plot the binding energy per nucleon (nucleons = neutrons and protons), Eb/A, we obtain a maximum at iron, and a series of stability peaks corresponding to particularly stable nuclei (e.g. helium, carbon, oxygen). These “enhanced stability peaks” are caused by the shell structure of atomic energy levels (similar to the shell structure of the electrons around a nucleus).

Generally speaking, fusion reactions of elements with atomic mass larger than 56, tend to consume energy, while fusion reactions resulting in iron or elements lighter than iron tend to liberate energy.

Posted in Uncategorized | Tagged , , , , , , , , , , , , , , , , , | Leave a comment


* Basics: – Difference between Fortran and Python? (compile step, weak vs. strong typed language)

Whereas Fortran is a compiled language, Python is an interpreted language that doesn’t require compilation. Python programs can be tested faster, without need of compiling after each change in the code. It also means that for computationally demanding programs, Fortran will be executed faster. One can take advantage of it by performing the number crunching with Fortran and using Python as a glue.

The second major difference is that whereas Fortran is a strongly typed language (meaning that variable types must be explicitly declared), Python is a weakly typed language, that is, variable types are implicitly assigned by Python, which usually implies less work from the programmer’s side.

– Basic datatypes (integer division!) and type conversion rules Everything, every kind of datatype, in python is called an object.

Integer, float, character, and boolean (True or False). Think of them as made by single elements. Beware of integer division:

>>> 3 / 2

To specify that a number is of float type, simply write it as 3. (adding a point):

>>> 3./2

Floating datatypes have priority over integers in arithmetic operations. This means that if I add for example an integer and a float, the result will be a float.

– Compound datatypes (mutable and immutable): e.g. tuple, string, list, dictionary

Strings: “…”, tuples (..,..), lists […,…], dictionaries { key: value, key:value}. Think on compound data types as lists of objects. Compound datatypes can be acted upon element-wise, and each element has its own corresponding index. Strings can be though of as lists of characters. Tuples are immutable, their elements cannot change once written. Example:

>>> a=(1,2,3)
>>> a[2]
>>> a[2] = 4
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

Note: strings behave like tuples, they’re immutable.

Dictionaries are similar to lists, except that the key (think of it as the index) is also used defined.

– Basic usage of compound datatypes (to be used for what?) Compound datatypes are a way to store collections of objects.

– Variables are not really “variables” but “name-tags” on objects Name-tags point to a specific amount of data in memory. Example:

>>> s = 'hello'
>>> l = s
>>> l
>>> s = 'hey'
>>> l

Up there, we don’t create two ‘hello’s, there is one ‘hello’ and two name-tags pointing to it (s and l). If we point the name-tag s somewhere else (e.g. ‘hey’), absolutely nothing happens to ‘hello’ and the other name-tags pointing to it.

– What are “side effects”? how to avoid them (if not wanted)? An example for a side effect:

>>> s = [1,2,3]
>>> l = [4,5,s]
>>> l
[4, 5, [1, 2, 3]]
>>> s[1]
>>> s[1] = 17
>>> l
[4, 5, [1, 17, 3]]

The logic is:

  1. We create a list s.
  2. We create a list l which points to s.
  3. We modify one object in s.
  4. We see this change also in l, since l is simply pointing at s.

To avoid side effects, an explicit copy of the list can be made by using:

listcopy = listname.copy()

– indexing and slicing

Sequences (e.g. strings, tuples, lists) can be indexed with o[index]. Dictionaries can be indexed by keys. Indexation begins at zero. Slicing takes slices of a sequence. Pay attention to the indexation, and to where the slice is being done. The syntax is: o[first:last] Examples:

>>> a = [0,1,2,3,4,5]
>>> a[:]
[0, 1, 2, 3, 4, 5]
>>> a[0:1] # Think of the slicing index as the commas, rather tha
# the elements. It indicates where we slice.
>>> a[0:-1]
[0, 1, 2, 3, 4]
>>> a[1:-1]
[1, 2, 3, 4]
>>> a[1:2]
>>> a[1:3]
[1, 2]

– Use of operators like “+” or “*” for lists or strings

“+” joins strings and lists. Example:

>>> 'hello' + 'world'

“*” appends one string or list to itself an integer number of times. Example:

>>> 'hello'*3
>>> a = [1,2,3]
>>> a*3
[1, 2, 3, 1, 2, 3, 1, 2, 3]

– I/O and string operations

Whatever is read from a file is a string. For reading and writing from/to files, we use the methods open, write, read, close, readline:

f = open("filename","x") # x is the option, it can be: w (write),                          # rt (read & write), r (read only),
                         # a (append)
f.write("text") # "text" is a string. To write a new line, we use "\n"
f.close() # Always close a file once it is not needed
f.readline() # Reads line by line


f.closed # Returns a boolean, True for file closed, elsewise False
with open("file","w") as name:
  name.write("text") # Automatically closes the file
                     #after the with statement.

* Control Flow:

Examples with each control block. Note the general syntax:

Control statement [condition or iteration]: # Note the colon
  Statement # Note the indentation
Additional control statements: # Again the colon
  Statement # Indentation of whatever happens inside the block
Statement # No indentation indicates that we are outside the 
          # control block

Important: The indentation must be consistant. Choose a fixed number of spaces to indent in each python program. Control blocks inside other control blocks are further indentated:

Control block 1:
  Control block 2:
  Statement outside CB 2 but inside CB 1
Statement outside CB 1

– if/elif/else

In [3]: if a == 1:
   ...:   print(1)
   ...: elif a == 2:
   ...:   print(2)
   ...: else:
   ...:   print(’A lot’)

– while. I don’t think that the break is required… But can be used to make sure that the program gets out of the control block under some conditions.

while True:
    n = raw_input("Please enter 'hello':")
    if n.strip() == 'hello':

– for

>>> for w in words:
...     print w, len(w)

– How to write the code? What is special about Python?

See the comment above about indentation.

* Writing Code:

– what is a “namespace”?

Namespaces are spaces containing a bunch of names (or variables). Different namespaces would be the namespace of the program, the namespace of a function inside the program, the namespace inside a module.

Each object has its own namespace.

Inside a function, the namespace is local to the function. The namespace of the main program is another one, and is called global. Do not be confused by that: The global namespace is NOT a “shared” namespace available to all objects (as it is in Fortran) Global is just the namespace corresponding to the program. If inside a function we want to use a nametag from the global namespace, we need to use the syntax “global name-tag“. Example:

a = 5
def add2():
 global a # Here we do it 
 return a + 3
a = add2()
print a
>>> 8

A name inside a module can only be used in the main program if it is imported. Example:

>>> pi # Name "pi" does not exist
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
NameError: name 'pi' is not defined
>>> import math # One way to import it
>>> math.pi
>>> from math import pi # Another way
>>> pi

So a namespace is a way to understand the scope of a name (= variable). A good way to think about namespaces is as boxes where you dump in names. One box cannot have two equal names. If we put a name into a box where there is one name called the same, the new name replaces the old one. In particular, importing names from modules should be done with some care, to make sure that something important is not being lost.

More about namespaces and all that, here.

– what is a module and how to use it?

A module is a file separate of the main program which contains objects. These objects can be easily made available to the main program by importing (see above).

– defining subroutines/functions


def square( a ):
   # This function returns the square of the argument
   a = a**2
   return a

Note the def keyword needed, the syntax similar to a block statement, and the return command, by which we set the output that is returned to the main program when the function is executed.

– basics of object orientation:

Every object has a class, a method (function of the object), and attributes. Classes can be predefined, or user defined.

xxx add scheme of namespace and “objectspace”

how to construct a class?

Example of constructing a new class called NewClass (I know…):

class NewClass(object):
  def __init__(self):
    self.method = method

This is kinda the minimum possible class (not exactly, but almost. See this for details).

Note the “object” object. This defines the type in the “object tree”.  The object type is the default for a new class. The self.method is an attribute of the class. It is simply a nametag stored inside the namespace of this class. Note also the syntax similar to a function.

After defining a class, we can access any object (nametags, methods, functions, subclasses…) inside this class.

why is it useful? (what is “self”? why is __init__ needed?)

New classes allow to create, hold, and access user defined collections of data in an organised way. It also allows to define ways to operate with this collection of data, and manipulate it.

The self is a way of calling the object of this class, within the class. A class must be defined without any knowledge about the nametags of the objects that the user or the programmer will want to create when they are needed. The self simply represents the nametag for this unknown object. Every method inside the class must have at least one argument, and this is the self argument.

The __init__ (double underscore) method is the way that classes are initialised in python.

* Using Numpy – what is numpy good for? how to use it?

Numpy has built-in methods specifically designed for working with multidimensional arrays and performing fast element-to-element operations. This makes numpy very good for avoiding loops over lists of objects and speeds up execution.

Numpy also has methods to compute vector-vector, vector-matrix, and matrix-matrix products:,b) # Scalar product

To use numpy, download it and import it to python as a module:

import numpy as np # "np" is an arbitrary shorthand name


– basic objects: arrays (important methods, attributes; how to generate them)

Array generation. Examples:

a = np.array([[1,2,3],[4,5,6]]) # Note () for argument delimitor
# [] for outer list, and [] for inner lists.
a = np.zeros([3,3],dtype="float64") # 3 x 3 array filled with 0
# Double precision
a = np.arange(1000) # 1-D Array with values from 1 to 1000
# Next some general examples:
np.ones((ndim,...)) # The argument is a tuple
np.empty((n,...)) # Does not initialise the array
np.eye(n) # n x n unit matrix (diagonal = 1, non-diag. = 0)
np.linspace(start,stop,num_elements) # Equally spaced points
np.diag(np.array([1,2,...])) # Diagonal matrix with diagonal =
# 1,2,...
np.random.random_integers(0,20,15) # 15 random int from 0 to 20

Main methods for arrays:

a.dtype # Datatype of the array
a.shape # Size of the array
len(a.shape) # Number of dimensions
a.ndim # Also number of dimensions
np.ravel() # "Flattens" array
np.reshape((2,3)) # Changes the shape (opposite to ravel)
a[:,np.newaxis] # Enforces the creation of a new dimension

The main attribute of an array is the type: e.g. “int64”, “int32”, “float64”. The syntax is:

a = array(content or size, dtype="datatype")

– indexing and slicing

Indexing and slicing works same as in regular python. Indexing begins with the element number 0.

Slicing works also same as regular python. A range can be indicated by a colon (e.g. a[1:3]), and a single colon indicates the whole range (see example below).

a = np.array([[1,2,3],[4,5,6]])
>> 1
a[0,:] # Print only first row.
>> [[1,2,3]]

See that the ordering of rows and columns goes as:


In numpy, the last index changes quickest.

– what is “broadcasting”? how do numerical operations work for arrays (and how to exploit this)?

Broadcasting can take place when operating with two arrays of different size. The smallest array will try to operate the largest one as much as it can. The simplest example would be the addition of the number 1 and any array “a“. The number 1 can be thought of as an array of dimension 1 and rank 1. When it operates, it operates on each element of a, thus it is the same as adding to an array of the same size as but made up of ones everywhere.

The same concept would happen to a small array which “fits” a larger array.

xxx Add some visual scheme of broadcasting

Conditions for array operation. If one these is not fulfilled, there will be an error:

  • Arrays have equal size, or
  • Arrays have different size, and for each pair of differing dimensions, one has always size one.

The trick for exploiting this is manipulating the arrays, adding and/or shifting dimensions so that the second condition is fulfilled.

– how to use numpy implicit loops to speed up things

Simple case: we have two vectors, and we want to multiply them element by element. Instead of constructing a loop that goes through each elements and performs the operation, we simply construct the two arrays and multiply them. This speeds up the program by avoiding a loop.

* Example Cases (relevant):

– compute a COM for a molecule


– enforce periodic boundary conditions


– use numpy for numerics in non time critical parts of simulation codes


* Advanced Python/Numpy usage (no details, only the basic concepts)

– what is scipy, usage for fitting or optimization

Scipy is an open source library of algorithms and mathematical tools for the Python programming language. It adds some fortran routines to python. The basic data structure in scipy is the array provided by numpy.

In fact, numpy is a subset of scipy.

xxx map

– parallelization with mpi4py

Parallelization can be used in python by importing modules prepared for this:

from mpi4py import MPI
rank = comm.Get_rank()
size = comm.Get_size()

Parallelization is implemented in a quite automatized way, so that the used does not need to think much about it. The only detail is that objects that will be shared amongst the processors have to be broadcasted. Example:

if size > 1: # "If nproc > 1, then broadcast" = comm.bcast(

For using mpi in a computer, one must make sure that the openmpi libraries are loaded, e.g.:

module add openmpi-gcc/1.3.3

Parallel python can be run like this:

mpirun -np 2 python

– wrapping of F90 code into Python using f2py

F2py calls fortran from within python. We need numpy for that since arrays are returned as numpy arrays. The program takes care of everything internally, and the user does not have to think about it.

It is probably more efficient to combine python and fortran than to compute everything in fortran. Fortran is better used to perform computationally demanding calculations, while python is by far more flexible for configuration, settings control, postprocessing…

Posted in Uncategorized | Tagged | Leave a comment